Big Data. Surely you’ve heard or read something about it by now. Maybe you even have an opinion. Some people I know think it’s much ado about nothing. Nothing more than hype, which dictionary.com defines as “exaggerated publicity” at one end of the spectrum or “a swindle, deception, or trick” at the other. Others think it’s hyperbole, defined as “an extravagant statement or figure of speech not intended to be taken literally.”

Although I think there are certainly some claims made out there about Big Data that fall into these categories, I lean towards describing the current rising interest in the topic as hyperopia, a form of farsightedness that our data management industry is experiencing. In this sense, hyperopia refers to temporal rather than spatial distance. I think we can see something happening with Big Data now but it’s rather fuzzy still, whereas the future implications are more clear.

Put another way (if the vision analogy doesn’t work for you), Big Data is no longer in its infancy. But it’s not fully mature yet either. Toddler? Preteen? Teen? I’m not sure where it is in this cycle, except certainly not at the teen stage. Of course, we can expect each stage in this maturity cycle to have its ups and downs. It’s a normal part of the process of technological change. Let’s not expect too much where it’s not appropriate and give it full support where and when it’s ready.

As I mentioned in my New Year’s musings, I felt this was the year to start paying more attention to Big Data. Accordingly, I have begun working with some of the technologies in the Big Data ecosystem and started speaking about it. Of course, I’ve been reading as much as I can, too. As a regular conference speaker, I highly value the educational and networking opportunities unique to conference attendance, so was thrilled to have the opportunity to attend GigaOM’s Structure:Data 2013 in New York City this week.

The Structure:Data conference used a different format than I’ve ever seen at a conference. Each session lasted 15 minutes with the attendees primarily positioned in one spot for the day. The speakers were rotated on and off the stage throughout the day with an emcee making introductions to the next session. The session styles varied also, from moderated panel to a speaker at a podium with slides to a speaker center stage without slides. I recall only one demo and even that was pre recorded.

The frequency with which speakers and formats changed meant that I didn’t get too antsy sitting in one spot for an extended period. There were also longer 45-60 minute workshops that allowed us a) to get up and move to a different place for a while, and b) to get exposed to a topic in a bit more depth than the main stage topics. Overall, I liked the format for its ability to introduce me to a lot of ideas in a compressed timeframe. The irony did not escape me though….Big Data in small bites.

Hands down, my favorite session was The CIA’s “grand challenge” with big data delivered by Ira “Gus” Hunt, CTO for the Central Intelligence Agency. His talk (online here) was fascinating and even a bit scary. Did you know your smartphone can identify you by your gait? That’s just one of many tidbits Hunt shared. Others that struck me most in this session:

  • Big Data allows us to grow the haystack and magnify the needles.
  • information no longer flows from the few to the many, but from the many to the many, thereby generating more and more data. Sure, there’s a lot of noise there, but there’s also a signal to be discovered lying within.
  • Sensors in everything results in explosive growth in data volumes. It’s not just Big, but Really Big. Sensors monitor location, health and even identity (through gait, for example).
  • Through analysis of sensor data, the inanimate can become sentient. (Oh, dear…)
The title of the session mentioned challenges, so here are a few that Hunt explored:
  • We don’t know the future value of data.
  • We cannot connect the dots that we don’t have.
  • Traditional requirements analysis don’t work in the world of Big Data.

Of particular interest to me, and a theme I heard many times over at the conference, was Hunt’s contention that the power of Big Data can be achieved only when the end user can interact it. We cannot expect users to be data scientists, nor can we expect them to be dependent on data scientists either. Yet, he says, analytical tools for Big Data are hard to use. (I stopped at some vendor booths to see what they had to show, but in fairness to all need more time to digest their offerings before I can comment.) He cited Wikipedia’s definition of data science, and in particular the following quote, “There is probably no living person who is an expert in all of these disciplines – if so they would be extremely rare.”

So are our academic institutions going to miraculously churn out hordes of data scientists to place in every business? I don’t think so. I don’t have anything against data scientists in general, but I think the mission of accomplishing Big Things with Big Data is better served if their skills are used to develop the next generation(s) of software that empower the business user, rather than having them become yet another barrier between regular people and data just because the data gets “bigger.” Or is this just wishful thinking on my part?