Storytelling with the Sounds of Big Data

Trying to internally “sell” the merits of a big data program to your executive team?  Of course, you will need your handy Solution Architect by your side, and a hard hitting financial analysis vetted by the CFO’s office. But numbers and facts probably won’t make the sales pitch complete. You’ll need to appeal to the emotional side of the sale, and one method to make that connection is to incorporate the sounds of big data.

By Tess Watson. Creative Commons. Courtesy of Flickr.
By Tess Watson. Creative Commons. Courtesy of Flickr.

There’s an interesting book review on “The Sonic Boom” by Joel Beckerman in the Financial Times.  In his book, Beckerman makes the statement that “sound is really the emotional engine for any story”—meaning if you’re going to create a powerful narrative, there needs to be an element of sound included.

Beckerman cites examples where sound is intentionally amplified to portray the benefits of a product or service, or even associate a jingle with a brand promise. For example, the sizzling fajitas that a waiter brings to your table, the boot up sound on an Apple Mac, or AT&T’s closing four notes on their commercials.

Of course, an analytics program pitch to senior management requires your customary facts and figures.  For example, when pitching the merits of an analytics program you’ll need slides on use cases, a few diagrams of the technical architecture (on premise, cloud based or a combination thereof), prognostications of payback dates and return on investment calculations, and a plan to manage the program from an organizational perspective among other things.

But let’s not mistake the value of telling a good story to senior management that humanizes the impact of investing deeper in an analytics program.  And that “good story” can be delivered more successfully when “sound” is incorporated into the pitch.

So what are the sounds of big data?  I can think of a few that, when experienced, can add a powerful dimension to your pitch.  First, take your executives on a tour of your data center, or one you’re proposing to utilize so they can hear the hum of a noisy server room where air conditioning ducts pipe in near ice cold air, CPU fans whirl in perpetuity, and cable monkeys scurry back and forth stringing fiber optic lines between various machines.  Yes, your executive team will be able to see the servers and feel the biting cold of the data center air conditioning, but you also want them to hear the “sounds” (i.e. listen to this data center) of big data in action.

In another avenue to showcase the sound of big data, perhaps you can replay to your executive team the audio of a customer phone call where your call center agent struggles to accurately describe where a customer’s given product is in transit, or worse, tries to upsell them a product they already own.  I’m sure you can think of more “big data” sounds that can accurately depict either your daily investment in big data technologies…or lack thereof.

Too often, corporate business cases with a “big ask” for significant headcount, investment dollars and more, give too much credence to the left side of our brain that values logic, mathematics and facts.  In the process we end up ignoring the emotional connection where feelings and intuition interplay.

Remember to incorporate the sounds of big data into your overall analytics investment pitch because what we’re aiming for is a “yes”, “go”, “proceed”, or “what are you waiting for?” from the CFO, CEO or other line of business leader. Ultimately, in terms of our analytics pitch, these are the sounds of big data that really matter.

Too Much Big Data, Too Few Big Ideas

A significant portion of the world’s knowledge is online and accessible to just about anyone with a web browser and internet connection. But as one author argues, all these noisy “Big Data” haystacks don’t translate into much signal, especially in terms of conceptualizing the next big idea.

Courtesy of Flickr. By Klearchos Kapoutsis
Courtesy of Flickr. By Klearchos Kapoutsis

With the “Big Data” deluge showing no signs of abating, information overload is the norm. In fact, this “information glut” suggests a larger problem. We’re suffering from information overload at the expense of free thinking and development of new big ideas. That’s the sentiment behind Neal Gabler’s “The Elusive Big Idea”.  The substance of Gabler’s argument is that in an era of Big Data, we know more than we’ve ever known, but think about it less.

That’s because the brain – while a wonderful processing engine – just can’t keep up with the data deluge. There’s simply too much to know and too little time to ponder.

Most of us are just flat out busy with work, family, friends and life’s little and larger troubles. This is why time saving devices are the rage. RSS news readers, while not as popular as in the past, are still a valuable tool to sort through the overindulgence of user generated content. And without our smartphone calendar reminders telling us where to be and when, most of us would be in a constant state of perspiration, realizing that we’re probably missing out on something, somewhere.

And yet we keep piling it on. Waiting in line to order lunch? Better check on what my friends are doing on Facebook. Need to wait five minutes for that sandwich to be made? Great, now there’s plenty of time to see what’s trending on Yahoo news. With our information addictions, it does appear, like the Pogo quotation so aptly illustrates, “We have met the enemy and he is us.”

Neal Gabler has a tough assessment for today’s westernized citizen. He says we prefer “knowing to thinking because knowing has more immediate value.”

Now if this “knowing” translated into something profound, we might be able to justify our information addiction. However, Gabler says our brains are now trained on trivial personal information such as; ‘Where am I going?”, “What are you doing?” or “Whom are you seeing?” And all in 140 characters or less.

With a focus on daily minutiae, it does appear we’re losing capacity for “the big idea”. That’s why attempts at freeing us from our daily inboxes—such as Google’s Free Time are inspiring.  At least there’s an attempt to galvanize thinking, and hopefully ideas will percolate into business value down the road.

Big Data technologies can save us time by sifting through mountains of multi-structured data.  But even then, while technology may help us recognize and match patterns better, or understand links and relationships with more clarity, there are abstract ideas and concepts that can only be tackled with the human mind.

There are many complex and global problems to think about, we just need to free up our minds from the daily clutter to engage them.

A good first step is a quiet room, free of electronics, and some down-time. See if you can stand the silence for more than ten minutes. Keep increasing that time if possible, for the world surely needs more thinking and less menial knowing and doing.

In Big Data Endeavors, Don’t Neglect Softer Business Skills

With technical skills such as Java, C++, Python and more in high demand for “Big Data” analytics, it seems like softer business skills such as speaking, writing, planning, leadership, negotiation etc. are falling by the wayside. But the ability to communicate, relate and navigate throughout an organization—so called “softer skills”—are especially needed to propagate analysis and communicate the impact of data-driven decision-making.

Courtesy of Flickr. By coryccreamer
Courtesy of Flickr. By coryccreamer

In 2012, cloud computing blogger David Linthicum penned a short piece explaining “3 Winners and 3 Losers in the Move to Big Data”.  In the post Linthicum identified one “loser” as data warehouse and BI specialists, presumably because these folks were accustomed to using languages like old-school SQL and supporting “legacy BI” systems.

It’s interesting that as we find ourselves nearing mid-2013, those “legacy” skills of writing for and supporting various BI tools and relational databases are not going away. In fact, the opposite seems true as open source programmers seek more ways to make projects SQL-like to access various distributed file systems, NoSQL and NewSQL data stores. And while the development of SQL-like interfaces helps the business analyst utilize some of these newer platforms, business skills seem to get short-shrift in the equation of making an analytics program a success.

It appears the burgeoning role of “data scientist” intends to bridge the gap between technical skills and business acumen.  An IBM blog states that while the formal training of a data scientist should include an understanding of computer science, applications and ability to write in various languages, they also need to have business smarts.  Thus the data scientist role must marry technical skills with “the ability to communicate findings to both business and IT leaders in a way that can influence how an organization approaches a business challenge.”

It would seem that bridging the technical and business acumen gap with the data scientist role is an excellent idea. However as many articles on this site point out, data scientists are in high demand and can cost an organization a pretty penny. And at this point, there just aren’t that many data scientists available on job boards, or willing to move out of Silicon Valley.  So it appears that while there are plenty of employees with technical skills, and line of business leaders that understand the inner workings of the enterprise, there’s still a gap that needs bridging. What’s a company to do?

While it’s debatable whether a business analyst can be taught the necessary technical skills to become a data scientist, we can definitely ensure that we don’t neglect softer business skills in the evolution towards a data-driven organization. For example, there are universities that offer classes and executive course work on negotiation, communication and selling skills. In addition, there are programs available such as Toastmasters that can teach leadership and public speaking skills.

Need help writing? Your local university likely has coursework and workshops to improve business writing for proposals, sales briefs, whitepapers and more. Finally, there are too few employees that can perform “critical thinking”, or the ability to conceptualize, analyze and then evaluate various streams of information. Coursework from universities across the globe can also assist in this area.

What say you? Are better business skills needed for analytics professionals? If so, what are those skills? Finally, how would you recommend developing an action plan to “perform a business skills upgrade”?

Should You Be Wary of Big Data Success Stories?

For every successful “Big Data” case study listed in Harvard Business Review, Fortune or the like, there are thousands of many failures.  It’s a problem of cherry-picking “success stories”, or assuming that most companies are harvesting extreme insights from Big Data Analytics projects, when in fact there is a figurative graveyard of big data failures that we never see.

Courtesy of Flickr by timlewisnm.
Courtesy of Flickr by timlewisnm.

Big Data” is a hot topic. There are blogs, articles, analyst briefs and practitioner guides on how to do “Big Data Analytics” correctly. And case studies produced by academics and vendors alike seem to portray that everyone is having success with Big Data analytics (i.e. uncovering insights and making lots of money).

The truth is that some companies are having wild success reporting, analyzing, and predicting on terabytes and in some cases petabytes of Big Data. But for every eBay, Google, or Amazon or Razorfish there are thousands of companies stumbling, bumbling and fumbling through the process of Big Data analytics with little to show for it.

One recent story detailed a certain CIO who ordered his staff to acquire hundreds of servers with the most capacity available. He wanted to proclaim to the world – and on his resume – that his company built the largest Hadoop cluster on the planet.  Despite staff complaints of “where’s the business case?” the procurement and installation proceeded as planned until the company could claim Hadoop “success”. And as suspected, within 24 months the CIO moved on to greener pastures, leaving the company with a mass of hardware, no business case, and certainly just a fraction of “Big Data” business value.

In an Edge.org article, author and trader Nassim Taleb highlights the problem of observation bias or cherry-picking success stories while ignoring the “graveyard” of failures. It’s easy to pick out the attributes of so-called “winners”, while ignoring that failures likely shared similar traits.

In terms of charting Big Data success, common wisdom says it’s necessary to have a business case, an executive sponsor, funding, the right people with the right skills and more. There are thousands of articles that speak to “How to win” in the marketplace with Big Data. And to be sure, these attributes and cases should be studied and not ignored.

But as Dr. Taleb says, “This (observation) bias makes us miscompute the odds and wrongly ascribe skills” when in fact in some cases chance played a major factor. And we must also realize that companies successfully gaining value from Big Data analytics may not have divulged all their secrets to the press and media just yet.

The purpose of this article isn’t to dissuade you from starting your “Big Data” analytics project. And it shouldn’t cause you to discount the good advice and cases available from experts like Tom Davenport, Bill Franks, Merv Adrian and others.

It’s simply counsel that for every James Simons—who makes billions of dollars finding signals in the noise—there are thousands of failed attempts to duplicate his success.

So read up “Big Data” success stories in HBR, McKinsey and the like, but be wary that these cases probably don’t map exactly to your particular circumstances. What worked for them, may not work for you.

Proceed with prudence and purpose (and tongue in cheek, pray for some divine guidance and/or luck) to avoid the cemetery of “Big Data” analytics projects that never delivered.

NSA and the Future of Big Data

The National Security Agency of the United States (NSA) has seen the future of Big Data and it doesn’t look pretty.  With data volumes growing faster than the NSA can store, much less analyze, if the NSA with hundreds of millions of dollars to spend on analytics is challenged, it raises the question; “Is there any hope for your particular company”?

Courtesy of Flickr. By One Lost Penguin

By now, most IT industry analysts accept the term “Big Data” is much more than data volumes increasing at an exponential clip. There’s also velocity, or speeds at which data are created, ingested and analyzed. And of course, there’s variety in terms of multi-structured data types including web logs, text, social media, machine data and more.

But let’s get back to data volumes. A commonly referenced report conducted by IDC mentions data volumes are more than doubling every two years. Now that’s exponential growth that Professor Albert Bartlett can appreciate!

What are consequences of unwieldy data volumes? For starters, it’s nearly impossible to effectively deal with the flood.

In James Bamford’s “Shadow Factory”, he mentions how the NSA is vigorously constructing data centers in remote and not so remote locations to properly store the “flood of data” captured from foreign communications including video, voice, text and spreadsheets.  One NSA director is quoted as saying; “Some intelligence data sources grow at a rate of four petabytes per month now…and the rate of growth is increasing!”

Building data centers and storing petabytes of data isn’t the end goal. What the NSA really needs is analysis. And in this area the NSA is falling woefully short, but not for lack of trying.

That’s because in addition to the fastest super computers from Cray and Fujitsu, the NSA needs programmers who can modify algorithms on the fly to account for new key words that terrorists or other foreign nationals may be using. The NSA also constantly seeks linguists to help translate, document and analyze various foreign languages (something computers struggle with—especially discerning sentiment and context).

According to Bamford, the NSA sifts through petabytes of data on a daily basis and yet the flood of data continues unabated.

In summary, for the NSA it appears there are more data to be stored and analyzed than budget to procure more supercomputers, programmers and analytic talent.  There’s just too much data and too little “intelligence” to let directors know what patterns, links and relationships are most important. One NSA director says; “We’ve been into the future and we’ve seen the problems of a “tidal wave” of data.”

So if one of the most powerful government agencies in the world is struggling with an exponential flood of big data, is there hope for your company?  For advice, we turn to Bill Franks, Chief Analytics Officer for Teradata.

In a Smart Data Collective article, Mr. Franks says that even though the challenge of Big Data may be initially overwhelming, it pays to eat an elephant a single bite at a time. “People need step back, push the hype from their minds, and think things through,” he says.  In other words, don’t stress about going big from day one.

Instead, Franks counsels companies to “start small with big data.”  Capture a bit at a time, gain value from your analysis and then collect more he says. There’s an overwhelming temptation to splurge on hundreds of machines and lots of software to capture and analyze everything. Avoid this route, and instead take the road less traveled—the incremental approach.

The NSA may be drowning in information, but there’s no need to inflict sleepless nights on your IT staff.  Think big data but start small. Goodness knows, in terms of data, there will always be plenty more to capture and analyze. The data flood will continue. And from a IT job security perspective, that’s a comforting thought.

How Mobile Operators are Mining Big Data

Mobile phone operators have long mined details on voice and data transactions to measure service quality, place cellular towers in optimal locations and even respond to tariff and rate disputes among various carriers.  But, that’s just scratching the surface for getting value from mobile data.

Image courtesy of Flickr. Milica Sekulic.

Call detail records (CDR) for mobile transactions are particularly interesting for analysis purposes.  According to a Wikipedia entry, CDRs are chock full of useful data for carriers including phone numbers for originator and call receiver, start time, duration, route, call type (voice, SMS, data) among other nuggets. It’s not unusual for mobile operators to mine 100 terabytes (TB) and up databases to optimize networks, strategically position service personnel, perform customer service requests and more.

And carriers are also starting to discover value in performing social network analysis (SNA) in relational databases and MapReduce/Hadoop platforms to analyze social/relationship connections, find influencers, and –if directed by government authorities—even perform crime syndication tracking or terrorist network monitoring.

While the types of analysis listed above are becoming commonplace, mobile phone operators are learning a lot more from “Big Data” analysis of everything they’re capturing.

Financial Times writer Gillian Tett explores some of these innovative approaches in a recent article (registration required). Tett notes that with mobile phone subscribers topping out at 2.5 billion subscribers in emerging markets alone, that mobile carriers, behavioral scientists and governments are learning more about “people’s movements, habits, and ideas.”

For example, Tett cites the 2010 Haitian earthquake where aid workers alongside researchers were able to “track Sim cards inside Haitians’ mobile phones.”  This in turn helped relief agencies analyze where populations dispersed and helped route food and medicine to where it was needed most.

Analyst firm IDC notes that smartphone sales are flying out the door at the tune of 400 million a quarter. With the rise of smartphones, there are also more mapping and location based applications online too. In fact, when billing, use, location, social networks, much less content accessed and more come into view, there will be little left to the imagination to complete a picture of who you are, where you’ve been, what you’re doing, and where you’re predicted to go next.

These types of rich information will be accessed for customer, corporate and societal benefit. However, there’s also ripe potential for mis-use. The key questions are – is this much ado about nothing, or a data collection spree with an unhappy ending?

How Much Big Data is Too Much?

With storage costs plummeting and sophisticated software approaches to mining Big Data, it appears that it is increasingly cost effective for corporations and governments to keep all types of data, even those previously discarded.  However, how much “Big Data” should corporations, entities and governments keep online or archived, especially when “Right to Be Forgotten” debates are swirling?

Image Courtesy of Flickr

Like it or not, all kinds of data are captured every day. James Gleick in “The Information” sums it up nicely;

“The information produced and consumed by humankind used to vanish—that was the norm, the default. The sights, the sounds, the spoken word just melted away. Now the expectations have inverted. Everything may be recorded and preserved at least potentially; every musical performance, every crime, elevator, city street, every volcano or tsunami on the remotest shore…”

With petabytes of storage and virtual machines available in the cloud on a pay per use basis, and on premise storage costs dropping like a rock, it’s conceivable for companies and governments keep every image, video, recording, keystroke, and web generated data type. And of course, all these data are of little use without techniques to mine and perform information discovery. Fortunately BI and data warehousing technologies have worked wonders over the past thirty to forty years for data that needs to be organized, and we have MapReduce/Hadoop to assist in assembling/analyzing an organized data garbage dump.

There are two consequences of this data deluge.

For individuals, there is the feeling of drowning in a sea of overwhelming data of which it’s difficult to manage much less scrutinize. Novelist David Foster Wallace called this scenario “Total Noise” to coin the feeling of drowning in a deep pool of too many tweets, posts, phone calls, podcasts and more. And because this total noise causes “information anxiety” for some, there are plenty of people deleting social media accounts.

And there is a second consequence of this data deluge. Since everything that can be captured is in the process of being captured, there are certainly privacy and security concerns. Our likes, rants, passions and partialities are recorded online and archived offline in perpetuity. These concerns have fomented potential privacy legislation such as the EU’s “Right to Be Forgotten” where digital providers—upon request—will need to cull digital references owned by individuals.

These consequences then beg the question, how much Big Data is too much? What should be kept for corporate reasons (to serve customers better, sell more products, optimize business processes etc)? What should be kept for governmental concerns (tracking bank flows for money laundering, watching for potential terrorist activity, monitoring fringe groups that don’t see eye to eye with government officials)?  And with pending legislation such as “Right to be Forgotten” considered in statehouses across the world, is it more hassle than it’s worth to keep all this Big Data, especially if there are financial penalties for not complying with legislation?