Is Your IT Architecture Ready for Big Data?

Built in the 1950s, California’s aqueduct is an engineering marvel that transports water from Northern California mountain ranges into thirsty coastal communities. But faced with a potentially lasting drought, California’s aqueduct is running below capacity as there’s not enough water coming from sources. In terms of big data, just the opposite is likely happening in your organization—too much big data, overflowing the river banks and causing havoc. And it’s only going from bad to worse.

Courtesy of Flickr. Creative Commons. By Herr Hans Gruber
Courtesy of Flickr. Creative Commons. By Herr Hans Gruber

The California aqueduct is a thing of beauty. As described in an Atlantic magazine article;

“A network of rivers, tributaries, and canals deliver runoff from the Sierra Mountain Range’s snowpack to massive pumps at the southern end of the San Joaquin Delta.” From there, these hydraulic pumps push water to California cities via a forty four mile aqueduct that traverses the state and dumps into various local reservoirs.

You likely have something analogous to a big data aqueduct in your organization. For example, source systems kick off data in various formats, which probably go through some refining process and end up in relational format. Excess digital exhaust is conceivably kept in compressed storage onsite or a remote location. It’s a continual process whereby data are continually ingested, stored, moved, processed, monitored and analyzed throughout your organization.

But with big data, there’s simply too much of it coming your way. Author James Gleick describes it this way; “The information produced and consumed by humankind used to vanish—that was the norm, the default. The sights, the sounds, the songs, the spoken word just melted away. Now expectations have inverted. Everything may be recorded and preserved, at least potentially: every musical performance; every crime in a shop, elevator, or city street; every volcano or tsunami on the remotest shore.” In short, everything that can be recorded is fair game, and likely sits on a server somewhere in the world.

So what got us here in terms of IT architecture isn’t going to be able to handle the immense data flood coming our way without a serious upgrade in terms of capability and alignment.

IT architecture can essentially be thought of as a view from above, or a blueprint of various structures and components and how they function together. In this context, we’re concerned with what an overall blueprint of business, information, applications and systems looks like today and what it needs to look like to meet future business needs.

We need a rethink of our architectural approaches for big data. To be sure, some companies—maybe 10%–will never need to harness multi-structured data types. They may never need to dabble with or implement open source technologies. To recommend some sort of “big data” architecture for these types of companies is counter-productive.

However, the other 90% of companies are waking up and realizing that today’s IT architecture and infrastructure won’t be able to meet their future needs. These companies desperately need to assess their current situation and future business needs, and then design an architecture that will deliver insights from all data types, not just those that fit neatly into relational rows and/or columns.

The big data onslaught will continue for the foreseeable future, and is only going to grow more intense from exponential data growth. But here’s the challenge: the human mind tends to think linearly—we simply don’t know how to plan for, much less capitalize on this exponential data growth. As such, the business, information, application and systems infrastructures—at most companies—aren’t equipped to cope with, much less harness the coming big data flood.

Want to be prepared? It’s important to take a fresh look at your existing IT architecture—and make sure that your data management, data processing, development tools, integration and analytic systems are up to snuff. And whatever your future plans are, consider doubling down on them.

Until convincing proof shows otherwise, it’s simply too risky not to have a well thought out plan to cope with stormy days ahead of too much big data.

Changing Your Mind About Big Data Isn’t Dumb

After all the hype about big data and its mental cousin Hadoop, some CIOs are getting skittish about investing additional money in a big data program without a clear business case.  Indeed, in terms of big data it’s OK to step back and think critically about what you’re doing, pause your programs for a time if necessary, and—yes, even change your mind about big data.

Courtesy of Flickr. Creative Commons. By Steven Depolo
Courtesy of Flickr. Creative Commons. By Steven Depolo

Economist and Federal Reserve Chairman, Alan Greenspan, has changed his mind many times. In aFinancial Times article, columnist Gillian Tett, chronicles Greenspan’s multiple positions on the value of gold. Tett says in his formative years, Greenspan was fascinated with the idea of the gold standard (i.e. pegging the value of a currency to a given amount of gold), but later was a staunch defender of fiat currencies.  And now, in his sunset years, Greenspan has shifted again saying; “Gold is a currency. It is still, by all evidence, a premier currency. No fiat currency, including the dollar, can match it.”

To me at least, Greenspan’s fluctuating positions on gold reflect a mind that continually adapts to new information.  Some would view Greenspan as “waffler”, or someone who cannot make up his mind. I don’t see it that way. Changing your mind isn’t a sign of weakness; rather it shows pragmatic and adaptive thinking that mutates as market or business conditions shift.

So what does any of this have to do with the concept of big data? While big data and associated big data technologies have enjoyed plenty of hype, there’s a new reality setting in regarding getting more value from big data investments.

Take for example a Barclays survey where a large percentage of CIOs were “uncertain”—thus far—as to the value of Hadoop because of the ongoing costs of support, training, hiring hard to find operations and development staff, and the necessary work to make Hadoop integrate with existing enterprise systems.

In another survey of 111 U.S. data scientists sponsored by Paradigm4, twenty-two percent of those surveyed said Hadoop and Spark were not well-suited to their analytics. And in the same survey, thirty-five percent of data scientists who tried Hadoop or Spark have stopped using it.

And earlier in the year, Gartner analyst Svetlana Sicular noted that big data has fallen into Gartner’s trough of disillusionment by commenting; “My most advanced with Hadoop clients are also getting disillusioned…these organizations have fascinating ideas, but they are disappointed with a difficulty of figuring out reliable solutions.”

With all this in mind, I think it makes sense to take a step back and assess your big data progress.  If you are one of those early Hadoop adopters, it’s a good time to examine your current program, report on results, and test against any return on investment (hard dollar or soft benefits) projections you’ve made. Or maybe you have never formalized a business case for big data? Here’s your chance to work up that business case, because future capital investments will likely depend on it.

In fact, now’s the perfect opportunity for deeper thinking on your big data investments. It’s time to go beyond the big data pilot and put effort into strategies for integrating these pilots with the rest of your enterprise systems.  And it’s also time to think long and hard about how to make your analytics “consumable by the masses”, or in other words, making your analytics accessible to many more business users than those currently using your systems.

And maybe you are in the camp of charting a different course for big data investments. Perhaps business conditions aren’t just right at the current moment, or there’s an executive shift that warrants a six month reprieve to focus on other core items.  If this is your situation, it might not be a bad idea to let an ever changing big data technology and vendor landscape shake out a bit before jumping back in.

To be clear, there’s no suggestion—whatsoever—to abandon your plans to harness big data. Now that would be dumb. But much like Alan Greenspan’s shifting opinions on gold, it’s also perfectly OK to re-assess your current position, and chart a more pragmatic and flexible course towards big data results.

Storytelling with the Sounds of Big Data

Trying to internally “sell” the merits of a big data program to your executive team?  Of course, you will need your handy Solution Architect by your side, and a hard hitting financial analysis vetted by the CFO’s office. But numbers and facts probably won’t make the sales pitch complete. You’ll need to appeal to the emotional side of the sale, and one method to make that connection is to incorporate the sounds of big data.

By Tess Watson. Creative Commons. Courtesy of Flickr.
By Tess Watson. Creative Commons. Courtesy of Flickr.

There’s an interesting book review on “The Sonic Boom” by Joel Beckerman in the Financial Times.  In his book, Beckerman makes the statement that “sound is really the emotional engine for any story”—meaning if you’re going to create a powerful narrative, there needs to be an element of sound included.

Beckerman cites examples where sound is intentionally amplified to portray the benefits of a product or service, or even associate a jingle with a brand promise. For example, the sizzling fajitas that a waiter brings to your table, the boot up sound on an Apple Mac, or AT&T’s closing four notes on their commercials.

Of course, an analytics program pitch to senior management requires your customary facts and figures.  For example, when pitching the merits of an analytics program you’ll need slides on use cases, a few diagrams of the technical architecture (on premise, cloud based or a combination thereof), prognostications of payback dates and return on investment calculations, and a plan to manage the program from an organizational perspective among other things.

But let’s not mistake the value of telling a good story to senior management that humanizes the impact of investing deeper in an analytics program.  And that “good story” can be delivered more successfully when “sound” is incorporated into the pitch.

So what are the sounds of big data?  I can think of a few that, when experienced, can add a powerful dimension to your pitch.  First, take your executives on a tour of your data center, or one you’re proposing to utilize so they can hear the hum of a noisy server room where air conditioning ducts pipe in near ice cold air, CPU fans whirl in perpetuity, and cable monkeys scurry back and forth stringing fiber optic lines between various machines.  Yes, your executive team will be able to see the servers and feel the biting cold of the data center air conditioning, but you also want them to hear the “sounds” (i.e. listen to this data center) of big data in action.

In another avenue to showcase the sound of big data, perhaps you can replay to your executive team the audio of a customer phone call where your call center agent struggles to accurately describe where a customer’s given product is in transit, or worse, tries to upsell them a product they already own.  I’m sure you can think of more “big data” sounds that can accurately depict either your daily investment in big data technologies…or lack thereof.

Too often, corporate business cases with a “big ask” for significant headcount, investment dollars and more, give too much credence to the left side of our brain that values logic, mathematics and facts.  In the process we end up ignoring the emotional connection where feelings and intuition interplay.

Remember to incorporate the sounds of big data into your overall analytics investment pitch because what we’re aiming for is a “yes”, “go”, “proceed”, or “what are you waiting for?” from the CFO, CEO or other line of business leader. Ultimately, in terms of our analytics pitch, these are the sounds of big data that really matter.

Too Much Big Data, Too Few Big Ideas

Courtesy of Flickr. By Klearchos Kapoutsis

A significant portion of the world’s knowledge is online and accessible to just about anyone with a web browser and internet connection. But as one author argues, all these noisy “Big Data” haystacks don’t translate into much signal, especially in terms of conceptualizing the next big idea.

Courtesy of Flickr. By Klearchos Kapoutsis
Courtesy of Flickr. By Klearchos Kapoutsis

With the “Big Data” deluge showing no signs of abating, information overload is the norm. In fact, this “information glut” suggests a larger problem. We’re suffering from information overload at the expense of free thinking and development of new big ideas. That’s the sentiment behind Neal Gabler’s “The Elusive Big Idea”.  The substance of Gabler’s argument is that in an era of Big Data, we know more than we’ve ever known, but think about it less.

That’s because the brain – while a wonderful processing engine – just can’t keep up with the data deluge. There’s simply too much to know and too little time to ponder.

Most of us are just flat out busy with work, family, friends and life’s little and larger troubles. This is why time saving devices are the rage. RSS news readers, while not as popular as in the past, are still a valuable tool to sort through the overindulgence of user generated content. And without our smartphone calendar reminders telling us where to be and when, most of us would be in a constant state of perspiration, realizing that we’re probably missing out on something, somewhere.

And yet we keep piling it on. Waiting in line to order lunch? Better check on what my friends are doing on Facebook. Need to wait five minutes for that sandwich to be made? Great, now there’s plenty of time to see what’s trending on Yahoo news. With our information addictions, it does appear, like the Pogo quotation so aptly illustrates, “We have met the enemy and he is us.”

Neal Gabler has a tough assessment for today’s westernized citizen. He says we prefer “knowing to thinking because knowing has more immediate value.”

Now if this “knowing” translated into something profound, we might be able to justify our information addiction. However, Gabler says our brains are now trained on trivial personal information such as; ‘Where am I going?”, “What are you doing?” or “Whom are you seeing?” And all in 140 characters or less.

With a focus on daily minutiae, it does appear we’re losing capacity for “the big idea”. That’s why attempts at freeing us from our daily inboxes—such as Google’s Free Time are inspiring.  At least there’s an attempt to galvanize thinking, and hopefully ideas will percolate into business value down the road.

Big Data technologies can save us time by sifting through mountains of multi-structured data.  But even then, while technology may help us recognize and match patterns better, or understand links and relationships with more clarity, there are abstract ideas and concepts that can only be tackled with the human mind.

There are many complex and global problems to think about, we just need to free up our minds from the daily clutter to engage them.

A good first step is a quiet room, free of electronics, and some down-time. See if you can stand the silence for more than ten minutes. Keep increasing that time if possible, for the world surely needs more thinking and less menial knowing and doing.

In Big Data Endeavors, Don’t Neglect Softer Business Skills

Courtesy of Flickr. By coryccreamer

With technical skills such as Java, C++, Python and more in high demand for “Big Data” analytics, it seems like softer business skills such as speaking, writing, planning, leadership, negotiation etc. are falling by the wayside. But the ability to communicate, relate and navigate throughout an organization—so called “softer skills”—are especially needed to propagate analysis and communicate the impact of data-driven decision-making.

Courtesy of Flickr. By coryccreamer
Courtesy of Flickr. By coryccreamer

In 2012, cloud computing blogger David Linthicum penned a short piece explaining “3 Winners and 3 Losers in the Move to Big Data”.  In the post Linthicum identified one “loser” as data warehouse and BI specialists, presumably because these folks were accustomed to using languages like old-school SQL and supporting “legacy BI” systems.

It’s interesting that as we find ourselves nearing mid-2013, those “legacy” skills of writing for and supporting various BI tools and relational databases are not going away. In fact, the opposite seems true as open source programmers seek more ways to make projects SQL-like to access various distributed file systems, NoSQL and NewSQL data stores. And while the development of SQL-like interfaces helps the business analyst utilize some of these newer platforms, business skills seem to get short-shrift in the equation of making an analytics program a success.

It appears the burgeoning role of “data scientist” intends to bridge the gap between technical skills and business acumen.  An IBM blog states that while the formal training of a data scientist should include an understanding of computer science, applications and ability to write in various languages, they also need to have business smarts.  Thus the data scientist role must marry technical skills with “the ability to communicate findings to both business and IT leaders in a way that can influence how an organization approaches a business challenge.”

It would seem that bridging the technical and business acumen gap with the data scientist role is an excellent idea. However as many articles on this site point out, data scientists are in high demand and can cost an organization a pretty penny. And at this point, there just aren’t that many data scientists available on job boards, or willing to move out of Silicon Valley.  So it appears that while there are plenty of employees with technical skills, and line of business leaders that understand the inner workings of the enterprise, there’s still a gap that needs bridging. What’s a company to do?

While it’s debatable whether a business analyst can be taught the necessary technical skills to become a data scientist, we can definitely ensure that we don’t neglect softer business skills in the evolution towards a data-driven organization. For example, there are universities that offer classes and executive course work on negotiation, communication and selling skills. In addition, there are programs available such as Toastmasters that can teach leadership and public speaking skills.

Need help writing? Your local university likely has coursework and workshops to improve business writing for proposals, sales briefs, whitepapers and more. Finally, there are too few employees that can perform “critical thinking”, or the ability to conceptualize, analyze and then evaluate various streams of information. Coursework from universities across the globe can also assist in this area.

What say you? Are better business skills needed for analytics professionals? If so, what are those skills? Finally, how would you recommend developing an action plan to “perform a business skills upgrade”?

Should You Be Wary of Big Data Success Stories?

Courtesy of Flickr by timlewisnm.

For every successful “Big Data” case study listed in Harvard Business Review, Fortune or the like, there are thousands of many failures.  It’s a problem of cherry-picking “success stories”, or assuming that most companies are harvesting extreme insights from Big Data Analytics projects, when in fact there is a figurative graveyard of big data failures that we never see.

Courtesy of Flickr by timlewisnm.
Courtesy of Flickr by timlewisnm.

Big Data” is a hot topic. There are blogs, articles, analyst briefs and practitioner guides on how to do “Big Data Analytics” correctly. And case studies produced by academics and vendors alike seem to portray that everyone is having success with Big Data analytics (i.e. uncovering insights and making lots of money).

The truth is that some companies are having wild success reporting, analyzing, and predicting on terabytes and in some cases petabytes of Big Data. But for every eBay, Google, or Amazon or Razorfish there are thousands of companies stumbling, bumbling and fumbling through the process of Big Data analytics with little to show for it.

One recent story detailed a certain CIO who ordered his staff to acquire hundreds of servers with the most capacity available. He wanted to proclaim to the world – and on his resume – that his company built the largest Hadoop cluster on the planet.  Despite staff complaints of “where’s the business case?” the procurement and installation proceeded as planned until the company could claim Hadoop “success”. And as suspected, within 24 months the CIO moved on to greener pastures, leaving the company with a mass of hardware, no business case, and certainly just a fraction of “Big Data” business value.

In an Edge.org article, author and trader Nassim Taleb highlights the problem of observation bias or cherry-picking success stories while ignoring the “graveyard” of failures. It’s easy to pick out the attributes of so-called “winners”, while ignoring that failures likely shared similar traits.

In terms of charting Big Data success, common wisdom says it’s necessary to have a business case, an executive sponsor, funding, the right people with the right skills and more. There are thousands of articles that speak to “How to win” in the marketplace with Big Data. And to be sure, these attributes and cases should be studied and not ignored.

But as Dr. Taleb says, “This (observation) bias makes us miscompute the odds and wrongly ascribe skills” when in fact in some cases chance played a major factor. And we must also realize that companies successfully gaining value from Big Data analytics may not have divulged all their secrets to the press and media just yet.

The purpose of this article isn’t to dissuade you from starting your “Big Data” analytics project. And it shouldn’t cause you to discount the good advice and cases available from experts like Tom Davenport, Bill Franks, Merv Adrian and others.

It’s simply counsel that for every James Simons—who makes billions of dollars finding signals in the noise—there are thousands of failed attempts to duplicate his success.

So read up “Big Data” success stories in HBR, McKinsey and the like, but be wary that these cases probably don’t map exactly to your particular circumstances. What worked for them, may not work for you.

Proceed with prudence and purpose (and tongue in cheek, pray for some divine guidance and/or luck) to avoid the cemetery of “Big Data” analytics projects that never delivered.

NSA and the Future of Big Data

no speed limit 2

The National Security Agency of the United States (NSA) has seen the future of Big Data and it doesn’t look pretty.  With data volumes growing faster than the NSA can store, much less analyze, if the NSA with hundreds of millions of dollars to spend on analytics is challenged, it raises the question; “Is there any hope for your particular company”?

Courtesy of Flickr. By One Lost Penguin

By now, most IT industry analysts accept the term “Big Data” is much more than data volumes increasing at an exponential clip. There’s also velocity, or speeds at which data are created, ingested and analyzed. And of course, there’s variety in terms of multi-structured data types including web logs, text, social media, machine data and more.

But let’s get back to data volumes. A commonly referenced report conducted by IDC mentions data volumes are more than doubling every two years. Now that’s exponential growth that Professor Albert Bartlett can appreciate!

What are consequences of unwieldy data volumes? For starters, it’s nearly impossible to effectively deal with the flood.

In James Bamford’s “Shadow Factory”, he mentions how the NSA is vigorously constructing data centers in remote and not so remote locations to properly store the “flood of data” captured from foreign communications including video, voice, text and spreadsheets.  One NSA director is quoted as saying; “Some intelligence data sources grow at a rate of four petabytes per month now…and the rate of growth is increasing!”

Building data centers and storing petabytes of data isn’t the end goal. What the NSA really needs is analysis. And in this area the NSA is falling woefully short, but not for lack of trying.

That’s because in addition to the fastest super computers from Cray and Fujitsu, the NSA needs programmers who can modify algorithms on the fly to account for new key words that terrorists or other foreign nationals may be using. The NSA also constantly seeks linguists to help translate, document and analyze various foreign languages (something computers struggle with—especially discerning sentiment and context).

According to Bamford, the NSA sifts through petabytes of data on a daily basis and yet the flood of data continues unabated.

In summary, for the NSA it appears there are more data to be stored and analyzed than budget to procure more supercomputers, programmers and analytic talent.  There’s just too much data and too little “intelligence” to let directors know what patterns, links and relationships are most important. One NSA director says; “We’ve been into the future and we’ve seen the problems of a “tidal wave” of data.”

So if one of the most powerful government agencies in the world is struggling with an exponential flood of big data, is there hope for your company?  For advice, we turn to Bill Franks, Chief Analytics Officer for Teradata.

In a Smart Data Collective article, Mr. Franks says that even though the challenge of Big Data may be initially overwhelming, it pays to eat an elephant a single bite at a time. “People need step back, push the hype from their minds, and think things through,” he says.  In other words, don’t stress about going big from day one.

Instead, Franks counsels companies to “start small with big data.”  Capture a bit at a time, gain value from your analysis and then collect more he says. There’s an overwhelming temptation to splurge on hundreds of machines and lots of software to capture and analyze everything. Avoid this route, and instead take the road less traveled—the incremental approach.

The NSA may be drowning in information, but there’s no need to inflict sleepless nights on your IT staff.  Think big data but start small. Goodness knows, in terms of data, there will always be plenty more to capture and analyze. The data flood will continue. And from a IT job security perspective, that’s a comforting thought.

Follow

Get every new post delivered to your Inbox.

Join 43 other followers

%d bloggers like this: