Beware Big Data Technology Zealotry

Undoubtedly you’ve heard it all before: “Hadoop is the next big thing, why waste your time with a relational database?” or “Hadoop is really only good for the following things” or “Our NoSQL database scales, other solutions don’t.” Invariably, there are hundreds of additional arguments proffered by big data vendors and technology zealots inhabiting organizations just like yours. However, there are few crisp binary choices in technology decision making, especially in today’s heterogeneous big data environments.

Courtesy of Flickr. Creative Commons. By Eden, Janine, and Jim.
Courtesy of Flickr. Creative Commons. By Eden, Janine, and Jim.

Teradata CTO Stephen Brobst has a great story regarding a Stanford technology conference he attended. Apparently in one session there were “shouting matches” between relational database and Hadoop fanatics as to which technology better served customers going forward. Mr. Brobst wasn’t amused, concluding; “As an engineer, my view is that when you see this kind of religious zealotry on either side, both sides are wrong. A good engineer is happy to use good ideas wherever they come from.”

Considering various technology choices for your particular organization is a multi-faceted decision making process. For example, suppose you are investigating a new application and/or database for a mission critical job. Let’s also suppose your existing solution is working “good enough”. However, the industry pundits, bloggers and analysts are hyping and luring you towards the next big thing in technology. At this point, alarm bells should be ringing. Let’s explore why.

First, for companies that are not start-ups, the idea of ripping and replacing an existing and working solution should give every CIO and CTO pause. The use cases enabled by this new technology must significantly stand out.

Second, unless your existing solution is fully depreciated (for on-premises, hardware based solutions), you’re going to have a tough time getting past your CFO. Regardless of your situation, you’ll need compelling calculations for TCO, IRR and ROI.

Third, you will need to investigate whether your company has the skill sets to develop and operate this new environment, or whether they are readily available from outside vendors.

Fourth, consider your risk tolerance or appetite for failure—as in, if this new IT project fails—will it be considered a “drop in the bucket” or could it take down the entire company?

Finally, consider whether you’re succumbing to technology zealotry pitched by your favorite vendor or internal technologist. Oftentimes in technology decision making, the better choice is “and”, not “either”.

For example, more companies are adopting a heterogeneous technology environment for unified information where multiple technologies and approaches work together in unison to meet various needs for reporting, dashboards, visualization, ad-hoc queries, operational applications, predictive analytics, and more. In essence, think more about synergies and inter-operability, not isolated technologies and processes.

In counterpoint, some will argue that technology capabilities increasingly overlap, and with a heterogeneous approach companies might be paying for some features twice. It is true that lines are blurring regarding technology capabilities as some of today’s relational databases can accept and process JSON (previously the purview of NoSQL databases), queries and BI reports can run on Hadoop, and “discovery work” can complete on multiple platforms. However, considering the maturity and design of various competing big data solutions, it does not appear—for the immediate future—that one size will fit all.

When it comes to selecting big data technologies, objectivity and flexibility are paramount. You’ll have to settle on technologies based on your unique business and use cases, risk tolerance, financial situation, analytic readiness and more.

If your big data vendor or favorite company technologist is missing a toolbox or multi-faceted perspective and instead seems to employ a “to a hammer, everything looks like a nail” approach, you might want to look elsewhere for a competing point of view.

It’s Time to Ditch Scarcity Thinking

In J.R.R. Tolkien’s “The Hobbit,” Smaug the magnificent dragon sits on his nearly unlimited hoard of treasure and coins and tells “burglar” Bilbo Baggins to “help (himself) again, there’s plenty and to spare.” While it’s certainly true there are many things in this world that are physically scarce, when it comes to living in the information age, we need to retrain our minds to ditch scarcity thinking and instead embrace “sky’s the limit” abundance.

Image courtesy of Flickr.  By SolidEther
Image courtesy of Flickr. By SolidEther

Most of us have been taught there are resource constraints for things such as time, talent and natural items such as land, fresh water and more. And of course, there are very real limits to some of these items. However, we currently live in an information age. And in this era, some of our previous thought patterns no longer apply.

Take for instance, the ability to have an ocean of knowledge at our fingertips. With non-networked computers or and other devices, we’re limited to the data at hand, or the storage capacity of these devices. But add in a dash of hard-wired or wireless networking and suddenly physical limits to knowledge disappear.

Apple’s Siri technology is a compelling case in point. Using the available processing power of an iPhone (which by the way is considerable), Siri could arguably answer a limited amount of questions based on data in flash storage.

But open up Siri’s natural language processing (the bulk of which is done in the cloud) and suddenly if Siri can’t understand you, or doesn’t know an answer, the web may provide assistance. By leveraging cloud computing and access to the internet, Siri brings a wealth of data to users, and even more intelligence to Apple by capturing all queries “in the cloud” and offering an immense data set for programmers to tune and improve Siri’s capabilities.

It used to be that TV airtime was in short supply. After all, there are only so many channels and airtime programming slots for content, especially during primetime hours. And there’s still an arduous process to create, discover and produce quality content that viewers will want to watch during these scarce blocks of time.

Without regard to conventional thinking, YouTube is turning this process on its head. A New Yorkerarticle details how YouTube is growing its market presence by offering unlimited “channels” that can be played on-demand, anytime and anywhere. “On YouTube, airtime is infinite, content costs almost nothing for YouTube to produce and quantity, not quality is the bottom line,” explains author John Seabrook.  Content watching then (whether via YouTube, Netflix, DVR, Slingbox etc), is no longer constricted to certain hours, and in effect time is no longer a constraint.

In the past, the music we liked was confined to physical media such as records or compact discs. Then MP3 players such as the iPod expanded our capabilities to listen to more music but were still confined to available device storage. That’s scarcity thinking. Now with wireless networking access, there are few limits to listening to our preferred music through streaming services such as Pandora, or renting music instead of owning it on CD.  Indeed, music subscription services are becoming the dominant model for how music is “acquired”.

There are still real limits to many valuable things the world (e.g. time, talent, money, physical resources, and even human attention spans). Yet even some of these items are artificially constrained by either politics or today’s business cases.

The information age has brought persons, businesses and societies elasticity, scalability, and the removal of many earlier capacity constraints. We seem to be sitting squarely on Smaug’s unending stack of treasure. But even in the great Smaug’s neck there was a gaping vulnerability. We’ll still need to use prudence, intelligence and far-sighted thinking in this age of abundance, with the understanding that just because some of our constraints are removed, that doesn’t necessarily mean we should become gluttonous and wasteful in our use of today’s resources.

 

Big Data Technology Training – A Better Approach

Many technology companies begin training by handing employees binders of technical manuals, topics and user guides.  Employees are expected to plow through reams of text and diagrams to learn what they need to know to succeed on the job. Instead of just a “core dump” of manuals and online training courses, technical employees should also get “hands on” simulations, boot camps and courses led by advanced robo-instructors to fully hit the ground running.

Courtesy of Flickr. By Colum O'Dwyer
Courtesy of Flickr. By Colum O’Dwyer

It’s generally accepted there are two types of knowledge; theoretical knowledge learned via reading books, whitepapers, and other types of documents (also known as classroom knowledge) and experiential knowledge (learning by doing a specific task or involvement in daily activities).

All too often, technology employees coming onto the job on day one, are either handed a tome or two to assimilate, or given a long list of pre-recorded webinars to understand the company’s technology, competitive positioning and go-to-market strategies. In best case scenarios, technology employees are given a week of instructor led training and possibly some role-playing exercises.  However, there is a better way.

Financial Times article titled “Do it Like a Software Developer” explores new approaches in terms of training and learning for technology companies of all sizes.  Facebook, for example, offers application development new hires 1-2 days of coursework and then turns them loose on adding new features to a new or existing software program.  In teams of 30-60, new hires are encouraged to work together to add features and present results to business sponsors at the end of the first week of employmentNew hires get hands-on and “real life” experience of how to work in teams to achieve specific business results.

Even better, Netflix has a rogue program called “Chaos Monkey” that keeps new and existing application developers on their toes. This program’s purpose is to intentionally and randomly disable systems that keep Netflix’s streaming system running. Employees then scramble to discover what’s going wrong and make necessary adjustments. According to the FT article, Chaos Monkey is only let loose on weekdays when lots of developers are around and there is relatively light streaming traffic. Netflix believes if left alone, the streaming service will break-down anyway, so isn’t it better to keep it optimized by having armies of employees scouring for trouble-spots?

Simulations, fire-drills, and real life boot camps should supplement book knowledge for technology companies looking to make new-hires fully productive. But of course, such events are often considered a luxury for companies with limited training budgets, or a need to get employees on the job as soon as possible. All too often, however, employees will learn one-way or another. And mistakes are then made on the customer’s dime. Is it not better to have new employees learn in a safe, controlled “non-production” environment where mistakes can be monitored and quickly corrected by mentors and instructors?”

“Hands-on” training and learning activities are not only for application developers. With available and coming Artificial Intelligence (AI) technologies, it’s feasible for “robo-instructors” to guide technology sales employees through customer sales calls via an online interface (with more than canned responses based on rudimentary decision trees).  Or new-hire technology marketing professionals could design a campaign along with a feasible budget for a new product line and present results to business sponsors or be graded by an advanced algorithm. The possibilities for a more robust and experiential training program for technology associates are endless.

At my first job in Silicon Valley—working for a cable modem company—I was handed five thick and heavy technical manuals on day-one. No instructor led, online training or mentoring. It was sink or swim, and many employees (me included) sank to the bottom of the ocean floor.

While these types of lackluster training events at tech companies might be more exception than rule, there’s an opportunity for increased new-hire productivity and job satisfaction. What’s required is a different mindset towards additional training investment and more focus on ingrained learning through experience and daily immersion of activities rather than a book knowledge cram course.

When Big Data Loses to the Anecdote

It’s highly possible that in your next business meeting, you may have data; you may have a solid analysis and even your best recommendations based on a given set of facts, and still lose out to a competing presentation littered with personal anecdotes. That’s because while business cultures like to profess “In God we trust; all others must bring data,” the reality is that human beings still like a gripping narrative, and emotional stories can sometimes override what seems like the best decision on paper.

Courtesy of Flickr. By fotosterona
Courtesy of Flickr. By fotosterona

Try this scenario on for size; as a marketing professional you must convince your CEO that your product needs a radical overhaul. You have tabulated numerical results from surveys; you’ve captured customer comments from online forums, and you’re ready to fire off with “fact” after “fact” to make your case.

Your opponent is the best sales person in the company. While you present to the CEO and make your case, “Fred” sits there with amusement—not only because he doesn’t support your premise, but also because he’s armed with powerful anecdotes.  And when you’ve completed your “business case” chock full of facts and figures for a new product direction and approach, Fred calmly relates three personal customer stories on why your ideas will never work.  And then adds; “All the other sales reps in this company feel the same way.” Question: which argument do you think the CEO adopts?

If you’re honest, you’ll admit that the art of storytelling “wins” over “the facts” in most business cultures. One reason for this is that we’re often inclined to believe the personal anecdote over data. Mathematics professor John Allen Paulos says; “In listening to stories we tend to suspend disbelief in order to be entertained, whereas in evaluating statistics we generally have an opposite belief in order to not be beguiled.”  It’s as if we turn our brains “on” for a story, and “off” when numbers come up on the screen.

The second reason is that facts and figures tend to be abstract, whereas the personal anecdote seems “more real”. Take for instance Lucy Kellaway’s Financial Times column on Ryanair. This airline describes itself as “cheapo air”. To keep costs down, Ryanair invests more in infrastructure and operations than customer service. Surely, Ryanair’s CEO has more online chatter, tweets, and phone calls than he’d care to collect, read or listen to.  Ryanair is swimming in data.

However, Lucy Kellaway relates that Ryanair’s CEO recently did an about face on some of his most notorious airline policies. Why? Because he was tired of being accosted by angry customers when he dined out. While the online Twitterati complained, it was the personal and often angry anecdote—delivered in person—that caused this particular CEO to change his mind.

So it appears that the best strategy to make your next business case is a powerful narrative (goMalcolm Gladwell if you’re able), supported by a statistical underpinning where appropriate. Simply presenting numbers, for numbers sake, will only cause glassy eyes and blank stares from those you are trying to persuade. Keep in mind that while it’s tempting to believe that we “must bring the data” and lots of it in order to persuade, sometimes it’s the personal anecdote/s that end up making the final sale.

Too Much Big Data, Too Few Big Ideas

A significant portion of the world’s knowledge is online and accessible to just about anyone with a web browser and internet connection. But as one author argues, all these noisy “Big Data” haystacks don’t translate into much signal, especially in terms of conceptualizing the next big idea.

Courtesy of Flickr. By Klearchos Kapoutsis
Courtesy of Flickr. By Klearchos Kapoutsis

With the “Big Data” deluge showing no signs of abating, information overload is the norm. In fact, this “information glut” suggests a larger problem. We’re suffering from information overload at the expense of free thinking and development of new big ideas. That’s the sentiment behind Neal Gabler’s “The Elusive Big Idea”.  The substance of Gabler’s argument is that in an era of Big Data, we know more than we’ve ever known, but think about it less.

That’s because the brain – while a wonderful processing engine – just can’t keep up with the data deluge. There’s simply too much to know and too little time to ponder.

Most of us are just flat out busy with work, family, friends and life’s little and larger troubles. This is why time saving devices are the rage. RSS news readers, while not as popular as in the past, are still a valuable tool to sort through the overindulgence of user generated content. And without our smartphone calendar reminders telling us where to be and when, most of us would be in a constant state of perspiration, realizing that we’re probably missing out on something, somewhere.

And yet we keep piling it on. Waiting in line to order lunch? Better check on what my friends are doing on Facebook. Need to wait five minutes for that sandwich to be made? Great, now there’s plenty of time to see what’s trending on Yahoo news. With our information addictions, it does appear, like the Pogo quotation so aptly illustrates, “We have met the enemy and he is us.”

Neal Gabler has a tough assessment for today’s westernized citizen. He says we prefer “knowing to thinking because knowing has more immediate value.”

Now if this “knowing” translated into something profound, we might be able to justify our information addiction. However, Gabler says our brains are now trained on trivial personal information such as; ‘Where am I going?”, “What are you doing?” or “Whom are you seeing?” And all in 140 characters or less.

With a focus on daily minutiae, it does appear we’re losing capacity for “the big idea”. That’s why attempts at freeing us from our daily inboxes—such as Google’s Free Time are inspiring.  At least there’s an attempt to galvanize thinking, and hopefully ideas will percolate into business value down the road.

Big Data technologies can save us time by sifting through mountains of multi-structured data.  But even then, while technology may help us recognize and match patterns better, or understand links and relationships with more clarity, there are abstract ideas and concepts that can only be tackled with the human mind.

There are many complex and global problems to think about, we just need to free up our minds from the daily clutter to engage them.

A good first step is a quiet room, free of electronics, and some down-time. See if you can stand the silence for more than ten minutes. Keep increasing that time if possible, for the world surely needs more thinking and less menial knowing and doing.

Is the Purpose of Analytics Just to Turn a Buck?

Ask just about any company why they are jumpstarting an analytics program and you’ll undoubtedly hear phrases like “We need to reduce costs” or “We must find new customers” or even “We need to shorten our product time-to-market.” And while these are all definitely sound reasons to initiate and nurture an analytics program, there are other rationales beyond “business value” for architecting and implementing an analytical infrastructure and applications.

By TheFixer. Courtesy of Flickr.
By TheFixer. Courtesy of Flickr.

A recent Financial Times article mentions how top global business schools are trying to get away from primacy of “Increasing Shareholder Value.”  Indeed, MBA students around the world are generally taught that increasing shareholder value is job number one, and they should do so by cutting costs wherever possible, expanding revenue streams, improving employee productivity and more.

For MBAs, the focus on short term shareholder value is mostly because it’s uncomplicated. “If we can skip the discussions of corporate purpose by stipulating that corporations exist to create shareholder value, then it makes it easier to get down to the more technical details of how we get there,” says Gerald Davis, management professor at University of Michigan.

Bill George, former CEO of Medtronic, has long counseled companies to look past shareholder value as the sole criterion of business success. Instead he says business leaders should consider additional stakeholders of customers, suppliers, employees, and communities when making decisions.

As a business analytics professional, it’s often too easy for me to think about analytics in the business context (i.e. how they can reduce costs, increase profits, speed time-to-market, improve employee productivity etc.) In fact, the mission for analytics can easily cross over from the land of shareholder value to safeguarding and improving the well-being and long term sustainability of other stakeholders.

Examples include:

  • National weather services use Hadoop and NoSQL databases to collect data points from global weather stations and satellites, feed data into predictive climate models, and then recommend courses of action to citizens and governments
  • Police departments use analytics to predict “hotspots” of criminal activity based on past incidents to help prevent crime and if not, nab lawbreakers in the act.
  • Governments use real time data collection and analytics to produce readings on local and global air pollution so that citizens can make informed choices about their daily activities.
  • Governments collect and share data on crime and terrorism (and as we’ve seen lately, sometimes a little too well!)
  • Analytics speeds aid relief efforts when natural disasters occur
  • Predictive analytics tracks disease outbreaks in real time
  • Access to open data sets and analytics may help farmers in Africa and elsewhere lift millions out of poverty by producing better crop yields
  • Data scientists are encouraged to share their analytic skills with charities
  • Companies can track food products with supply chain analytics as they move from “field to fork” to promote food safety

These are just some examples of the value of analytics beyond shareholder value creation, and there are hundreds more.

Business schools across the globe are revamping their MBA curriculum to focus on shareholder value to a lesser extent and more on sustainability and value for all stakeholders. Perhaps it’s time to look at the worth analytics can bring through a broader and more significant lens of improving societal value, and not just shareholder profits.

Preserving Big Data to Live Forever

If anyone knows how to preserve data and information for long term value, it’s the programmers at Internet Archive, based in San Francisco, CA.  In fact, Internet Archive is attempting to capture every webpage, video, television show, MP3 file, or DVD published anywhere in the world. If Internet Archive is seeking to keep and preserve data for centuries, what can we learn from this non-profit about architecting a solution to keep our own data safeguarded and accessible long-term?

Long term horizon by Irargerich. Courtesy of Flickr.
Long term horizon by Irargerich. Courtesy of Flickr.

There’s a fascinating 13-minute documentary on the work of data curators at the Internet Archive. The mission of the Internet Archive is “universal access to all data”. In their efforts to crawl every webpage, scan every book, and make information available to any citizen of the world, the Internet Archive team has designed a system that is resilient, redundant, and highly available.

Preserving knowledge for generations is no easy task. Key components of this massive undertaking include decisions in technology, architecture, data storage, and data accessibility.

First, just about every technology used by Internet Archive, is either open source software or commodity hardware. For web crawling and adding content to their digital archives Heritrix was developed by Internet Archive. To enable full text search on Internet Archive’s website, Nutch running on Hadoop’s file system is utilized to “allow Google-style full-text search of web content, including the same content as it changes over time.”  There are also web sites that mention HBase could also be in the mix as a database technology.

Second, the concepts of redundancy and disaster planning are baked into the overall Internet Archive architecture. The non-profit has servers located in San Francisco, but in keeping a multi-century and beyond vision, Internet Archive mirrors data in Amsterdam and Egypt to weather the volatility of historical events.

Third, many companies struggle to decide what data they should use, archive, or throw away. However with the plummeting cost of hard disk storage, and open source Hadoop, capturing and storing all data in perpetuity is more feasible than ever. For Internet Archive all data are captured and nothing is thrown away.

Finally, it’s one thing to capture and store data, and another to make it accessible. Internet Archive aims to make the world’s knowledge base available to everyone. On the Internet Archive site, users can search and browse through ancient documents, view recorded video from years past and listen to music from artists that no longer walk planet earth. Brewster Kahle, founder of the Internet Archive says, that with a simple internet connection; “A poor kid in Keyna or Kansas can have access to…great works no matter where they are, or when they were (composed).”

Capturing a mountain of multi-structured data (currently 10 petabytes and growing) is an admirable feat, however the real magic lies in Internet Archive’s multi-century vision of making sure the world’s best and most useful knowledge is preserved. Political systems come and go, but with Internet Archive’s Big Data preservation approach, the treasures of the world’s digital content will hopefully exist for centuries to come.

Follow

Get every new post delivered to your Inbox.

Join 40 other followers

%d bloggers like this: