Rent vs. Buy? The Cloud Conundrum

Over the long-run, is cloud computing a waste of money? Some startups and other “asset lite” businesses seem to think so. However, cloud computing for specific use cases, makes a lot of sense—even over the long haul.

Courtesy of Flickr. By IntelFreePress
Courtesy of Flickr. By IntelFreePress

A Wired Magazine article emphasizes how some Silicon Valley startups are migrating from public clouds to on-premises deployments. Yes, read that again. Cash poor startups are saying “no” to the public cloud.

On the whole this trend seems counter intuitive. That’s because it’s easy to see how capital disadvantaged startups would be enchanted with public cloud computing: little to no startup costs, no IT equipment to buy, no data centers to build, and no software licensing costs. Thus for startups, public cloud computing makes sense for all sorts of applications and it’s easy to see why entrepreneurs would start—and then stick with public clouds for the foreseeable future.

However, after an initial “kick the tires” experience, various venture capital sponsored firms are migrating away from public clouds.

The Wired article cites how some start-ups are leaving the public cloud for their own “fleet of good old fashioned computers they could actually put their hands on.”  That’s because, over the long run, it’s generally more expensive to rent vs. buy computer resources. The article mentions how one tech start up “did the math” and came up with internal annual costs of $120K for the servers they needed, vs. $320K in public cloud costs.

For another data point, Forbes contributor Gene Marks cites how six of his clients analyzed the costs of public cloud vs. an on-premises installation, monitored and managed by a company’s own IT professionals. The conclusion? Overall, it was “just too expensive” for these companies to operate their workloads in the public cloud as opposed to capitalizing new servers and operating them on a monthly basis.

Now to be fair, we need to make sure we’re comparing apples to apples. For an on-premises installation, hardware server costs may be significantly less over the long run, but it’s also important to include costs such as power, floor space, cooling, and employee costs of monitoring, maintaining and upgrading equipment and software. In addition there are sometimes “hidden” costs of employees spending cycles procuring IT equipment, efforts for capacity sizing, and hassles of going through endless internal capitalization loops with the Finance group.

Thus, cloud computing still makes a lot of financial sense, especially when capacity planning cycles aren’t linear, when there is need for “burst capacity”, or even when there is unplanned demand (as there often is with fickle customers). And don’t forget about use cases such as test and development, proof of concept, data laboratory environments and disaster recovery.

Another consideration is resource utilization.  As I have stated before, if you plan on using IT resources for a brief period of time, cloud computing makes a lot of sense. Conversely, if you plan on operating IT resources at 90-100% utilization levels, on a continual and annual basis, it probably makes sense to acquire and capitalize IT assets instead of choosing “pay per use” cloud computing models.

Ultimately, the cloud rent vs. buy decision comes down to more than just the price of servers. Enterprises should be careful to understand their use cases for cloud vs. on premises IT. In addition, watch for hidden costs in your TCO calculation that underestimate how much time and effort it really takes to get an IT environment up, running and performing.

The Center of Analytics Success? Communication!

As technical skills such as programming, application development, and “Big Data” infrastructure management take on added importance, it is imperative analytics professionals also develop their “softer” business skills such as communication. Indeed, a lack of proper communication skills on the part of analytic sponsors and staff could have significant repercussions for the long term viability of an analytics program.

Image courtesy of Flickr.  By Greater Boston Chamber
Image courtesy of Flickr. By Greater Boston Chamber

Some would argue, at least for IT professionals, that technical skills trump communication skills. However, for “Big Data” project success, communication matters more than ever, especially when trying to gain agreement from line of business managers, EVPs, CEOs and Board of Directors to fund, build and staff your project team.

In fact, in most companies the story goes something like this: after conceptualizing the business problem and getting initial buy-in management to move forward, you’ll need to write the business plan, collaborate with myriad stakeholders, negotiate various trade-offs, and work with Finance to calculate assorted financial projections such as TCO, IRR and ROI.

Then you’ll once you’ve achieved buy-in from Finance, Accounting, Legal, HR, LOB leaders and more, you’ll need to build your executive leadership team presentation and yes…actually deliver the presentation to senior leadership or the Board with aplomb.

And even when you get an analytics implementation started, you’ll surely need communication skills to report on milestone progress, goals achieved and current/future business results. You may also have to communicate technical concepts to people without such skills. In short, it can be argued that communication skills are at the heart of “success or failure” in getting an analytics program off the ground and running.

Still not convinced? An analyst at a global advisory firm says that communication skills are an “area for improvement” for CIOs—especially for individuals who want to “help lead the transformation of their companies.” And another study found that 41% of survey respondents suggested communication skills were even more important than technical skills to ensure IT success!

If you are Linus Torvalds, perhaps you don’t have to worry about superb communication skills. However, that’s not good advice for the rest of us. And if you’re lacking compelling communication skills, fear not. There are multiple avenues available to help you brush up on skills such as writing, presenting, negotiating, and interpersonal communication. And some of these executive education programs might be covered by employer tuition reimbursement.

Technical skills and programming languages come and go, usually replaced by the next big thing. But communication skills never go out of style. In terms of business and technical communications, concise and impactful writing helps document objectives, deliverables, timelines and results in an accessible manner. And the ability to present ideas effectively—with persuasive confidence—can help fill the gap between “great idea” and actual project funding

Analytics and Hedgehogs: Lessons from the Tampa Bay Rays

The Tampa Bay Rays spend significantly less on payroll than some of the wealthier teams in Major League Baseball, but get results that are sometimes better than those that wildly overspend. The Tampa Bay Rays success boils down to two things – understanding how to be a hedgehog, and continual application of statistics and analytics into daily processes.

Tampa Bay RaysGreek poet Archilochus once said: “The fox knows many things, but the hedgehog knows one big thing.” Many interpretations of this phrase exist, but one characterization is the singular focus on a particular discipline, practice or vision.

According to a Sports Illustrated article “The Rays Way”, while Major League Baseball teams such as the Los Angeles Angels load up on heavy hitters such as Albert Pujols and Josh Hamilton, the Tampa Bay Rays instead have a hedgehog-like and almost maniacal spotlight on pitching.

For example, SI writer Tom Verducci says “The Rays are to pitching what Google is to algorithms.” In essence, the Rays have codified methods (on how to raise up young pitchers and injury prevention techniques) and daily processes (including exclusive stretching and strengthening routines) into a holistic philosophy of “pitching first”.

But enabling that hedgehog-like approach to pitching is a culture of measurement and analysis.  To illustrate, the SI article mentions that pitchers are encouraged to have a faster delivery (no more than 1.3 seconds should elapse between a pitch and hitting the catcher’s glove). Pitchers are also instructed to throw the changeup on 15% of deliveries. And while other pitchers try and focus on getting ahead of batters, the Rays have discovered it’s the first three pitches that matter, with the third being the most important.

In terms of applying analytics, the Rays rely on a small staff of “Moneyball” statistical mavens that provide pitchers with a daily dossier of the hitters they’ll likely face, including they pitches they like and those they hate. And analytics also plays a part in how the Rays position their outfield and infielders to field balls that might otherwise go into the books as hits.

The Rays are guarded about sharing their proprietary knowledge on processes and measurement, and for good reason, as last year they had the lowest earned run average (ERA) in the American League and held batters to the lowest batting average (.228) in forty years. Even better, they’ve done this while spending ~70% less than other big market teams and winning 90+ games three years in a row. That’s nailing the hedgehog concept perfectly!

Seeing a case study like this, where a team or organization spends significantly less than competitors and gets better results, can be pretty exciting. However, an element of caution is necessary. It’s not enough to simply follow the hedgehog principle.

The strategy of a hedgehog-like “focus” can be highly beneficial, but in the case of the Tampa Bay Rays, it’s the singular focus on a critical aspect of baseball (i.e. pitching), joined with analytical processes, skilled people and the right technologies that really produce the winning combination.

Should You Be Wary of Big Data Success Stories?

For every successful “Big Data” case study listed in Harvard Business Review, Fortune or the like, there are thousands of many failures.  It’s a problem of cherry-picking “success stories”, or assuming that most companies are harvesting extreme insights from Big Data Analytics projects, when in fact there is a figurative graveyard of big data failures that we never see.

Courtesy of Flickr by timlewisnm.
Courtesy of Flickr by timlewisnm.

Big Data” is a hot topic. There are blogs, articles, analyst briefs and practitioner guides on how to do “Big Data Analytics” correctly. And case studies produced by academics and vendors alike seem to portray that everyone is having success with Big Data analytics (i.e. uncovering insights and making lots of money).

The truth is that some companies are having wild success reporting, analyzing, and predicting on terabytes and in some cases petabytes of Big Data. But for every eBay, Google, or Amazon or Razorfish there are thousands of companies stumbling, bumbling and fumbling through the process of Big Data analytics with little to show for it.

One recent story detailed a certain CIO who ordered his staff to acquire hundreds of servers with the most capacity available. He wanted to proclaim to the world – and on his resume – that his company built the largest Hadoop cluster on the planet.  Despite staff complaints of “where’s the business case?” the procurement and installation proceeded as planned until the company could claim Hadoop “success”. And as suspected, within 24 months the CIO moved on to greener pastures, leaving the company with a mass of hardware, no business case, and certainly just a fraction of “Big Data” business value.

In an Edge.org article, author and trader Nassim Taleb highlights the problem of observation bias or cherry-picking success stories while ignoring the “graveyard” of failures. It’s easy to pick out the attributes of so-called “winners”, while ignoring that failures likely shared similar traits.

In terms of charting Big Data success, common wisdom says it’s necessary to have a business case, an executive sponsor, funding, the right people with the right skills and more. There are thousands of articles that speak to “How to win” in the marketplace with Big Data. And to be sure, these attributes and cases should be studied and not ignored.

But as Dr. Taleb says, “This (observation) bias makes us miscompute the odds and wrongly ascribe skills” when in fact in some cases chance played a major factor. And we must also realize that companies successfully gaining value from Big Data analytics may not have divulged all their secrets to the press and media just yet.

The purpose of this article isn’t to dissuade you from starting your “Big Data” analytics project. And it shouldn’t cause you to discount the good advice and cases available from experts like Tom Davenport, Bill Franks, Merv Adrian and others.

It’s simply counsel that for every James Simons—who makes billions of dollars finding signals in the noise—there are thousands of failed attempts to duplicate his success.

So read up “Big Data” success stories in HBR, McKinsey and the like, but be wary that these cases probably don’t map exactly to your particular circumstances. What worked for them, may not work for you.

Proceed with prudence and purpose (and tongue in cheek, pray for some divine guidance and/or luck) to avoid the cemetery of “Big Data” analytics projects that never delivered.

Technologies and Analyses in CBS’ Person of Interest

Person of Interest is a broadcast television show on CBS where a “machine” predicts a person most likely to die within 24-48 hours. Then, it’s up to a mercenary and a data scientist to find that person and help them escape their fate. A straight forward plot really, but not so simple in terms of the technologies and analyses behind the scenes that could make a modern day prediction machine a reality. I have taken the liberty of framing some components that could be part of such a project.  Can you help discover more?

CBSIn Person of Interest, “the machine” delivers either a single name or group of names predicted to meet an untimely death. However, in order to predict such an event, the machine must collect and analyze reams of big data and then produce a result set, which is then delivered to “Harold” (the computer scientist).

In real life, such an effort would be a massive undertaking on a national basis, much less by state or city. However, let’s dispense with the enormities—or plausibility of such a scenario and instead see if we can identify various technologies and analyses that could make a modern day “Person of Interest” a reality.

It is useful to think of this analytics challenge in terms of a framework: data sources, data acquisition, data repository, data access and analysis and finally, delivery channels.

First, let’s start with data sources. In Person of Interest, the “machine” collects data from various sources such as interactions from: cameras (images, audio and video), call detail records, voice (landline and mobile), GPS for location data, sensor networks, and text sources (social media, web logs, newspapers, internet etc.). Data sets stored in relational databases that are publicly and not publicly available might also be used for predictive purposes.

Next, data must be assimilated or acquired into a data management repository (most likely a multi-petabyte bank of computer servers). If data are acquired in near real time, they may go into a data warehouse and/or Hadoop cluster (maybe cloud based) for analysis and mining purposes. If data are analyzed in real time, it’s possible that complex event processing technologies (i.e. streams in memory) are used to analyze data “on the fly” and make instant decisions.

Analysis can be done at various points—during data streaming (CEP), in the data warehouse after data ingest (which could be in just a few minutes), or in Hadoop (batch processed).  Along the way, various algorithms may be running which perform functions such as:

  • Pattern analysis – recognizing and matching voice, video, graphics, or other multi-structured data types. Could be mining both structured and multi-structured data sets.
  • Social network (graph) analysis – analyzing nodes and links between persons. Possibly using call detail records, web data (Facebook, Twitter, LinkedIn and more).
  • Sentiment analysis – scanning text to reveal meaning as in when someone says; “I’d kill for that job” – do they really mean they would murder someone, or is this just a figure of speech?
  • Path analysis – what are the most frequent steps, paths and/or destinations by those predicted to be in danger?
  • Affinity analysis – if person X is in a dangerous situation, how many others just like him/her are also in a similar predicament?

It’s also possible that an access layer is needed for BI types of reporting, dashboard, or visualization techniques.

Finally, delivery of the result set –in this case – name of the person “the machine” predicts most likely to be killed in the next twenty four hours, could be sent to a device in the field either a mobile phone, tablet, computer terminal etc.

These are just some of the technologies that would be necessary to make a “real life” prediction machine possible, just like in CBS’ Person of Interest. And I haven’t even discussed networking technologies (internet, intranet, compute fabric etc.), or middleware that would also fit in the equation.

What technologies are missing? What types of analysis are also plausible to bring Person of Interest to life? What’s on the list that should not be? Let’s see if we can solve the puzzle together!

Three Primary Analytics Lessons Learned from 9/11

Regarding the Al Qaeda terrorist attacks on 9/11, it’s distressing to learn there were plenty of chances to stop them.  From lack of information sharing among US agencies, to terrorist database lookups, to airport security, each time there was a failed opportunity to potentially prevent 9/11.  And yet, even had authorities known what to look for, they were ultimately vulnerable to creative and inventive efforts to rain down destruction.

Image courtesy of Flickr. By yopse – Arnaud Montagard

In James Bamford’s Shadow Factory exposé on the United States National Security Agency (NSA), he cites missed opportunities to catch the 9/11 terrorists.

First, because of information silos and lack of communication and information sharing between the NSA, CIA and FBI, some of the 9/11 attackers were known to be in the United States, but were unaccounted for. In fact, according to Bamford, two of the attackers were pulled over by Oklahoma Highway patrol for speeding—just days before the attack. However, because they were not on any known “watch” list, they were given a speeding ticket and sent onward.

In addition, on 9/11 two terrorists were actually flagged by the computer-assisted passenger prescreening system (CAPPS) for purchasing one way tickets with cash. Once flagged, the process then was extra screening for their checked luggage in the search for explosives. Bamford writes; “Both Mihdhar and Moqed were flagged by CAPPS, but since Mihdhar had no checked luggage and Moqed wasn’t carrying any explosives, the procedures had no effect on their mission.”  With no explosives found, both luggage and potential terrorists were allowed to board the plane.

Finally, according to Bamford, the last line of defense on 9/11 was airport security screening. However, because TSA at the time had a policy that knives less than 4 inches were allowed on airplanes, it wasn’t a problem to get the pocket Leatherman’s onboard.

The intention of this column is not to disparage any government agency, nor delve into a discussion on which US leaders and officials knew what, and when. Instead, there are some lessons learned that can be applied across today’s enterprises for better analytics.

First, even if you know what insights you’re looking for; data silos may prevent discovery of the best answer. For most organizations, too many single-subject data marts and/or “spreadmarts” prevent an integrated view of the business. In this instance, the NSA infrastructure acted as a data “vacuum cleaner” capturing all kinds of voice, video, text and more on potential terrorists. However, FBI and CIA organizations also had their own data silos–there was no integrated view. But   in defense of each organization and their respective missions—and because of privacy concerns—this was purposeful.

Second, your organizational processes may prevent success in analytics. Bamford notes that the NSA was in many instances monitoring communications of Al Qaeda abroad. They even knew days before 9/11 that “something big” was going to happen. Unfortunately the NSA didn’t know what was going to happen, nor where. And because of existing laws, organizational silos and “not my business” behaviors,information was not shared across agencies that might have prevented the 9/11 attacks.  Some companies have a “dog eat dog” culture (think Liar’s Poker) that may not encourage information sharing. These companies ultimately will have less success with corporate analytics.

Third, even when companies invest in the best analytical technologies and people, they can still be subjected to creative strategies to outwit their best efforts. For example, even with the NSA listening to terrorist conversations, mining of keywords in voice and data by sophisticated algorithms, and the best efforts to track terrorists across the globe, government agencies were unprepared for the use of airplanes as terrorist weapons. The inconceivable or in Donald Rumsfeld parlance, the “unknown, unknowns” often cannot be discovered, even with the best of people, processes and technologies.

Big Data Analytics the Ultimate Solution for HR Woes?

A terrible global economy, too few jobs, too many applicants, and far too many resumes. Sounds like a ripe opportunity for “Big Data” analytics to sort through the plethora of personalities and find needles in the proverbial haystack doesn’t it?  Not so fast. In a rush to use sophisticated algorithms to find and hire the right people, employers may be confusing correlation with cause.

The Wall Street Journal published an article which highlighted how companies are using analytics and personality tests to sort through thousands of applications for limited job openings. With too many applications, employers are resorting to machine analytics to parse resumes, discover keywords, and prioritize potential interviewees.

Image courtesy of Flickr. By quinn.anya

Xerox, for example, claims hiring software is used for all of its 48,700 call center jobs to find employees that have personality, patience, and persistence.  By having software mine and “score” the best candidates for these types of jobs, Xerox claims this analytics approach has cut attrition by 20%.

With a tough global economy, and high unemployment rates, employers are literally deluged with stacks and stacks of resumes. That’s where Big Data analytics comes into play. Machines are increasingly reading and scoring applicants for call-backs and interviews. And personality tests are chock full of data, which are then used to predict the suitability of candidates for a specific job based on how they answer a battery of questions.

Such tests and software are helping employers, “gauge an applicant’s emotional stability, work ethic and attitude toward drugs and alcohol,” according to the WSJ article.  And these algorithmic approaches are arguably saving companies money in hiring, attrition and fraud costs.

However, potential employees are figuring out how to game the system. Resumes are now routinely stacked with so many buzzwords that if a human HR professional reviewed them, they’d be close to incomprehensible.  And there are plenty of online forums with tips on how to outmaneuver personality tests. It seems like a cat and mouse game with no clear winner.

Worse, employers are confusing correlation with cause. By claiming hiring software lowers attrition or reduces fraud, employers are only focusing on the front end of the employment lifecycle (hiring processes) and missing the bigger picture.

For example, if an employer states hiring analytics have reduced call center attrition by 20%, they fail to recognize hundreds of other factors that determine whether an employee decides to stay with a company including culture, work environment, direct reports, salary, incentives, economic conditions and more.

Ultimately, declarative statements such as “this software cuts hiring costs by 20%” is made by a person who does not understand that there is almost always a “web of causation”, especially in complex decision making processes such as hiring.

As with anything, analytics should be used as a tool in key decision making processes. It’s better than flying blind. But let’s not draw the conclusion that such tools are beyond a shadow of a doubt helping companies hire better people for a given job.  We can have a degree of belief, but let’s leave certainty to the physicists and economists. Oh wait, they get it wrong too.