RSS Feed

Category Archives: Technology Trends

Is there Too Much Complexity in the Cloud?

Posted on

A recent analyst report suggests public clouds are prone to failure because they are inherently complex. However, just because there are multiple interacting objects in a particular environment, this doesn’t necessarily imply complexity.

Cloud computing is all the rage for business users and technology buyers. And why not, especially because it provides a fast and flexible option for delivering information technology services. In addition,cloud computing also drives value through higher utilization of IT assets, elasticity for unplanned demand, and scalability to meet business needs today and tomorrow.

However, there are risks in the cloud, especially in the public cloud where business and news media regale with case studies of data loss, security issues, failed backups and more. Perhaps one reason that public clouds are prone to failure—and perhaps always will be—is that some analysts consider these environments to be complex and tightly coupled.  And if indeed this is the case, then IT buyers must consider that failure isn’t only possible, it’s inevitable.

Yet, first we must ask, are public clouds really complex environments?

To understand if a particular system is complex, we must understand if it has characteristics such as connected objects (nodes and links with interdependencies), multiple messages and transactions, hierarchies, and behavioral rules (instructions).

Public cloud services available from companies such as Microsoft, Google, Amazon Web Services (AWS) etc., often consist of various components such as applications (front end and backend such as billing), controllers and message passing mechanisms, hardware configurations (disk, CPU, memory), databases (relational and NoSQL), Hadoop clusters and more.  In addition there are various management options (dashboards, performance monitoring, identity and access) and these environments typically operate with multiple users, multiple tenants (compute environments shared with more than one application and/or company), and sometimes span multiple geographies.  And from a complexity standpoint we haven’t even yet discussed processes in building cloud environments much less operating them.

In summary, in a cloud environment there’s lots of moving pieces and parts interacting with each other (not necessarily in a linear fashion) within any given timeframe.

Multiple interacting agents can help define whether a particular environment is complex or not, however another key determinant is also very important—whether processes are tightly or loosely coupled. Richard Bookstaber, author of Demon of Our Own Design, writes that tightly coupled systems have components critically interdependent with little to no margin for error. “When things go wrong, (an error) propagates linked from start to finish with no emergency stop button to hit,” Bookstaber says. So a tightly coupled system is one where linkages (dependencies) are so “tight” that errors or failures cascade and eventually cause the entire system to fail.

This discussion is important from a risk management perspective for cloud computing. If we believe that data is one of the most valuable assets of a corporation and if we believe public clouds are complex environments with tightly coupled components that have little to no slack (buffers) to stop failures, then there should be a set of practices and processes set in place to manage the potential risk of data breach, theft, loss or corruption.

So what say you, should public clouds be considered “complex” environments?  Are they “high risk” systems prone to failure?

Of Baby Black Swans and the Race to Zero

Defined as extreme events with high impact, Black Swans are infrequent occurrences that pack a punch (i.e. in financial markets the 2008 crisis, or 2010 flash crash). However a new study shows as the combination of machine trading and speed intertwine, these extreme events are occurring more often than previously imagined.  As markets continue to connect and participants become linked, each extreme bounce and/or collision may slowly break the system.

Nassim Nicholas Taleb is the person most responsible for burning the concept of low probability, high impact events into the minds of global business executives.  Coining the term “Black Swans” as the name for extreme outliers with devastating consequences, Taleb has put executives on notice that they need more built-in redundancy and should incorporate slack in business processes to cushion against failure.

However as technology proliferates and advances thus speeding processes, it appears humans are increasingly removed from decision making.  Thus ensuring a little slack in the system may not be enough to protect from system meltdown.

Take for example a complex “system” such as global financial markets.  In an effort to gain competitive advantage, computer scientists, quants, and software programmers are building machines that scan data streams, analyze, and decide trading strategies in micro-seconds.  These individuals (sometimes hedge fund managers) or corporations (such as larger investment banks) are shrinking the window for decision making down to levels where humans cannot react fast enough—microseconds today and nanoseconds in the future.

Trading equities is now a technological “arms race”, where companies compete buying and selling at near light speed. And while the concept of using speed for competitive advantage doesn’t sound like such a bad idea, there are also ramifications for a race to zero.

The first issue with this trading arms race is exclusion of participants who cannot afford the requisite technology.  Just as it takes nearly a billion dollars to win a US election thus ensuring few can join the fray, it takes multi-millions to build and co-locate ultra-fast computerized trading platforms. A second issue is that as trading nears the speed of light, there is ultimately less and less slack in the system to correct trading errors. And since financial markets are tightly coupled, this means that one single error in a fragile system can cascade with cataclysmic results.

Trading at near light speed – in an already fragile and tightly coupled system—is driving more extreme events, which appear to be fracturing global markets. And contrary to common knowledge, these events aren’t just happening once every two to three years.

A team of physicists, system engineers, and software programmers recently published a paper suggesting that abrupt “events” are occurring in the financial markets much more than previously thought. In fact, over the years 2006-11, the authors report a total of 18,520 spikes in stock movements—or extreme events (I’ll call them baby black swans) that arguably should have low probability of occurring according to normal distribution statistical models.

The aforementioned study notes; “There is far greater tendency for these financial fractures to occur, within a given duration time window, as we move to smaller timescales.” Meaning that in financial markets, as faster computers slice decision making windows down to nanoseconds, we should expect more volatility.  Moreover, if a given system is not designed to handle extreme volatility, there is a high probability of fissures and potential for total system breakdown.

In 2010’s Flash Crash, the US stock market plunged 1000 points in nine minutes and then regained those losses just as fast.  Never before had market participants seen thousand point swings within a ten minute timeframe. If the authors in the study cited in this article are correct, this kind of extreme volatility is only the beginning.

Questions:

  • Is this “race to zero” latency risky, or is this much ado about nothing?
  • Speed is a competitive advantage. Do you see a similar “race to zero” in decision making processes in other industries?

Are Data Scientists the Next Masters of the Universe?

Back in the late 1970s, traders buying and selling mortgages were pushed aside for new masters of the universe—“quants” or individuals that used mathematics to slice and dice mortgages into debt tranches. And in the same way, today’s traditional Business Intelligence (BI) professionals must be looking over their collective shoulders as business and IT publications tout the emerging role of “data scientist”.

Before Lew Ranieri came on the scene, mortgages were a very staid business. Banks would loan money and keep assets on the books for up to thirty years (depending on how quickly the loan was paid back). Except for underwriting skills, there wasn’t much complexity to the mortgage business.

As a trader for Salomon Brothers, Lew Ranieri changed all that.  Ranieri’s insight was that mortgages could be bundled together and then sliced into different tranches of varied risk.  This slicing exercise was quite complex because of a buyer’s ability to prepay their loans early or refinance.  Michael Lewis, of Liar’s Poker fame writes; “Mortgages were acknowledged to be the most mathematically complex securities in the marketplace. The complexity arose entirely out of the option the homeowner has to prepay his loan…mortgages were about math.” 

Suddenly the very boring business of home loans became a very complex business challenge in how to slice the pie based on risk profiles and cash flows from interest and principal. Lewis writes; “Different investors place different prices on risk. Risk could be canned and sold like tomatoes.” And this mathematical complexity demanded a new skill set—quantitative analysis—to perform the necessary mathematical modeling to ensure investment banks remained profitable in this new business.

Pushed out by a new breed of mathematical whizz-kids, many former investment bankers and traders either retired or left for smaller financial firms. And the rise of the quants—or the new masters of the universe—was complete by the mid-1980s.

Is a similar shift happening in the field of Business Intelligence with the emerging “data scientist” role? The skill set of today’s data scientist is much more robust than one who solely performs BI or ETL application development.  With new sources and types of data (i.e. multi-structured), the data scientist must be able to develop new data driven products such as churn models, create recommendation algorithms, assist marketers with behavioral segmentation and targeting and more.

But that’s not all. Fellow SmartDataCollective contributor Daniel Tunkelang says the data scientist; “Also needs to possess creativity and strong communication skills. Creativity drives the process of hypothesis generation, i.e., picking the right problems to solve that will create value for users and drive business decisions.”  Tall order to find all these skill sets in one person, much less build an internal competency center with such talent.

Perhaps for the foreseeable future, there’s room for both traditional BI professionals and the new breed of data scientists, as today both are valuable contributors in the field of analytics. However, with data growth on a fast paced exponential curve, much less the complexity and velocity of multi-structured data, it’s easy to see how the mix of skill sets to succeed in the future will tilt more in favor of the data scientist role.

The mortgage bankers never saw Lew Ranieri coming. Regarding the rise of data scientists—should traditional BI professionals be worried?

Why Capacity Management Matters For Countries…and Data Warehouses

Why Capacity Management Matters For Countries…and Data Warehouses

A quick glance of business news shows that few things are growing these days. Economies are slowing down (China), stagnant (USA) or in recession (Greece, Spain, Portugal et al), and even 401Ks are barely hanging on with the new normal of 1-2% growth (if you’re lucky). And while global CEOs, presidents, and prime ministers stay up at night worrying how to ignite growth, in many instances the foundation—or infrastructure—to support growth is severely lacking.

Growth is a national and global obsession. Countries seek to grow their industry competitiveness and tax base. Companies seek to grow their customer base, revenues and profitability. Individuals seek to grow job skills, knowledge and paychecks.  And there are few corporate or country agendas that lack plans for creation of business, innovation, and even employment. In short, “growth” is a hot topic.

However, in terms of growth, it’s difficult to get from Point A to B without the supporting infrastructure.

For example, the Financial Times reports that while Indonesia has been one of the few bright spots for growth (6% in 2011); there’s plenty of missed opportunity. There’s more goods coming in and out of the country than Indonesia’s airports, roads, and bridges can handle. Aircraft wait in the sky for places to land, boats waste time as they cannot dock and unload their cargo, and goods sit on the ground longing for delivery trucks. Indonesia’s creaking and decayed infrastructure is costing its citizens plenty of favorable advantage.

And breakneck growth in India is also an issue as poor planning and lack of infrastructure hurts this country’s prospects for a brighter future. Another Financial Times article mentions how some Indian mega-cities haven’t kept up with the massive influx of people seeking factory jobs; “Once pleasant (towns) are congested, chaotic mess(es) with snarled traffic, housing shortages and a frenzied edge.”  The article also cites overwhelmed sewers, poor drainage, and lack of city planning where slums sit next door to factories. “It’s just haphazard growth,” says one businessman.

Growth without a plan isn’t a recipe for success, and growth without investment in infrastructure leads to an execution nightmare.

That’s why capacity management is so important. Proper capacity management seeks to spare companies (and countries) from hardship by anticipating growth trends so leaders can lay a foundation to support innovation and drive financial value. Capacity Management seeks to understand a “current state”, and also provide a vision of “where you want to be in the future”. These plans examine today’s data, extrapolate trends, and recommend a path of investment in talent and infrastructure to meet the needs of tomorrow and possibly three, five and even ten years from now.

And some trends are obvious; especially predictions for growth in “Big Data” where firms like IDC anticipate by 2020, the amount of data generated each year will reach 35 zetabytes. Linear planning strategies in this instance aren’t going to help your company survive this data onslaught.

Examine trends in your industry. What kinds of growth do you see? What trends appear linear but could explode into exponential curves? How would you rate your infrastructure today, and its ability to accommodate future growth?

Is your infrastructure ready to welcome new customers, revenues and profits? Have you the right talent and skills to succeed today and five years from now?

Or will you –like the countries of India and Indonesia—leave opportunity sitting on the doorstep?

Bringing SQL-MapReduce Capabilities to Life

Bringing SQL-MapReduce Capabilities to Life

Most data management professionals know that multi-structured data such as Web server logs, social media and sensor data abound in their enterprise. However, they may lack a clear view on how to derive value from it. The volumes of information generated from these sources are often referred to as “big data,” which speaks to its complexity, variety and velocity.

A link to my latest Teradata Magazine article on Aster Data’s Analytic Pipeline service. Enjoy.  View PDF

Follow

Get every new post delivered to your Inbox.