Privacy Ramifications of IT Infrastructure Everywhere

Most people don’t notice that information technology pervades our daily lives. Granted, some IT infrastructure is in the open and easy to spot, such as the computer and router on your desk hooked up via network cables. However, plenty of IT infrastructures are nearly invisible as they reside in locked network rooms or heavily guarded data centers.  And some IT infrastructures are bundled underneath city streets, arrayed on rooftops, or even camouflaged as trees at the local park. Let’s take a closer look at a few ramifications of IT infrastructure everywhere.

Courtesy of Flickr. By Jonathan McIntosh.
Courtesy of Flickr. By Jonathan McIntosh.

1.  Technology is pervasive and commonplace in our daily lives. Little is seen, much is hidden.

Good news: Companies have spent billions of dollars investing in wired and wireless connections that span cities, countries and oceans. This connectivity has enabled companies to ship work to lower cost providers in developing countries, and for certain IT projects to “follow the sun” and thus finish faster. Also, because we have IT infrastructure everywhere, it makes it possible for police forces and/or governments to identify and prosecute perpetrators of crime that much easier.

Bad news: This same IT infrastructure can also be used to monitor and analyze where and how people gather, what they say, relationships, how they vote, religious and political views and more. Closed circuit TV cameras on street corners (or concealed as mailboxes), ATM machines, POS systems, red-light cameras, and drones make up a pervasive and possibly invasive infrastructure that never sleeps. You may be free to assemble, however, IT infrastructure might be watching.

2.  Some information technology is either affordable or in some cases “free”, but the true costs may be hidden.

Good news: Google’s G+ or Gmail, Facebook, or Yahoo’s portal and email services are no to low cost for consumers and businesses. In addition, plenty of cloud providers such as Amazon, Google or Dropbox offer a base level of storage for documents or photos with no upfront hard dollar cost. On the surface it appears we are getting something for practically nothing.

Bad news: There’s no such thing as a free lunch as Janet Vertesi, assistant professor of sociology at Princeton can attest. For months she tried to hide her pregnancy from Big Data, but she realized that Facebook, Google and other free “services” were watching her every post, email, and interaction in search of ways to advertise and sell her something. While she was not paying a monthly fee for these online services, there was in fact a “cost”—Vertesi was exchanging her online privacy for the ability of advertisers to better target her and serve appropriate advertising.

3. IT infrastructure is expected to be highly available. Smartphones, internet access, computers are simply expected to work and be immediately available for use.

Good news: With IT infrastructure, high availability (four to five 9’s) is the name of the game. Anything less doesn’t cut it. Cloud services from IaaS to SaaS are expected to stay up and running, and phone networks are expected to have enough bandwidth to support our phone calls and web browsing—even atbusy sporting events.  And for the most part, IT infrastructure delivers time and again because consumers and business have the expectation that technology is highly available.

Bad news: Not only is IT infrastructure always on, but because of Moore’s Law and plummeting costs of disk, it never forgets.  For example, when disk and tape space was expensive, closed circuit TVs would record a day’s worth of coverage and then write over it the next day. Now, multiple cameras can record 30 days of surveillance on an 80 GB hard drive. And we haven’t even mentioned offsite or cloud storage which makes it possible to store audio, video, documents, photos, call detail records and more—essentially forever. Youthful transgressions can be published for all time. And mistakes today are recorded for years to come. The internet never forgets, unless you live in the European Union.

In the book, Sorting Things Out, Geoffrey C. Bowker and Susan Leigh Star call “Infrastructural Inversion” the process of focusing on various invisible systems—how they work—and how “people can change this invisibility when necessary”.   IT infrastructure is one such system that permeates our daily lives, often unseen but ever so critical to our societies.

There are undoubtedly other ramifications to this unseen IT infrastructure. Here’s hoping you’ll join the conversation with your thoughts!

Driving Data: A Slippery Ethical Slope?

When thinking about telematics, it’s easy to conjure up images of fleet tracking via GPS, satellite navigation systems for driving directions, or even the ubiquitous on-board security and diagnostic systems. However, what’s less understood is that data on your driving habits, locations and more are being collected, sometimes without your explicit knowledge.

Image courtesy of Flickr. By Michael Loke
Image courtesy of Flickr. By Michael Loke

Most people don’t realize that driving data are being collected in 80% of the cars sold in the United States.  According to an Economist article, event data recorders (EDRs) are installed in most cars to analyze how airbags are deployed.  Some EDRs can also record events such as “forward and sideway acceleration and deceleration, vehicle speed, engine speed and steering inputs.”

The Economist article also says EDR data can show if a driver stepped on the gas just before an accident, or how quickly brakes were applied. And EDRs can also record whether seat belts were locked. These data can be used to augment a police crash report, corroborate accident events as remembered by a driver, or even be used against a driver when negligence is suspected.

This brings to mind a key question – who owns this data? The Economist article says that if you are the car owner, it’s probably you. However, if your car is totaled from a crash, and you sell it to the insurance company as part of a claim resolution process, then it’s likely your insurance company now owns the data.

Data can be used for purposes advantageous and disadvantageous to a driver.

An MIT Technology Review article cites how a new $70 device is now available to hook into your car’s EDR. This device wirelessly transmits data via Bluetooth to your mobile phone on your driving efficiency, cost of your daily commute, and information on possible engine issues.  And the company providing the device can deliver a “score” for your driving habits, gas savings and safety in relation to other drivers.

Driving data can also be collected for things you did not intend. For example, a team of scientists used mobile phone location data gleaned from wireless networks to detect commute patterns from more than 1 million users over three weeks in the San Francisco Bay Area.

These scientists discovered “cancelling some car trips from strategically located neighborhoods could drastically reduce gridlock and traffic jams.”  In other words, some neighborhoods are responsible for a fair portion of Bay Area freeway congestion.  The scientists claimed by cancelling just 1% of trips from these neighborhoods, congestion for everyone else could be reduced by 14%.

Of course, drivers in urban areas could be incentivized to use public transportation, carpool or telecommute, but it’s also possible that a more heavy-handed government approach could restrict commutes from these neighborhoods—on certain days—“for the good of all.”

Data are of course, benign. However, driving data from GPS and other devices are collected daily—and sometimes without your consent.

Altruistically, these data may ultimately be used to design better cars, better freeways and improve the overall quality of life for everyone concerned. Yet, it’s also important to realize that mobile data from daily road travels can also be utilized for tracking purposes, to pin down exactly where you are located at any given moment in time, and how you arrived.

And that thought should give everyone pause.

Will Pay-Per-Use Pricing Become the Norm?

CIOs across the globe have embraced cloud computing for myriad reasons; however a key argument is cost savings. If a typical corporate server is utilized anywhere from 5-10% over the life of the asset, then it’s fair to argue the CIO paid ~10x too much for that asset (assuming full utilization). Thus to get better value,  a CIO then has two choices – embark on a server consolidation project—or use cloud computing models to access processing power and/or storage, when needed, on a metered basis.

Cloud computing isn’t the only place where utility based pricing is taking off. An article in the Financial Times shows how the use of “Big Data” in terms of volume, variability and velocity, is stoking a revolution in real-time, pay-per-use pricing models.

traffic jamThe FT article cites Progressive Insurance as an example. With the simple installation of a device that can measure driver speed, braking, location and other data points, Progressive can gather multiple data streams and compute a usage based pricing model for drivers that want to reduce premiums. For example, rates may vary depending on how hard a customer brakes, how “heavy they are on the accelerator”, or how many miles they drive.

The installed device works wirelessly to stream automobile data back to Progressive’s corporate headquarters, where billing computations take place in near real time.  Of course, the driver must be willing to embark upon such a pricing endeavor, and possibly lose some privacy freedoms, however this is often a small price to pay for the benefit of a pricing model that correlates safer driving habits with a lower insurance premium.

And this is just the tip of the iceberg. Going a step further to true utility based pricing, captured automobile data points also make it possible to create innovative pricing models based on other risk factors.

For example, if an insurance company decides it is riskier to drive to certain locales, or from 2am-5am, they can attach a “premium price” to those decisions, thus letting a driver choose their insurance rate.  Even more futuristic, it might be possible to be charged more or less based on discovery of how many passengers are driving with you!

Whether it is utility based pricing of electricity based on time of day, cloud computing, or even pay as you go insurance, with the explosion of “big data” and other technologies, it’s already possible to stream and collect various data, calculate a price and then bill a customer in a matter of minutes.  The key consideration will be consumer acceptance of such pricing models (considering various privacy tradeoffs) and adoption rates.

If the million “data collection” devices Progressive has installed are any indication, much less the general acceptance of utility priced cloud computing models, it appears we’ve embarked upon a journey in which it’s far too late to go back home.

How Much Big Data is Too Much?

With storage costs plummeting and sophisticated software approaches to mining Big Data, it appears that it is increasingly cost effective for corporations and governments to keep all types of data, even those previously discarded.  However, how much “Big Data” should corporations, entities and governments keep online or archived, especially when “Right to Be Forgotten” debates are swirling?

Image Courtesy of Flickr

Like it or not, all kinds of data are captured every day. James Gleick in “The Information” sums it up nicely;

“The information produced and consumed by humankind used to vanish—that was the norm, the default. The sights, the sounds, the spoken word just melted away. Now the expectations have inverted. Everything may be recorded and preserved at least potentially; every musical performance, every crime, elevator, city street, every volcano or tsunami on the remotest shore…”

With petabytes of storage and virtual machines available in the cloud on a pay per use basis, and on premise storage costs dropping like a rock, it’s conceivable for companies and governments keep every image, video, recording, keystroke, and web generated data type. And of course, all these data are of little use without techniques to mine and perform information discovery. Fortunately BI and data warehousing technologies have worked wonders over the past thirty to forty years for data that needs to be organized, and we have MapReduce/Hadoop to assist in assembling/analyzing an organized data garbage dump.

There are two consequences of this data deluge.

For individuals, there is the feeling of drowning in a sea of overwhelming data of which it’s difficult to manage much less scrutinize. Novelist David Foster Wallace called this scenario “Total Noise” to coin the feeling of drowning in a deep pool of too many tweets, posts, phone calls, podcasts and more. And because this total noise causes “information anxiety” for some, there are plenty of people deleting social media accounts.

And there is a second consequence of this data deluge. Since everything that can be captured is in the process of being captured, there are certainly privacy and security concerns. Our likes, rants, passions and partialities are recorded online and archived offline in perpetuity. These concerns have fomented potential privacy legislation such as the EU’s “Right to Be Forgotten” where digital providers—upon request—will need to cull digital references owned by individuals.

These consequences then beg the question, how much Big Data is too much? What should be kept for corporate reasons (to serve customers better, sell more products, optimize business processes etc)? What should be kept for governmental concerns (tracking bank flows for money laundering, watching for potential terrorist activity, monitoring fringe groups that don’t see eye to eye with government officials)?  And with pending legislation such as “Right to be Forgotten” considered in statehouses across the world, is it more hassle than it’s worth to keep all this Big Data, especially if there are financial penalties for not complying with legislation?


Has Personalized Filtering Gone Too Far?

In a world of plenty, algorithms may be our saving grace as they map, sort, reduce, recommend, and decide how airplanes fly, packages ship, and even who shows up first in online dating profiles. But in a world where algorithms increasingly determine what we see and don’t see, there’s danger of filtering gone too far.

The global economy may be a wreck, but data volumes keep advancing. In fact, there is so much information competing for our limited attention, companies are increasingly turning to compute power and algorithms to make sense of the madness.

The human brain has its own methods for dealing with information overload. For example, think about millions of daily input the human eye receives and how it transmits and coordinates information with our brain. A task as simple as stepping a shallow flight of stairs takes incredible information processing. Of course, not all received data points are relevant to the task of walking a stairwell, and thus the brain must decide which data to process and which to ignore. And with our visual systems bombarded with sensory input from the time we wake until we sleep, it’s amazing the brain can do it all.

But the brain can’t do it all—especially not with the onslaught of data and information exploding at exponential rates. We need what author Rick Bookstaber calls “artificial filters,” computers and algorithms to help sort through mountains of data and present the best options. These algorithms are programmed with decision logic to find needles in haystacks, ultimately presenting us with more relevant choices in an ocean of data abundance.

Algorithms are at work all around us. Google’s PageRank presents us relevant results—in real time—captured from web server farms across the globe. sorts through millions of profiles, seeking compatible profiles for subscribers. And Facebookshows us friends we should “like.”

But algorithmic programming can go too far. As humans are more and more inundated with information, there’s a danger in turning over too much “pre-cognitive” work to algorithms. When we have computers sort friends we would “like”, pick the most relevant advertisements or best travel deals, and choose ideal dating partners for us, there’s a danger in missing the completely unexpected discovery, or the most unlikely correlation of negative one. And even as algorithms “watch” and process our online behavior and learn what makes us tick, there’s still a high possibility that results presented will be far and away from what we might consider “the best choice.”

With a data flood approaching, there’s a temptation to let algorithms do more and more of our pre-processing cognitive work. And if we continue to let algorithms “sort and choose” for us – we should be extremely careful to understand who’s designing these algorithms and how they decide. Perhaps it’s cynical to suggest otherwise, but in regards to algorithms we should always ask ourselves, are we really getting the best choice, or getting the choice that someone or some company has ultimately designed for us?

*  Rick Bookstaber makes the case that personalized filters may ultimately reduce human freedom. He says, “If filtering is part of thinking, then taking over the filtering also takes over how we think.” Are there dangers in too much personalized filtering?