How To Work With Big Data

Visuals of data

Let’s start with the basics.  Big data refers to complex sets of data that are too large for analysis by traditional analytical applications.

For a business it can be difficult to accurately learn from the data gathered on consumers. Capturing efficient data is challenging, and then you have to analyze that data which can sometimes be even more challenging. So the big question here is how do you use data to predict user trends and user behavior, and how do you ascertain value from the acquired data?

What makes big data challenging is the tremendous size of the data, the speed at which you can analyze so much data, and trying to zoom in on which data is meaningful, given the wide variety of data being recorded. Managing to accurately interpret your big data can ensure efficiency in your consumer strategy, and reduce risk for your company’s investments.  You should perform diagnostics on how effectively you are interacting with your data.  Ideally, if you can accurately read consumer trends, then it should be easier to make a successful business decision based on that data.

There are a few different methods for visually representing your collected data. But which visualizations will show us the consumer trends that we don’t expect to see?  Think about a few standard website analytics: page-views, clicks, bounce-rate.  Using a line graph, you can visualize and compare these numbers across a few days, weeks and months.  But, how do you interact with those numbers to make smart investments?

One method for interacting with big data is to let that data permeate your company, or the 50 monkeys on 50 typewriters technique.  If your entire staff has hands-on tools for interacting with your big data analytics, then they can all problem-solve and innovate with that data.  One of the best practices for extracting value from your big data is to invest in an environment for visualizing your data.  You can have data analytics for every single human thought, but if you don’t organize it into a logical environment, then the data is useless. Given the complexity of big data, it is easily distorted.  Establishing an accurate data representation is the most important thing that you can do to help guide your company’s investments.  

Innovation will develop from cross-referencing, and combining various, and what may appear to be disconnected data values.  It is challenging to integrate disparate data sets, but integration is essential to understanding the relationship between your business and the consumer.  Let’s look at an example.  IBM and The Weather Channel partnered in an attempt to counteract the economic effects of weather, which costs billions of dollars for businesses across the US.  The partnership intends to forecast the weather by collecting over 10 billion data points from phones, vehicles and other sensors.  Businesses can use the data to adjust staffing and other expenses.  Upfront this is an expensive and challenging venture, but it will likely save an exponential amount in the future.  When you look at massive amounts of data holistically, then you can make intelligent investments.  However, if you were just looking at individual segments of data those logical investment choices might not be visible.

Successfully incorporating big data into business strategies is mostly a mystery.  The recent development of virtual reality in the data analysis field has the potential to transform.  Your brain absorbs information in three dimensions- if we can develop smart data representations in 3D then our understanding of the data will be much more efficient.  Two dimensional data representations are ubiquitous right now. Biologically speaking, humans are poorly designed for comprehending the complexity and multidimensionality of big data.  2D visualizations reduce their data sets so that we can understand them.  If we could visualize multidimensional data sets and simultaneously have the ability to separate that data into individual parts, it would be much easier to understand data patterns.  Virtual reality has the potential to create these nuanced representations.

VR technology is being incorporated into big data analysis in small increments.  We are still waiting on the first major VR data visualization for goal oriented data discoveries. The recent VR trend in video game development has provided rapid advancement for this technology.  Google’s VR software has been open sourced and Oculus Rift is a popular VR technology owned by Facebook.  Software development for VR games has a niche market right now, but VR games compared to their predecessors are like sound cinema to silent film.  It is only a matter of time before VR is ubiquitous for game technology and before it transcends games.  VR will be used for all types of data interactions.  I predict that the proliferation of VR hardware in the coming years will equate to the transformation of PC’s in the 80’s to their ubiquity today.  The tech is becoming more powerful, affordable and more convenient.

Currently, our best 2D data projections consist of complex data sets built into digital environments that allow the viewer to manipulate the model.  Within a 3D VR projection the user can navigate their data from any angle they choose.  Scaling for the point of view of the user is difficult in 2D modeling.  Scaling can be particularly challenging when you have varied data sets, and want to find a specific  value.  Smaller details tend to get lost in 2D abstractions, while trying to represent big picture analytical comparisons.  The interactive capability of VR will solve for this problem.  Accounting for individual bits of data within a mass is an essential building block for 3D visualizations.  Depth of detail will ultimately be dependent on the specific VR environment, but 3D has the capability of incredibly deep representations, while 2D does not.  Imagine that you are looking at a city that is 4 square miles.  Now imagine trying to understand that city by looking at it in a profile view, and only seeing 2 miles.  From that perspective, you can stack the buildings on top of each other or line them up in one long line.  Using that profile, or 2D visualization to calculate the average living space per person can be accomplished.

But, imagine trying to calculate the average daily walking distance for each person, or the average distance from a house to a store, or other specific data.  These calculations can be made with the 2D visualization, but they are much more complex and challenging to comprehend. Now imagine a scalable 3D interactive simulation of that same city.  You can walk around the city and you have interactive tools for breaking it into individual parts.  The calculations, though complex, are much easier to wrap your mind around.  Your brain absorbs a story faster than it can internalize a raw data set and the digital visualizations create visual storytelling. Without the right framework you will not recognize important patterns in your data.  With enough people studying your data you will get insights, the same is true for crowdsourcing any problem.

The best way to maximize our big data integration is to establish a good interactive environment with that data. We don’t yet have great tools for big data analysis but there are a number of tools available.  Right now, we are trying to find where the secrets of big data are hidden, and all we have is a compass. We are trying to reach the point where we can locate those secrets with Google maps.  It is a large gap. The newest innovation is big data analysis with virtual reality.

So the question remains, where will big data visualizations on VR lead us, and how successful will they be?  Time will tell.