Vast Data Plus User Communities Equals Self-Improving Machines

Alfred Spector, Google, at Structure Big Data 2011Making sense of vast amounts of data is made easier through processor improvements, faster networks and a growing amount of cloud storage capacity, but theres another factor thatsaccelerating the ability to sift through information: user communities. At the Structure Big Data event on Wednesday, Alfred Spector, aVPofResearchandSpecialInitiatives at Google, illustrated how to combine low-level user data with the massive information stores and cloud computing services offered by his company.

Perhaps the most prominent example is Googles geographic data used both in both the Google Maps and Earth products. The company harvests global information to create useful products in their own right, but each can be supplemented through localized user data. A modern data management web app makes it easy for Google to host, manage, allow collaboration and publication of data tables or personalized maps. For example, Google Maps data combined with information from hospitals and doctors can easily show which nearby health-care providers have fluvaccinesavailable.

Making large amounts of data usable and modifiable by end users has the potential to create solutions that Google hasnt envisioned yet. But what it has done is allowed for what Spector calls a hybrid intelligence becauseusers and computers are doing more together than either could do individually. Scientists that track global warming may only have access to limited datasets which show only a small picture of the overall situation. Google Earth, however, can augment its base data with sensor information from various satellites and datapoints, providing a more holistic view of global warming.

This user community and data combination approach is leading! to smar ter machines as well. The voice search features offered by Google are becoming more accurate due to speech recognition data provided by users. In effect, the speech service is training itself because its learning from all of the incoming data.

Just as they can with Google Maps data, end users can leverage these smarter machines as well. Spector said that a spam-killing blog moderator could be created by end users if they train the system with both good blog posts and spam comments. Those inputs, combined with Googles prediction APIs and Python scripts, would effectively create an intelligent automated moderator that couldcontinuouslyimprove its own performance.

Watch live streaming video from gigaombigdata at livestream.com

Related content from GigaOM Pro (subscription reqd):

Who uses Clickatell messaging APIs? We all do! No BS APIs so easy to integrate into your existing system, it&rsquoq;s almost criminal! Test our SMS gateway, no download required


Comments

Popular posts from this blog

China Watch: Magical New Maglev, Fire the Ambassador?

Live Blog: GMIC G-Startup Competition 2011

White spaces are a go! (at least in Wilmington)