It’s kind of amazing that we all settled on the term “big data” before the “Internet of things” really arrived. That pending revolution, in which we’ll see all kinds of new objects connected to the Internet thanks to the cheap hardware provided by the smartphone boom, will generate information on a scale we can’t even really comprehend yet.
Most businesses have woken up to the reality that data analysis is the lifeblood of a 21st century corporation. Yet they are not prepared for the incoming flood of data that these connected objects will provide. It’s one thing to track your customers as they flow through your website, or observe that the yellow button gets way more clicks than the red button. It’s another thing to track data created inside and outside a company about their employees, customers, and partners.
We need new technologies to make sense of this glut of information. And this is going to change the way that big data companies like Hortonworks
, and others sell products and services.
This looming problem is something we’re sure to discuss at Structure Data, scheduled for March 9th and 10th in San Francisco. We’re featuring speakers such as William Ruh of GE, who will talk about the impact the industrial Internet will have on the manufacturing sector; Jerome Dubreuil of Samsung, who will illustrate just how much data connected home devices generate; and a panel of healthcare experts will sort through the dual challenges of the retiring baby boom generation and an explosion in quantified-self health apps.
This may sound like a buzzword salad to many of you. But those in charge of the massive players in this market are making moves to get themselves ready for the data deluge from a realized Internet of things. Hortonworks, for example, bought a company last year called Onyara that focused on finding smarter ways to analyze data than just throwing a bunch of servers at the problem. (Hortonworks CEO Rob Horton will be speaking at Structure Data.)
For a long time, this was a tried-and-true approach to data analysis: Find a problem, collect or obtain a data set, and run a smart piece of software on a gazillion servers (if you want results before the end of the year). But the Internet of things will overwhelm this line of thinking. The sheer number of events will be too much for companies to parse.
Get Data Sheet, Fortune’s technology newsletter.
This exposes a problem in the nascent world of “big data.” Industry-wide best practices for some of the trickier problems are hard to come by, since most companies have data issues and opportunities that are unique to their business. Data scientists and engineers tend to build data infrastructure in ways they’ve seen work in the past, and when the underlying assumptions behind those methods change dramatically—like, say, when you dump hundreds of millions of additional inputs into the system—they’re going to need a new method.
That method likely involves machine learning techniques that can help data analysis systems learn which inputs to prioritize, the same way you use context clues to pick out the handful of relevant emails in your inbox each morning against the hundreds of extraneous ones.
Herein lies the opportunity for data analysis companies like Hortonworks as the Internet of things evolves. IBM, which has put most of its eggs in Watson’s basket, launched a product earlier this month that helps application developers prioritize extremely important —such as oxygen levels in a mine—over the reams of other data produced by sensors in that environment.
If the budding giants of big data—Hortonworks, Cloudera, and their ilk—don’t adjust to meet this challenge, someone surely will. And even if the Internet of things arrives bit by bit, instead of as a tsunami, the data scientists, engineers, and developers who get this right over the next few years will be in prime position for both profits and influence over the development of our connected world.
Find out how that world will evolve at Structure Data in March.