Data: the stuff that anchors our decision-making in some kind of empirical reality, instead of pure instinct or gut feel.
But getting the data we need, when we need it, in a readable form, that deliver clear answers… all of this has a somewhat fraught history. Beginning with Decision Support Systems/Executive Informations Systems (DSS/EIS: c. 1960s), and running through Business Intelligence (BI: c. 1990s—present), the promise has been charts that answer key business questions. The reality has been long integration and configuration cycles, and a relatively inflexible output—i.e. “Oh, you want an answer to that question? Well that’s a whole ‘nother set of queries…”
Stepping into the breach is data science. “Data science” (like its close partner, “machine learning”) is one of those impressive-sounding but vague terms that we in the tech world are so good at coining. At its core, data science is built to answer questions you might not believe were answerable, or even think to ask. It brings a much broader and more nuanced analysis to data, uncovering hidden correlations and insights.
A question like, “How close are we to our monthly sales goal?” is perfectly suited to BI. But a question like “Which employees are at risk to leave?” takes data science to answer. Where BI is responsive to a specifically framed input with a set of known variables, data science is both prescriptive and predictive, finding patterns among intangible variables.
Business Intelligence falls down when there’s no obvious relationship between the question and the available data. This is exactly where data science excels. In questions that are suited for data science, the answer may be a single number, or it may be a set of characteristics that require some data manipulation and modeling to obtain, and that don’t easily lend themselves to standard visualizations like a bar or pie chart.
Once we have the ability to predict relationships in our data, those same predictions allow us to make specific recommendations about what actions to take next. For instance, maybe we discover that senior developers...
- Hired into the Raleigh delivery center
- With a company tenure of more than two years
- Whose commit rates fall off by more than 4% month over month
...are at risk to leave within two quarters. Maybe we also discover that the most engaged and productive developers are those engaged on teams with a Workload Balance of at least 60 percent. Data science can detect all of this, and make a recommendation (e.g. “Rotate this person to this team”).
That’s the real benefit of data science. We’re not just reporting on the organization, we’re starting to optimize it because we understand the underlying hidden variables that drive it. This is what’s really meant when we talk about creating a “data-driven” company: not a gloss on BI, but a whole new way of using data to understand and improve ourselves.