Data Analytics: Bridging the Language of Academic and Business Worlds

By Christian Conroy
“Well, I mean, it can’t really technically be predictive, you know,” I stated plainly to my boss, assuming he would understand given the type and quality of the data we were working with.
“But it has to be. We want to be able to use this as a practical tool,” he said back, his eyebrows raised almost as if to say, “How are you not getting this?”
Coming from a meager data analysis background, it was only natural that one of the main roles I would fill within the Business Market Intelligence & Strategy team at Electrolux during my tenure as a Wallenberg Fellow would be one focused on elevating team capabilities from data management to data analysis. The task was to use historical macroeconomic indicator data for multiple countries across multiple years to make inferences about country market size using future projections of the same macroeconomic data. While I was happy to have the opportunity to apply the statistical analysis skills that I had spent the past few years cultivating through both a Fulbright Research Grant at an experimental economics institution and a first academic year of a public policy program at Georgetown, I knew that trying to even approach something that could be labeled as predictive would be a difficult challenge.
Unlike previous roles I have had in risk analysis and business intelligence, the human capital that comprises modern business intelligence teams is no longer centered on qualitative expertise in the political nuances of specific foreign governments, the unique cultural aspects of consumers abroad, or the drivers of non-market economic policies in protectionist economies. Instead, modern teams consist of experts in IT integration, statistical analytics, data visualization, and project management. Similar to most industries – roughly 47% of jobs in the US are expected to be automated over the next two decades according to a study from the Oxford Martin School – business intelligence as a sector is now focused more on cost and labor saving tools than it is on human labor. Hence, it was my task to further this trend within the Electrolux business intelligence team and pave the way for the company to begin investing in advanced data analytics tools that would automate the predictive model I had been tasked with creating.
Responding to the point made above by my boss and later discussing the issue with colleagues, I defended my doubts about any model’s predictive capabilities by ignorantly spewing off a hodgepodge of esoteric econometrics terms to try and prove that the word predictive was often overused in the business world – autocorrelated errors, multilevel fixed effects, bias and endogeneity, statistical significance and hypothesis testing, multicollinearity. At a company retreat, I proudly showed scatter plots and regression equations and explained how two-way fixed effects regressions on panel data worked, only to be met with relative silence and mild confusion from the audience. Of course none of these words meant anything in the context of business intelligence. The words, perhaps subconsciously spouted off more to make it seem like I knew what I was talking about than to actually demonstrate it, naturally fell on deaf ears.
Discontent to let the business world triumph over the principles the last few years of training had taught me, I reached out to a number of professors at Georgetown to either help me understand the best methods to use in building a predictive model or at least join me in looking down upon a money-hungry business world that cares more about selling a product to an equally uninformed audience than it does in demonstrating real statistical validity. To my chagrin, I received no such support. Instead, the professors I reached out to responded back with a bunch of case-by-case jargon, emphasizing in the end that what methods I used in the analysis ultimately rested on what the end goals were. One professor even advised me to “impose minimal restrictions on the underlying causal relationships,” focusing instead on using a guess and check strategy that attempts to match up predicted values to the actual values to the greatest extent possible.
I was shocked. I was frustrated. If the model is not causal in some way, then what is the point? Even if they weren’t going to directly provide me any code to do the analysis, I had hoped that they would at least provide me with a cookie-cutter solution that I could later google enough about to figure out how to operationalize. Instead of receiving such guidance, the professors were essentially telling me that all of those rules that I had thought were ironclad – errors cannot be correlated with your independent and dependent variables, something must be controlled for when the value of time period 2 is based on the value of time period 1, a model is not predictive when there is no random sampling and the creation of control and treatment groups – were all flexible. What they were telling me was that it was more important to tailor any analysis I do to the end goal of the business, namely to make a practical predictive tool, than it was to demonstrate statistical validity.
In the end, despite all of the self-learning I had to do in order to attempt to develop an even partly workable model, all I had to do was try to create something that predicted market size numbers that were as close as possible to the actual market size numbers of the historical data. Though it was difficult to let myself sleep knowing that the model I was developing was full of statistical faux pas, it was also liberating to move beyond an academic world often defined by rules and textbooks and into one of money-making decisions defined more by the ends than by the means. The entire experience has caused me to rethink exactly what I am learning within my graduate programs and hopefully start a positive dialogue with colleagues and professors at the McCourt School of Public Policy about how coursework can focus not just on empirically evaluating government policy interventions but also on leveraging predictive analytics for business applications.
The takeaway is that new experiences often remind us of how constrained we are by the environment that we have previously been in and developed a comfort for. Just as it is easier for a child to learn a language than for an adult who has already built up rigid sets of rules in his or her mind, it is often easier for someone to learn something when they come in with a blank slate. Putting oneself in an environment where everyone is effectively speaking a different language – that held doubly true for a team here that was comprised largely of Italians and Swedes – can actually be constructive, especially if it forces one to challenge their assumptions or admit that they don’t understand something as well as they thought they did.