Learning data science with spreadsheets – when might it make sense?


Algorithms now decide our credit score, which patients receive medical care, and which families get access to stable housing. This hidden web of automated systems can trap people in poverty. — MIT Technology Review

  1. We interact with machine learning algorithms all the time. It’s not to say that my aim or yours should be to become a data science engineer overnight. But, understanding data science is a matter of staying relevant at workplace and being aware of how our attention and choices are nudged by machines.

  2. On one extreme, 2020 was evidence enough that data analytics can be a matter of life and death. On the other end, data science applications might tickle a curious brain not only about how it works but also why it works and when might it might go wrong.

  3. There’s value in learning Data science even if you’ll never write a line of code. For one, its common knowledge among experts that most machine learning products have little or nothing to do with machine learning; products labeled “AI enabled” are often an unreasonable exaggeration. That trend isn’t a coincidence in a world where Investors, Business executives and users have little understanding of data science.

  4. As a practical matter, understanding the mechanics of data science could mean you avoid wasting time and resources on projects to solve problems which haven’t been conquered by the most cutting-edge data science models. In its worst form, machine learning projects then become an expedition of spending millions of dollars only to build a product which reinforces management’s biases and instincts.

  5. So, learn data science to unlock opportunities to solve problems previously considered unsolvable and to separate problems needing Machine Learning from those solvable with rudimentary business rule tables.

  6. Enter Spreadsheets. Right at the outset, if you want to, with your own bare hands – code a machine learning solution and scale it up for use by thousands of users, then, by all means, ignore spreadsheets.

  7. But if you invested the past decade perfecting your spreadsheet skill, then it is your unparalleled data science learning platform. You dont need to figure the syntax or lists of built-in functions. You focus on learning the science using a tool that is your slave. Not the other way round.

  8. Despite its shortcomings, spreadsheets are the most ubiquitous and familiar analytics platform. It doesn’t hurt that spreadsheets are a visual analytics tool needing little or no code.

  9. Through tedious formulas and admittedly inefficient implementations, Excel will lay bare how an algorithm works, which parameters affect performance and when things don’t work. It’s the perfect DIY platform to get your hands dirty.