I came across a LinkedIn post where someone was asking the question, which is more important to learn, “Data Munging or Machine Learning?”
I like to take a step back and say, "Both are important and yes, data munging is more important than Machine Learning, even though most boot-camps that I came across do not cover the topic much. But…”
Actually a lot of courses that you attended have skipped a lot of steps in between. They covered a pre-defined business problem at the start and then showed you the Machine Learning models to use for the problem.
In the harsh business reality, there is NO pre-defined business problem and there is no such thing as “clean” data, at least not like what you see in the boot-camps. The data scientist will need to define the business problem to solve first, where the potential business value from the project is determined, then translate it into a machine learning problem, while checking the suitability of data.
Once the fit between business problems and data is good, and stakeholders (usually business users) are onboard, then we move to Exploratory Data Analysis and Data Cleaning. Now that is the real world, well part of it at least.
As the data scientist, your most important task is to determine the business problem to solve, that is where your value is! Determining the business problem sets the tone for the rest of the project. Data Munging and Machine Learning are just part of your toolkit. Do remember that! :)
Here are a few posts that may be useful.
Data Cleaning for Data Scientist
Exploratory Data Analysis (Non-Visual)
Exploratory Data Analysis (Visual)
Have fun in your data science journey! All the best! :)
Podcast: As many of you may know, I have started my podcasting channel “Symbolic Connection” with a friend of mine, Thu Ya Kyaw. Since the last newsletter, we have published two more episodes with my co-host as guests and my good friend from Bangkok, Charin Polpanumas, Lead Data Scientist with Central Group. :)
Currently, I am reading the following books:
1) Invisible Women: Data Bias in a World Designed for Men
2) The Robots are Coming: The Future of Jobs in the Age of Automation
3) Seeing What Others Don’t: The Remarkable Ways We Gain Insights
Feedback is most welcomed! Please send them through my LinkedIn or Twitter. Consider sharing the newsletter if you found it to be useful. Just hit that “Share” button! :)
Great article Koo Ping Shung.I have been looking for something of this sort. Thank you so much for writing this