
How is everyone doing? I hope you are having fun in your Data Science learning and having a great time with your family.
In recent weeks, I have to teach a module for ESSEC’s Singapore Campus. The module is an introduction to Data Science for MBA students. In the module, I aim to share a comprehensive overview of what Data Science is and why businesses need to start working on it ASAP. Long story short, I also shared in a small section, another overlooked area in Data Science & Artificial Intelligence, Metadata.
Metadata in short is data about data. For instance, the dataset name, when is the dataset created, etc. Why is it important? Well, if you are going to work with data, you cannot avoid data errors. Data errors can be very time-consuming because time is spent on investigating and rectification. As a data scientist, we can manage the time for investigation, by documenting down the ETL (Extract, Transform & Load) process and metadata.
So well-documented metadata can cut down investigation time tremendously. But similarly to data, it is often overlooked. If you do not start now, and as your data and organization mature in Data Science, the "monster" may grow too big to be handled.
Unfortunately, I have not come across a single software that can manage metadata, thus metadata is found in disparate sources. They are found in programming scripts and logs, policy documents, etc. Thus you need to plan on how to store them so that it is easily accessible when needed.
Remember, spend more time on your data and metadata. The benefits reaped will be tremendous. I have written a more detailed post on Metadata. Do have a read.
Thank you for showing me your kind support with your precious time!
If you found the newsletter to be useful, do share them with your network. If there is any feedback, feel free to give me a Tweet or message me on LinkedIn! If you have some time on hand, do visit my website! :)
Note: In my previous newsletter, I have written a post on how Data Science and Artificial Intelligence may be impacted by Covid-19. Here is the post if you have not read it. Feedback is welcome!
Posts that may interest you:
Books I am reading now:
Upheaval: Turning Points for Nations in Crisis
Rethink: The Surprising History of New Ideas
Human Compatible: Artificial Intelligence and the Problem of Control