In this first post of Train Test Split, I would like to share my view on the meaning of Data Science. After working for some time now in Data Science projects of the Dutch industry and after lengthy discussions with colleagues and supervisors, I have formulated a personal view and opinion on what Data Science really means.
I am sure my view is not perfect, but I hope it can bring some insights to people who are new in the field and maybe serve as a reminder to people who are already working in Data Science. Let’s start by realizing that the world of Data Science has been hype-heavy these past few years.
Data Science Is All The Craze
Nowadays, everyone is talking about Data Science, Machine Learning, Deep Learning, Artificial Intelligence, and Big Data. The internet is full of blog posts, articles, online courses and degrees of all kinds that have to do with the aforementioned topics. It’s only natural that people get lost in the craze that is happening around us.
Companies all over the place are looking for people with Data Science skills. However, amidst this hype I dare to say that the real point of Data Science can easily get lost. Moving on, I would like to share my personal view on the essence of what data science really is.
The Essence Of Data Science
The essence of Data Science is not at all complicated and we should not make it complicated for no reason. It is very simple to state but quite difficult to do in reality.
The essence of Data Science, at least in my opinion, is creating value out of Data in order to bring about impact in a business or research problem. People tend to get lost in the techniques, the fancy algorithms, and the latest, shiny software packages. However, they are not the point, they are only a means to an end. I dare to say that sometimes, at least in my experience, simple solutions can bring a stronger impact than fancy ones.
Data Scientist Vs Data Analyst
Keeping in mind the real meaning of Data Science (according to me at least :P), it’s also interesting to discuss the difference between a Data Scientist and a Data Analyst.
For anyone either looking for a job or getting into the field it can be confusing that there are many different roles someone can find in the Data Science ecosystem. Job descriptions range from Data Scientist, to Data Analyst, to Business Analyst, to Big Data Engineer, to Data Engineer, to Machine Learning Engineer, just to name a few you can hear out there. Discussing the differences between these is out of the scope of this post (and sometimes there is no real difference :P), but the Data Analyst vs Data Scientist difference should be mentioned.
In my view, a Data Analyst is someone who is only looking at what is there in the data already. However, a Data Scientist should go further. A Data Scientist should be able to create value out of the data. Another major difference is that a Data Analyst is only looking at the past. On the other hand, a Data Scientist is also concerned about the future, thus being adept in techniques of Predictive Modeling and Machine Learning.
Moreover, I strongly believe that a Data Scientist should be able to create or at least prototype data products (or data-driven software products for that matter), whereas a Data Analyst is not concerned about a final product but only about the result of their analysis.
Don’t Lose The Essence
At this point, I would like to give a warning to people entering the field and a reminder to people who are already working or studying in the field of Data Science. For the people entering the field: please don’t get swept up into the hype of terms. Realize that Data Science in its essence is about creating value out of Data.
Similarly, for the people already in the field: don’t forget the point of what you are doing. The real meaning of Data Science is about creating value out of data and creating a real impact in a problem, be it business problem or research problem. Techniques, algorithms, and software packages are very important, but they are not the point themselves. They are the tools we can use to bring about the desired result, which is the value out of the data. Don’t forget the real impact that your work in Data Science should bring. That is the meaning of Data Science.
Don’t Be In It For The Money
Data Science can be a very lucrative field. Data Science skill is highly sought after by companies. However, I would like to strongly urge people not to fall into the trap of getting into or staying in Data Science just for the money. Yes, getting paid well for your work and having skills that are valuable is awesome. That being said, there are many roads someone can take in life to meet those goals and I don’t believe making money should be the only criterion. Do get into Data Science if you have an interest in working with data and have passion about creating value out of data. Do not get into Data Science for the money.
In this post, I tried to share my view on the real meaning of Data Science. This view has been created after very interesting discussions I have had over the past year with colleagues and supervisors in my PDEng program, and I would really like to thank the people that shared with me their opinions on this very interesting and important topic. I hope the ideas can be of help to someone starting out in the field or someone who is trying to find their way amidst the hype of the Data Science world.
I believe Data Science has much to offer to our world, but it is not magic and shouldn’t be viewed as magic. We should try to strip away all the hype and noise surrounding it and focus on its essence. Creating value out of Data that can bring positive impact to a business or research problem.