“Chemical Data Science” is a recognized field that combines the principles of chemistry with data science techniques. It’s essentially the application of data analysis, machine learning, and computational methods to solve problems in chemistry and related fields.
The established academic field that connects chemistry and computing is often called Cheminformatics. Cheminformatics focuses primarily on molecules: representing chemical structures in data form, calculating descriptors, predicting molecular properties, and supporting applications such as drug discovery and materials design.
In this blog, however, I’ll be using the broader term Chemical Data Science. This reflects not only cheminformatics but also the wider use of modern data science methods—Python, SQL, Excel, and machine learning—applied to chemical and process-related data.
Chemical Data Science includes:
In other words, Chemical Data Science blends the rigor of chemistry with the toolkit of data science. It’s an evolving field, recognized by universities and industry alike, and my goal in this blog is to share practical, approachable examples that showcase how data science can unlock insights hidden in chemical information.