With the increasing availability of large amounts of data, methods that fall under the term data science are becoming important assets for chemical engineers to use. Methods, broadly speaking, are needed to carry out three tasks, namely data management, statistical and machine learning and data visualization. While claims have been made that data science is essentially statistics, consideration of the three tasks previously mentioned make it clear that it is really broader than just statistics alone and furthermore, statistical methods from a data-poor era are likely insufficient. While there have been many successful applications of data science methodologies, there are still many challenges that must be addressed. For example, just because a dataset is large, does not necessarily mean it is meaningful or information rich. From an organizational point of view, a lack of domain knowledge and a lack of a trained workforce among other issues are cited as barriers for the successful implementation of data science within an organization. Many of the methodologies employed in data science are familiar to chemical engineers; however, it is generally the case that not all the methods required to carry out data science projects are covered in an undergraduate chemical engineering program. One option to address this is to adjust the curriculum by modifying existing courses and introducing electives. Other examples include the introduction of a data science minor or a postgraduate certificate or a Master’s program in data science.
This is an open access article distributed under the Creative Commons Attribution License
which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.