Machine Learning and Data Science

mccormick.northwestern.edu/machine-learning-data-science-minor/

The McCormick minor in Machine Learning and Data Science provides students with practical knowledge fundamental to the data science lifecycle.  Students will gain experience with a variety of data models and techniques used for collecting data, cleaning it, and analyzing it.  They will also learn how to glean insights from data through multiple modern computational tools, as well as the ability to think critically about the construction and implications of analysis and models for data-driven decision making.

MLDS minor students will choose a specialization in Machine Learning or Data Engineering, or a Hybrid. Students specializing in Machine Learning will dive into the computational and mathematical roots of the algorithms, building deep understanding that will enable them to reason about successes and failures of Machine Learning systems. Students with Data Engineering and Hybrid specializations will engage with hands-on training with the tools needed to work effectively with large data sets in the cloud or on their own machines. Their studies will culminate with a Data Engineering Studio course, where they will synthesize their in-depth knowledge of statistics, machine learning, and computing with a quarter-long project using real-world data.  This project will enable them to apply their skills in a sustainable, reproducible manner.  

The program also includes two elective courses to either enrich each student's major study or broaden their experience in data-intensive analysis in other disciplines. The minor is designed to empower students to leverage data science tools to amplify work in their own disciplines. This involves developing comprehensive data science pipelines and using computational data analysis for the estimation, prediction, design, and control of engineering systems.


 

Data Science and Engineering Courses

DATA_ENG 200-0 Foundations of Data Science (1 Unit)   Foundations of Data Science will cover the fundamentals of data science and the context within which this field operates. This course will introduce the steps of the data science lifecycle and the associated data tools and techniques, through implementation in languages such as Python.

DATA_ENG 300-0 Data Engineering Studio (1 Unit)   Data Engineering Studio teaches how to build a sustainable data science lifecycle. Students will analyze data in multiple contexts (e.g., SQL, building machine learning models), share the findings with peers, and practice iteratively refining the analysis based on feedback. They will become acquainted with the common pitfalls in applying data analytics to real-world datasets. Several modern data engineering tools, such as docker containers, Spark, Airflow, and MLFlow, will be covered. Prerequisite: DATA_ENG 200-0 and 1 unit from each of the following core areas; Statistics Foundations, Intermediate Programming/Algorithmic Skills, and Applied Machine Learning.Prereq for DATA_ENG 300.