18-899-K2   Applied Machine Learning

Location: Africa

Units: 6

Semester Offered: Spring

Course description

This course will provide the expertise and skills necessary for applying machine learning techniques to large real-world datasets in order to facilitate knowledge discovery, predictive analytics and decision-support. A variety of sophisticated techniques for refining, visualizing, exploring and modelling data will be introduced and demonstrated.  The advantages and disadvantages of linear, nonlinear, nonparametric and ensemble methods will be discussed while exploring the challenges of both supervised and unsupervised learning. The importance of quantifying uncertainty and communicating confidence in model results will be emphasized. Applications will include visualization, clustering, ranking, pattern recognition, anomaly detection, data mining, classification, regression, forecasting and risk analysis.  Participants will obtain hands-on experience during project assignments that utilize publicly available datasets and address practical challenges.

Learning objectives

The objective of this course is to provide students with an overview of the use and potential of machine learning in research, business and government. For example, the challenge could be to use a large database to segment clients, understand their behaviour and formulate a new strategy for optimising key performance indicators (KPIs). Participants will learn how to plan, design and implement an empirical research project using sophisticated quantitative techniques.  They will learn how to identify relationships between explanatory variables and KPIs, communicate the outcomes from complicated analyses and construct automated systems for decision-making. There will be a strong emphasis on highlighting the challenges of working with large datasets and understanding the risks of over-fitting.


After completing this course, students should be able to:

  • Design an empirical project in response to a specific objective
  • Identify, collect, clean and organize a large dataset
  • Explore the dataset using visualization techniques
  • Present a summary that is appropriate for end-users
  • Apply machine learning techniques
  • Describe the advantages and disadvantages of different models
  • Select an approach that is optimal for meeting the objective
  • Communicate model output and conclusions to end-users

Content details

  1. Regression
  2. Feature selection
  3. Nonlinear techniques
  4. Unsupervised learning
  5. Supervised learning
  6. Ensemble approaches


  • Data and Inference Mini-Course
  • Background in quantitative discipline (Engineering, Computer Science, Physics, Mathematics, Statistics)
  • Programming

Faculty: Patrick McSharry