Causality and Machine Learning

80816/80516, Spring 2025

Key Information and Links

Instructor:	Kun Zhang
Lectures:	Time: Tuesdays and Thursdays 12:30 – 1:50 PM. Location: Tepper Building 1308.
Office Hours:	Time: Wednesdays 3:00 – 4:00 PM (other times by appointment). Location: 161B Baker Hall or zoom (if needed).
Canvas:	https://canvas.cmu.edu/courses/46438

Syllabus

1. Course description

In the past decades, significant progress has been made in tackling long-standing causality problems, such as discovering causality from observational data and inferring causal effects. Moreover, it has recently been shown that the causal perspective aids in understanding and solving various machine learning problems such as transfer learning, out-of-distribution prediction, disentanglement, representation learning, and adversarial vulnerability. Accordingly, this course is concerned with understanding causality, learning it from observational data, and using it to tackle other learning problems.

The course covers representations of causal models, how causality is different from association, methods for causal discovery and causal representation learning, and how causality enhances advanced learning tasks, including generative AI. We will address the following questions. Why is causality essential? How can we learn it, including latent variables, from observational data? How can we make sure the estimated representation is causal? What role does causality play in learning under data heterogeneity? Can causal principles make generative AI more controllable and capable of extrapolation? How can deep learning benefit from a causal perspective?

Two main causality problems are emphasized. One is causal discovery or causal representation learning. It is well known that “correlation does not imply causality,” but we will make it more precise by asking what assumptions, what information in the data, and what procedures enable us to successfully recover causal information. Causal relations may happen among the underlying hidden variables—we will also see how to uncover the underlying hidden “causal” variables as well as their causal relations from the measured variables. The other is how to properly make use of causal information. This includes identification of causal effects, counterfactual reasoning, and improving machine learning with causal knowledge.

2. Course objectives

As an outcome of this course, participants are expected to:

Understand how causality is different from association and why it is useful,
Get familiar with graphical models, causality-related concepts and principles, and emerging approaches to causal discovery or causal representation learning from observational data,
Be acquainted with the state-of-the-art of causality research in different disciplines,
Be able to develop suitable methods for causal representation learning or causal discovery to address problems in specific domains,
Properly leverage causality in understanding and solving advanced machine learning and artificial intelligence problems,
Identify and formulate causal problems in your respective fields, and be able to find potential solutions.

3. Who can attend

Prerequisites are not required, but introductory statistics or machine learning would be helpful. This course is accessible to students from across disciplines – we especially welcome students from different departments.

4. Course materials

Reading materials will be available online or distributed in class. In addition, we will refer to several chapters of the following two books frequently (some chapters will be available on Canvas):

Peter Spirtes, Clark Glymour, and Richard Scheines (SGS). 2000. Causation, prediction, and search, 2nd edition. MIT Press, Cambridge, Massachusetts,
Judea Pearl. 2009. Causality: Models, Reasoning and Inference, 2nd edition. Cambridge University Press, Cambridge.

5. Grading

Class participation is 5% of your grade. You are allowed to miss two classes without any penalty; after that, missing each class lowers your final course grade by 1% (5% is the maximum), unless you get an approval. In addition, we have 10% for active involvement in in-class discussions (raising or answering questions and participation in discussions).

There will be four homework assignments. These will be worth 40% of your grade. Students should submit their homework on Canvas as MS Word or pdf files (in special situations, you may submit the homework by email to the instructor). 20% will be deducted if it is late unless you get the instructor's approval in advance.

We have 10% for the project/essay proposal for each individual or team of two students, due on March 14, 11:59PM and 35% for the final project report or essay (due on May 2, 11:59PM). Please work together with the instruction to decide on the topic for your project/essay by February 28. (See more detail after Class Schedule.)

Remark: For evaluation of the project report and presentations, we will adopt discipline-specific criteria for students from different disciplines (e.g., philosophy, machine learning, statistics, computer science, psychology, information systems, social and decision sciences, public policy, and biology). The evaluation is partly based on the significance, expected output, and novelty of the problems in the students’ respective fields and their interest to general audiences.

Class schedule

Class meetings consist of lecture presentations on principles and methodologies for causal discovery, causal inference, counterfactual reasoning, and causal representation learning. If time permits, we may have guest lectures on various topics.

The course is divided into nine sessions; see below. Students are expected to finish the readings and try to come up with questions before coming to class.

Part I. Introduction (1 week)

01/14 (Tue) & 01/16 (Thu): Introduction: Concepts, problems, and a big picture of machine learning

Introduction to machine learning and artificial intelligence & how they are connected with causality
Causality-related concepts, principles, and problems: definition of causality, motivation for causal analysis, directed acyclic graphs, interventions, structural equation models
Discussion: Why do we care about causality?

Research problems in causality: Causal discovery, causality for machine learning, identification of causal effects, counterfactual reasoning, generative AI, (automated) scientific discovery
Summary of fundamental problems and recent achievements
Reading: 1. Pages 1-6 of “Causal discovery and inference: concepts and recent methodological advances” (By Spirtes & Zhang), Applied Informatics, 2016; 2. Chapter 1 of Pearl’s book
Open discussion: What do you think of causal analysis in your field?

Slides: Class 1 and 2 - Introduction to causality.pdf

Part II. Preliminaries: Statistics, information theory, basic machine learning, graphical models, and traditional multivariate analysis (2 weeks)

01/21 (Tue): From probability theory to statistics

Probability axioms, discrete and continuous variables
Statistical independence and conditional independence
Sample statistics: expectation, covariance, and correlation; uncorrelatedness vs. independence
Central Limit Theorem and Cramer Decomposition Theorem
Gaussian distribution

Why is it widely assumed but rarely encountered?
Is it a blessing or a challenge to causal discovery?

Three ways of making use of data

Bayes’ rule
Statistical tests
Maximum likelihood estimation (point estimation)

Linear regression
Reading: Chapter on maximum likelihood estimation of “Probability and Statistical Inference” (by R. V. Hogg, E. A. Tanis, and D. L. Zimmerman)
Slides: Class 3 - From probability theory to statistics

01/23 (Thu) and 01/28 (Tue): Traditional machine learning: Settings, assumptions, basic methods, and model selection

Supervised learning

From linear to nonlinear models
Nonparametric models
Bias-variance tradeoff

Model selection
Unsupervised learning

Two ways to “simplify” data
Assumptions underlying clustering

Slides: Class 4 and 5 - From statistics to machine learning

01/30 (Thu): Multivariate analysis: Goals, techniques, and connections to causal discovery

Why multivariate analysis
Principal component analysis (PCA)
Factor analysis (is it useful for causal discovery?)
Independent component analysis (ICA): Linear and nonlinear cases
The (imprecise) connection between multivariate analysis methods with causal analysis
Slides: Class 6 - Multivariate analysis and connection with causality

02/04 (Tue): Graphical models, d-separation, and representation of causal relations

Graphical models
d-separation
Markov conditions
Causal graphical models
Reading: OLI “Causal and statistical reasoning,” Module 13: CMU users sign in here
Slides: Class 7 - Graphical models

Part III. Identification of causal effects & counterfactual reasoning (1 week)

02/06 (Thu): Identifiability & identification of causal effects, the potential outcome framework, and graphical identifiability criteria

Interventions and causal effects
Potential outcome framework vs. graphical criteria
Controlling confounding bias: back-door and front-door criteria
Unification of the criteria for causal effect identification
Nonparametric vs. parametric cases
Propensity score and its applications
New machine learning methods for causal effect estimation
Reading: Pages 65-78 of Pearl’s book

02/11 (Tue): Counterfactual reasoning

Counterfactual reasoning vs. traditional prediction
Methods for counterfactual reasoning

Three-step procedure and limitations
Recent developments: Nonparametric counterfactuals & natural counterfactuals

Reading: 1. Pages 78-89 of Pearl’s book; 2. “Natural counterfactuals” (by Hao et al.), NeurIPS 2024
Slides: Class 8 and 9 - Identification of causal effects and counterfactual reasoning
Assignment 1 released

Part IV. Traditional approaches to causal discovery: “Independence” in causal models, constraint- and score-based causal discovery (1 week)

02/13 (Thu): “Independence” in causal models & constraint-based causal discovery

“Independence” implied by causal models: general ideas
“Independence” instantiation 1: conditional independence for causal discovery
PC algorithm for causal discovery
FCI algorithm for causal discovery
GES algorithm for causal discovery
Demonstration: Using causal-learn or TETRAD for causal analysis
Reading: 1. Chapters 5.4.1 & 5.4.2 of the SGS book; 2. Chapter 6.7 of the SGS book
Slides: Class 10 - The PC Algorithm

Part V. Functional causal model-based approaches to causal discovery: Linear, non-Gaussian methods and beyond (2.5 weeks)

02/18 (Tue): Linear, non-Gaussian, acyclic causal models (LiNGAM)

Structural equation models and independence noise (“Independence” instantiation 2)
LiNGAM: identifiability and identification of the causal model
Reading: “A linear non-Gaussian acyclic model for causal discovery” (by Shimizu et al.), Journal of Machine Learning Research, 2006

02/20 (Thu): Estimating Linear, non-Gaussian, acyclic causal models (LiNGAM) with ICA and other methods

ICA and its relation to LiNGAM
Estimating LiNGAM with independent noise condition
Slides: Class 11 and 12 - Linear, non-Gaussian, causal models

02/25 (Tue): Traditional causal discovery in the presence of confounders

Linear causal discovery in the presence of confounders: What is identifiable and how to estimate it?
Reading: 1. “Estimation of causal effects using linear nonGaussian causal models with hidden variables” (by Hoyer et al.), International Journal of Approximate Reasoning, 2008; 2. “Learning Linear Non-Gaussian Causal Models in the Presence of Latent Variables” (by Salehkaleybar et al.), Journal of Machine Learning Research (JMLR), 2020
Discussion: What do you think of causal discovery and those methods?

02/27 (Thu): Discovery of cyclic causal models (causal models with feedback)

Interpretation of feedback in causal representations
Linear causal discovery in the presence of feedback
Reading: “Discovering cyclic causal models by independent components analysis” (by Lacerda et al.), UAI 2008
Slides: Class 13 and 14 - LiNGAM - confounders and cycles
Assignment 2 released

03/11 (Tue): Nonlinear causal models & Independent causal mechanism for causal discovery

Post-nonlinear causal models
Nonlinear additive noise models
Estimation of nonlinear causal models
“Independent nonlinear mechanism” for causal discovery in deterministic cases
“Independent mechanism” in linear, high-dimensional case
Reading: Pages 11-18 of “Causal discovery and inference: concepts and recent methodological advances” (By Spirtes & Zhang), Applied Informatics, 2016
Slides: Class 15 - Nonlinearity, nonstationarity, and other types of independence for causal discovery

Part VI. Practical issues in causal discovery (1 week)

03/13 (Thu): Causal discovery in the presence of nonstationary causal models, mixed types of variables, and selection bias

Modeling causal processes with continuous/discrete variables
Causal discovery and visualization of nonstationary causal models
Effects of different types of selection bias
Causal discovery in the presence of selection bias
Reading: 1. “Causal discovery from heterogeneous/nonstationary data” (by Huang et al.) JMLR 2020; 2. “On the Identifiability and Estimation of Functional Causal Models in the Presence of Outcome-Dependent Selection” (by Zhang et al.), UAI 2016; 3. “Detecting and Identifying Selection Structure in Sequential Data” (by Zheng et al.), ICML 2024
Slides: Class 16 - Selection bias, measurement error, and missing values

03/18 (Tue): Handling measurement error, missing values, and time series

Causal discovery in the presence of measurement error
Missing data as a causal problem & causal discovery in the presence of missing data
Granger causality and its relation to constraint-based causal discovery
Structural equal models for causal discovery from time series: Granger causality with instantaneous effects, causal discovery from subsampled data, causal discovery from partially observed processes
Reading: 1. “Causal discovery in the presence of measurement error: Identifiability conditions” (by Zhang et al.), UAI 2018; 2. “Causal discovery in the presence of missing data” (by Tu et al.), AISTATS 2019; 3. “Testing for causality: a personal viewpoint” (by Granger), J. Econ. Dyn. Control, 1980
Reading: Idea of Identifiability establishment: A Linear, Non-Gaussian Case
Slides: Class 17 - Missing values, temporal constraints, and basic idea of identifiability establishment

Part VII. Causal representation learning (CRL) (2 weeks)

03/20 (Thu): CRL in the IID case: Benefits from functional constraints or sparsity

Estimation of latent variables and their causal relations
Tetrad conditions
Rank deficiency and Rank-based Latent Causal Discovery (RLCD)
Generalized Independent Noise (GIN) conditions
Sparsity constraints for identifiability of nonlinear ICA
Reading: 1. “A Versatile Causal Discovery Framework to Allow Causally-Related Hidden Variables” (by Dong et al.), ICLR 2024; 2. “Generalized Independent Noise Condition for Estimating Linear Non-Gaussian Latent Variable Causal Graphs” (by Xie et al.), NeurIPS 2020; 3. “Generalizing Nonlinear ICA Beyond Structural Sparsity” (by Zheng & Zhang), NeurIPS 2023
Slides: Class 18 - Causal representation learning in IID case - sparsity constraints

03/25 (Tue): CRL from changes: Benefits from multiple distributions

Nonlinear ICA with surrogate variables
Partial disentanglement with identifiable changing components
General, nonparametric case
Minimal change principle
Reading: 1. “Nonlinear ICA using auxiliary variables and generalized contrastive learning” (by Hyvärinen et al.), AISTATS 2019; 2. “Partial disentanglement for domain adaptation” (by Kong et al.), ICML 2022; 3. “Causal Representation Learning from Multiple Distributions: A General Setting” (by Zhang et al.), ICML 2024
Slides: Class 19 - Causal representation learning from multiple distributions

03/27 (Thu): CRL from temporal data

Why temporal information helps
Temporal disentanglement
With instantaneous relations
Reading: 1. “Temporally Disentangled Representation Learning” (by Yao et al.), NeurIPS 2022; 2. “On the Identification of Temporally Causal Representation with Instantaneous Dependence” (by Li et al.), arxiv 2024
Slides: Class 20 - Causal representation learning from temporal data
Assignment 3 released

04/01 (Tue): Real problems of CRL

Psychometric studies
Multi-model causal discovery with latent variables
Refined (causal) CLIP model
Reading: 1. “Multi-domain image generation and translation with identifiability guarantees” (by Xie et al.), ICLR 2023; 2. A paper on refined causal CLIP model will be shared later
Slides: Class 21 - Real problems of causal representation learning

Part VIII. Causal view for machine learning and artificial intelligence (2 weeks)

04/08 (Tue): Transfer learning and image translation

A picture of machine learning, especially deep learning
Transfer learning
Image-to-image translation: A causal perspective
Reading: 1."Domain Adaptation as a Problem of Inference on Graphical Models” (by Zhang et al.), NeurIPS 2020; 2. “Multi-domain image generation and translation with identifiability guarantees” (by Xie et al.), ICLR 2023 3. A paper on general settings of domain adaptation will be shared later
Slides: Class 22 - Transfer learning
Discussion: How is general-purpose AI connected to causality and how to achieve it?

04/10 (Thu): Semi-supervised learning, reinforcement learning, and large models

Semi-supervised learning
Reinforcement learning: learning and using causal representations
Causality of/for large models
Reading: 1. “On causal and anticausal learning” (by Schölkopf et al.), ICML 2012; 2. "AdaRL: What, Where, and How to Adapt in Transfer Reinforcement Learning" (by Huang et al.), ICLR 2022; 3. A paper on causality of/for large models will be provided later, consider the rapid evolution of the field

04/15 (Tue): Unsupervised deep learning, deep generative models, and causal generative AI

Adversarial vulnerability
How causality helps in unsupervised deep learning
Disentanglement
Autoencoder, Generative Adversarial Networks (GANs), and Stable diffusion models
How causal learning benefits generative AI
Reading: 1. “Generative Adversarial Nets” (by Goodfellow et al.), NIPS 2014; 2. “Nonlinear ICA using auxiliary variables and generalized contrastive learning” (by Hyvärinen et al.), AISTATS 2019; 3. “Causal Compositional Image Generation with Minimal Change” (by Xie et al.), arxiv 2024
Assignment 4 released

04/17 (Thu): Fairness in machine learning

Group-level and individual-level fairness
Evaluating fairness vs. achieving fairness
Procedural fairness
Reading: to be provided

Part IX. Real applications of causal discovery, review, and outlook (1 week)

04/22 (Tue): Applications of causal discovery and causal representation learning

Causal analysis in neuroscience (especially in brain network discovery from fMRI)
Causal analysis in finance and biology
Real data sets for causal discovery
Discussion: causality vs. selection

04/24 (Wed): Review and Outlook

Review: Why causality? How do we find & make use of causality?
How to achieve automated scientific discovery (generation of causal hypotheses followed by verifications)?
Discussion:
- Can we avoid using “causality”?
- Causality in the era of large models?
- Causality facilitates the 2nd scientific revolution?

Final project

Topics

Participants are encouraged to present your own causality-related problems or data sets for the final project (if they prefer not to do the problem sets nor write an essay). Alternatively, you can choose one from the following topics (reading materials will be provided to you). Please determine the topic together with the instructor by 02/28.

Domain specific causal discovery (e.g., for fMRI, MEG, economic data, financial data, and climate data)
Causal perspective of multi-task learning
Causal treatment of out-of-distribution prediction
Prediction in nonstationary environments: Invariant representations or the ability to adapt?
Understanding “feedback” in directed graphical causal representations
Causality and heterogeneity in learning
Causality and transfer learning with different feature spaces: Why and how does causal knowledge help?
Causal discovery and complexity measures
Causal discovery in the presence of confounders from distribution changes
Towards “universal” causal discovery
Creativity by causal knowledge integration and counterfactual reasoning
Causal links vs. associations between genes, traits, and disease (c.f. "Integrative analysis of 111 reference human epigenomes," available at http://www.nature.com/nature/journal/v518/n7539/full/nature14248.html)
Causal analysis in stock market and interpretation of the causal relations
Causality in climate analysis (e.g., prediction and understanding of El Niño; c.f. the “Azimuth El Niño Project”)
Causality and prediction in nonstationary environments (e.g., how to improve the performance of Google Flu Trends; c.f. http://gking.harvard.edu/publications/parable-google-flu%C2%A0traps-big-data-analysis)
Finding causal knowledge and using it for crime control
Causality-based computational social science
Large language models and causality

Requirements

Students are expected to present the selected problems, make progress on the topics, and summarize their achievements. You may complete them alone or in two-person groups.

Key dates

02/28: The last day to decide on the topic of your project/essay.
03/04: The last day to send the instructor a short (half-page) description of the proposed final project/essay & a short (one-page) description of your initial ideas as to how to tackle the problem (due: 11:59 p.m.).
05/02: The final version of the project report/essay (due: 11:59 p.m.).

Staff

Instructor

Kun Zhang

Teaching assistants

Yujia Zheng

Haoyue Dai

Additional information

To students

Take care of yourself. Do your best to maintain a healthy lifestyle this semester by eating well, exercising properly, avoiding drugs and alcohol, getting enough sleep and taking some time to relax. This will help you achieve your goals and cope with stress.

All of us benefit from support during times of struggle. You are not alone. There are many helpful resources available on campus and an important part of the college experience is learning how to ask for help. Asking for support sooner rather than later is often helpful.

If you or anyone you know experiences any academic stress, difficult life events, or feelings like anxiety or depression, we strongly encourage you to seek support. Counseling and Psychological Services (CaPS) is here to help: call 412-268-2922 and visit their website at http://www.cmu.edu/counseling/. Consider reaching out to a friend, faculty or family member you trust for help getting connected to the support that can help.

Diversity statement

We must treat every individual with respect. We are diverse in many ways, and this diversity is fundamental to building and maintaining an equitable and inclusive campus community. Diversity can refer to multiple ways that we identify ourselves, including but not limited to race, color, national origin, language, sex, disability, age, sexual orientation, gender identity, religion, creed, ancestry, belief, veteran status, or genetic information. Each of these diverse identities, along with many others not mentioned here, shape the perspectives our students, faculty, and staff bring to our campus. We, at CMU, will work to promote diversity, equity and inclusion not only because diversity fuels excellence and innovation, but because we want to pursue justice. We acknowledge our imperfections while we also fully commit to the work, inside and outside of our classrooms, of building and sustaining a campus community that increasingly embraces these core values.

Each of us is responsible for creating a safer, more inclusive environment.

Unfortunately, incidents of bias or discrimination do occur, whether intentional or unintentional. They contribute to creating an unwelcoming environment for individuals and groups at the university. Therefore, the university encourages anyone who experiences or observes unfair or hostile treatment on the basis of identity to speak out for justice and support, within the moment of the incident or after the incident has passed. Anyone can share these experiences using the following resources:

Center for Student Diversity and Inclusion: csdi@andrew.cmu.edu, (412) 268-2150

Report-It online anonymous reporting platform: reportit.net username: tartans password: plaid

All reports will be documented and deliberated to determine if there should be any following actions. Regardless of incident type, the university will use all shared experiences to transform our campus climate to be more equitable and just.

Course policies

Remember: If you registered for this class, you have until March 31st to change your grade in this course from a letter grade to a Pass/Fail grade.

Cheating and plagiarism

It is the responsibility of each student to be aware of the university policies on academic integrity, including the policies on cheating and plagiarism. This information is available at http://www.cmu.edu/academic-integrity.

Disability

If you have a disability and have an accommodations letter from the Disability Resources office, I encourage you to discuss your accommodations and needs with me as early in the semester as possible. I will work with you to ensure that accommodations are provided as appropriate. If you suspect that you may have a disability and would benefit from accommodations but are not yet registered with the Office of Disability Resources, I encourage you to contact them at access@andrew.cmu.edu.

Student well-being

Take care of yourself. Do your best to maintain a healthy lifestyle this semester by eating well, exercising, avoiding drugs and alcohol, getting enough sleep and taking some time to relax. This will help you achieve your goals and cope with stress.

All of us benefit from support during times of struggle. There are many helpful resources available on campus and an important part of the college experience is learning how to ask for help. Asking for support sooner rather than later is almost always helpful.

If you or anyone you know experiences any academic stress, difficult life events, or feelings like anxiety or depression, we strongly encourage you to seek support. Counseling and Psychological Services (CaPS) is here to help: call 412-268-2922 and visit their website at http://www.cmu.edu/counseling/. Consider reaching out to a friend, faculty or family member you trust for help getting connected to the support that can help.

Template from CMU 10-701 Course. Many thanks to the original creators!