

Causality and Machine Learning
80816/80516, Spring 2025
Key Information and Links
Instructor: | Kun Zhang |
Lectures: | Time: Tuesdays and Thursdays 12:30 – 1:50 PM. Location: Tepper Building 1308. |
Office Hours: | Time: Wednesdays 3:00 – 4:00 PM (other times by appointment). Location: 161B Baker Hall or zoom (if needed). |
Canvas: | https://canvas.cmu.edu/courses/46438 |
Syllabus
1. Course description
In the past decades, significant progress has been made in tackling long-standing causality problems, such as discovering causality from observational data and inferring causal effects. Moreover, it has recently been shown that the causal perspective aids in understanding and solving various machine learning problems such as transfer learning, out-of-distribution prediction, disentanglement, representation learning, and adversarial vulnerability. Accordingly, this course is concerned with understanding causality, learning it from observational data, and using it to tackle other learning problems.
The course covers representations of causal models, how causality is different from association, methods for causal discovery and causal representation learning, and how causality enhances advanced learning tasks, including generative AI. We will address the following questions. Why is causality essential? How can we learn it, including latent variables, from observational data? How can we make sure the estimated representation is causal? What role does causality play in learning under data heterogeneity? Can causal principles make generative AI more controllable and capable of extrapolation? How can deep learning benefit from a causal perspective?
Two main causality problems are emphasized. One is causal discovery or causal representation learning. It is well known that “correlation does not imply causality,” but we will make it more precise by asking what assumptions, what information in the data, and what procedures enable us to successfully recover causal information. Causal relations may happen among the underlying hidden variables—we will also see how to uncover the underlying hidden “causal” variables as well as their causal relations from the measured variables. The other is how to properly make use of causal information. This includes identification of causal effects, counterfactual reasoning, and improving machine learning with causal knowledge.
2. Course objectives
As an outcome of this course, participants are expected to:
- Understand how causality is different from association and why it is useful,
- Get familiar with graphical models, causality-related concepts and principles, and emerging approaches to causal discovery or causal representation learning from observational data,
- Be acquainted with the state-of-the-art of causality research in different disciplines,
- Be able to develop suitable methods for causal representation learning or causal discovery to address problems in specific domains,
- Properly leverage causality in understanding and solving advanced machine learning and artificial intelligence problems,
- Identify and formulate causal problems in your respective fields, and be able to find potential solutions.
3. Who can attend
Prerequisites are not required, but introductory statistics or machine learning would be helpful. This course is accessible to students from across disciplines – we especially welcome students from different departments.
4. Course materials
Reading materials will be available online or distributed in class. In addition, we will refer to several chapters of the following two books frequently (some chapters will be available on Canvas):
- Peter Spirtes, Clark Glymour, and Richard Scheines (SGS). 2000. Causation, prediction, and search, 2nd edition. MIT Press, Cambridge, Massachusetts,
- Judea Pearl. 2009. Causality: Models, Reasoning and Inference, 2nd edition. Cambridge University Press, Cambridge.
5. Grading
Class participation is 5% of your grade. You are allowed to miss two classes without any penalty; after that, missing each class lowers your final course grade by 1% (5% is the maximum), unless you get an approval. In addition, we have 10% for active involvement in in-class discussions (raising or answering questions and participation in discussions).
There will be four homework assignments. These will be worth 40% of your grade. Students should submit their homework on Canvas as MS Word or pdf files (in special situations, you may submit the homework by email to the instructor). 20% will be deducted if it is late unless you get the instructor's approval in advance.
We have 10% for the project/essay proposal for each individual or team of two students, due on March 14, 11:59PM and 35% for the final project report or essay (due on May 2, 11:59PM). Please work together with the instruction to decide on the topic for your project/essay by February 28. (See more detail after Class Schedule.)
Remark: For evaluation of the project report and presentations, we will adopt discipline-specific criteria for students from different disciplines (e.g., philosophy, machine learning, statistics, computer science, psychology, information systems, social and decision sciences, public policy, and biology). The evaluation is partly based on the significance, expected output, and novelty of the problems in the students’ respective fields and their interest to general audiences.
Class schedule
Class meetings consist of lecture presentations on principles and methodologies for causal discovery, causal inference, counterfactual reasoning, and causal representation learning. If time permits, we may have guest lectures on various topics.
The course is divided into nine sessions; see below. Students are expected to finish the readings and try to come up with questions before coming to class.
Part I. Introduction (1 week)
- 01/14 (Tue) & 01/16 (Thu): Introduction: Concepts, problems, and a big picture of machine learning
- Introduction to machine learning and artificial intelligence & how they are connected with causality
- Causality-related concepts, principles, and problems: definition of causality, motivation for causal analysis, directed acyclic graphs, interventions, structural equation models
- Discussion: Why do we care about causality?
- Research problems in causality: Causal discovery, causality for machine learning, identification of causal effects, counterfactual reasoning, generative AI, (automated) scientific discovery
- Summary of fundamental problems and recent achievements
- Reading: 1. Pages 1-6 of “Causal discovery and inference: concepts and recent methodological advances” (By Spirtes & Zhang), Applied Informatics, 2016; 2. Chapter 1 of Pearl’s book
- Open discussion: What do you think of causal analysis in your field?
Part II. Preliminaries: Statistics, information theory, basic machine learning, graphical models, and traditional multivariate analysis (2 weeks)
- 01/21 (Tue): From probability theory to statistics
- Probability axioms, discrete and continuous variables
- Statistical independence and conditional independence
- Sample statistics: expectation, covariance, and correlation; uncorrelatedness vs. independence
- Central Limit Theorem and Cramer Decomposition Theorem
- Gaussian distribution
- Why is it widely assumed but rarely encountered?
- Is it a blessing or a challenge to causal discovery?
- Three ways of making use of data
- Bayes’ rule
- Statistical tests
- Maximum likelihood estimation (point estimation)
- Linear regression
- Reading: Chapter on maximum likelihood estimation of “Probability and Statistical Inference” (by R. V. Hogg, E. A. Tanis, and D. L. Zimmerman)
- 01/23 (Thu) and 01/28 (Tue): Traditional machine learning: Settings, assumptions, basic methods, and model selection
- Supervised learning
- From linear to nonlinear models
- Nonparametric models
- Bias-variance tradeoff
- Model selection
- Unsupervised learning
- Two ways to “simplify” data
- Assumptions underlying clustering
- 01/30 (Thu): Multivariate analysis: Goals, techniques, and connections to causal discovery
- Why multivariate analysis
- Principal component analysis (PCA)
- Factor analysis (is it useful for causal discovery?)
- Independent component analysis (ICA): Linear and nonlinear cases
- The (imprecise) connection between multivariate analysis methods with causal analysis
- 02/04 (Tue): Graphical models, d-separation, and representation of causal relations
- Graphical models
- d-separation
- Markov conditions
- Causal graphical models
- Reading: OLI “Causal and statistical reasoning,” Module 13: CMU users sign in here
Part III. Identification of causal effects & counterfactual reasoning (1 week)
- 02/06 (Thu): Identifiability & identification of causal effects, the potential outcome framework, and graphical identifiability criteria
- Interventions and causal effects
- Potential outcome framework vs. graphical criteria
- Controlling confounding bias: back-door and front-door criteria
- Unification of the criteria for causal effect identification
- Nonparametric vs. parametric cases
- Propensity score and its applications
- New machine learning methods for causal effect estimation
- Reading: Pages 65-78 of Pearl’s book
- 02/11 (Tue): Counterfactual reasoning
- Counterfactual reasoning vs. traditional prediction
- Methods for counterfactual reasoning
- Three-step procedure and limitations
- Recent developments: Nonparametric counterfactuals & natural counterfactuals
- Reading: 1. Pages 78-89 of Pearl’s book; 2. “Natural counterfactuals” (by Hao et al.), NeurIPS 2024
- Assignment 1 released
Part IV. Traditional approaches to causal discovery: “Independence” in causal models, constraint- and score-based causal discovery (1 week)
- 02/13 (Thu): “Independence” in causal models & constraint-based causal discovery
- “Independence” implied by causal models: general ideas
- “Independence” instantiation 1: conditional independence for causal discovery
- PC algorithm for causal discovery
- FCI algorithm for causal discovery
- GES algorithm for causal discovery
- Demonstration: Using causal-learn or TETRAD for causal analysis
- Reading: 1. Chapters 5.4.1 & 5.4.2 of the SGS book; 2. Chapter 6.7 of the SGS book
Part V. Functional causal model-based approaches to causal discovery: Linear, non-Gaussian methods and beyond (2.5 weeks)
- 02/18 (Tue): Linear, non-Gaussian, acyclic causal models (LiNGAM)
- Structural equation models and independence noise (“Independence” instantiation 2)
- LiNGAM: identifiability and identification of the causal model
- Reading: “A linear non-Gaussian acyclic model for causal discovery” (by Shimizu et al.), Journal of Machine Learning Research, 2006
- 02/20 (Thu): Estimating Linear, non-Gaussian, acyclic causal models (LiNGAM) with ICA and other methods
- ICA and its relation to LiNGAM
- Estimating LiNGAM with independent noise condition
- 02/25 (Tue): Traditional causal discovery in the presence of confounders
- Linear causal discovery in the presence of confounders: What is identifiable and how to estimate it?
- Reading: 1. “Estimation of causal effects using linear nonGaussian causal models with hidden variables” (by Hoyer et al.), International Journal of Approximate Reasoning, 2008; 2. “Learning Linear Non-Gaussian Causal Models in the Presence of Latent Variables” (by Salehkaleybar et al.), Journal of Machine Learning Research (JMLR), 2020
- Discussion: What do you think of causal discovery and those methods?
- 02/27 (Thu): Discovery of cyclic causal models (causal models with feedback)
- Interpretation of feedback in causal representations
- Linear causal discovery in the presence of feedback
- Reading: “Discovering cyclic causal models by independent components analysis” (by Lacerda et al.), UAI 2008
- Assignment 2 released
- 03/11 (Tue): Nonlinear causal models & Independent causal mechanism for causal discovery
- Post-nonlinear causal models
- Nonlinear additive noise models
- Estimation of nonlinear causal models
- “Independent nonlinear mechanism” for causal discovery in deterministic cases
- “Independent mechanism” in linear, high-dimensional case
- Reading: Pages 11-18 of “Causal discovery and inference: concepts and recent methodological advances” (By Spirtes & Zhang), Applied Informatics, 2016
Part VI. Practical issues in causal discovery (1 week)
- 03/13 (Thu): Causal discovery in the presence of nonstationary causal models, mixed types of variables, and selection bias
- Modeling causal processes with continuous/discrete variables
- Causal discovery and visualization of nonstationary causal models
- Effects of different types of selection bias
- Causal discovery in the presence of selection bias
- Reading: 1. “Causal discovery from heterogeneous/nonstationary data” (by Huang et al.) JMLR 2020; 2. “On the Identifiability and Estimation of Functional Causal Models in the Presence of Outcome-Dependent Selection” (by Zhang et al.), UAI 2016; 3. “Detecting and Identifying Selection Structure in Sequential Data” (by Zheng et al.), ICML 2024
- 03/18 (Tue): Handling measurement error, missing values, and time series
- Causal discovery in the presence of measurement error
- Missing data as a causal problem & causal discovery in the presence of missing data
- Granger causality and its relation to constraint-based causal discovery
- Structural equal models for causal discovery from time series: Granger causality with instantaneous effects, causal discovery from subsampled data, causal discovery from partially observed processes
- Reading: 1. “Causal discovery in the presence of measurement error: Identifiability conditions” (by Zhang et al.), UAI 2018; 2. “Causal discovery in the presence of missing data” (by Tu et al.), AISTATS 2019; 3. “Testing for causality: a personal viewpoint” (by Granger), J. Econ. Dyn. Control, 1980
- Reading: Idea of Identifiability establishment: A Linear, Non-Gaussian Case
Part VII. Causal representation learning (CRL) (2 weeks)
- 03/20 (Thu): CRL in the IID case: Benefits from functional constraints or sparsity
- Estimation of latent variables and their causal relations
- Tetrad conditions
- Rank deficiency and Rank-based Latent Causal Discovery (RLCD)
- Generalized Independent Noise (GIN) conditions
- Sparsity constraints for identifiability of nonlinear ICA
- Reading: 1. “A Versatile Causal Discovery Framework to Allow Causally-Related Hidden Variables” (by Dong et al.), ICLR 2024; 2. “Generalized Independent Noise Condition for Estimating Linear Non-Gaussian Latent Variable Causal Graphs” (by Xie et al.), NeurIPS 2020; 3. “Generalizing Nonlinear ICA Beyond Structural Sparsity” (by Zheng & Zhang), NeurIPS 2023
- 03/25 (Tue): CRL from changes: Benefits from multiple distributions
- Nonlinear ICA with surrogate variables
- Partial disentanglement with identifiable changing components
- General, nonparametric case
- Minimal change principle
- Reading: 1. “Nonlinear ICA using auxiliary variables and generalized contrastive learning” (by Hyvärinen et al.), AISTATS 2019; 2. “Partial disentanglement for domain adaptation” (by Kong et al.), ICML 2022; 3. “Causal Representation Learning from Multiple Distributions: A General Setting” (by Zhang et al.), ICML 2024
- 03/27 (Thu): CRL from temporal data
- Why temporal information helps
- Temporal disentanglement
- With instantaneous relations
- Reading: 1. “Temporally Disentangled Representation Learning” (by Yao et al.), NeurIPS 2022; 2. “On the Identification of Temporally Causal Representation with Instantaneous Dependence” (by Li et al.), arxiv 2024
- Assignment 3 released
- 04/01 (Tue): Real problems of CRL
- Psychometric studies
- Multi-model causal discovery with latent variables
- Refined (causal) CLIP model
- Reading: 1. “Multi-domain image generation and translation with identifiability guarantees” (by Xie et al.), ICLR 2023; 2. A paper on refined causal CLIP model will be shared later
Part VIII. Causal view for machine learning and artificial intelligence (2 weeks)
- 04/08 (Tue): Transfer learning and image translation
- A picture of machine learning, especially deep learning
- Transfer learning
- Image-to-image translation: A causal perspective
- Reading: 1."Domain Adaptation as a Problem of Inference on Graphical Models” (by Zhang et al.), NeurIPS 2020; 2. “Multi-domain image generation and translation with identifiability guarantees” (by Xie et al.), ICLR 2023 3. A paper on general settings of domain adaptation will be shared later
- Discussion: How is general-purpose AI connected to causality and how to achieve it?
- 04/10 (Thu): Semi-supervised learning, reinforcement learning, and large models
- Semi-supervised learning
- Reinforcement learning: learning and using causal representations
- Causality of/for large models
- Reading: 1. “On causal and anticausal learning” (by Schölkopf et al.), ICML 2012; 2. "AdaRL: What, Where, and How to Adapt in Transfer Reinforcement Learning" (by Huang et al.), ICLR 2022; 3. A paper on causality of/for large models will be provided later, consider the rapid evolution of the field
- 04/15 (Tue): Unsupervised deep learning, deep generative models, and causal generative AI
- Adversarial vulnerability
- How causality helps in unsupervised deep learning
- Disentanglement
- Autoencoder, Generative Adversarial Networks (GANs), and Stable diffusion models
- How causal learning benefits generative AI
- Reading: 1. “Generative Adversarial Nets” (by Goodfellow et al.), NIPS 2014; 2. “Nonlinear ICA using auxiliary variables and generalized contrastive learning” (by Hyvärinen et al.), AISTATS 2019; 3. “Causal Compositional Image Generation with Minimal Change” (by Xie et al.), arxiv 2024
- Assignment 4 released
- 04/17 (Thu): Fairness in machine learning
- Group-level and individual-level fairness
- Evaluating fairness vs. achieving fairness
- Procedural fairness
- Reading: to be provided
Part IX. Real applications of causal discovery, review, and outlook (1 week)
- 04/22 (Tue): Applications of causal discovery and causal representation learning
- Causal analysis in neuroscience (especially in brain network discovery from fMRI)
- Causal analysis in finance and biology
- Real data sets for causal discovery
- Discussion: causality vs. selection
- 04/24 (Wed): Review and Outlook
- Review: Why causality? How do we find & make use of causality?
- How to achieve automated scientific discovery (generation of causal hypotheses followed by verifications)?
- Discussion:
- Can we avoid using “causality”?
- Causality in the era of large models?
- Causality facilitates the 2nd scientific revolution?
Final project
Topics
Participants are encouraged to present your own causality-related problems or data sets for the final project (if they prefer not to do the problem sets nor write an essay). Alternatively, you can choose one from the following topics (reading materials will be provided to you). Please determine the topic together with the instructor by 02/28.
- Domain specific causal discovery (e.g., for fMRI, MEG, economic data, financial data, and climate data)
- Causal perspective of multi-task learning
- Causal treatment of out-of-distribution prediction
- Prediction in nonstationary environments: Invariant representations or the ability to adapt?
- Understanding “feedback” in directed graphical causal representations
- Causality and heterogeneity in learning
- Causality and transfer learning with different feature spaces: Why and how does causal knowledge help?
- Causal discovery and complexity measures
- Causal discovery in the presence of confounders from distribution changes
- Towards “universal” causal discovery
- Creativity by causal knowledge integration and counterfactual reasoning
- Causal links vs. associations between genes, traits, and disease (c.f. "Integrative analysis of 111 reference human epigenomes," available at http://www.nature.com/nature/journal/v518/n7539/full/nature14248.html)
- Causal analysis in stock market and interpretation of the causal relations
- Causality in climate analysis (e.g., prediction and understanding of El Niño; c.f. the “Azimuth El Niño Project”)
- Causality and prediction in nonstationary environments (e.g., how to improve the performance of Google Flu Trends; c.f. http://gking.harvard.edu/publications/parable-google-flu%C2%A0traps-big-data-analysis)
- Finding causal knowledge and using it for crime control
- Causality-based computational social science
- Large language models and causality
Requirements
Students are expected to present the selected problems, make progress on the topics, and summarize their achievements. You may complete them alone or in two-person groups.
Key dates
- 02/28: The last day to decide on the topic of your project/essay.
- 03/04: The last day to send the instructor a short (half-page) description of the proposed final project/essay & a short (one-page) description of your initial ideas as to how to tackle the problem (due: 11:59 p.m.).
- 05/02: The final version of the project report/essay (due: 11:59 p.m.).
Additional information
To students
Take care of yourself. Do your best to maintain a healthy lifestyle this semester by eating well, exercising properly, avoiding drugs and alcohol, getting enough sleep and taking some time to relax. This will help you achieve your goals and cope with stress.
All of us benefit from support during times of struggle. You are not alone. There are many helpful resources available on campus and an important part of the college experience is learning how to ask for help. Asking for support sooner rather than later is often helpful.
If you or anyone you know experiences any academic stress, difficult life events, or feelings like anxiety or depression, we strongly encourage you to seek support. Counseling and Psychological Services (CaPS) is here to help: call 412-268-2922 and visit their website at http://www.cmu.edu/counseling/. Consider reaching out to a friend, faculty or family member you trust for help getting connected to the support that can help.
Diversity statement
We must treat every individual with respect. We are diverse in many ways, and this diversity is fundamental to building and maintaining an equitable and inclusive campus community. Diversity can refer to multiple ways that we identify ourselves, including but not limited to race, color, national origin, language, sex, disability, age, sexual orientation, gender identity, religion, creed, ancestry, belief, veteran status, or genetic information. Each of these diverse identities, along with many others not mentioned here, shape the perspectives our students, faculty, and staff bring to our campus. We, at CMU, will work to promote diversity, equity and inclusion not only because diversity fuels excellence and innovation, but because we want to pursue justice. We acknowledge our imperfections while we also fully commit to the work, inside and outside of our classrooms, of building and sustaining a campus community that increasingly embraces these core values.
Each of us is responsible for creating a safer, more inclusive environment.
Unfortunately, incidents of bias or discrimination do occur, whether intentional or unintentional. They contribute to creating an unwelcoming environment for individuals and groups at the university. Therefore, the university encourages anyone who experiences or observes unfair or hostile treatment on the basis of identity to speak out for justice and support, within the moment of the incident or after the incident has passed. Anyone can share these experiences using the following resources:
- Center for Student Diversity and Inclusion: csdi@andrew.cmu.edu, (412) 268-2150
- Report-It online anonymous reporting platform: reportit.net username: tartans password: plaid
All reports will be documented and deliberated to determine if there should be any following actions. Regardless of incident type, the university will use all shared experiences to transform our campus climate to be more equitable and just.
Course policies
Remember: If you registered for this class, you have until March 31st to change your grade in this course from a letter grade to a Pass/Fail grade.
Cheating and plagiarism
It is the responsibility of each student to be aware of the university policies on academic integrity, including the policies on cheating and plagiarism. This information is available at http://www.cmu.edu/academic-integrity.
Disability
If you have a disability and have an accommodations letter from the Disability Resources office, I encourage you to discuss your accommodations and needs with me as early in the semester as possible. I will work with you to ensure that accommodations are provided as appropriate. If you suspect that you may have a disability and would benefit from accommodations but are not yet registered with the Office of Disability Resources, I encourage you to contact them at access@andrew.cmu.edu.
Student well-being
Take care of yourself. Do your best to maintain a healthy lifestyle this semester by eating well, exercising, avoiding drugs and alcohol, getting enough sleep and taking some time to relax. This will help you achieve your goals and cope with stress.
All of us benefit from support during times of struggle. There are many helpful resources available on campus and an important part of the college experience is learning how to ask for help. Asking for support sooner rather than later is almost always helpful.
If you or anyone you know experiences any academic stress, difficult life events, or feelings like anxiety or depression, we strongly encourage you to seek support. Counseling and Psychological Services (CaPS) is here to help: call 412-268-2922 and visit their website at http://www.cmu.edu/counseling/. Consider reaching out to a friend, faculty or family member you trust for help getting connected to the support that can help.
Template from CMU 10-701 Course. Many thanks to the original creators!