George H. Chen

You can also find my papers listed on Google Scholar.

Some Working Papers

"Can Platform Accountability Reduce Sex Trafficking? Evidence from the Price Effect"
Helen S. Zeng, George H. Chen, Brett Danaher, Michael D. Smith
"Multi-stage Readmission and Mortality Prediction along Patient Care Pathway"
Xinyu Yao, Rema Padman, George H. Chen, Karmel S. Shehadeh, Arman Kilic
Under review

2025

"The Impact of Medication Non-adherence on Adverse Outcomes: Evidence from Schizophrenia Patients via Survival Analysis"
Shahriar Noroozizadeh, Pim Welle, Jeremy C. Weiss, George H. Chen
Accepted at the Conference on Health, Inference, and Learning (CHIL), 2025
[paper draft coming soon]

2024

"Generalized Prompt Tuning: Adapting Frozen Univariate Time Series Foundation Models for Multivariate Healthcare Time Series"
Mingzhu Liu, Angela H. Chen, George H. Chen
Machine Learning for Health (ML4H), December 2024
[arXiv] [publisher's link] [code]
(A short version of this paper not specific to healthcare was presented at the NeurIPS 2024 workshop on Time Series in the Age of Large Models)
"An Introduction to Deep Survival Analysis Models for Predicting Time-to-Event Outcomes"
George H. Chen
Foundations and Trends in Machine Learning, December 2024
[arXiv] [publisher's link] [code]
"Fairness in Survival Analysis with Distributionally Robust Optimization"
Shu Hu*, George H. Chen*
(* = equal contribution)
Journal of Machine Learning Research (JMLR), August 2024
[arXiv] [publisher's link] [code]
Note: Journal paper version of our ML4H 2022 paper—generalizes the DRO approach from our earlier paper to a wide class of survival models (such as but not limited to Cox, DeepHit, and SODEN models), adds theoretical analysis for our split DRO approach, and derives an exact DRO Cox method without sample splitting
"Neural Topic Models with Survival Supervision: Jointly Predicting Time-to-Event Outcomes and Learning How Clinical Features Relate"
George H. Chen*, Linhong Li*, Ren Zuo, Amanda Coston, Jeremy C. Weiss
(* = equal contribution)
Artificial Intelligence in Medicine, June 2024
[arXiv] [publisher's link] [code]
Note: Journal paper version of our AIME 2020 paper—fixes some model presentation glitches, includes more combinations of topic models with survival models, and has much more thorough discussion
"Survival Kernets: Scalable and Interpretable Deep Kernel Survival Analysis with an Accuracy Guarantee"
George H. Chen
Journal of Machine Learning Research (JMLR), February 2024
[arXiv (includes minor corrections)] [publisher's link] [code] [ICML poster]
(Presented at the International Conference on Machine Learning (ICML) in July 2024 as part of the journal-to-conference track)
Best paper finalist (applied track) at the INFORMS Data Mining and Decision Analytics Workshop 2022
Note: This paper is about learning a flexible survival model that is in some sense interpretable and also simultaneously learns a "kernel function" (measures the similarity between any two data points). This paper is the third in a trilogy of papers I've written on kernel survival analysis and aims to combine insights from my ICML 2019 paper (on theory for nearest neighbor and kernel Kaplan-Meier estimators) and my MLHC 2020 paper (on how to automatically learn kernel functions for kernel Kaplan-Meier estimators) as to obtain a class of scalable, interpretable, and accurate kernel Kaplan-Meier estimators, where a special case of this class of estimators has a theoretical guarantee.
"Improving Fairness in Deepfake Detection"
Yan Ju*, Shu Hu*, Shan Jia, George H. Chen, Siwei Lyu
(* = equal contribution)
IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), January 2024
[arXiv] [publisher's link]

2023

"Temporal Supervised Contrastive Learning for Modeling Patient Risk Progression"
Shahriar Noroozizadeh, Jeremy C. Weiss, George H. Chen
Machine Learning for Health (ML4H), December 2023
[arXiv] [publisher's link] [code]
(Preliminary version presented at the AAAI 2023 Workshop on Representation Learning for Responsible Human-Centric AI)
"Neurological Prognostication of Post-Cardiac-Arrest Coma Patients Using EEG Data: A Dynamic Survival Analysis Framework with Competing Risks"
Xiaobin Shen, Jonathan Elmer, George H. Chen
Machine Learning for Healthcare (MLHC), August 2023
[arXiv (includes minor corrections)] [publisher's link] [code]
"A General Framework for Visualizing Embedding Spaces of Neural Survival Analysis Models Based on Angular Information"
George H. Chen
Conference on Health, Inference, and Learning (CHIL), June 2023
[arXiv] [publisher's link] [code] [poster]
"Influence via Ethos: On the Persuasive Power of Reputation in Deliberation Online"
Emaad Manzoor, George H. Chen, Dokyun Lee, Michael D. Smith
Management Science, May 2023
[arXiv] [publisher's link] [Cornell news]
Best paper at the AAAI Workshop on AI for Behavior Change 2021

2022

"BOND: Benchmarking Unsupervised Outlier Node Detection on Static Attributed Graphs"
Kay Liu*, Yingtong Dou*, Yue Zhao*, Xueying Ding, Xiyang Hu, Ruitong Zhang, Kaize Ding, Canyu Chen, Hao Peng, Kai Shu, Lichao Sun, Jundong Li, George H. Chen, Zhihao Jia, Philip S. Yu
(* = equal contribution)
Neural Information Processing Systems (NeurIPS) (Datasets and Benchmarks track), November-December 2022
[arXiv] [code]
"Distributionally Robust Survival Analysis: A Novel Fairness Loss Without Demographics"
Shu Hu*, George H. Chen*
(* = equal contribution)
Machine Learning for Health (ML4H), November 2022
[arXiv] [publisher's link] [code]
Note: In this original conference version of the paper, we only considered the Cox model. For a substantial extension to a much wider class of survival models, an exact Cox DRO method, and theory on our split DRO approach, please see our JMLR 2024 journal paper extension.
"TOD: Tensor-Based Outlier Detection, a General GPU-Accelerated Framework"
Yue Zhao, George H. Chen, Zhihao Jia
Proceedings of the VLDB Endowment, Vol 16, No. 3, November 2022
[arXiv] [publisher's link] [code]
(Presented at the International Conference on Very Large Data Bases, August-September 2023)
"ECOD: Unsupervised Outlier Detection Using Empirical Cumulative Distribution Functions"
Zheng Li*, Yue Zhao*, Xiyang Hu, Nicola Botta, Cezar Ionescu, George H. Chen
(* = equal contribution)
IEEE Transactions on Knowledge and Data Engineering, March 2022
[arXiv] [publisher's link] [code]

2021

"Consumer Behavior in the Online Classroom: Using Video Analytics and Machine Learning to Understand the Consumption of Video Courseware"
Mi Zhou, George H. Chen, Pedro Ferreira, Michael D. Smith
Journal of Marketing Research, December 2021
[SSRN] [publisher's link]

2020

"Neural Topic Models with Survival Supervision: Jointly Predicting Time-to-Event Outcomes and Learning How Clinical Features Relate"
Linhong Li, Ren Zuo, Amanda Coston, Jeremy C. Weiss, George H. Chen
International Conference on Artificial Intelligence in Medicine (AIME), August 2020
[arXiv (journal version; fixes various bugs in the conference paper version)] [code] [talk slides]
"Predicting Mortality Risk in Viral and Unspecified Pneumonia to Assist Clinicians with COVID-19 ECMO Planning"
Helen Zhou*, Cheng Cheng*, Zachary C. Lipton, George H. Chen, Jeremy C. Weiss
(* = equal contribution)
International Conference on Artificial Intelligence in Medicine (AIME), August 2020
[arXiv] [code]
(Also presented at the International Conference on Machine Learning (ICML) Workshop on Machine Learning for Global Health, July 2020)
"Deep Kernel Survival Analysis and Subject-Specific Survival Time Prediction Intervals"
George H. Chen
Machine Learning for Healthcare (MLHC), August 2020
[arXiv] [publisher's link] [code] [poster]
Note: This paper is essentially a sequel to my theory paper on nearest neighbor and kernel survival analysis (ICML 2019), where an open problem encountered is how to automatically learn kernel functions for survival analysis aside from using random survival forests. In my follow-up JMLR 2024 paper, I show how to scale deep kernel survival analysis up to large datasets and how to establish an accuracy guarantee for a special case of the resulting estimator.

2019

"Missing Not at Random in Matrix Completion: The Effectiveness of Estimating Missingness Probabilities Under a Low Nuclear Norm Assumption"
Wei Ma*, George H. Chen*
(* = equal contribution)
Neural Information Processing Systems (NeurIPS), December 2019
[arXiv] [code] [poster] [talk slides]
Best paper (theoretical track) at the INFORMS Data Mining and Decision Analytics Workshop 2019
"Truck Traffic Monitoring with Satellite Images"
Lynn H. Kaack, George H. Chen, M. Granger Morgan
ACM Conference on Computing and Sustainable Societies (COMPASS), July 2019
[arXiv]
(Also presented at the International Conference on Machine Learning (ICML) Workshop on Climate Change, June 2019)
"Nearest Neighbor and Kernel Survival Analysis: Nonasymptotic Error Bounds and Strong Consistency Rates"
George H. Chen
International Conference on Machine Learning (ICML), June 2019
[arXiv (includes minor corrections)] [publisher's link] [code] [talk slides] [poster]
Note: I wrote two follow-up papers; see the notes for my MLHC 2020 paper and my JMLR 2024 paper.
"An Interpretable Produce Price Forecasting System for Small Farmers in India using Collaborative Filtering and Adaptive Nearest Neighbors"
Wei Ma, Kendall Nowocin, Niraj Marathe, George H. Chen
Information and Communication Technologies and Development (ICTD), January 2019
[arXiv]

2018

"Explaining the Success of Nearest Neighbor Methods in Prediction"
George H. Chen, Devavrat Shah
Foundations and Trends in Machine Learning, May 2018
[arXiv (includes some corrections)] [publisher's link]

2017

"Survival-Supervised Topic Modeling with Anchor Words: Characterizing Pancreatitis Outcomes"
George H. Chen, Jeremy C. Weiss
Neural Information Processing Systems (NeurIPS) Workshop on Machine Learning for Health (ML4H), December 2017
[arXiv (short workshop version)]
(Also presented at the Society for Medical Decision Making North American Meeting, October 2017)
"Toward Reducing Crop Spoilage and Increasing Small Farmer Profits in India: a Simultaneous Hardware and Software Solution"
George H. Chen, Kendall Nowocin, Niraj Marathe
Information and Communication Technologies and Development, November 2017
[arXiv]

2015

"A Latent Source Model for Patch-Based Image Segmentation"
George H. Chen, Devavrat Shah, Polina Golland
Medical Image Computing and Computer-Assisted Intervention (MICCAI), October 2015
[arXiv] [paper] [poster]
Note: For a more comprehensive exposition of this paper, consider reading Chapter 5 of my Ph.D. thesis.
"Latent Source Models for Nonparametric Inference"
George H. Chen
Ph.D. thesis, MIT, May 2015
[paper]
Received the George M. Sprowls award for best Ph.D. thesis in Computer Science at MIT
"Targeting Villages for Rural Development Using Satellite Image Analysis"
Kush R. Varshney, George H. Chen, Brian Abelson, Kendall Nowocin, Vivek Sakhrani, Ling Xu, Brian L. Spatocco
Big Data, March 2015
[paper]

2014

"A Latent Source Model for Online Collaborative Filtering"
(alphabetical author ordering) Guy Bresler, George H. Chen, Devavrat Shah
Neural Information Processing Systems (NeurIPS), December 2014
[arXiv - longer version] [paper - short conference version] [poster]
Selected as a spotlight (one of 62/1678 submissions)
Note: An expanded version including intuition for how collaborative filtering relates to an MAP item recommender and derivations for the examples is in Chapter 4 of my Ph.D. thesis; the notation has also been changed to be more similar to the other two papers that went toward my thesis.

2013

"A Latent Source Model for Nonparametric Time Series Classification"
(alphabetical author ordering) George H. Chen, Stanislav Nikolov, Devavrat Shah
Neural Information Processing Systems (NeurIPS), December 2013
[arXiv - longer version] [paper - short conference version] [poster]
Note: An expanded version with a lower bound on the misclassification rate and further discussion is in Chapter 3 of my Ph.D. thesis.
"Sparse Projections of Medical Images onto Manifolds"
George H. Chen, Christian Wachinger, Polina Golland
Information Processing in Medical Imaging (IPMI), June-July 2013
[arXiv] [paper] [poster]

2012

"Deformation-Invariant Sparse Coding"
George H. Chen
Master's thesis, MIT, May 2012
[paper] [poster]

2011

"Deformation-Invariant Sparse Coding for Modeling Spatial Variability of Functional Patterns in the Brain"
George H. Chen, Evelina G. Fedorenko, Nancy G. Kanwisher, Polina Golland
Neural Information Processing Systems (NeurIPS) Workshop on Machine Learning and Interpretation in Neuroimaging, December 2011
[paper] [talk slides]

2010

"Indoor Localization and Visualization Using a Human-Operated Backpack System"
Timothy Liu, Matthew Carlberg, George Chen, Jacky Chen, John Kua, Avideh Zakhor
International Conference on Indoor Positioning and Indoor Navigation (IPIN), September 2010
[paper]
"Indoor Localization Algorithms for a Human-Operated Backpack System"
George Chen, John Kua, Stephen Shum, Nikhil Naikal, Matthew Carlberg, Avideh Zakhor
International Symposium on 3D Data Processing, Visualization and Transmission (3DPVT), May 2010
[paper]

2009

"Classifying Urban Landscape in Aerial LIDAR Using 3D Shape Analysis"
Matthew Carlberg, Peiran Gao, George Chen, Avideh Zakhor
International Conference on Image Processing (ICIP), November 2009
[paper]
"2D Tree Detection in Large Urban Landscapes Using Aerial LIDAR Data"
George Chen, Avideh Zakhor
International Conference on Image Processing (ICIP), November 2009
[paper]
"Image Augmented Laser Scan Matching for Indoor Dead Reckoning"
Nikhil Naikal, John Kua, George Chen, Avideh Zakhor
International Conference on Intelligent Robots and Systems (IROS), October 2009
[paper]