Saadia Gabriel

Hi! I'm Saadia Gabriel (she/her).

I am an Assistant Professor of Computer Science at the UCLA Samueli School of Engineering. I'm a co-director of UCLA NLP and affiliated with the Bunche Center for African-American Studies. Previously, I was a NYU Faculty Fellow and MIT CSAIL Postdoctoral Fellow working with the wonderful Prof. Marzyeh Ghassemi. I received my PhD from the Paul G. Allen School of Computer Science & Engineering at the University of Washington. I was very fortunate to be advised by Prof. Yejin Choi and Prof. Franziska Roesner. My work focuses on measuring factuality, intent and potential harm of human-written language. At UCLA, I run the Misinformation, AI and Responsible Society (MARS) Lab. You can find out more about my research agenda here. I serve as an area chair for *ACL, ICLR, COLM and ICML. Sadly I get too many emails to respond to them individually, but you can fill out this form if you are an undergrad or master's student at UCLA interested in collaborating with us.

New:

Early 2026: Invited talk at UW NLP seminar.
June 2026: Invited speaker at 7th Annual Conference on Health, Inference, and Learning (CHIL 2026).
March 2026: Invited talk at UC Riverside.
January 2026: Invited talk at UToronto SRI seminar.
December 2025: Salman's work on AI debate for factuality assessment will appear at NeurIPS 2025 ✨.
November 2025: Ashima's work on ModelCitizens and Genglin's work on MOSAIC have been accepted to the EMNLP main conference!
October 2025: Excited to have our work funded by a second grant from the UCLA Initative to Study Hate!
October 2025: I'll be speaking at LA Tech Week.
October 2025: Check out our COLM 2025 paper on X-teaming! I also have invited talks at the workshops on Social Simulations with LLMs and Visions of Language Modeling.
June 2025: Excited to have my research group's work supported by 2025 Google Research Scholar and ML & Systems Junior Faculty awards.
June 2025: Invited speaker at RSS 2025 workshop on Reliable Robotics: Safety and Security in the Face of GenAI.
April 2025: Lots of UCLA events: invited talk at UCLA DataX+IDRE AI Symposium, UCLA Law panel on Ethics of Nonprofits and AI and UCLA Women & Philanthropy panel on Hate and Social Media.
April 2025: Invited talk at Conference for Emerging Black Academics in STEM (Caltech).
February 2025: Invited talk at Stanford NLP Seminar.
February 2025: Invited talk at IASEAI in Paris.
November 2023: Honored to be named on Forbes 30 under 30 list in Science.
November 2023: Invited talk at NYU CDS Seminar.
November 2023: Guest lecture in NLP at MIT.
November 2023: Invited talk at Northeastern.
November 2023: Presenting at NYU-KAIST Inclusive AI Workshop.
October 2023: Invited talk at Mount Holyoke College.
October 2023: Guest lecture on AI Ethics at Oakton College.
September 2023: Co-teaching my first class as a professor (NYU Data Science Capstone).
September 2023: Thank you to MIT (Generative AI Impact Award) and Cohere for $61,000 of grant support over the summer. I look forward to discussing the funded projects!
August 2023: New paper on LLMs for mental health prediction.
August 2023: New paper and dataset (Socratis) exploring capabilities of multimodal models for understanding emotional reactions to images.
June 2023: Panelist at CHIL 2023 on LLMs for healthcare.
June 2023: Talk at Spotify NYC.
April 2023: Invited talks at UCLA, MIT and Princeton.
March 2023: Guest lectures at the University of Washington (Undergraduate NLP, CSE 447) and Carnegie Mellon University (Computational Ethics, CS 11-830).
March 2023: Invited talks at the University of Chicago, Northeastern and Cornell.
February 2023: Invited talks at the University of Pittsburgh, University of Michigan, UMass Amherst, Boston University and Johns Hopkins.
January 2023: Invited talks at Heriot-Watt and Emory.
October 2022: New paper on testing robustness of NLI and hate speech classifiers with generated adversaries accepted to EMNLP Findings!
August 2022: Guest lecture in UW Intro to Machine Learning course (CSE 416).
July 2022: Named an outstanding reviewer for NAACL 2022.
July 2022: Socio-Cultural Inclusion co-chair for NAACL 2022.
May 2022: Our team's proposal to investigate misinformation and social biases will be part of a new TACC high-performance computing program initative.
April 2022: Invited talk at Cornell JEDI dialogues seminar.
February 2022: Two papers accepted to ACL 2022 main conference!
February 2022: Darpa Semafor keynote talk on Misinfo Reaction Frames.
December 2021: Invited talk at Stanford NLP seminar.
October 2021: Presenting at MIT EECS Rising Stars Workshop.
July 2021: Co-organizing Safety for E2E Conversational AI at SIGDIAL 2021.
May 2021: Work on evaluating effectiveness of factuality metrics for summarization (GO FIGURE) accepted to ACL 2021 Findings!
April 2021: New preprint on defending against misinformation.
January 2021: Invited talk at UMass Amherst Rising Stars Seminar.
December 2020: Paragraph-level Commonsense Transformers accepted to AAAI 2021.
Presenting at NeurIPS 2020 Resistance AI Workshop.
October 2020: Presented on Social and Power Implications of Language at UW colloquium.
September 2020: Presented on summarization with cooperative generator-discriminator networks and detection of implicit social biases in text at BBN Technologies.
July 2020: Presented as part of Voice Tech Global panel on implicit bias towards the Black community and conversational AI.

Preprints

The African Languages Lab: A Collaborative Approach to Advancing Low-Resource African NLP
Issaka et al.
Arxiv 2025.
[Preprint]

OpenThoughts: Data Recipes for Reasoning Models
Guha et al.
Arxiv 2025.
[Preprint]

Disparities in LLM Reasoning Accuracy and Explanations: A Case Study on African American English
Runtao Zhou, Guangya Wan, Saadia Gabriel, Sheng Li, Alexander J Gates, Maarten Sap, Thomas Hartvigsen.
Arxiv 2025.
[Preprint]

Generalization in Healthcare AI: Evaluation of a Clinical Large Language Model
Salman Rahman, Lavender Yao Jiang, Saadia Gabriel, Yindalon Aphinyanaphongs, Eric Karl Oermann, Rumi Chunara.
Arxiv 2024.
[Preprint]

Journal Papers

Investigating machine moral judgement through the Delphi experiment
Liwei Jiang, Jena D. Hwang, Chandra Bhagavatula, Ronan Le Bras, Jenny Liang, Sydney Levine, Jesse Dodge, Keisuke Sakaguchi, Maxwell Forbes, Jack Hessel, Jon Borchardt, Taylor Sorensen, Saadia Gabriel, Yulia Tsvetkov, Oren Etzioni, Maarten Sap, Regina Rini, Yejin Choi.
Nature Machine Intelligence 2025.
[Paper]

Mental-LLM: Leveraging Large Language Models for Mental Health Prediction via Online Text Data
Xuhai Xu, Bingsheng Yao, Yuanzhe Dong, Saadia Gabriel, Hong Yu, James Hendler, Marzyeh Ghassemi, Anind K. Dey, Dakuo Wang.
IMWUT 2024.
[Preprint]

Conference Papers

AI Debate Aids Assessment of Controversial Claims
Salman Rahman, Sheriff Issaka, Ashima Suvarna, Genglin Liu, James Shiffer, Jaeyoung Lee, Md Rizwan Parvez, Hamid Palangi, Shi Feng, Nanyun Peng, Yejin Choi, Julian Michael, Liwei Jiang, Saadia Gabriel.
NeurIPS 2025.
[Preprint].

ModelCitizens: Representing Community Voices in Online Safety
Ashima Suvarna, Christina Chance, Karolina Naranjo, Hamid Palangi, Sophie Hao, Thomas Hartvigsen, Saadia Gabriel.
EMNLP 2025.
[Preprint].

MOSAIC: Modeling Social AI for Content Dissemination and Regulation in Multi-Agent Simulations
Genglin Liu, Vivian Le, Salman Rahman, Elisa Kreiss, Marzyeh Ghassemi, Saadia Gabriel.
EMNLP 2025.
[Preprint]. To be presented at the ICML 2025 MAS workshop.

X-Teaming: Multi-Turn Jailbreaks and Defenses with Adaptive Multi-Agents
Salman Rahman, Liwei Jiang, James Shiffer, Genglin Liu, Sheriff Issaka, Md Rizwan Parvez, Hamid Palangi, Kai-Wei Chang, Yejin Choi, Saadia Gabriel.
COLM 2025.
[Preprint]. To be presented at the ICML 2025 MAS workshop.

Mind the Gesture: Evaluating AI Sensitivity to Culturally Offensive Non-Verbal Gestures
Akhila Yerukola, Saadia Gabriel, Nanyun Peng, Maarten Sap.
ACL 2025.
[Preprint]

MisinfoEval: Generative AI in the Era of "Alternative Facts"
Saadia Gabriel, Liang Lyu, James Siderius, Marzyeh Ghassemi, Jacob Andreas, Asu Ozdaglar.
EMNLP 2024.
[Paper][Talk][Environment/Code/Data]. MIT Generative AI Impact Award.
Press: MIT News

Can AI Relate: Testing Large Language Model Response for Mental Health Support
Saadia Gabriel, Isha Puri, Xuhai Xu, Matteo Malgaroli, Marzyeh Ghassemi.
EMNLP 2024 Findings.
[Paper][Talk][Code/Data]. An earlier version appeared in this working paper.
Press: MIT News

How to Train Your Fact Verifier: Knowledge Transfer with Multimodal Open Models
Jaeyoung Lee, Ximing Lu, Jack Hessel, Faeze Brahman, Youngjae Yu, Yonatan Bisk, Yejin Choi, Saadia Gabriel.
EMNLP 2024 Findings.
[Preprint].

NaturalAdversaries: Can Naturalistic Adversaries Be as Effective as Artificial Adversaries?
Saadia Gabriel, Hamid Palangi, Yejin Choi.
EMNLP 2022 Findings.
[Paper]

Misinfo Reaction Frames: Reasoning about Readers’ Reactions to News Headlines
Saadia Gabriel, Skyler Hallinan, Maarten Sap, Pemi Nguyen, Franziska Roesner, Eunsol Choi, Yejin Choi.
ACL 2022.
[Paper][Data/Models]

ToxiGen: A Large-Scale Machine-Generated Dataset for Adversarial and Implicit Hate Speech Detection
Thomas Hartvigsen, Saadia Gabriel, Hamid Palangi, Maarten Sap, Dipankar Ray, Ece Kamar.
ACL 2022.
[Paper][Data/Models]
Press: TechCrunch, Microsoft Research Blog

GO FIGURE: A Meta Evaluation of Factuality in Summarization
Saadia Gabriel, Asli Celikyilmaz, Rahul Jha, Yejin Choi, Jianfeng Gao.
ACL 2021 Findings.
[Paper]

Discourse Understanding and Factual Consistency in Abstractive Summarization
Saadia Gabriel, Antoine Bosselut, Jeff Da, Ari Holtzman, Jan Buys, Kyle Lo, Asli Celikyilmaz, Yejin Choi.
EACL 2021.
[Paper]

Paragraph-level Commonsense Transformers with Recurrent Memory
Saadia Gabriel, Chandra Bhagavatula, Vered Shwartz, Ronan Le Bras, Maxwell Forbes, Yejin Choi.
AAAI 2021.
[Paper] [Project Page]

Social Bias Frames: Reasoning about Social and Power Implications of Language
Maarten Sap, Saadia Gabriel, Lianhui Qin, Dan Jurafsky, Noah A. Smith, Yejin Choi.
ACL 2020.
Also presented at West Coast NLP Summit (WeCNLP) 2020 and awarded Best Paper.
[Paper] [Data]

Detecting and Tracking Communal Bird Roosts in Weather Radar Data
Zezhou Cheng, Saadia Gabriel, Pankaj Bhambhani, Daniel Sheldon, Subhransu Maji, Andrew Laughlin, David Winkler.
AAAI 2020.
[Paper]

The Risk of Racial Bias in Hate Speech Detection
Maarten Sap, Dallas Card, Saadia Gabriel, Yejin Choi, Noah A. Smith.
ACL 2019. Best Paper Nominee.
[Paper]
Press: Forbes, Vox, Observer, Fortune, TechCrunch

MathQA: Towards Interpretable Math Word Problem Solving with Operation-Based Formalisms
Aida Amini, Saadia Gabriel, Shanchuan Lin, Rik Koncel-Kedziorski, Yejin Choi, Hannaneh Hajishirzi.
NAACL 2019.
[Paper] [Data]

Early Fusion for Goal Directed Robotic Vision
Aaron Walsman, Yonatan Bisk, Saadia Gabriel, Dipendra Misra, Yoav Artzi, Yejin Choi, Dieter Fox.
IROS 2019. Best Paper Nominee.
[Paper]

Workshop Papers

Socratis: Are large multimodal models emotionally aware?
Katherine Deng, Arijit Ray, Reuben Tan, Saadia Gabriel, Bryan Plummer, Kate Saenko.
ICCV 2023 WECEIA.
[Paper]