About the Course

See the syllabus for details. This topics course is designed for graduate students who are interested in the emerging area of AI security and trustworthy machine learning. Modern machine learning systems are increasingly deployed in high-stakes applications, making it essential to understand their vulnerabilities, limitations, and the principles needed to ensure reliability. The goal of this course is to help students develop a solid conceptual and practical foundation in the security and trustworthiness aspects of machine learning by studying core threat models, analyzing state-of-the-art research papers, and working on research-oriented projects.

Course Material

Date Topic Paper Link Slides
Jan. 08 Logistics and Overview S1 S2
Jan. 13 MLP and CNN S1 S2
Jan. 15 RNN S1 S2
Jan. 20 Transformer S1 S2
Jan. 22 Travel (Class Cancel)
Jan. 27 Generative models P1 P2 S1 S2
Jan. 29 LLM S1
Feb. 03 Backdoor attack P1 P2
Feb. 05 Backdoor defense P1 P2
Feb. 10 Federated learning P1 P2
Feb. 12 Adversarial attack P1 P2
Feb. 17 Project Proposal Discussion
Feb. 19 Project Proposal Discussion
Feb. 24 Adversarial defense P1 P2
Feb. 26 Privacy attacks P1 P2
Mar. 03 Explainable AI P1 P2
Mar. 05 Fairness P1 P2
Mar. 10 Watermark LLM P1 P2
Mar. 12 Jailbreak attack P1 P2
Mar. 17 Spring Break
Mar. 19 Spring Break
Mar. 24 Midpoint Project Review
Mar. 26 Midpoint Project Review
Mar. 31 Jailbreak defense P1 P2
Apr. 02 Well-being Day
Apr. 07 Backdoor in LLM P1 P2
Apr. 09 Backdoor defense in LLM P1 P2
Apr. 14 Hallucination P1 P2
Apr. 16 Safety alignment P1 P2
Apr. 21 Final Presentation
Apr. 23 Final Presentation

Class Participation

Use the participation record to report class participation.

Paper Presentation

Each student will present one or two research papers during the semester, depending on enrollment. You can use the paper list to sign up for your preferred papers.

Slides must be uploaded by 11:59pm the night before class on the day of your presentation. Late submissions will incur penalties.

Final Project Details

This course includes a final project in lieu of a final exam. Projects may be completed individually or in groups of two. Groups of more than two are not permitted. The final project consists of:

Group List

Please form the final project group before January 21st, and sign up using the shared spreadsheet. Please don’t modify the information of other groups.

Three Parts Including Point Values

I will meet with each student or group to discuss potential project topics. Suitable topics include, but are not limited to:

  • Conducting a careful empirical study comparing state-of-the-art methods;
  • Reproducing an influential research paper and analyzing its limitations;
  • Developing a small methodological or algorithmic extension;
  • A structured survey of a focused sub-area in trustworthy machine learning.

P1: Project Proposal (10 Points): The project proposal is limited to 2-page (excluding reference) and contains:

  • The problem you aim to address;
  • A brief review of related work;
  • The method(s) you plan to use or compare;
  • Evaluation metrics and expected outcomes;
  • Reference.

See latex template at link.

P2: Project Presentation (40 Points): Presentations will take place during the final 2 - 3 lectures of the semester. Each student or group will give a short presentation (length announced later) summarizing the problem, approach, results, and conclusions. Attendance is required for all presentations.

P3: Project Paper (50 Points): Students must submit a written final report in PDF format. The report must use the NeurIPS Latex style files and should be no more than 8 pages excluding references (there is no minimum length requirement). The report may include a discussion of possible future extensions.

Due Dates of Individual Parts

Part Description Location Due Date (Time)
P1 Project Proposal Canvas Feb. 15 (11:59PM)
Proposal Meeting Hanes 334 Feb. 17 / Feb. 19 (Lecture Time)
P2 Presentation Slides Canvas Apr. 20 / Apr. 22 (11:59PM)
Final Presentation Class Apr. 21 / Apr. 23 (Lecture time)
P3 Final Report Canvas Apr. 30 (11:59PM)

This page was last updated on 2026-01-13 11:00:18.60922 Eastern Time.