Introduction to Natural
Language Processing (NLP)

UCLA CS 162, Spring 2025

Lecture: M/W 2-3:50pm, ENGR VI MLC

Discussion: F 2-3:50pm, Fowler Museum at UCLA A139

Instructor: Saadia Gabriel
Email: skgabrie@cs.ucla.edu
Office: Eng VI 295A
Office Hours: By appointment

TA: Ashima Suvarna
Email: asuvarna31@cs.ucla.edu
Office: Eng VI 295B
Office Hours: Tuesdays 11:30-12:30pm

Course Description: Natural Language Processing (NLP) is a rapidly developing field, with recent advances of deep neural networks that revolutionize many NLP applications. This course is intended as an introduction to a wide range of NLP tasks, algorithms for effectively solving these problems (including the most recent advances of deep learning models), and methods for evaluating their performance. There will be a focus on statistical and neural network learning algorithms that train on (annotated) text corpora to automatically acquire the knowledge needed to perform the task. Class lectures will discuss general issues as well as present abstract algorithms. The homework will touch on both theoretical foundations of linguistic phenomena and implementation of the algorithms. Implemented versions of some of the algorithms will be provided in order to give a feel for how the systems discussed in class "really work" and allow for extensions and experimentation as part of the course projects.

Schedule:

Date Topic Description Assignment(s)
3/31 Week 1: Intro   We will go over the syllabus, schedule, and course expectations. There will be an overview of NLP history and applications as well as challenges of pragmatic reasoning. [Slides]
4/2 Week 1: Semantics & Pragmatics How do we represent meaning? We first go over types of word representations, similarity measures and defining relations between words. [Slides]
4/4 TA-led Discussion Linear Algebra and Calculus Review, Intro to Google Cloud and Colab
4/7 Week 2: Semantics & Pragmatics We continue our discussion of word representations, cover document-level representation, and examine the role of context in defining meaning. [Slides]
4/9 Week 2: Language Models Let's talk about ngram models. We discuss what language models are and some of the simplest forms of implementation. We also start to consider evaluation. [Slides]
4/11 TA-led Discussion Data Preparation for NLP Models, Neural Network Basics and Pytorch
4/14 Week 3: Language Models We discuss how to make our language models better using smoothing and our first NLP application (text classification). [Slides]
4/16 Week 3: Language Models We start discussing neural network based language models. [Slides]
4/18 NLP Seminar Talk Attend David Bamman's talk for extra credit.
4/21 Week 4: Language Models We discuss recurrent neural networks (RNNs). [Slides]
4/23 Week 4: Language Models We discuss transformer neural networks, which are used for many state-of-the-art language models like the one behind ChatGPT. [Slides]
4/25 TA-led Discussion Midterm Review
4/28 Week 5: Language Models We discuss large language models, including pre- and post-training. We also cover multilinguality and a second NLP application of machine translation.
4/30 Week 5: Midterm There will be an in-class exam component, details will be released a week beforehand.
5/2 TA-led Discussion Introduction to HuggingFace, Neural Network Basics and Pytorch
5/5 Week 6: Parts-of-speech and Sequence Tagging We will discuss parts of speech, hidden Markov models (HMM) and the Viterbi algorithm.
5/7 Week 6: Information Extraction and Retrieval We will start by introducing information extraction tasks like Named Entity Recognition (NER).
5/9 NLP Seminar Talk Attend Swabha Swayamdipta's talk for extra credit.
5/12 Week 7: Information Extraction and Retrieval We discuss retrieval methods, retrieval-augmented generation and a third NLP application of question-answering.
5/14 Week 7: NLP Ethics An in-depth overview of recent ethical concerns in NLP, including social biases in LLMs and AI-generated disinformation.
5/16 TA-led Discussion Midterm Solutions and Basics of Prompting LLMs
5/19 Week 8: Guest Lecture Liwei Jiang (UW) on AI Safety
5/21 Week 8: Guest Lecture Irene Chen (UC Berkeley) on LLMs in Healthcare
  • Mid-project report due at 11:59pm
5/23 TA-led Discussion Final Project Office Hours
5/26 Holiday
  • Final presentation slides due 5/27 at 11:59pm
5/28 Week 9: Final Presentations Schedule TBD
5/30 TA-led Discussion Final Review Session
6/2 Week 10: Final Presentations Schedule TBD
6/4 Week 10: Final Presentations & Conclusion Schedule TBD
  • Take-home final out
6/6 TA-led Discussion Final Review Session
  • Take-home final and final project reports due on 6/13 at 11:59pm

Resources:

Assignments and announcements will be posted on Bruin Learn. We will be using Piazza for course discussion, including online help with homework assignments.

Grading:

Detailed guidelines for assignments will be released later in the quarter.

  • Homework assignments (30%)
    • Students will individually complete 3 written and coding problem sets related to weekly topics.
  • Course Project (30%)
    • Students will form groups of 4 and build a working AI detection system that can evaluate a hidden test set of human-written and AI-generated text. The group submission should provide all code and clear instructions in order for the course staff to run the detection system on the test set. Further instructions and test set formatting details will be provided.
    • This will be graded based on a 1-page project proposal (5%), a mid-project report (5%), final in-person presentations (5%) and a final write-up (15%).
  • In-Class Midterm Exam(15%)
    • Details TBD.
  • Take-home Final Exam(20%)
    • Details TBD.
  • Participation (5%)
    • Students will be asked to submit one question at the end of every lecture (starting in week 2), which will be answered later in the course. Additionally, students will be asked to provide short, constructive feedback to their peers' final presentations that can aid in editing final project write-ups.

Course Policies:

Late Policy. Students will have 5 late days that can be used on any homework assignment (or multiple homework assignments) without penalty. Since the final project is a group assignment there are no late days, but extensions will be considered under extraordinary circumstances. Students are expected to communicate potential presentation conflicts (e.g. illness, conference travel) to the instructor in advance.

Academic Honesty. Homework assignments are expected to be completed individually and the instructor will check for similarity in responses. For all assignments, any collaborators or other sources of help should be explicitly acknowledged. Violations of academic integrity (please consult the student conduct code) will be handled based on UCLA guidelines.

Accommodations. Our goal is to have a fair and welcoming learning environment. Students should contact the instructor at the beginning of the quarter if they will need special accomodations or have any concerns.

Use of ChatGPT and Other LLM Tools. Students are expected to first draft writing without any LLMs and all ideas presented must be their own. Students may use LLMs for grammer correction and minimal editing if they add an acknowledgement of this use. Any work suspected to be entirely AI-generated will be given a grade of 0.

Acknowledgements: The syllabus, assignments and lecture slides were adapted from Nanyun (Violet) Peng's course materials.