Language Processing

CS 562/662 Natural Language Processing

This course, which is sometimes referred to as Computational Linguistics, covers key models and algorithms that are used for automatic processing of natural language text. In natural language processing (NLP) tasks, inputs are word sequences and the outputs consist of linguistic annotations to those sequences. These annotations are crucial for downstream applications like automatic speech recognition, machine translation, information extraction, and question answering. Students in this course will implement key parts of a natural language pipeline, including text normalization, part-of-speech tagging, named entity recognition, parsing, topic modeling, and sentiment analysis.

Prerequisite: CS/EE 555/655 Analyzing Sequences


CS 550/650 Dialogue

Spoken dialogue systems are already being deployed to help people find out flight information, trade stock, access email, and check traffic conditions. With the continuing advancements in speech technology, more information and services will become readily available.

This course teaches the fundamentals of spoken dialogue systems. Spoken dialogue systems include components for speech recognition, parsing, semantic interpretation, dialogue management, text generation, speech synthesis, and agent architecture. The course is organized in terms of 3 frameworks for dialogue management: finite-state machines, form-filling, and speech-act reasoning. We examine how speech recognition, parsing, and semantic interpretation fit into each framework. We will also contrast hand-crafting a dialogue manager with using machine learning, namely reinforcement learning.
Course Website


CS 635 Information Retrieval

Information Retrieval Methodologies & Applications covers the various computational and methodological aspects of search, both over textual data as well as multimedia data. The course format will consist of a combination of lectures and student-led presentations of primary source articles from the scientific literature on the subject (both classic and modern papers will be featured). The course will involve substantial reading (2-4 papers or book chapters per week), and will feature regular assignments giving students hands-on experience at building information retrieval systems as well as working with standard off-the-shelf software tools. The course will also include a final project in which students will be expected to solve an IR problem of their choice (e.g., implementing an algorithm from a paper), and evaluate their solution.

In addition to computational techniques, the course will include in-depth discussion of evaluation methodologies, including both classical frequentist approaches as well as recent Bayesian approaches to IR evaluation. Rounding things out, the course will feature occasional guest lectures from professionals working in IR.