Language Processing

CS 562/662 Natural Language Processing

This course, which is sometimes referred to as Computational Linguistics, covers key models and algorithms that are used for automatic processing of natural language text. In natural language processing (NLP) tasks, inputs are word sequences and the outputs consist of linguistic annotations to those sequences. These annotations are crucial for downstream applications like automatic speech recognition, machine translation, information extraction, and question answering. Students in this course will implement key parts of a natural language pipeline, including text normalization, part-of-speech tagging, named entity recognition, parsing, topic modeling, and sentiment analysis.

Prerequisite: CS/EE 555/655 Analyzing Sequences

CS 550/650 Dialogue

Spoken dialogue systems are already being deployed to help people find out flight information, trade stock, access email, and check traffic conditions. With the continuing advancements in speech technology, more information and services will become readily available.

This course teaches the fundamentals of spoken dialogue systems. Spoken dialogue systems include components for speech recognition, parsing, semantic interpretation, dialogue management, text generation, speech synthesis, and agent architecture. The course is organized in terms of 3 frameworks for dialogue management: finite-state machines, form-filling, and speech-act reasoning. We examine how speech recognition, parsing, and semantic interpretation fit into each framework. We will also contrast hand-crafting a dialogue manager with using machine learning, namely reinforcement learning.
Course Website

CS 506/606 Information Retrieval

The course will cover a variety of topics in the general area of information retrieval (IR). An initial series of lectures will cover the fundamentals of IR, with a particular emphasis on applications of modern natural language processing techniques in the field and on evaluation. The remainder of the course will be taught seminar-style, and will consist of a review of selected recent papers from the Information Retrieval literature. To provide a historical perspective, we will also include several classic papers in this review. Example topics to be covered will include (but are not limited to): practical issues related to webcrawling; indexing of raw text and other data, such as word lattices output from speech recognizers; and query expansion and suggestion methods. Issues in IR evaluation will be covered throughout the course. Students will be expected to actively participate in discussions of research papers, and to lead the discussions in several sessions. The course will also include one or more homework assignments and a term project, which will involve implementing and evaluating IR systems.