CS 562/662 Natural Language Processing
This course, which is sometimes referred to as Computational Linguistics, covers key models and algorithms that are used for automatic processing of natural language text. In natural language processing (NLP) tasks, inputs are word sequences and the outputs consist of linguistic annotations to those sequences. These annotations are crucial for downstream applications like automatic speech recognition, machine translation, information extraction, and question answering. Students in this course will implement key parts of a natural language pipeline, including text normalization, part-of-speech tagging, named entity recognition, parsing, topic modeling, and sentiment analysis.
Prerequisite: CS/EE 555/655 Analyzing Sequences
This course teaches the fundamentals of spoken dialogue systems. Spoken dialogue systems include components for speech recognition, parsing, semantic interpretation, dialogue management, text generation, speech synthesis, and agent architecture. The course is organized in terms of 3 frameworks for dialogue management: finite-state machines, form-filling, and speech-act reasoning. We examine how speech recognition, parsing, and semantic interpretation fit into each framework. We will also contrast hand-crafting a dialogue manager with using machine learning, namely reinforcement learning.
The course will cover a variety of topics in the general area of information retrieval (IR). An initial series of lectures will cover the fundamentals of IR, with a particular emphasis on applications of modern natural language processing techniques in the field and on evaluation. The remainder of the course will be taught seminar-style, and will consist of a review of selected recent papers from the Information Retrieval literature. To provide a historical perspective, we will also include several classic papers in this review. Example topics to be covered will include (but are not limited to): practical issues related to webcrawling; indexing of raw text and other data, such as word lattices output from speech recognizers; and query expansion and suggestion methods. Issues in IR evaluation will be covered throughout the course. Students will be expected to actively participate in discussions of research papers, and to lead the discussions in several sessions. The course will also include one or more homework assignments and a term project, which will involve implementing and evaluating IR systems.