Natural Language Processing

CS 635 Information Retrieval

Information retrieval methodologies and applications covers the various computational and methodological aspects of search, both over textual data as well as multimedia data. The course format will consist of a combination of lectures and student-led presentations of primary source articles from the scientific literature on the subject (both classic and modern papers will be featured). The course will involve substantial reading, 2-4 papers or book chapters per week, and will feature regular assignments giving students hands-on experience at building information retrieval systems as well as working with standard off-the-shelf software tools. The course will also include a final project in which students will be expected to solve an IR problem of their choice, e.g., implementing an algorithm from a paper, and evaluate their solution. In addition to computational techniques, the course will include in-depth discussion of evaluation methodologies, including both classical frequentist approaches as well as recent Bayesian approaches to IR evaluation. Rounding things out, the course will feature occasional guest lectures from professionals working in IR.

CS 560/660 Dialogue

Spoken dialogue systems are already being deployed to help people find out flight information, trade stock, access email, and check traffic conditions. With the continuing advancements in speech technology, more information and services will become readily available. This course teaches the fundamentals of spoken dialogue systems. Spoken dialogue systems include components for speech recognition, parsing, semantic interpretation, dialogue management, text generation, speech synthesis, and agent architecture. The course is organized in terms of 3 frameworks for dialogue management: finite-state machines, form-filling, and speech-act reasoning. We examine how speech recognition, parsing, and semantic interpretation fit into each framework. We will also contrast hand-crafting a dialogue manager with using machine learning, namely reinforcement learning.

Course Website

CS 656 Introduction to Linguistics and Communication Disorders

The course will cover the structural and interpretative components of language, e.g., morphology, phonetics, as well as language use in context, e.g., pragmatics, sociolinguistics. Linguistics will be considered from a theoretical, developmental, and clinical perspective, and course topics will include related areas of study including computational linguistics. Bilingual and multicultural issues will also be addressed in terms of linguistic variation and change.

This course will provide a background in linguistics and language development for the students in the CSEE program. Because the CSEE program is associated with the Center for Spoken Language Understanding, many students conduct research involving language use, speech processing, and deviations from typical development. An understanding of linguistics is an important foundation skill for these students.

CS 562/662 Natural Language Processing

This course, which is sometimes referred to as computational linguistics, covers key models and algorithms that are used for automatic processing of natural language text. In natural language processing, NLP, tasks, inputs are word sequences and the outputs consist of linguistic annotations to those sequences. These annotations are crucial for downstream applications like automatic speech recognition, machine translation, information extraction, and question answering. Students in this course will implement key parts of a natural language pipeline, including text normalization, part-of-speech tagging, named entity recognition, parsing, topic modeling, and sentiment analysis.

Prerequisite: CS/EE 555/655 Analyzing Sequences