Linguistic, mathematical, and computational fundamentals of natural language processing (NLP).
Topics include part of speech tagging, Hidden Markov models, syntax and parsing, lexical semantics, compositional semantics, machine translation, text classification, discourse and dialogue processing. Additional topics such as sentiment analysis, text generation, and deep learning for NLP.
Introduction to Natural Language Processing Jacob Eisenstein First Edition, October 2019 MIT Press ISBN: 9780262042840 https://github.com/jacobeisenstein/gt-nlp-class/blob/master/notes/eisenstein-nlp-notes.pdf
Speech and Language Processing Daniel Jurafsky and James Martin Third Edition, 2019 Prentice Hall https://web.stanford.edu/~jurafsky/slp3/
Approximately 50 pages of the textbooks
Lecture
(CPSC 202 and CPSC 223) OR "permission of the instructor". All programming assignments are in Python.
Class logistics, Why is NLP hard, Methods used in NLP, Mathematical and probabilistic background, Linguistic background, Python libraries for NLP, NLP resources, Word distributions, NLP tasks, Preprocessing
Language Modeling, Noisy Channel, Hidden Markov Models, The Viterbi Algorithm, Statistical Part of Speech Tagging, Syntax and Parsing, Context-Free Grammars, CKY Parsing, the Penn Treebank, Parsing Evaluation, Dependency Syntax, Dependency Parsing, Features and Unification, Tree-Adjoining Grammars, Combinatory Categorial Grammars, Noun sequence parsing
Text Similarity, Stemming, WordNet, Word Similarity, Vector Semantics, Dimensionality Reduction, Representing Meaning, First Order Logic, Inference, Semantic Parsing, Abstract Meaning Representation, Sentiment Analysis
Question Answering, Text Summarization, Text Generation, Discourse Analysis, Dialogue Systems, Machine Translation, Syntax-based Machine Translation
Text Classification, Vector Classification, Linear Models, Text clustering
Perceptron, Word Embeddings, word2vec, Deep Neural Networks, Sentence Representations, Neural approaches to question answering, parsing, machine translation, summarization, etc., Transformers, BERT.
Unless otherwise specified in an assignment all submitted work must be your own, original work. Any excerpts, statements, or phrases from the work of others must be clearly identified as a quotation, and a proper citation provided. Any violation of the University's policies on Academic and Professional Integrity may result in serious penalties, which might range from failing an assignment, to failing a course, to being expelled from the program.
Violations of academic and professional integrity will be reported to Student Affairs. Consequences impacting assignment or course grades are determined by the faculty instructor; additional sanctions may be imposed.