Linguistic, mathematical, and computational fundamentals of natural language processing (NLP).
Topics include part of speech tagging, Hidden Markov models, syntax and parsing, lexical semantics, compositional semantics, machine translation, text classification, discourse and dialogue processing. Additional topics such as sentiment analysis, text generation, and deep learning for NLP.
Speech and Language Processing Daniel Jurafsky and James Martin Second Edition, 2009 Prentice Hall ISBN-13: 978-0131873216 ISBN-10: 0131873210 https://web.stanford.edu/~jurafsky/slp3/
30-50 pages of the textbook
(CPSC 202 and CPSC 223) OR "permission of the instructor". All programming assignments are in Python.
Class logistics, Why is NLP hard, Methods used in NLP, Mathematical and probabilistic background, Linguistic background, Python libraries for NLP, NLP resources, Word distributions, NLP tasks, Preprocessing
Language Modeling, Noisy Channel, Hidden Markov Models, The Viterbi Algorithm, Statistical Part of Speech Tagging, Brown Clustering, Information Extraction, Syntax and Parsing, Context-Free Grammars, CKY Parsing, the Penn Treebank, Parsing Evaluation, Lexicalized Parsing, Dependency Syntax, Dependency Parsing, Features and Unification, Mildly Context-Sensitive Grammars, Tree-Adjoining Grammars, Combinatory Categorial Grammars, Noun sequence parsing, Prepositional Phrase Attachment
Text Similarity, Stemming, WordNet, Word Similarity, Vector Semantics, Dimensionality Reduction, Text Kernels, Lexical Acquisition, Representing Meaning, First Order Logic, Inference, Semantic Parsing, Abstract Meaning Representation, Sentiment Analysis, Word Sense Disambiguation
Question Answering, Text Summarization, Text Generation, Discourse Analysis, Dialogue Systems, Machine Translation, Noisy Channel Methods, Syntax-based Machine Translation
Text classification, Vector Classification, Linear Models, Perceptron, Support Vector Machines, Kernel Methods, Feature Selection, Text clustering, Word Embeddings, word2vec, Deep Neural Networks, Sentence Representations
Unless otherwise specified in an assignment all submitted work must be your own, original work. Any excerpts, statements, or phrases from the work of others must be clearly identified as a quotation, and a proper citation provided. Any violation of the University's policies on Academic and Professional Integrity may result in serious penalties, which might range from failing an assignment, to failing a course, to being expelled from the program.
Violations of academic and professional integrity will be reported to Student Affairs. Consequences impacting assignment or course grades are determined by the faculty instructor; additional sanctions may be imposed.