Computational Linguistics (CL) is the study of natural language from a computational perspective.
This is a diverse field with many subdivisions:
Stony Brook has a separate
computational linguistics department.
CL means different things to different people. We will focus mainly on advanced topics in Natural Language Processing:
Syllabus
- Syntactic Parsing -- Joint models, Semi-supervised approaches, neural models.
- Semantic Parsing -- CCG, large scale models, online learning methods.
- Semantic Role Labeling -- Graph-based, unsupervised induction, semi-supervised methods.
- Relation & Event Extraction -- Latent variable, neural models, matrix factorization.
- Lexical Semantics -- Distributional methods, Deep learning, topic models.
- Reasoning w/ Textual Knowledge -- Probabilistic Soft logic, and Markov Logic Networks.
We'll likely add a few more depending on interest.
Course Structure
[This is tentative. Likely to change slightly.]
- Paper Summary Reports (20%) -- Best 10 scores on summaries of papers we read.
- Paper presentation (5%) -- Your in class presentations grade.
- Mid-term (15%) -- In class exam.
- Assignments (10%) -- Two programming assignments.
- Final Project (40%) -- Research project extending on a state-of-the-art system.
- Final Exam (10%) -- Take home. To be turned in within 24 hours.
Requirements FAQ
- Do I need to know Machine Learning?
Yes. Much of the CL covered in this class relies on statistical and machine learning-based methods.
You are expected to have to taken a grad-level ML course.
Among other things, you should at least be familiar with standard ML techniques such as
Logistic Regression, Expectation Maximization, Hidden-markov models, Conditional random fields,
and Neural Networks. We will NOT delve into introductions of these methods in class.
- Do I need to know Natural Language Processing?
Yes. This is an advanced topics class.
You are expected to have taken a grad-level NLP course.
The material will assume that you already had a basic introduction to the standard techniques
in NLP pipelines such as Part-of-Speech tagging, syntactic parsing, and information extraction.
Texts
None required. The following books contain useful reference material:
- Speech and Language Processing: An Introduction to Natural Language Processing, Computational Linguistics and Speech Recognition,
D. Jurafsky & James H. Martin,
Prentice Hall, Second Edition, 2009.
- Foundations of Statistical Natural Language Processing,
C.D. Manning & H. Schuetze,
Cambridge: MIT Press, 1999