Natural Language Processing
Computer Science 333
- Instructor: John Donaldson
- Office: King 223A
- Office hours: MWF 1:30-3 pm (or by appointment)
- Meeting time: MWF 3:30-4:20 pm, King 123
- Prerequisite: CSCI 151 (Data Structures)
- Textbook: Jurafsky and Martin, Speech and Language Processing,
second edition, Prentice Hall, 2009. ISBN 978-0131873216
- To learn the basic concepts and theory of natural language
processing; i.e., processing human language with a computer.
- To learn about the areas in which natural language processing
is being applied to problems today.
- To learn to use tools and develop software systems to process
Your grade will be based on homework, a project, and two exams.
|Point breakdown (tentative):
|Midterm Exam (October 10)
|Final Exam (December 17 - 9 am)
Late labs are strongly discouraged. You may hand up to two labs
one day late without penalty. Be sure to submit early!
Labs that are up 24 hours late will be penalized by 50%.
Labs that are more than 24 hours late will not be graded.
Problem sets are due at the beginning of lecture. Late
problem sets are not accepted.
If for some reason (such as a severe illness) you will not be
able to complete a lab or take a test, talk to me immediately, and
prior to the deadline. I will handle these situations on a
Tutors are available, provided by Oberlin College. If interested,
see Kay Knight in Peters 114.
If you have a disability that might impact your performance in
this course, or requires special accommodation, please contact me
as soon as possible so that appropriate arrangements can be
made. Support is available through Student Academic
Services, specifically Jane Boomer. You will need to contact them
to get your disability documented before accommodations can be
All work in this course is to be performed in accordance with the
system. You must write the Honor Pledge and sign at the
end of each and every submission. Electronic submissions must
include the honor pledge in the comments and your name. The
pledge is "I have adhered to the Honor Code in this assignment."
That being said, in a hands-on course such as this one, some
discussion of lab assignments is expected and encouraged. A
few specific do's and don't's:
- ask questions about the requirements of an assignment
- discuss with your classmates general approaches to solving a
problem prior to
starting your own design and coding
- get/give help from/to another student in solving a
particularly tough debugging problem
In the end, the work you submit must be your own. If you're
not sure what is acceptable in a given situation, please ask me
- obtain a copy of another student's code (including a student
who has taken the course before)
- give a copy of your code with another student
- collaborate with a partner or group to work on an assignment
- discuss an exam in any way with another student who may be
taking the exam at another time
- Regular expressions and finite state automata.
- Language models. N-grams.
- Part of speech tagging.
- Hidden Markov models.
- Context-free grammars for English.
- Parsing algorithms. The CKY algorithm and the Earley
- Statistical parsing. Probabilistic context-free grammars.
- Features and Unification.
- Computational semantics with first order logic.
- Word sense disambiguation.
- Computational discourse.
- Information extraction.
- Machine translation.