Difference between revisions of "Winter 2018 CS291A Deep Learning for NLP"

Latest revision as of 18:31, 27 February 2018

Instructor: William Wang
Time: T R 1:00pm - 2:50pm
Location: PHELPS 2510
Reader: Ke Ni, ke00@ucsb.edu
Instructor Office Hours: Tu 3-4pm HFH 1115 starting 01/23.
Prerequisites:
- Machine Learning (CS 165B) or equivalent
- Good programming skills and knowledge of data structure (e.g., CS 130A)
- Solid background in machine learning, linear algebra, probability, and calculus.
- Comfortable with deep learning platforms such as TensorFlow, Torch, Theano, MXNet, Caffe etc.
- Prior experiences with AWS / Google Cloud is not required, but could be very useful.

In-class Presentation

01/25 Word embeddings
01/30 Neural network basics (Project proposal due to Grader: Ke Ni < ke00@ucsb.edu> , HW1 out)
02/01 Recursive Neural Networks
02/06 RNNs
02/08 LSTMs/GRUs
02/13 Sequence-to-sequence models and neural machine translation (HW1 due and HW2 out)
02/15 Attention mechanisms
02/20 Convolutional Neural Networks (Mid-term report due to Grader: Ke Ni <ke00@ucsb.edu>)
02/22 Language and vision
02/27 Deep Reinforcement Learning 1 (HW2 due)
03/01 Deep Reinforcement Learning 2
03/06 Unsupervised Learning

Course Objective

At the end of the quarter, students should have a good understanding about basic deep learning models, and should be able to implement some fundamental algorithms for simple problems in deep learning. Students will also develop an understanding of the open research problems in deep learning, and be able to conduct cutting-edge research with novel contributions to improve existing solutions.

Piazza

http://www.piazza.com/ucsb/winter2018/cs291a

Syllabus

Winter 2018 CS291A Syllabus

Course Description

Deep learning has revolutionized many subfields within AI. DeepMind's AlphaGo combined convolutional neural networks together with deep reinforcement learning and MCTS, and won many games against top human Go players. In computer vision, most of the leading systems in ImageNet competitions are based on deep neural networks. Deep learning has also changed the game in NLP: for example, Google has recently replaced their phrase-based machine translation system with neural machine translation system. Throughout the quarter, we will go over some of the basics in neural networks, and we will also go through the deep learning revolution after 2006. In this graduate class, we will also emphasize on the development of graduate student's paper reading and presentation abilities: each student will need to present research papers related to this topic. Last but not least, the most important aspect of this course is for students to work on a novel research project in open problems related to NLP and deep learning, and gain hands-on experiences of doing cutting-edge research.

Text Book

No textbook is required, but the following optional textbook is recommended:

Deep Learning, An MIT Press book, Ian Goodfellow and Yoshua Bengio and Aaron Courville
HTML version of the book: http://www.deeplearningbook.org/

Project

One key aspect of this class is to have students to gain hands-on experiences in open research problems. To do this, each student will need to propose a research project. The teaching staff will provide the feedback on the proposal, and track the progress of each student. Computing resources will be provided: each team will be provided with sufficient amount of Google Cloud credits for their projects.

In the project proposal, each team must clearly mention the following aspects of their project:

What is the motivation of the problem?
What is the exact definition of the problem? How do we formulate the problem in machine learning?
What are some existing approaches to this problem?
What are some existing datasets that you can work on?
What is the novelty in your project? New problem? New approach? New dataset?
How are you going to implement your approach and verify the idea?

Good places to look for project inspirations:

Recent papers from ACL, EMNLP, NAACL from ACL Anthology: http://aclweb.org/anthology/
Recent papers from ICML, NIPS, and ICLR conferences: http://jmlr.org/proceedings/ http://papers.nips.cc/

FAQ: Can I use my existing research projects / thesis research project as the project in this class? A: I would prefer students to get out of their comfort zone and try something new in this class. If you are using existing techniques from your existing project, it is unlikely that you will be able to learn anything new during the course project. However, you may still draw the inspiration from your research problem to formulate your class project.

Available Datasets

Wikipedia Harassment/Personal Attack Dataset https://figshare.com/projects/Wikipedia_Talk/16731 Ex Machina: Personal Attacks Seen at Scale, https://arxiv.org/abs/1610.08914
Stance Detection / Fake News Detection / Automated Fact-Checking, email William.
Deep learning for abstractive humor generation. Dataset: https://www.cs.ucsb.edu/~william/papers/meme.pdf
NELL Knowledge Graph http://rtw.ml.cmu.edu/
Relation Prediction / Reasoning FB15K-237 https://www.microsoft.com/en-us/download/details.aspx?id=52312
Abstractive summarization datasets https://www.aclweb.org/anthology/D/D15/D15-1044.pdf
WikiHow: learning processes from lists and free text https://github.com/paolo7/KnowHowDataset

Grading

There will be two homework assignments (20%), one project (65%), and an in-class paper presentation (15%). The in-class presentation includes 12 minutes presentation (12 slides max) and 3 minutes QA. The breakdown of project grading includes: 1-page proposal (10%), 2-page mid-term report (10%), final presentation (15%), and a final report (30%). Four late days are allowed with no penalty. After that 50% will be deducted if it is within 4 days after the due day, unless you have a note from the doctors' office. Homework assignment submissions that are five days late will receive zero credits.

Final Report Format

You must use the ICML 2018 latex style files for writing the report. The final report must be 3-5 pages long including references. It is encouraged to include the following components in your reports (not necessarily this order): abstract, introduction (motivation, task definition, your novel contributions), related work, your technical approach, such as math formulation of the problem, algorithms, theorems (if any), experiments, discussion, and conclusion.

Academic Integrity

We follow UCSB's academic integrity policy from UCSB Campus Regulations, Chapter VII:``Student Conduct and Discipline"):

It is expected that students attending the University of California understand and subscribe to the ideal of academic integrity, and are willing to bear individual responsibility for their work. Any work (written or otherwise) submitted to fulfill an academic requirement must represent a student’s original work. Any act of academic dishonesty, such as cheating or plagiarism, will subject a person to University disciplinary action. Using or attempting to use materials, information, study aids, or commercial “research” services not authorized by the instructor of the course constitutes cheating. Representing the words, ideas, or concepts of another person without appropriate attribution is plagiarism. Whenever another person’s written work is utilized, whether it be a single phrase or longer, quotation marks must be used and sources cited. Paraphrasing another’s work, i.e., borrowing the ideas or concepts and putting them into one’s “own” words, must also be acknowledged. Although a person’s state of mind and intention will be considered in determining the University response to an act of academic dishonesty, this in no way lessens the responsibility of the student.

More specifically, we follow Stefano Tessaro and William Cohen's policy in this class:

You cannot copy the code or answers to homework questions or exams from your classmates or from other sources; You may discuss course materials and assignments with your classmate, but you cannot write anything down. You must write down the answers / code independently. The presence or absence of any form of help or collaboration, whether given or received, must be explicitly stated and disclosed in full by all involved, on the first page of their assignment. Specifically, each assignment solution must start by answering the following questions:

(1) Did you receive any help whatsoever from anyone in solving this assignment? Yes / No.
- If you answered 'yes', give full details: (e.g. ``Jane explained to me what is asked in Question 3.4")
(2) Did you give any help whatsoever to anyone in solving this assignment? Yes / No.
- If you answered 'yes', give full details: (e.g. ``I pointed Joe to section 2.3 to help him with Question 2".
No electronics are allowed during exams, but you may prepare an A4-sized note and bring it the exam.
If you have questions, often ask the teaching staff.

Academic dishonesty will be reported to the highest line of command at UCSB. Students who engage in such activities will receive an F grade automatically.

Accessibility

Students with documented disability are asked to contact the DSP office to arrange the necessary academic accommodations.

Discussions

All discussions and questions should be posted on our course Piazza site.

@@ Line 15: / Line 15: @@
 **Conner : [https://people.cs.umass.edu/~arvind/emnlp2014.pdf Efficient Non-parametric Estimation of Multiple Embeddings per Word in Vector Space, Neelakantan et al., EMNLP 2014]
 **Sanjana : [http://www.anthology.aclweb.org/D/D14/D14-1162.pdf Glove: Global Vectors for Word Representation, J Pennington, R Socher, CD Manning - EMNLP, 2014]
-** : [http://www.aclweb.org/anthology/P15-1173 AutoExtend: Extending Word Embeddings to Embeddings for Synsets and Lexemes, Rothe and Schutze, ACL 2015]
+**Wenhu : [http://www.aclweb.org/anthology/P15-1173 AutoExtend: Extending Word Embeddings to Embeddings for Synsets and Lexemes, Rothe and Schutze, ACL 2015]
 *01/30 Neural network basics (Project proposal due to Grader: Ke Ni < ke00@ucsb.edu> , HW1 out)
-**Mohith : [http://www.iro.umontreal.ca/~vincentp/ift3395/lectures/backprop_old.pdf Learning representations by back-propagating errors, Nature, 1986]
+**Jashanvir : [http://www.iro.umontreal.ca/~vincentp/ift3395/lectures/backprop_old.pdf Learning representations by back-propagating errors, Nature, 1986]
-**Dan : [https://arxiv.org/abs/1609.04747 An overview of gradient descent optimization algorithms, Sebastian Ruder, Arxiv 2016]
+**Metehan : [https://arxiv.org/abs/1609.04747 An overview of gradient descent optimization algorithms, Sebastian Ruder, Arxiv 2016]
 **Vivek P.: [http://jmlr.org/papers/volume15/srivastava14a/srivastava14a.pdf Dropout: A simple way to prevent neural networks from overfitting (2014), N. Srivastava et al., JMLR 2014]
 *02/01 Recursive Neural Networks
@@ Line 58: / Line 58: @@
 *03/06 Unsupervised Learning
 **Hongmin : [http://papers.nips.cc/paper/5423-generative-adversarial-nets.pdf Generative Adversarial Nets, Goodfellow et al., NIPS 2014]
-**Austin : [https://arxiv.org/abs/1312.6114 Auto-encoding variational Bayes, Kingma and Welling, ICLR 2014]
+**Burak : [https://arxiv.org/abs/1312.6114 Auto-encoding variational Bayes, Kingma and Welling, ICLR 2014]
 **Pushkar : [https://arxiv.org/pdf/1511.06434.pdf%C3%AF%C2%BC%E2%80%B0 Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks, Redford et al., 2015]
+**Liu : [http://papers.nips.cc/paper/5949-semi-supervised-sequence-learning.pdf Semi-supervised Sequence Learning, Dai et al., NIPS 2015]
 ====Course Objective====

Difference between revisions of "Winter 2018 CS291A Deep Learning for NLP"

Latest revision as of 18:31, 27 February 2018

Contents

Instructor and Venue

In-class Presentation

Course Objective

Piazza

Syllabus

Course Description

Text Book

Project

Available Datasets

Grading

Final Report Format

Academic Integrity

Accessibility

Discussions

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

Navigation

Tools