Spring 2019 CS165B Machine Learning
Contents
Instructor and Venue
- Instructor: William Wang
- TAs:
- Yuqing Zhu yuqingzhu@ucsb.edu
- Hongmin Wang hongmin_wang@cs.ucsb.edu
- Grader: TBD
- Time: T R 11:00am - 12:15pm
- Location: SH 1431
- Discussions:
- T 4:00- 4:50 GIRV 2112
- T 5:00- 5:50 387 1015
- TA Office Hours Location: Room 104 at Trailer 935, next to Phelps
- Yuqing Zhu: TBD
- Hongmin Wang: TBD
- Instructor Office Hours: Th 12:30pm-1:30pm HFH 1115 (No OH on 4/11)
- Prerequisites:
- The formal prerequisite is Computer Science 130A (Data Structures and Algorithms I). This implies that you have taken CS 40, CS 32, and PSTAT 120A or ECE 139 and you have studied topics such as algorithms, data structures, searching and sorting techniques, recursion, and induction, all of which are relevant to this course. Most importantly, you need to be able to think logically about problems and solution strategies, and you must be familiar enough with writing software to implement solutions on your own. If you have completed the CS lower division courses, the Math and PSTAT requirements, and CS 130A, you will be well-prepared.
- Some familiarity with basic concepts of machine learning, linear algebra, probability, and calculus.
Course Objective
At the end of the quarter, students should have a good understanding about basic ML problems tasks and should be able to implement some fundamental algorithms for simple problems in ML. Students will also develop an understanding of the open research problems in ML.
Piazza
Please enroll if you haven't: [1]
Syllabus
- 04/02: Introduction and Logistics
- 04/04: Key Machine Learning Concepts
- 04/09: Machine Learning in Practice (Guest Lecturer: Dr. Andrew Mutz, CTO at Appfolio (NASDAQ: APPF))
- 04/11: Classification (Guest Lecturer: Hongmin Wang, HW#1 out)
- 04/16: Machine Learning in Practice (Guest Lecturer: Dr. Nilesh Mishra, Engineering Manager at LogMeIn (NASDAQ: LOGM))
- 04/18: Classification
- 04/23: Decision Trees
- 04/25: Decision Trees (HW#1 due, HW#2 out)
- 04/30: Linear models
- 04/02: Linear models, perceptrons, midterm review
- 05/07: In-Class Midterm exam
- 05/09: Perceptrons, Support vector machines (HW#2 due, HW#3 out)
- 05/14: Kernel methods, distance measures, clustering
- 05/16: Clustering, naive Bayes classifier
- 05/21: Features, model ensembles
- 05/23: Machine learning experiments (HW#3 due, HW#4 out)
- 05/28: Neural networks
- 05/30: Neural networks & Deep learning
- 06/04: Neural networks & Deep learning
- 06/06: Final Exam Review (HW#4 due)
Course Description
- What is this course about?
The course covers the core principals and key techniques of machine learning (ML), which is the study of algorithms that learn from data and experience. Topics including classification, concept learning, decision trees, artificial neural networks, Bayesian learning, and others. Both theory and practice will be emphasized. However, this is an introductory course - you won't learn everything about machine learning here, but you'll be prepared to go deeper in the subject.
- Overview
Machine learning studies the question "How can we build computer programs that automatically improve their performance through experience?" This includes learning to perform different types of tasks based on different types of experience. For example, it includes robots learning to better navigate based on experience gained by roaming their environments; medical decision aids that learn to predict which therapies work best for which diseases based on data mining of historical health records; speech recognition systems that lean to better understand your speech based on experience listening to you; and computer vision systems that learn how to represent and recognize objects. This course is designed to give students a thorough grounding in the methods, theory, and algorithms of machine learning. The topics of the course draw from several fields that contribute to machine learning, including classical and Bayesian statistics, pattern recognition, and information theory.
- We will cover topics such as:
- Overview of machine learning
- Decision Tree Learning
- Artificial Neural Networks
- Bayesian Learning
- Computational Learning Theory
- Instance-Based Learning
- Genetic Algorithms
- Classification
Note that CS165A, taught in the Winter Quarter, is a complementary course on Artificial Intelligence, although it is not a prerequisite to CS165B. Artificial Intelligence is a broader area and encompasses topics beyond machine learning such as logic, problem-solving and search, expert systems, intelligent agents, knowledge representation, planning, and natural language processing.
- What you will learn
By the end of the quarter, you should understand what machine learning is all about - what it entails, what it can (and can't) be used for, what are the underlying assumptions and principles, and what are the main tools and techniques. You will gain experience, both conceptually and practically, by homework assignments that involve solving problems and implementing machine learning algorithms. You will see many examples of working machine learning systems and research throughout the quarter to provide a broad exposure to the topic.
- Why study machine learning?
Machine learning, which was originally an area of study within the broad field of artificial intelligence, has a long history, with notable failures and successes, but always with a promise of being really useful some day. That day has come. In recent years machine learning has become an extremely important and relevant topic and a core competency for many (perhaps most - maybe even all) areas of science, business, and government. The explosion of "big data" has made machine learning a critical technology for finding the information needles in the data haystacks, so to speak. Huge amounts of data combined with machine learning techniques are at the heart of what Google, Amazon, Netflix, Facebook, and many key companies do, and increasingly important in scientific pursuits.
Gaining knowledge and experience in machine learning is valuable for a wide range of career paths, including research, industry, and entrepreneurship.
Text Book
There is a required textbook, available at the UCSB bookstore, from which reading will be assigned throughout the quarter.
- Required: Machine Learning: The Art and Science of Algorithms that Make Sense of Data, by Peter Flach, Cambridge University Press, 2012.
Also available at Amazon, etc.
- (optional) Machine Learning, Tom Mitchell. (Errata and new chapters)
- (optional) Pattern Recognition and Machine Learning, Chris Bishop.
- (Freely downloadable) A Course in Machine Learning, Hal Daumé III
- (Freely downloadable) Information Theory, Inference, and Learning Algorithms, David MacKay
Grading
There will be four homework assignments (40%), one mid-term exam (20%), and a final exam (40%). Four late days are allowed with no penalty. After that 50% will be deducted if it is within 4 days after the due day, unless you have a note from the doctors' office. Homework assignment submissions that are five days late will receive zero credits. Your grade can be found on GauchoSpace.
Academic Integrity
We follow UCSB's academic integrity policy from UCSB Campus Regulations, Chapter VII:``Student Conduct and Discipline"):
- It is expected that students attending the University of California understand and subscribe to the ideal of academic integrity, and are willing to bear individual responsibility for their work. Any work (written or otherwise) submitted to fulfill an academic requirement must represent a student’s original work. Any act of academic dishonesty, such as cheating or plagiarism, will subject a person to University disciplinary action. Using or attempting to use materials, information, study aids, or commercial “research” services not authorized by the instructor of the course constitutes cheating. Representing the words, ideas, or concepts of another person without appropriate attribution is plagiarism. Whenever another person’s written work is utilized, whether it be a single phrase or longer, quotation marks must be used and sources cited. Paraphrasing another’s work, i.e., borrowing the ideas or concepts and putting them into one’s “own” words, must also be acknowledged. Although a person’s state of mind and intention will be considered in determining the University response to an act of academic dishonesty, this in no way lessens the responsibility of the student.
More specifically, we follow Stefano Tessaro and William Cohen's policy in this class:
You cannot copy the code or answers to homework questions or exams from your classmates or from other sources; You may discuss course materials and assignments with your classmate, but you cannot write anything down. You must write down the answers / code independently. The presence or absence of any form of help or collaboration, whether given or received, must be explicitly stated and disclosed in full by all involved, on the first page of their assignment. Specifically, each assignment solution must start by answering the following questions:
- (1) Did you receive any help whatsoever from anyone in solving this assignment? Yes / No.
- If you answered 'yes', give full details: (e.g. ``Jane explained to me what is asked in Question 3.4")
- (2) Did you give any help whatsoever to anyone in solving this assignment? Yes / No.
- If you answered 'yes', give full details: (e.g. ``I pointed Joe to section 2.3 to help him with Question 2".
- No electronics are allowed during exams, but you may prepare an A4-sized note and bring it the exam.
- If you have questions, often ask the teaching staff.
Academic dishonesty will be reported to the highest line of command at UCSB. Students who engage in such activities will receive an F grade automatically.
Accessibility
Students with documented disability are asked to contact the DSP office to arrange the necessary academic accommodations.
Discussions
All discussions and questions should be posted on our course Piazza site.