Spring 2022 CS165B Machine Learning

From courses
Revision as of 22:52, 3 April 2022 by Wangwilliamyang (talk | contribs) (Instructor and Venue)

Jump to: navigation, search

Instructor and Venue

  • Instructor: William Wang
  • TAs:
    • Alex Rich <anrich@ucsb.edu>
    • Yijun Xiao <yijunxiao@cs.ucsb.edu>
    • Weixi Feng <weixifeng@ucsb.edu>
    • Xinyi Wang <xinyi_wang@ucsb.edu>
  • Time: T R 12:30pm - 1:45pm
  • Location: IV THEA2
  • Discussions:
    • TBD
  • TA Office Hours Location:
  • Instructor Office Hours: Starting 2nd week, Th 2-3pm Henley Hall 2005
  • Prerequisites:
    • The formal prerequisite is Computer Science 130A (Data Structures and Algorithms I). This implies that you have taken CS 40, CS 32, and PSTAT 120A or ECE 139 and you have studied topics such as algorithms, data structures, searching and sorting techniques, recursion, and induction, all of which are relevant to this course. Most importantly, you need to be able to think logically about problems and solution strategies, and you must be familiar enough with writing software to implement solutions on your own. You will be well-prepared if you have completed the CS lower-division courses, the Math and PSTAT requirements, and CS 130A.
    • Some familiarity with basic concepts of machine learning, linear algebra, probability, and calculus.

Course Objective

At the end of the quarter, students should have a good understanding of basic ML problems tasks and should be able to implement some fundamental algorithms for simple problems in ML. Students will also develop an understanding of the open research problems in ML.

Piazza

piazza.com/ucsb/spring2022/cs165b

Syllabus

  • 3/29: Introduction and Logistics (Reading Prologue)
  • 3/31: Key Machine Learning Concepts (Reading Ch. 1)
  • 4/5: Classification (HW#1 out) (Reading Ch. 2)
  • 4/7: Classification (Reading Ch. 3)
  • 4/12: Decision Trees (Reading Ch. 4)
  • 4/14: Decision Trees (HW#1 due, HW#2 out) (Reading Ch. 5)
  • 4/19: Linear models (Reading Ch. 7)
  • 4/21: Linear models and perceptrons, Midterm review
  • 4/26: Perceptrons, Support vector machines
  • 4/28: In-Class Midterm exam
  • 5/3: Responsible Machine Learning (Guest Lecturer: Sharon Levy, Amazon Fellow)
  • 5/5: Kernel methods, distance measures, clustering (Reading Ch. 8)(HW#2 due, HW#3 out)
  • 5/10: Clustering, naive Bayes classifier (Reading Ch. 9.1, 9.2)
  • 5/12: Features, model ensembles (Reading Ch. 11)
  • 5/17: Machine learning experiments (Reading Ch. 10)
  • 5/19: Neural networks (Reading Ch. 12) (HW#3 due, HW#4 out)
  • 5/24: Neural networks & Deep learning (Reading NN and DL chapter 1)
  • 5/26: Neural networks & Deep learning (Reading NN and DL chapter 6)
  • 5/31: Multimodal Machine Learning (Guest Lecturer: Wanrong Zhu, Regents Fellow)
  • 6/2: Final Exam Review (HW#4 due)
  • 6/4-6/10: Final Exam in the Final Week TBD

Course Description

  • What is this course about?

The course covers the core principals and key techniques of machine learning (ML), which is the study of algorithms that learn from data and experience. Topics including classification, concept learning, decision trees, artificial neural networks, Bayesian learning, and others. Both theory and practice will be emphasized. However, this is an introductory course - you won't learn everything about machine learning here, but you'll be prepared to go deeper in the subject.

  • Overview

Machine learning studies the question "How can we build computer programs that automatically improve their performance through experience?" This includes learning to perform different types of tasks based on different types of experience. For example, it includes robots learning to better navigate based on experience gained by roaming their environments; medical decision aids that learn to predict which therapies work best for which diseases based on data mining of historical health records; speech recognition systems that lean to better understand your speech based on experience listening to you; and computer vision systems that learn how to represent and recognize objects. This course is designed to give students a thorough grounding in the methods, theory, and algorithms of machine learning. The topics of the course draw from several fields that contribute to machine learning, including classical and Bayesian statistics, pattern recognition, and information theory.

  • We will cover topics such as:
    • Overview of machine learning
    • Decision Tree Learning
    • Artificial Neural Networks
    • Bayesian Learning
    • Computational Learning Theory
    • Instance-Based Learning
    • Genetic Algorithms
    • Classification

Note that CS165A, taught in the Winter Quarter, is a complementary course on Artificial Intelligence, although it is not a prerequisite to CS165B. Artificial Intelligence is a broader area and encompasses topics beyond machine learning such as logic, problem-solving and search, expert systems, intelligent agents, knowledge representation, planning, and natural language processing.

  • What you will learn

By the end of the quarter, you should understand what machine learning is all about - what it entails, what it can (and can't) be used for, what are the underlying assumptions and principles, and what are the main tools and techniques. You will gain experience, both conceptually and practically, by homework assignments that involve solving problems and implementing machine learning algorithms. You will see many examples of working machine learning systems and research throughout the quarter to provide a broad exposure to the topic.

  • Why study machine learning?

Machine learning, which was originally an area of study within the broad field of artificial intelligence, has a long history, with notable failures and successes, but always with a promise of being really useful some day. That day has come. In recent years machine learning has become an extremely important and relevant topic and a core competency for many (perhaps most - maybe even all) areas of science, business, and government. The explosion of "big data" has made machine learning a critical technology for finding the information needles in the data haystacks, so to speak. Huge amounts of data combined with machine learning techniques are at the heart of what Google, Amazon, Netflix, Facebook, and many key companies do, and increasingly important in scientific pursuits.

Gaining knowledge and experience in machine learning is valuable for a wide range of career paths, including research, industry, and entrepreneurship.

Text Book

There is a required textbook, available at the UCSB bookstore, from which reading will be assigned throughout the quarter.

  • Required: Machine Learning: The Art and Science of Algorithms that Make Sense of Data, by Peter Flach, Cambridge University Press, 2012.

Also available at Amazon, etc.

  • (optional) Machine Learning, Tom Mitchell. (Errata and new chapters)
  • (optional) Pattern Recognition and Machine Learning, Chris Bishop.
  • (Freely downloadable) A Course in Machine Learning, Hal Daumé III
  • (Freely downloadable) Information Theory, Inference, and Learning Algorithms, David MacKay

Grading

There will be four homework assignments (40%), one mid-term exam (20%), and a final exam (40%). Four late days are allowed with no penalty. After that 50% will be deducted if it is within 4 days after the due day, unless you have a note from the doctors' office. Homework assignment submissions that are five days late will receive zero credits. Your grade can be found on GauchoSpace.

Academic Integrity

We follow UCSB's academic integrity policy from UCSB Campus Regulations, Chapter VII:``Student Conduct and Discipline"):

  • It is expected that students attending the University of California understand and subscribe to the ideal of academic integrity, and are willing to bear individual responsibility for their work. Any work (written or otherwise) submitted to fulfill an academic requirement must represent a student’s original work. Any act of academic dishonesty, such as cheating or plagiarism, will subject a person to University disciplinary action. Using or attempting to use materials, information, study aids, or commercial “research” services not authorized by the instructor of the course constitutes cheating. Representing another person's words, ideas, or concepts without appropriate attribution is plagiarism. Whenever another person’s written work is utilized, whether it be a single phrase or longer, quotation marks must be used and sources cited. Paraphrasing another’s work, i.e., borrowing the ideas or concepts and putting them into one’s “own” words, must also be acknowledged. Although a person’s state of mind and intention will be considered in determining the University response to an act of academic dishonesty, this in no way lessens the responsibility of the student.

More specifically, we follow Stefano Tessaro and William Cohen's policy in this class:

You cannot copy the code or answers to homework questions or exams from your classmates or from other sources; You may discuss course materials and assignments with your classmate, but you cannot write anything down. You must write down the answers / code independently. The presence or absence of any form of help or collaboration, whether given or received, must be explicitly stated and disclosed in full by all involved, on the first page of their assignment. Specifically, each assignment solution must start by answering the following questions:

  • (1) Did you receive any help whatsoever from anyone in solving this assignment? Yes / No.
    • If you answered 'yes', give full details: (e.g. ``Jane explained to me what is asked in Question 3.4")
  • (2) Did you give any help whatsoever to anyone in solving this assignment? Yes / No.
    • If you answered 'yes', give full details: (e.g. ``I pointed Joe to section 2.3 to help him with Question 2".
  • No electronics are allowed during exams, but you may prepare an A4-sized note and bring it the exam.
  • If you have questions, often ask the teaching staff.

Academic dishonesty will be reported to the highest line of command at UCSB. Students who engage in such activities will receive an F grade automatically.

  • Unauthorized uploading of any course materials to any external website is strictly prohibited.

Accessibility

Students with documented disability are asked to contact the DSP office to arrange the necessary academic accommodations.

Discussions

All discussions and questions should be posted on our course Piazza site.