CSE 544, Spring 2017: Probability and Statistics for Data Science

News:
4/25: Assignemnt 5 out, due in CS 347 on 5/11 (Thursday).

CSE 544: Probability and Statistics for Data Science
Spring 2017


When: Mon Wed, 2:30pm - 3:50pm
Where: Old CS 2129
Instructor: Anshul Gandhi
Instructor Office Hours: Mon 5-6pm and Wed 4-5pm
        Also, by appointment (email instructor to schedule)
        347, New CS building
Course TA: Xi Zhang (xizhang1 [at] cs [dot] stonybrook [dot] edu)

Course Info

This course covers probability and statistics topics required for data scientists to analyze and interpret data. The course is also part of the Data Science and Engineering Specialization. The course is targeted primarily at PhD and Masters students in the Computer Science Department. Topics covered include Probability Theory, Random Variables, Stochastic Processes, Statistical Inference, Hypothesis Testing, Regression, Classification, and Clustering. For more details, refer to the syllabus below.

The class is expected to be interactive and students are encourages to participate in class discussions.
Grading will be on a curve, and will tentatively be based on assignments, exams, a group project, and class participation. For more details, refer to the section on grading below.

Syllabus & Schedule

Date Topic Readings Notes
Jan 23 (Mon) Course introduction, class logistics
Jan 25 (Wed) Probability review - 1 AoS 1.1 - 1.6
MHB 3.1 - 3.5
Jan 30 (Mon) Probability review - 2 AoS 1.7
MHB 3.6, 3.10 - 3.11
Feb 01 (Wed) Random variables - 1 AoS 2.1 - 2.3
MHB 3.7 - 3.9
assignment 1 out
Feb 06 (Mon) Random variables - 2 AoS 2.4
MHB 3.7 - 3.9, 3.14.1
MATLAB scripts:
draw_Bernoulli, draw_Binomial, draw_Geometric,
sample_Bernoulli, sample_Binomial, sample_Geometric
Feb 08 (Wed) Random variables - 3 AoS 2.7
MHB 3.14.1, 3.10, 3.13
MATLAB scripts:
draw_Uniform, draw_Exponential, draw_Normal,
sample_Uniform, sample_Exponential, sample_Normal
Feb 13 (Mon) Conditioning and Expectations AoS 2.8
MHB 3.11 - 3.12, 3.15
Feb 15 (Wed) Probability inequalities,
Stochastic processes,
Markov chains
AoS 4.1 - 4.2, 23.1 - 23.3
MHB 3.14.2, 8.1 - 8.7
assignment 1 due
assignment 2 out
Feb 20 (Mon) Non-parametric inference - 1 AoS 6.1 - 6.2, 7.1 - 7.2 MATLAB scripts:
draw_ecdf_random, draw_ecdf_normal, draw_ecdf_exponential
Feb 22 (Wed) Non-parametric inference - 2 AoS 20.2 MATLAB scripts:
draw_histogram_random, draw_histogram_normal
Feb 27 (Mon) Non-parametric inference - 3
Confidence intervals
AoS 20.3, 6.3.2, 7.1 MATLAB scripts:
draw_kde
Mar 01 (Wed) Parametric inference - 1 AoS 6.3.1 - 6.3.2, 9.1 - 9.2 assignment 2 due
Mar 06 (Mon) Mid-term 1
Mar 08 (Wed) Parametric inference - 2 AoS 9.3 - 9.4, 9.9 assignment 3 out
Required data file for Q8.
Mar 13 (Mon) Spring break No class
Mar 15 (Wed) Spring break No class
Mar 20 (Mon) Hypothesis testing - 1 AoS 10 - 10.1, 10.10.2
Mar 22 (Wed) Project discussion - 1 Finalize project dataset.
Meet in CS 220.
Mar 27 (Mon) Hypothesis testing - 2 AoS 10.2, 10.5 assignment 3 due
assignment 4 out

Required data: q5_sigma3.dat, q5_sigma100.dat, q7_X.dat, q7_Y.dat.
Mar 29 (Wed) Bayesian inference AoS 11.1 - 11.2, 11.7
Apr 03 (Mon) Statistical Models
Apr 05 (Wed) Project discussion - 2 Finalize project deliverables.
Apr 10 (Mon) Mid-term prep No class.
assignment 4 due
Apr 12 (Wed) Mid-term 2
Apr 17 (Mon) Regression
Apr 19 (Wed) Time series analysis
Apr 24 (Mon) Project discussion - 3 Mid-project review.
Apr 26 (Wed) Classification assignment 5 out.
Required data: q2.dat, q4.dat, q5.dat.
May 01 (Mon) Clustering
May 03 (Wed) Project prep
May 08 (Mon) Project final ppts
May 10 (Wed) Project final ppts

Resources

Grading (tentative)

Group Project

Academic Integrity

Each student must pursue his or her academic goals honestly and be personally accountable for all submitted work. Representing another person's work as your own is always wrong. Faculty are required to report any suspected instances of academic dishonesty to the Academic Judiciary. For more comprehensive information on academic integrity, including categories of academic dishonesty, please refer to the academic judiciary website at http://www.stonybrook.edu/commcms/academic_integrity. Please note that any incident of academic dishonesty will immediately result in an F grade for the student.

Critical Incident Management

Stony Brook University expects students to respect the rights, privileges, and property of other people. Faculty are required to report to the Office of Judicial Affairs any disruptive behavior that interrupts their ability to teach, compromises the safety of the learning environment, or inhibits students' ability to learn.

Disability Support Services

If you have a physical, psychological, medical or learning disability that may impact your course work, please contact Disability Support Services, ECC (Educational Communications Center) Building, room 128, (631) 632-6748. They will determine with you what accommodations, if any, are necessary and appropriate. All information and documentation is confidential. http://studentaffairs.stonybrook.edu/dss.
 Please report any errors to the Instructor.