CSE 390, Fall 2017: Probability & Statistics for Data Science

News:
09/19: Lecture 6 slides and py scripts posted.
09/14: Lecture 5 slides and py scripts posted.
09/12: Lecture 4 slides posted.
08/29: Lecture 1 slides posted.
08/05: Our first lecture will be on Aug 29th (Tues) at 4pm in Frey 205.

CSE 390: Probability & Statistics for Data Science
Fall 2017


When: Tue Thu, 4:00pm - 5:20pm
Where: Frey Hall 205
Instructor: Anshul Gandhi
Instructor Office Hours: Tue 3-4pm and Thu 5:30-6:30pm
             347, New CS building
Course TA: Caitao Zhan, Kunal Shah
TA Office Hours: By appointment (please email the TA(s) to schedule)

Course Info

This undergraduate-level special topics course covers probability and statistics topics required for data scientists to analyze and interpret data. The course will involve theoretical topics and some programming assignments. The course is targeted primarily for junior and senior undergraduate students who are comfortable with concepts relating to probability and are comfortable with basic programming. Undergraduates from Computer Science, Applied Mathematics and Statistics, and Electrical and Computer Engineering would be well suited for taking this class. Topics covered include Probability Theory, Random Variables, Stochastic Processes, Statistical Inference, Hypothesis Testing, Regression, Classification, and Clustering. For more details, refer to the syllabus below.

The class is expected to be interactive and students are encouraged to participate in class discussions.
Grading will be on a curve, and will tentatively be based on assignments, exams, and class participation. For more details, refer to the section on grading below.

Syllabus & Schedule

Date Topic Readings Notes
Aug 29 (Tue)
[Lec 01]
Course introduction, class logistics
Aug 31 (Thu)
[Lec 02]
Probability review - 1
  • Basics: sample space, outcomes, probability
  • Events: mutually exclusive, independent
  • Calculating probability: sets, counting, tree diagram
  • AoS 1.1 - 1.6
    MHB 3.1 - 3.5
    Sep 05 (Tue) Labor Day No class
    Sep 07 (Thu)
    [Lec 03]
    Probability review - 2
  • Conditional probability
  • Law of total probability
  • Bayes' theorem
  • AoS 1.7
    MHB 3.6, 3.10 - 3.11
    assignment 1 out
    Sep 12 (Tue)
    [Lec 04]
    Random variables - 1: Overview
  • Discrete and Continuous RVs
  • Mean, Moments, Variance
  • pmf, pdf, cdf
  • AoS 2.1 - 2.3
    MHB 3.7 - 3.9
    Sep 14 (Thu)
    [Lec 05]
    Random variables - 2: Discrete RVs
  • Bernoulli(p)
  • Binomial(n, p)
  • Geometric(p)
  • Indicator RV
  • AoS 2.4
    MHB 3.7 - 3.9, 3.14.1
    Python scripts:
    draw_Bernoulli, draw_Binomial, draw_Geometric,
    sample_Bernoulli, sample_Binomial, sample_Geometric
    Sep 19 (Tue)
    [Lec 06]
    Random variables - 3: Continuous RVs
  • Uniform(a, b)
  • Exponential(λ)
  • Normal(μ, σ2), and its several properties
  • AoS 2.7
    MHB 3.14.1, 3.10, 3.13
    assignment 1 due
    assignment 2 out
    Python scripts:
    draw_Uniform, draw_Exponential, draw_Normal,
    sample_Uniform, sample_Exponential, sample_Normal
    Sep 21 (Thu) Instructor traveling No class
    Sep 26 (Tue)
    [Lec 07]
    Random variables - 4: Joint distributions & conditioning
  • Joint probability distribution
  • Linearity (and product) of expectation
  • Conditional expectation
  • Sum of a random number of RVs
  • AoS 2.8
    MHB 3.11 - 3.12, 3.15
    Sep 28 (Thu)
    [Lec 08]
    Probability inequalities
  • Markov's Inequality
  • Chebyshev's inequality
  • Weak law of large numbers
  • Central limit theorem
  • AoS 4.1 - 4.2, 23.1 - 23.3
    MHB 3.14.2, 8.1 - 8.7
    Oct 03 (Tue)
    [Lec 09]
    Non-parametric inference - 1 AoS 6.1 - 6.2, 7.1 - 7.2 assignment 2 due
    Oct 05 (Thu)
    [Lec 10]
    Non-parametric inference - 2 AoS 20.2
    Oct 10 (Tue) Mid-term 1

    Resources

    Grading (tentative)

    Academic Integrity

    Each student must pursue his or her academic goals honestly and be personally accountable for all submitted work. Representing another person's work as your own is always wrong. Faculty are required to report any suspected instances of academic dishonesty to the Academic Judiciary. For more comprehensive information on academic integrity, including categories of academic dishonesty, please refer to the academic judiciary website at http://www.stonybrook.edu/commcms/academic_integrity. Please note that any incident of academic dishonesty will immediately result in an F grade for the student.

    Critical Incident Management

    Stony Brook University expects students to respect the rights, privileges, and property of other people. Faculty are required to report to the Office of Judicial Affairs any disruptive behavior that interrupts their ability to teach, compromises the safety of the learning environment, or inhibits students' ability to learn.

    Disability Support Services

    If you have a physical, psychological, medical or learning disability that may impact your course work, please contact Disability Support Services, ECC (Educational Communications Center) Building, room 128, (631) 632-6748. They will determine with you what accommodations, if any, are necessary and appropriate. All information and documentation is confidential. http://studentaffairs.stonybrook.edu/dss.
     Please report any errors to the Instructor.