CSE 519 - Data Science

Fall 2018

Data Science is a rapidly emerging discipline at the intersection of statistics, machine learning, data visualization, and mathematical modeling. This course is designed to provide a hands-on introduction to Data Science by challenging student groups to build predictive models for upcoming events, and validating their models against the actual outcomes.

  • Course Time: 8:30-9:50AM Tuesday and Thursday
    Place: 102 Frey Hall
  • Steven Skiena's office hours are 2:30AM-4PM Tuesday-Thursday, in 251 New Computer Science, and by appointment.
  • The course teaching assistants will be:
    • Allen Kim. His email address is allen.kim@stonybrook.edu. He will have office hours Monday and Wednesday 2:30PM-4PM in room 340 New Computer Science.
    • Harsh Agarwal. His email address is hagarwal@cs.stonybrook.edu. He will have office hours Thursday 2-4PM in the TA room 2217 in Old Computer Science.
    • Shouvik Roy. His email address is shroy@cs.stonybrook.edu. He will have office hours Tuesday 3-4:30PM and Wednesday 4-5:30PM in the TA room 2206 Old Computer Science.
    • Raveendra Soori. raveendra.soori@stonybrook.edu. He will have office hours Tuesday and Thursday form 1:30-3PM in the TA rooms 2203 and 2217 in Old Computer Science.
    • Sahil Sobti. ssobti@cs.stonybrook.edu. He will have office hours Thursday 2-4PM in the TA rooms 2203 Old Computer Science.
  • Videos and slides from my Fall 2016 lectures is available here. The video from Fall 2017 appears here , but the quality is not good. The best stuff should always be available at www.data-manual.com.
  • Sign up for the Piazza class discussion board at https://piazza.com/stonybrook/fall2018/cse519.
  • Syllabus
  • Lecture Schedule

Textbook

We will use my new book The Data Science Design Manual, Springer-Verlag, 2017.The associated website www.data-manual.com points to many resources, including lecture notes/videos, errata, a problem solution Wiki, and sample Python notebooks for generating figures from the book.

I will welcome feedback on the book. Please keep track of errata in the book send them to me, ideally in one batch at the end of the semester.

Homework Assignments

Lecture Notes

I will give about 25 formal lectures this semester. All classes will be filmed by Echo360 and made available on Blackboard.

Ritika Nevatia is graciously making the lecture notes she takes in class available to all interested students. Check them out.

Old lecture notes are available from the previous offering in Fall 2014.

Semester Projects

Roughly half of the course grade will come from a course project. Students will typically work in small groups (2-3 people) on independent research projects. I will distribute a list of possible projects about six weeks into the semester. You will be encouraged to develop your own project ideas, although I must approve.

Recommended Readings

The field of data science is still emerging, but there are several books which it will be useful to read and consult:

Videos: The Quant Shop

The Quant Shop is a series of eight 30 minute programs on Data Science, which are a product of the Fall 2014 offering of this course. Watch them for inspiration at the Quant Shop Vimeo channel.

Related Links

Professor

Steven S. Skiena
251 New Computer Science Building
Department of Computer Science
Stony Brook University
Stony Brook, NY 11794-2424, USA
skiena@cs.stonybrook.edu
631-632-9026