CSE 519 - Data Science

Fall 2016

Data Science is a rapidly emerging discipline at the intersection of statistics, machine learning, data visualization, and mathematical modeling. This course is designed to provide a hands-on introduction to Data Science by challenging student groups to build predictive models for upcoming events, and validating their models against the actual outcomes.

  • Course Time: 1:00-2:20PM Monday and Friday
    Place: Frey Hall 301
  • Steven Skiena's office hours are 2:30AM-4PM Tuesday-Thursday, in 251 New Computer Science, and by appointment.
  • The course teaching assistant will be Junting Ye. His email address is cse519@cs.stonybrook.edu. He will have office hours Tuesday 3PM-4PM and Wednesday from 10AM-11AM in the Graduate Student Lounge, 3rd Floor, New Computer Science.
  • Syllabus
  • Lecture Schedule

Textbook

We will use a preliminary manuscript of my forthcoming book “The Data Science Design Manual”, with a target date for publication of August 2017. Copies of the manuscript will be available for purchase at cost (about $25) from the Stony Brook campus UPS store, located on the lower level of Melville Library - E0320. Phone: (631) 632-1831

I will welcome feedback on the book, and corrections to the manuscript at the end of the semester, so please denote them clearly (with a highlighter?) in your copy as you read it. Extra credit will be awarded to students who report substantial amounts of corrections especially on chapters (ID mod 12) +1 and (ID mod 12)+7, so I get feedback on everything.

Homework Assignments

Lecture Notes

I will give about 25 formal lectures this semester. All classes will be filmed by Echo360 and made available on Blackboard.

Old lecture notes are available from the previous offering in Fall 2014.

Semester Projects

Roughly half of the course grade will come from a course project. Students will typically work in small groups (2-3 people) on independent research projects. I will distribute a list of possible projects about six weeks into the semester. You will be encouraged to develop your own project ideas, although I must approve.

Recommended Readings

The field of data science is still emerging, but there are several books which it will be useful to read and consult:

Videos: The Quant Shop

The Quant Shop is a series of eight 30 minute programs on Data Science, which are a product of the Fall 2014 offering of this course. Watch them for inspiration at the Quant Shop Vimeo channel.

Related Links

  • My algorithms textbook is the way to get a job at Google
  • Bing Predicts uses search queries and other modeling to predict the outcome of a variety of events.
  • CS109 Data Science, Harvard University, Fall 2015 -- This course stresses statistical modeling and Python programming. Very interesting, well thought-out assignments

Professor

Steven S. Skiena
251 New Computer Science Building
Department of Computer Science
Stony Brook University
Stony Brook, NY 11794-2424, USA
skiena@cs.stonybrook.edu
631-632-9026