Home
Schedule
Labs
Links
Policies
Grades

General Info:

Instructor: Prof. Klaus Mueller and Prof. Kazem Mahdavi
    Office hours: SUNY Korea B-471, We 2-3pm (or send email for other arrangements)
    Phone: +82-32-626-1200
    Email: mueller{remove_this}@cs.sunysb.edu

Teaching assistant: 

Meeting time and venue:

     SUNY Korea, TBD, TuTh 3:30 - 4:50 pm

Summary:
   The growth of digital data is tremendous. Any aspect of life and matter is being recorded and stored on cheap disks, either in the cloud, in businesses, or in research labs. We can now afford to explore very complex relationships with many variables playing a part. But for this data-driven world we need powerful tools that allow us to be creative, to sculpt this intricate insight from the raw block of data. This course will cover the fundamental concepts of data science which is the umbrella term for the myriad of disciplines that contribute these tools. The course will equip students with the key skillset toward becoming good data scientists. Major topics include scoping projects, data preparation, statistics basics, visualization, statistical learning, data mining, high performance computing, various types of structured and unstructured data analysis, matrix methods, scalability, and optimization.
 
Prerequisites:
    Graduate standing

Texts:
Required:
  - Data Science for Business: What You Need to Know About Data Mining and Data-Analytic Thinking by F. Provost and T. Faucett, O'Reilly Media, 2013
  - Data Mining: The Texbook by C. Aggarwal, Springer, 2015   

Optional:
  - Predictive Analytics: The Power to Predict Who Will Click, Buy, Lie, or Die by E. Siegel and T. Davenport, 2013
  - Automate This: How Algorithms Came to Rule Our World by C. Steiner, 2012
  - Doing Data Science: Straight Talk from the Frontline by C. O’Neill and R. Schutt, O'Reilly Media, 2013

Grading:

    Midterm 30%
    Projects (2): 15% each
    Final Project: 40% (10% for each component -- proposal, prelim report, final report, presentation)