General Info:
Instructor: Prof. Klaus Mueller
and Prof. Kazem Mahdavi
Office hours: SUNY Korea B-471, We 2-3pm
(or send email for other arrangements)
Phone: +82-32-626-1200
Email:
mueller{remove_this}@cs.sunysb.edu
Teaching assistant:
Meeting time and venue:
SUNY Korea, TBD, TuTh 3:30 - 4:50 pm
Summary:
The growth of digital data is tremendous.
Any aspect of life and matter is being recorded and stored on cheap disks, either
in the cloud, in businesses, or in research labs. We can now afford to explore
very complex relationships with many variables playing a part. But for this
data-driven world we need powerful tools that allow us to be creative, to sculpt
this intricate insight from the raw block of data. This course will cover the
fundamental concepts of data science which is the umbrella
term for the myriad of disciplines that contribute these tools. The course will
equip students with the key skillset toward becoming good data scientists. Major
topics include scoping projects, data preparation, statistics basics, visualization,
statistical learning, data mining, high performance computing, various types
of structured and unstructured data analysis, matrix methods, scalability, and
optimization.
Prerequisites:
Graduate standing
Texts:
Required:
- Data
Science for Business: What You Need to Know About Data Mining and Data-Analytic
Thinking
by F. Provost and T. Faucett, O'Reilly Media, 2013
- Data Mining: The Texbook by C. Aggarwal, Springer, 2015
Optional:
- Predictive
Analytics: The Power to Predict Who Will Click, Buy, Lie, or Die by E.
Siegel and T. Davenport, 2013
- Automate
This: How Algorithms Came to Rule Our World by C. Steiner, 2012
- Doing
Data Science: Straight Talk from the Frontline by C. O’Neill and R. Schutt, O'Reilly Media, 2013
Grading:
Midterm 30%
Projects (2): 15% each
Final Project: 40% (10% for each component -- proposal, prelim report, final report, presentation)