Mini-project #3 is out, due October 27.
Mini-project #2 is out, due October 8.
(Make sure to consider the size of available datasets while scoping projects)
TA information is now up to date.
Mini-project #1 is out, due September 17.
Welcome to the class! Hope you will enjoy it :)
Students are encouraged to study the Syllabus to have a general understanding of the
course organization, as well as the Assignments to have an idea about the workload.
Instructor: Leman Akoglu
Teaching Assistant: Ali Selman Aydin
- Office: 257 Computer Science
- Office hours: Tue 2:30PM - 3:30PM
- Email: invert (cs.stonybrook.edu @ leman)
- Office: Computer Science 2203 (old bldg)
- Office hours: Monday 4PM - 5PM
- Email: invert (stonybrook.edu @ aliselman.aydin)
Tue & Thu 5:30PM - 6:50PM
Computer Science 2311 (old bldg)
Knowledge discovery in data is "the non-trivial process of identifying valid, novel, potentially useful, and ultimately understandable patterns in data" --- Fayyad et al. (1996)
Large-scale data generated by humans and machines is available everywhere. Acquiring the
fundamental skills on how to 1) analyze and understand as well as 2) manage and process these large
datasets are crucial in today's data-driven world, for producing data products that solve real-world problems.
will cover the fundamental concepts in data science, to equip students with the key skillset toward becoming good data scientists. Major topics
include scoping projects, data preparation, statistics basics, visualization, statistical learning, data mining, various types of structured and unstructured data analysis, matrix methods, scalability, and optimization.
We expect students to have a copy of the following book, from which most readings will be assigned.
Below you can find a list of other recommended books to learn certain subjects in more depth.
I will also post the lecture notes on the Blackboard
for other pointers.
BULLETIN BOARD and other info
MISC - FUN: