## CSE 548 / AMS 542 Fall 2012. Graduate Analysis of Algorithms

 Lecturer: Rob Johnson TAs: Sam McCauley (smccauley AT cs DOT stonybrook DOT edu) Li Niu (lniu AT cs DOT stonybrook DOT edu) Time: MoWe 2:30-3:50pm Location: Engineering 143 Office Hours: Rob: We 4:00-5:30, Th 3:30-5:00 2313D Computer Science Building Sam: MoTh 1:15-2:15, 1309 Computer Science Building Li: Fr 9-10, TA Room (Computer Science Building) Home page: http://www.cs.sunysb.edu/~rob/teaching/cse548-fa12

## News

• The final exam is available for your enjoyment.
• Sam's review session will take place tomorrow, Thursday Dec. 6th, at 1pm
in room CS 2114.  He will post pictures of the blackboard on Blackboard.
He will cover B^epsilon-trees and the answers to the mid-term and HW3.

Rob's review session will take place Monday, Dec. 10th, at 12pm in room
CS 2120.  Bring questions!

The final exam will take place Tuesday, Dec. 11th, at 5:30pm in Benedict
hall.  This is a dorm building.  I don't know the room number, so arrive
early and look around.  I will!

Also, Matthew Cordaro has graciously posted a PDF with all his notes for
the entire class on dropbox: http://tinyurl.com/bay8ndr.

• Here's source for the midterm.
• The res of HW3 is now available, with source.
• HW3 consists of two parts:
• Write up solutions to all the questions on our midterm.
• Several additional questions (to be posted soon).
HW3 is due in class on November 26th.
• The online book, Mathematics for Computer Science, by Eric Lehman and Tom Leighton, is a good reference on the background math for this class. See in particular chapters 18 onward for background on probability.
• Michael Bender has a great tutorial on external memory data structures.
• Yufei Tao's notes on the external memory model and sorting are a great resource for reviewing that material.
• Here are some old mid-terms and final exams for your reference on the style of my tests. Keep in mind that these are for completely different classes, so the material has nothing to do with the tests we will have in this class. Only use these to get an idea of the format of the tests.
• An addendum to the Oct. 10 lecture is now available.
• HW2 is now available, with source. (Now corrected!)
• Some requested reference materials:
• HW1 is now available, with source.
• Additional info on the Bloom Filter. (For the curious -- not required reading). Also, note that their notation switches m and n from what we used in class.

## Overview

This course will introduce the tools for designing and analyzing algorithms for solving common problems in computer science, with a special emphasis on algorithms that one would actually want to use. The course will also introduce probabilistic methods needed to analyze randomized algorithms, and will cover basic complexity theory and approximation algorithms for hard problems.

## Topics

Some subset of the following:
• Hashing: chaining vs. linear probing, balls and bins, expected occupancy, w.h.p. definition, w.h.p. max bin, coupon collecting (a slight detour...) , average vs worst case, algorithmic complexity attacks, 2-universal hashing, the power of 2 choices, cuckoo hashing, bloom filters, quotient filters
• Sorting: merge sort, quick sort, divide and conquer, recurrence relations, etc., median-finding, lower bounds
• Union-find (and amortized analysis)
• Heaps: Binomial heaps, Fibonacci heaps
• Ordered structures: 2-3 trees, red-black trees, splay trees, treaps, cache-oblivious trees?, skip lists
• Data structures for external storage: DAM model, B+-trees, LSM-trees, Fractal trees (aka LSM-trees + fractional cascading), B^\epsilon trees, Sorting in external storage, k-way merge sort, distribution sort?
• String algorithms: substring searching, longest-common substring, edit distance, longest repeated substring?
• Graph algorithms: DFS, BFS, connected components, topological sort, minimum spanning tree, matching, max-flow/min-cut, min-cost flow, dijkstra's algorithm, all-pairs shortest paths, CFL reachability
• Complexity theory: P vs NP, NP-completeness, NP-complete examples, reductions, approximation algorithms, co-NP, PSPACE, EXP, P#, etc.
• Other fascinating topics: number theory (primality testing), FFT, error-correcting codes, network coding, compression, expander graphs, polynomial identity testing, matrix multiplication
• Linear programming: simplex algorithm, duality, primal dual method, polynomial-time algorithms, integer-linear programming, convex programming

Subject to tweaks throughout the semester.
• Homeworks (25%). I will assign 8-10 homeworks throughout the semester. Homeworks must be written using LaTeX.
• Class participation (5%). I strongly encourage discussion and debate in class. Please don't hesitate to ask questions if you don't understand or disagree with anything covered in this course.
• Midterm exam (30%). You will receive 25% credit for any problem for which you write "I don't know" (and nothing else).
• Final exam (40%). The final is cumulative, so it will have questions covering topics from the entire semester. You will receive 25% credit for any problem for which you write "I don't know" (and nothing else).

## Class Notes

If you would like to earn extra credit, you may volunteer to write up notes for a lecture. We will choose the best three sets of notes submitted for each lecture and post these to the course website. Only notes posted to the course website will earn extra credit. Extra credit will be curved to the final class grades such that 8 sets of notes will improve your grade by one step (e.g. from a B+ to an A-). Thus, for example, if you have notes posted for every lecture, then your final grade will be improved by one whole letter grade. Notes must be submitted in PDF format. Notes should be emailed to the TAs.

## References

There is no required course textbook. I will be using the following resources to prepare the class, so they should be good references for studying:

## Tips on doing well in the class

• Attend lectures. I often make a serious effort to synthesize material from several topics in order to stream-line the presentation and bring out connections between different areas of computer science. You may not be able to find this information anywhere else, and it might be in the homework or tests.
• Attend office hours. I do my best teaching in office hours.
• Demonstrate scrupulous ethics. Cheating will, by default, result in an F in the class and, if possible, expulsion from school. Fortunately, it's easy to avoid cheating on the homework: cite all materials and people referenced while working on the assignment. You may work in groups, but you must cite your collaborators on your homework. Citing external sources will not hurt your grade. Failing to cite an external source may result in a charge of academic dishonesty! No matter how well cited, handing in a solution you do not understand is plagiarism!
• Ask questions in class. If you don't understand it, then probably nobody does.
• Don't fall behind. Come to office hours as soon as you have questions.
• Start the homework as soon as it is handed out. That's what I did in grad school.
• Turn in all assignments on time. Late assignments will not be accepted.

## Lecture Schedule

Here's a very optimistic schedule. We'll be lucky to get through 2/3 of this material in the semester.
Warning: The notes below may contain errors. Use with caution.
Date
8/27 No class: Travel
8/29 No class: Travel
9/3 No class: Labor day
9/5 Hashing: chaining vs. linear probing, balls and bins, expected occupancy, w.h.p. definition, w.h.p. max bin, coupon collecting (a slight detour...)
9/10 Hashing: average vs worst case, algorithmic complexity attacks, 2-universal hashing, the power of 2 choices, cuckoo hashing, bloom filters, quotient filters
9/12 Sorting: merge sort, quick sort, divide and conquer, recurrence relations
9/17 Sorting: median-finding, lower bounds; Union-find and amortized analysis
9/19 Heaps: Binomial heaps, Fibonacci heaps
Rob's OH canceled
9/24 Guest lecture: Don Porter
Concurrent Algorithms
Rob's OH canceled
9/26 Guest lecture: Don Porter
Concurrent Algorithms
Rob's OH canceled
10/1 Ordered structures: 2-3 trees, red-black trees, splay trees
10/3 Ordered structures: treaps, cache-oblivious trees?, skip lists
10/8 Data structures for external storage: DAM model, B+-trees, LSM-trees
10/10 Data structures for external storage: Fractal trees (aka LSM-trees + fractional cascading), B^\epsilon trees, Sorting in external storage, k-way merge sort, distribution sort?
10/15 Guest lecture (Rob's OH canceled)
TBD
10/17 Guest lecture (Rob's OH canceled)
TBD
10/22 String algorithms: substring searching, longest-common substring, edit distance, longest repeated substring?
10/24 Graph algorithms: DFS, BFS, connected components, topological sort, minimum spanning tree,
10/29 Graph algorithms: matching, max-flow/min-cut, min-cost flow, dijkstra's algorithm, all-pairs shortest paths, CFL reachability
10/31 Guest lecture (Rob's OH cancelled)
TBD
11/5 Guest lecture (Rob's OH cancelled)
TBD
11/7 Guest lecture (Rob's OH cancelled)
TBD
11/12 Complexity theory: P vs NP, NP-completeness, NP-complete examples
11/14 Complexity theory: reductions, approximation algorithms, co-NP, PSPACE, EXP, P#, etc.
11/19 Linear programming: simplex algorithm, duality, primal dual method
11/21 No class: Thanksgiving break
11/26 Linear programming: polynomial-time algorithms, integer-linear programming, convex programming
11/28 Other fascinating topics: number theory (primality testing), FFT
12/3 Other fascinating topics: error-correcting codes, network coding, compression
12/5 Other fascinating topics: expander graphs, polynomial identity testing, matrix multiplication
12/11 Final Exam: 5:30-8:00

Note: If you have a physical, psychological, medical or learning disability that may impact on your ability to carry out assigned course work, please contact the staff in the Disabled Student Services office (DSS), Room 133, Humanities, 632-6748v/TDD. DSS will review your concerns and determine with you what accommodations are necessary and appropriate. All information and documentation of disability are confidential.

Note: Each student must pursue his or her academic goals honestly and be personally accountable for all submitted work. Representing another person's work as your own is always wrong. Any suspected instance of academic dishonesty will be reported to the Academic Judiciary. For more comprehensive information on academic integrity, including categories of academic dishonesty, please refer to the academic judiciary website at http://www.stonybrook.edu/uaa/academicjudiciary/.