[ General Information
| Course Outline
| Other Pointers
[ Selected Projects ] [ Announcements ] [ Proposed Projects ]
Course description: This course is for students interested in advanced database system design and implementation. The course will quickly cover relational databases and then cover advanced topics in modern database systems, including object-oriented databases, XML databases, distributed databases, and on-line analytical processing. We will discuss various data description and query languages, database design, and query processing and optimization, and also look at distributed object model, and data mining and data warehouses. Students undertake a semester project that includes the design and implementation of a database system, which includes the use of object-oriented features and XML. | Prerequisites: CSE305; or discrete math (sets, relations, functions, logic, etc), data structures and algorithms, good knowledge of Java, plus quick grasp of 305 materials. | Credits: 3.
Instructor: Annie Liu | Email: liuATcsDOTsunysbDOTedu | Office: Computer Science 1433 | Phone: 632-8463.
TA: Bin Tang | Email: bintangATcsDOTsunysbDOTedu
Hours: Tue Thu 11:20-12:40AM, in Computer Science 2129/31.
Annie's office hours: Tue 10-11:20AM, Thu 2-3:20PM, in CS 1433.
Bin's office hours: Mon 1-3 PM, in CS 2110.
Textbook: Database and Transaction Processing: An Application-Oriented Approach by Philip M. Lewis, Arthur Bernstein, and Michael Kifer. Addison Wesley, 2002. You will be given access to the slides from the textbook, but you should also take good course notes yourself.
Grading: A project description and two parts of the course project are worth 5%, 20%, and 10%, respectively, of the grade. A midterm exam and a final exam are worth 25% and 40%, respectively, of the grade. There will also be bonus points with the projects and exams as they fit. No late handins will receive any credit. All work must be done individually unless permitted explicitly; you may discuss with others and look up references, but you must write up your own solutions independently and credit all sources that you used. Any plagiarism or other forms of cheating will result in an F or worse.
Course homepage: http://www.cs.sunysb.edu/~liu/cse532/, containing all course related information.
We will first quickly review relational databases, including entity-relationship model, relational algebra, SQL, relational normalization, and SQL in application programs, and then more carefully discuss relational calculus. Just before discussing relational calculus, we will have a quiz.
We will then focus on two main parts of the course: (1) object-oriented databases, and (2) XML databases. For the first part, we will discuss the motivation and concepts of object data model, the ODMG standard, object-oriented extensions of SQL, and the distributed object model CORBA. For the second part, we will first discuss semistructured data, XML basics and DTD, and XML schema, and then discuss XML query languages including XPath, XSLT, and XQuery. Just before covering XML query languages, we will have a review lecture followed by an in-class midterm exam.
We will next discuses how queries are processed and optimized, followed by a study of distributed databases, and finally cover OLAP together with data mining and data warehouses. At the end, there will be a review lecture and a final exam.
The course project is on the design and implementation of a database system and has two parts. The first part includes the use of object-oriented features; it will start right after the first quiz and be due on the day of the midterm review. The second part includes the use of XML; it will start right after the midterm exam and be due on the day of the final review.
Handout Q: Questionnaire
Slides for Ch. 5: Entity-Relationship Model
Slides for Ch. 6: Relational Algebra and SQL
Slides for Ch. 8: Relational Normalization Theory
Slides for Ch.10: SQL in the Real World
Slides for Ch. 7: Relational Calculus and Visual Query Languages
Slides for Ch.16: Object Databases
Slides for Ch.17: XML and Web Data
Slides for Ch.17 new version of the part on XQuery: XQuery
Slides for an XQuery tutorial that includes types in XQuery: XQuery Tutorial
Slides for Ch.13: Query Processing Basics
Slides for Ch.14: Query Optimization Overview
Slides for Ch.18: Distributed Databases
Slides for Ch.19: OLAP and Data Mining
Handout H1: Homework 1: Description of a Database System Application
Handout Qz: Quiz
Handout P1: Project Part I: An Object-Relational Database System
Handout H2: Homework 2 and Solution: Practice Problems for Midterm
Handout M: Midterm Exam
Handout MS: Solution and Grading Guide for Midterm Exam
Handout P2: Project Part II: An XML-Based Database System
Handout H3: Homework 3 and Solution: Practice Problems for Final
Handout F: Final Exam
Handout FS: Solution to Final Exam
Windows NT Transaction Processing Labs: policies, FAQ, connecting to DB2, DB2 Docs, JBuilder, links, etc.
IBM's DB2 V7 manuals, including
SQL Getting Started (html) | (pdf) | (pdf at Trans Labs)
SQL References, V7.1 (html) | (pdf) | (pdf at Trans Labs)
SQL References, V7.2 (pdf)
Application Development Guide, V7.1 (html) | (direct link to html found by Alex, faster too!) | (pdf)
Application Development Guide, V7.2 (pdf) | (direct link to pdf found by Alex)
1.4.1 API Specification
JDBC 3.0 API Specification
W3C XML Query page:
specifications, implementations, links, etc.
Quick reference for XML Path Language (XPath) 1.0
Quick reference for XML Schema (XSD) Part I Structures 1.0 | Part II Datatypes 1.0
Ghostscript, Ghostview and GSview: Ghostview and GSview are for viewing and printing Postscript documents (some of the handouts for this course will be in this format). If you use Linux, then this software is already installed on your machine. On Windows, you need to download both Ghostview and Ghostscript (and also some fonts). Unzip Ghostview and run setup.exe. It will unpack and install the rest.
The gzip homepage: GNU zip for compression.
This course includes both theory and implementation aspects, so you should have the prerequisites for both. The official title of the course is Theory of Database Systems, but the official content is really Advanced Database Systems, hence the title we use. For this semester, the exact content is as described at the top of the course homepage, which is very similar to the content of the course in Spring 2002.
You should learn all information on the course homepage. Check the homepage periodically for Announcements.
Do all course work. The projects and exercises are integral parts of the course as they provide concrete experiences with the abstract ideas covered in the class.
Computing facilities: You will be given an account in the graduate NT machines lab and sbpus Solaris machines room. Never let anyone else use your account; it is against the rules. Please be conscious of security in the lab; theft or vandalism will be punished severely. If you have any problems with the hardware or software in the lab (other than with the requirements of the project itself), please email ntadminATcsDOTsunysbDOTedu with a copy to me; neither the TA nor I could fix such problems.
Disability: If you have a physical, psychological, medical or learning disability that may have an impact on your ability to carry out assigned course work, please contact the staff in the Disabled Student Services office (DSS), Room 133 Humanities, 632-6748/TDD. DSS will review your concerns and determine with you what accommodations are necessary and appropriate. All information and documentation of disability are confidential.