CSE305 (Spring `02)
Principles of Database Systems

[ General Information | Course Outline | Lectures | Handouts | Other Pointers | Requirements ]
[ Announcements | What's new ]


General Information

Course description: This course is for students interested in the design of database management systems to obtain integrity and availability of data. We will focus mainly on relational data models and relational query operations, together with SQL for data definitions and queries. We will discuss design methods including the entity-relationship model and relational normalization theory. We will also discuss how data is stored and indexed and how queries are processed and optimized. Students undertake a semester project that includes the design and implementation of a database system. We will touch upon advanced topics such as object-oriented databases as time permits. | Prerequisites: CSE214 and CSE220, or undergrad discrete math (sets, relations, functions, predicate logic), data structures, computer organization, plus programming in Java. | Credits: 4.

Instructor: Annie Liu | Email: liu AT cs DOT sunysb DOT edu | Office: Computer Science 1433 | Phone: 632-8463.

TAs: Zhe Fan | Email: fzhe AT cs DOT sunysb DOT edu | Phone: 216-2338.
Diptikalyan Saha | Email: dsaha AT cs DOT sunysb DOT edu | Phone: 632-7679.
Wenxin Song | Email: wenxin AT cs DOT sunysb DOT edu | Phone: TBA.
Lujin Wang | Email: lujin AT cs DOT sunysb DOT edu | Phone: TBA.

Hours: Tue Thu 9:50-11:10AM, in Harriman 137.
Annie's office hours: Tue 8:30-9:50AM, Thu 11:10AM-12:30PM, in CS 1433.
Lujin's office hours: Wed 3-5PM, Fri 2:10-3:10PM, in CS 2110.
Zhe's office hours: Tue 4-5PM, Thu 1-2PM, Fri 10:20-11:20am, in CS 2110.
Dipti's office hours: Mon 9:30-10:30AM, in CS 2110.
Wenxin's office hours: Mon 2:10-3:10PM, in CS 2110.

Textbook: Database and Transaction Processing: An Application-Oriented Approach by Philip M. Lewis, Arthur Bernstein, and Michael Kifer. Addison Wesley, 2002. In addition, you should take good notes during the lectures.

Grading: Weekly or biweekly assignments, on homeworks or the project, together worth 30% of the grade. Two midterms and a final, each worth 20%, 20%, and 30%, respectively, of the grade. See Requirements.

Course homepage: http://www.cs.sunysb.edu/~liu/cse305/, containing all course related information.


Course Outline

We will start with an overview of databases and transaction processing, followed with an introduction to the basic concepts and to SQL.

Then, we will study basic database design using the entity-relationship model, learning to model entities, relationships, and additional constraints.

We will then focus on two main parts of the course: (1) relational data model and the SQL data definition sublanguage, and (2) relational algebra and the SQL query sublanguage.

Next, we will discuss more database design using relational normalization theory, discussing normal forms and their construction.

Additionally, we will discuss triggers and more SQL.

Finally, we will see how data is stored and indexed and how queries are processed and optimized.


Lectures

Lecture 1 (01/24/02): Overview: course information and requirements; database management systems, transaction processing, analytic processing. Reading: Ch.1, Sec.2.3. Homework: Handout H1.
Lecture 2 (01/29/02): Introduction to basic concepts and SQL: tables/relations/predicates, operations; SQL, examples, selection, projection, join, etc. Reading: Ch.2, Ch.3. Homework: Handout H2.
Lecture 3 (01/31/02): Entity-relationship model: entity and relationship (type, schema), attributes, roles, keys; representation using relations/tables, set value problem. Reading: Sec.5.1-5.3.
Lecture 4 (02/05/02): Entity-relationship model: representation using SQL, foreign keys and keys, modeling using UML associations; entity type hierarchies. Reading: Sec.5.3-5.4.1. Project 1: Handout P1.
Lecture 5 (02/07/02): Entity-relationship model: participation constraints, representation using SQL; at most / at least / exactly one; entity or attribute / relationship, information lose. Reading: Sec.5.4.2-5.8.
Lecture 6 (02/12/02): Relational data model: physical, conceptual, external schemas, data model; relational model, relation instance/schema, relational database. Reading: Sec.4.1-4.2.1.
Lecture 7 (02/14/02): Relational data model: integrity constraints, key constraints, foreign key constraints, inclusion dependence; SQL DDL: table, keys, null, default value. Reading: Sec.4.2.2-4.3.5.
Lecture 8 (02/19/02): SQL data definition language: semantics constraint: check, assertion, create domain; foreign key constraint; event and trigger. Reading: Sec.4.3.5-4.3.8. Homework: Handout H3.
Lecture 9 (02/21/02): SQL data definition language: view, modifying schema, access control; project 1 solution, possible designs and trade-offs. Reading: Sec.4.3.9-4.4.
Midterm 1 (02/26/02): In-class exam. Requirements: closed all items except your own handwritten (not mechanically or electronically reproduced) notes. Reading: anything relaxing, as you are a third done.
Lecture 11 (02/28/02): Relational algebra: overview, basic operators: selection, projection, set union / difference / intersection, Cartesian product, join, renaming. Reading: Sec.6.1. Project 2: Handout P2.
Lecture 12 (03/05/02): Relational algebra: division; SQL query sublanguage: simple queries, relating to relational algebra, set operations, nested queries (self study). Reading: Sec.6.2.1-6.2.3.
Lecture 13 (03/07/02): SQL query sublanguage: division, SQL queries, evaluation strategy, relating to relational algebra (review); aggregates, grouping, sorting (self study). Reading: Sec.6.2.1-6.2.5.
Lecture 14 (03/12/02): SQL query sublanguage: review of relational algebra, SQL query sublanguage, writing queries; views, modifying data, updating views. Reading: Sec.6.2.6-6.4.
Lecture 15 (03/14/02): Relational normalization: redundancy, decomposition; functional dependency, reflexivity+augmentation+transitivity, closure, correctness. Reading: Sec.8.1-8.4. Project 3: Handout P3.
Lecture 16 (03/19/02): Relational normalization: attribute closure and computation, normal forms, Boyce-Codd normal form, third normal form, decomposition. Reading: Sec.8.4-8.6.
Lecture 17 (03/21/02): Relational normalization: schema decomposition, lossless, dependency preserving, BCNF decomposition algorithm. Reading: Sec.8.6-8.7 Homework: Handout H4.
Spring break: March 25-29.
Lecture 18 (04/02/02): Relational normalization: discussion of P3 and review, 3NF decomposition algorithm, minimal cover, tradeoffs, multi-valued dependency and 4NF. Reading: Sec.8.8-8.9, 8.11-8.14.
Lecture 19 (04/04/02): Triggers and active databases: event, condition, action, meanings, in SQL:99, chain reaction. Review for midterm 2: topics, concepts, problem solving, algorithms. Reading: Ch.9. H4S.
Midterm 2 (04/09/02): In-class exam. Requirements: closed all items except your own handwritten (not mechanically or electronically reproduced) notes.
Lecture 21 (04/11/02): SQL in application programs: statement vs call level interface, embedded SQL, in/out/status, transaction, cursor, stored procedure, constraint checking. Reading: Sec.10.1-10.3.
Lecture 22 (04/16/02): SQL in application programs: overview of SLI (embedded and dynamic SQL, SQLJ) and CLI (JDBC, ODBC); dynamic SQL; JDBC. Reading: Sec.10.4-10.8. Project 4: Handout P4.
Lecture 23 (04/18/02): Data storage and indexing: disk organization, latency, page transfer; heap files, performance; sorted files, overflow chain; indices, search key, index structure. Reading: Sec.11.1-11.4.0.
Lecture 24 (04/23/02): Data storage and indexing: indices, integrated vs separate; clustered vs unclustered, range search, sparse vs dense; multilevel indexing, ISAM. Reading: Sec.11.4.1-11.5.1.
Lecture 25 (04/25/02): Data storage and indexing: review; multilevel indexing using B+ trees, insertion; hash indexing; choosing an index. Reading: Sec.11.5.2-11.6, 11.8,11.9. Homework: Handout H5.
Lecture 26 (04/30/02): Query processing and optimization basics. External sort, duplicate removal; selection for simple and complex conditions; nested loops, sort-merge, hash join. Reading: Ch.13.
Lecture 27 (05/02/02): Query optimization: overview, principles, rules, and examples; review for the final exam, topics, more sample questions; demo, grades, etc. Reading: Ch.14.
Lecture 28 (05/07/02): Self review, and more meetings can be scheduled if you need to see me about your grades, after they are posted later in the week, and before the final.
Final Exam (05/14/02): 8:00-10:30AM, in Union Auditorium, Final exam. Closed all items except your own handwritten (not mechanically or electronically reproduced) notes. Bring your id card for checking.

Handouts

Slides for Ch.1: Overview: Database Management, Transaction Processing, Analytic Processing

Slides for Ch.2: A Closer Look: Basic Description of Data and Operations and SQL

Slides for Ch.5: Entity-Relationship Model (Part 1) | (Part 2) | (Part 3)

Slides for Ch.4: Relational Data Model and SQL DDL (Part 1) | (Part 2) | (Part 3) | (Part 4)

Slides for Ch.6: Relational Algebra and SQL DML (Part 1) | (Part 2) | (Part 3) | (Part 4)

Slides for Ch.8: Relational Normalization (Part 1) | (Part 2) | (Part 3) | (Part 4)

Slides for Ch.9: Triggers and Active Databases

Slides for Ch.10: SQL in the Real World (Part 1) | (Part 2)

Slides for Ch.11: Physical Data Organization and Indexing (Part 1) | (Part 2) | (Part 3)

Slides for Ch.13: Query Processing Basics

Slides for Ch.14: Query Optimization Overview

Handout H1: Homework 1: Questionnaire

Handout H2: Homework 2: An Interesting Database System Application

Handout H3: Homework 3: ER Model, Relational Data Model, SQL Data Definition Language

Handout H3S: Solution to Homework 3

Handout H4: Homework 4: Relational Algebra, SQL Query Sublanguage, Normal Forms

Handout H4S: Solution to Homework 4

Handout H5: Homework 5: More SQL, Data Storage and Indexing, Query Processing

Handout H5S: Solution to Homework 5

Handout P1: Project 1: Entity-Relationship Model for Stony Brook Car Pooling Management System

Handout P1S: Solution to Project 1

Handout P2: Project 2: Creating Relations / SQL Tables

Handout P2S: Solution to Project 2

Handout P3: Project 3: Querying and Updating Data

Handout P3S: Solution to Project 3

Handout P4: Project 4: Building Application Programs, Testing, and Improvements

Handout M1: Midterm 1

Handout M1S: Solution to Midterm 1

Handout M2: Midterm 2

Handout M2S: Solution to Midterm 2

Handout F: Final Exam

Handout FS: Solution to Final Exam


Other Pointers

Windows NT Transaction Processing Labs: policies, FAQ, Sybase, JDBC, JBuilder, etc.

Sybase's Transact-SQL User's Guide.
It is installed in the Trans Labs at H:\Documentation\Sybase\sqlug_v12_0.pdf
Other Adaptive Server Enterprise 12.0 Product Documentation.
Sybase ASE v12.0 is currently running on all Sybase servers in the department.

Ghostscript, Ghostview and GSview: Ghostview and GSview are for viewing and printing Postscript documents (some of the handouts for this course will be in this format). If you use Linux, then this software is already installed on your machine. On Windows, you need to download both Ghostview and Ghostscript (and also some fonts). Unzip Ghostview and run setup.exe. It will unpack and install the rest.

The gzip homepage: GNU zip for compression.

Miscellaneous:
A little SQL tutorial, provided by Deji Abraham. (I wish we could remove the geocities ads on it.)
Anomaly and decomposition in the context of a little database tutorial, provided by Eve Fon Wu.


Requirements

You are required to learn all information on the course homepage. You should check the homepage periodically for Announcements and What's new

You are required to do all course work. In particular, you are required to learn all materials covered in lectures as they are needed for the course work. Besides the exams, there are two kinds of assignments, described below.

  • Homework assignments will be given as preparations for exams. They will not be graded, but solutions will be provided and/or discussed, and you will get -10% of the course grade for each homework that you did not turn in by the deadline, or for any obvious junk that you did turn in.

  • The project requires to build a transaction processing system. Parts of it will be done in groups of two students. The project will have four deliverables: the ER model, the relational data model, the relational query model, and the GUI.

Unless permitted explicitly, all work on the assignments must be done individually. You may discuss with others and look up references, but you must write up your solutions independently and credit all other sources that you used. Any plagiarism or other forms of cheating will result in an F.

Unless specified otherwise, all assignments will be due in class on the due date, the same day that the subsequent assignment or exam will come. No late assignment will be accepted.

The following information must be contained in all project files and homeworks: Student ID, Name, Email address,  Due Date. It is important that you explain your work carefully. All work must be submitted in a neat and organized fashion, and it must be typed and stapled. You might lose points if these conditions are not met. If you feel that you deserve a higher grade see one of the TAs promptly no later than two weeks after the assignment was graded.

Using the Transaction Processing Lab: For the project, you will be given an account in the Transaction Processing Lab. Do not let anyone else use your account. Please be conscious of security in the lab since theft of departmental equipment is a problem that affects us all. Although your project will have to be demonstrated in the lab, if you have your own computer it may be possible for you to do much of the development there. If you have any problems with the hardware or software in the lab (other than with the requirements of the project itself) send E-mail to ntadmin AT cs DOT sunysb DOT edu with a copy to me. Neither the TAs nor I can fix such problems.

Again on cheating: Unfortunately, cheating has increasingly become a problem among students in the Department. You should be aware that cheating on homeworks or the project will be treated as severely as cheating on exams. While we don't mind if you discuss the homeworks or project with other students, if we find that you have turned in work that was essentially done by someone else, or you have given your work to someone else, you will fail the course. Each semester several such cases are discovered. If we decide to turn the case over to the CASA committee, the punishment can be worse. If for any reason you can't do the work yourself, come to see us right away and we will try to get some help for you. The last day to drop a course without tuition liability is Jan 29. The last day to drop a course without a withdrawal being recorded is Feb 5.

Disability: If you have a physical, psychological, medical or learning disability that may have an impact on your ability to carry out assigned course work, I urge you to contact the staff in the Disabled Student Services office (DSS), Room 133 Humanities, 632-6748/TDD. DSS will review your concerns and determine, with you, what accommodations are necessary and appropriate. All information and documentation of disability is confidential.


liu AT cs DOT sunysb DOT edu