CSE 532
Fall 2002
Stony Brook
Advanced Database Systems
Annie Liu
Project Part II
Handout P2
Nov. 5, 2002
Due Dec. 12


An XML-Based Database System

In this Project Part II, you are to design and implement an XML-based database system to support your favorite database application. See information at the bottom of this page about using Quip, an XQuery implementation from Software AG.

Your data will be in plain XML files, not in an XML server. Your application will consist of a set of XQuery queries issued on command line or using Quip's graphical user interface.

You will design the XML Scheme appropriate for your application (even though we do not require you to pass it to a parser), generate data in XML files that conform to this schema, and write XQuery queries.

Make as many of the constraints as possible be part of the XML Schema. Express the rest using XQuery queries so that a constraint will be considered violated if the corresponding query returns a non-empty answer.

Implement and test these constraints and the queries in your application on your generated data in XML files.

At the end of the project, you will be asked to hand in your project document and present a short (15-20 minutes) demo (there will be sign-up sheets).

Your database application

This should be the same application as in your Project Part I. Make sure you have a sufficient number of data types, constraints, and queries as required for your project description from Homework 1.

Documentation

Your complete implementation (queries, constraints, XML Schema) must be accompanied by a project document, which should include queries, constraints, XML Schema, and a brief user guide.

Teaming

This Project Part II should be done individually.

Bonus

Points for the bonus parts below are rough estimates; if you want to do any of them or propose to do anything else, you should let me know when you have a plan, or result, to get a better estimate of the points.

A. (20%) Write the XQuery types corresponding to your XML Schema; if there is any constraint that can not be expressed using XQuery types, express them using XQuery queries as described above. Compare the size of this part and the size of your XML Schema (using rough counts of pages/lines/words/etc, whichever are appropriate).

B. (20%) Generate data that is as realistic as possible (in terms of both the content and the size) in any way you like and perform the queries in your application on this data. Explain your method, include the source code, describe how to run it, and summarize the results of the queries (show more details if a query finds anything interesting).

C. (30%) Take an interesting query, either existing or new, write it in the shortest, clearest, and most efficient ways. You should have two or more ways for an interesting query. Compare the different ways in terms of length, clarity, efficiency (both analytically and by experiments for efficiency), and the trade-offs.

D. (30%) Take another XQuery implementation and try it with the XML Schema, XQuery types, XQuery queries, and Java interface (whichever it has) for your application. Describe what you did, summarize your experience, and compare that implementation with Quip. You can find a list of XQuery implementations at here and here. We recommend Galax (for XQuery types and conformation with the new standard), X-Hive (for performance, possibly), Oracle and Kawa (for Java interface, possibly).

Using Quip, an XQuery implementation from Software AG

Quip is installed in the Graduate NT Lab. You can start Quip by using the shortcut in the Programs item of the Start menu. All Quip logfiles are written to "C:\Temp\%USERNAME%".

You can also download a copy of Quip from the Software AG site: http://developer.softwareag.com/tamino/quip/