In this homework, you are asked to describe the project you would
like to do for part II of the course.
Possible projects are as described in class and summarized below.
- Extending and improving your system from part I. You can modify
schema and constraints as needed. Examples are:
- Automatically collecting real data, using XQuery, XSLT, etc.
- Automatically generating HTML web pages, using XQuery, XSLT, etc.
- Comparing alternative and additional queries,
possibly in different languages, in term of
clarity (ease of writing and reading) and
efficiency (analysis of running times).
- Generating large data set and evaluating actual query performance,
on alternative and additional queries.
- Using other tools or ideas, extending your system from part I with
them, and comparing your system from part I with using them. Examples
- XQuery tools besides Quip:
Qexo (gnu), Galax (Bell Labs), others, commercial tools.
- Webpage generation tools: XStrudel (ATT Labs, UPenn, WashU)
- Intelligent query tools: MultiRunner, MultiQuery, etc.
- An intelligent query tool: WinAgent (SB).
- Search engines: Google, etc.
- Semantic web tools: Cwm (W3C) for query using rules.
- Programming languages: Java, Python, Perl, etc. for
general purpose programming and comparing productivity and efficiency.
- Extending and improving other tools. For example, one can improve
the interface, implementation, and output of MultiRunner. Talk to me
if you are interested.
- Implementing your favorite tool, either for web queries or having
a web query or XML query component. Examples are ideas in the wish
list from assignment 1, organized as below:
I also have problems in program analysis and in database and web
applications that mainly involve querying XML documents. Talk to me if
you are interested.
- Semantics: standardizing web pages--chaoying liu, perfect search engine--qiting huang, ask what about--muhammad atiyat, smarter search engine--marc fruchter, interactive and intelligent--wong kwok bun, smart web query--ahmed al-badawna, want precise answer--yash chandra, personalize--sofoklis papasofokli.
- Multimode (also semantics): answer and interact with voice--jiande he, search by voice and picture and ret good match--guochan cen, web image query/monitoring--adewale oluwasanmi, image grabber--courtland eppelsheimer.
- General tools (also semantics): google--sungpill han, google--shenan xue, baidubar--wei chen, keywords priority--jin chao lei, logs---andrew chen.
- Downloading files: better search--david cheung, finding--xiao wang, file seeker--chang tai zheng.
- Computing: software version tracker--matthew salacain, intelligent driver searching--ben reisner, notebook computer--daniel yu.
- Shopping and banking: dynamic price--robin soohoo, shop on line--bin zhou, collect transactions and statements of diff banks and accounts--haris khan.
- Transportation and travel: diff modes--john paul nazarre, flight closest date--gemma jittansingh-mayers, car rental--youssef bittaf.
- Schools and jobs: college decision maker--simon ho, grad programs--chaitanya attaluri, sb search--lun yin cheng, solar job search--ying liu, job--brian lin wan.
- News and comments: e-news assistant/collaborator--kunal kadbet, comments at portal sites--mark drago.
- Sport and entertainment: national football league--ralph d'ambrosio.
- More or less related to web queries: web programming complexity reducer--guoqiang zhang, web appl desgin--shika saraf, web site & info authentication--stephanie stradford, on demand computing--qiang tong.
The total amount of work should be about three times as much as
assignments 2 and 3 together. It is all the work for the rest of the
course, and is for a total of about three times as much credit as
Note, you could choose to do one of the projects in item 1 above,
and do another one or portions in other items later for appropriate
You are asked to describe the "why", "what", and "how" of your
proposed project. This time, we ask that you carefully describe the
"what" and "how". For "what", say clearly what your project will be.
For the system you will produce, say what it will take as input and
produce as output (from a user's perspective);
if it is interactive, explain the interactions clearly. If there are
multiple parts, describe each part clearly. For "how", include how
you will carry out your project (from a implementor's
perspective), how your system will be implemented, using what
tool, in what language, and what test data you will use.
In all cases, your project will be processing some XML and/or HTML
data. The precise kinds of data and kinds of operations in your system must be clear from
your description. For a small example (of a simpler project), see
There are two additional requirements that I
discussed in detail with every team I met. They are summarized here.
(1) Reuse: After you decide the what and how,
spend some time on the web looking for existing systems and tools that
can be used for part of your project, and reuse as much as possible;
if you could not find anything useful, still summarize your effort.
(2) Incremental development: Find the minimum
that you can definitely do and still have a useful deliverable, and
make the rest of the work into 1 or 2 or 3 increments that together
achieve the overall goal.
By Monday March 22, midnight, send me an email in plain text
containing three lines of the following forms
<a href="url-of-your-project description">Name of Your Project</a>
Name of the first person in your team
Name of the second person in your team
and on Tuesday March 23, hand in a printout of your description in class.
This homework is worth about 3% of the course grade (good
description and design can be used as part of next two homeworks and
project report, which count for all the remaining points in the
course). Exceptionally well thought-out and well written homeworks
will receive appropriate extra credit.