CSE 676: Special Topics in Artificial Intelligence
(Spring 2003)
 Description:
This seminar will cover papers for building wrappers for web sources.
 Class Hours and Location:
Friday, 3:30p to 5:30p at the Computer Science Seminar Room.
 Instructor:
I.V. Ramakrishnan
Office: 1421 Computer Science
Email: ram@cs.sunysb.edu
 Reading List :
Papers highlighted in 'yellow' will be discussed this week.  
1.
  Callif, M.E. and Mooney, R.J.
Relational Learning of Pattern-Match Rules for Information Extraction.
2.
  Crescenzi, V. and Mecca, G.
Grammars have Exceptions.
3.
  Crescenzi, V., Mecca, G. and Merialdo, P.
RoadRunner: Towards Automatic Data Extraction.
4.
  Embley, D.W., Campbell D.M., Jiang Y.S., Liddle, S.W., Kai Ng, Y., Quass, D., Smith, R.D.
Conceptual-Model-Based Data Extraction from Multiple-Record Web Pages.
5.
  Embley, D.W., Jiang, Y.S. and Ng, Y.K.
Record-Boundary Discovery in Web Documents.
6.
  Freitag, D.
Machine Learning for Information Extraction in Informal Domains.
7.
  Golgher, P.B. Da Silva, A.S., Laender, A.H.F., and Ribeiro-Neto, B.A.
Bootstrapping for Example-Based Data Extraction.
8.
  Hammer, J., Breunig, M., Garcia-Molina, H., Nestorov S., Vassalos, V., Yerneni, R.
Template-Based Wrappers in the TSIMMIS System.
9.
  Hsu, C.N., and Dung, M.T.
Generating Finite-State Transducers for Semi-structured Data Extraction from the Web.
10.
  Kushmerick, N.
Wrapper Induction: Efficiency and Expressiveness.
11.
  Alberto H. F. Laender, Berthier Ribeiro-Neto, and Altigran S. da Silva.
DEByE - Data Extraction by Example.
12.
  Liu, L., Pu, C. and Han, W.
XWRAP: An XML-enabled Wrapper Construction System for Web Information Sources.
13.
  Muslea, I., Milton, S., and Knoblock, C.A.
Heirarchical wrapper induction for Semistructured Information Sources.
14.
  Sahuguet, A. and Azavant, F.
Building Intelligent Web Applications using Lightweight Wrappers.
15.
  Soderlan, S.
Learning Information Extraction Rules for Semi-structured and Free Text.
16.
  Arvind Arasu and Hector Garcia-Molina
Extracting Structured Data from Web Pages.
17.
  Seung Jin-Lim and Yiu-Kai Ng
An Automated Change-Detection Algorithm for HTML Documents based on Semantic Heirarchies
18.
  Vinayak Borkar, Kaustubh Deshmukhy and Sunita Sarawagi
Automatic segmentation of text into structured records
19.
 

David W. Embley, Douglas M. Campbell, Randy D. Smith
Ontology-Based Extraction and Structuring of Information from Data-Rich Unstructured Documents

May-11
20.
  David W. Embley and L. Xu
Record Location and Reconfiguration in Unstructured MultipleRecord Web Documents
May-11
21.
 

Natalya F. Noy and Deborah L. McGuinness
Ontology Development 101: A Guide to Creating Your First Ontology

top
Last updated on 9 May, 2003
Click here to report broken links.