next up previous
Next: 2 International Dining: Restaurant Up: Student Projects for Fall Previous: Student Projects for Fall

1 Text Mining and Document Analysis

The increasing volume of informative content on the World-Wide Web coupled with decreasing costs of computation and communication have created exciting new opportunities in text mining. Towards this end, we have started the Lydia project, a natural language system for rapidly assimilating the primary vocabulary associated with high-quality curated text, and extracting relations between them.

Lydia-style text analysis has natural applications in the financial, legal, medical, and homeland security sectors, and we look forward to interacting with associated companies. The ultimate goal of Lydia is to build a relational encyclopedia of much of the world's knowledge through the analysis of news sources, reference texts, and primary sources such as government documents. Lydia (http://www.textmap.org) is still at a relatively early stage of development, but it is already producing interesting analysis of significant volumes of text.

Projects in text mining include:

Useful background for these projects include Internet programming experience, natural language processing, and/or Hadoop Map/Reduce programming. There is room for several students on this project.


next up previous
Next: 2 International Dining: Restaurant Up: Student Projects for Fall Previous: Student Projects for Fall
Steve Skiena
2009-08-28