Home Research Publications Teaching Funding Software Group News
 

News Archive
 
 
March 15, 2016: NSF REU Opening: Exploring Scalable Data Analytics for Big Data at Stony Brook University.

I am looking for a highly motivated undergraduate student with CS or informatics major to work on an NSF sponsored project on Research Experiences for Undergraduates (REU).
If you are interested, please submit your application at SPIDAL REU at Stony Brook University .

March 1, 2016: We are organizing the Second International Workshop on Data Management and Analytics for Medicine and Healthcare (DMAH'2016), in conjunction with VLDB 2016.

The workshop will bring people cross-cutting the fields of information management and medical informatics, to discuss innovative data management and analytics technologies highlighting end-to-end applications, systems, and methods to address problems in healthcare, public health, and everyday wellness, with clinical, physiological, imaging, behavioral, environmental, and omic- data, and data from social media. The workshop will be held at New Delhi, India on September 9, 2016. The workshop is in conjunction with the Very Large Databases Conference (VLDB).



Febrary 28, 2016: Department of Biomedical Inforamtics is selected again by Google as a Google Summer of Code Organization.

Google Summer of Code is a global program focused on introducing students to open source software development. Students work on a 3 month programming project with an open source organization during their break from university. The Department of Biomedical Informatics (organization administrator: Fusheng Wang) of Stony Brook University is selected one of the open source organizations to provide projects for students.

December 23, 2015: We won EDBT 2016 Test of Time Award.

Our paper "Bridging Physical and Virtual Worlds: Complex Event Processing for RFID Data Streams" is selected for the EDBT 2016 Test of Time Award. The International Conference on Extending Database Technology (EDBT) is a premium conference on database technologies. The conference awards the EDBT test-of-time (ToT) award, with the goal of recognizing one paper, or a small number of papers, presented at EDBT earlier and that have best met the "test of time", i.e. that has had the most impact in terms of research, methodology, conceptual contribution, or transfer to practice over the past decade(s).



EDBT Test of Time Awards

September 28, 2015: Shuai Zheng joined Centers for Disease Control and Prevention.

Shuai Zheng successfully defended his Ph.D. thesis "Online Learning Based Clinical Information Extraction and Classification" and joined Centers for Disease Control and Prevention.

April 14, 2015: I will teach CSE532: Theory of Database Systems this fall.

The course is for Ph.D. students only.

Course Website

April 2, 2015: We are awarded an Amazon AWS Teaching Grant.

We are awarded an Amazon AWS in Education Teaching Grant to support CSE532.

AWS in Education

March 26, 2015: NSF REU Opening: Exploring Scalable Data Analytics for Big Data at Stony Brook University.

I am looking for highly motivated undergraduate student with CS or informatics major to work on an NSF sponsored project on Research Experiences for Undergraduates (REU).
If you are interested, please submit your application at SPIDAL REU at Stony Brook University .

January 31, 2015: We are organizing the First International Workshop on Data Management and Analytics for Medicine and Healthcare (DMAH'2015), in conjunction with VLDB 2015.

The workshop will bring people cross-cutting the fields of information management and medical informatics, to discuss innovative data management and analytics technologies highlighting end-to-end applications, systems, and methods to address problems in healthcare, public health, and everyday wellness, with clinical, physiological, imaging, behavioral, environmental, and omic- data, and data from social media. The workshop will be held at the Big Island of Hawaii on September 5, 2015. The workshop is in conjunction with the Very Large Databases Conference (VLDB).

Download the brochure.

January 8, 2015: I joined Stony Brook University.

I moved to Stony Brook University for a joint appointment at Department of Biomedical Informatics and Department of Computer Science, starting in January, 2015.

October 31, 2014: Congratulations to Ablimit Aji on successfully defending his Ph.D. thesis!

Ablimit Aji has defended his Ph.D. thesis, and is now Dr. Aji. His disseration title is "High Performance Spatial Query Processing for Large Scale Spatial Data Warehousing".
He is currently a research scientist at HP Labs.

Group picture.

September 7, 2014: We are awarded an NSF ACI Award on "CIF21 DIBBs: Middleware and High Performance Analytics Libraries for Scalable Data Science".

We are awarded a $5M from NSF on CIF21 DIBBs: Middleware and High Performance Analytics Libraries for Scalable Data Science". The project will design and implement a software Middleware for Data-Intensive Analytics and Science (MIDAS) that will enable scalable applications with the performance of HPC (High Performance Computing) and the rich functionality of the commodity Apache Big Data Stack. Further, this project will design and implement a set of cross-cutting high-performance data-analysis libraries, which will support new programming and execution models for data-intensive analysis in a wide range of science and engineering applications.

NSF Press Release: Laying the Groundwork for Data-driven Science

August 20, 2014: NSF CAREER award.


August 14, 2014: We are awarded an Amazon AWS in Education Research grant.

We are awarded an Amazon AWS in Educaiton Research Grant for building Cloud Enabled Pathology Image Analytics Software.
AWS in Education Research Grants

July 10, 2014: We are funded by Center for Disease Control and Prevention for Clinical Natural Language Processing.

We are funded by Center for Disease Control and Prevention (CDC) for a project "Adaptive Self Learning Technology and Surveillance of Venous Thromboembolism through Electronic Health Records". The project will use natural language processing technologies for extracting and integrating clinical data records for VTE surveillance study.

April 1, 2014: Winner of 2014 IBM Champion.

Recognized as a 2014 IBM Champion of the IBM Champion program. This program recognizes individuals who make significant contributions to IBM communities by advocating the solutions; sharing their knowledge and expertise; and helping nurture and grow independent communities.

April 18, 2014: A virtual machine version of PAIS system is available for download.

You can now install PAIS from the VM release.You can follow the instructions to setup.

March 28, 2014: Ablimit joined HP Labs.

Farewell lunch picture.

December 12, 2013: Hadoop-GIS is available in Amazon Elastic MapReduce.

We have provided a library for Hahoop-GIS to run on Amazon Elastic MapReduce (EMR).You can follow the instructions to quickly start running your large scale spatial queries on Amazon cloud.

September 24, 2013: PAIS Portal demo is available.

We have a demo pathology analytical imaging portal ready. You can browse, download, visualize images and anaytical results.

August 28, 2013: Hadoop-GIS 1.0 alpha released.

Hadoop-GIS 1.0 alpha is released. Please find the source code at https://github.com/hadoop-gis. A demo will be presented at SIGSpatial GIS Conference at Orlando, November 5-8, 2013. We are also looking for community contributors. A Hive feature request has been submitted at Hive JIRA.

July 4, 2013: Our work on adaptive learning based approach for clinical information extraction is accepted to AMIA Annual Symposium, 2013.

Our paper "ASLForm: An Adaptive Self Learning Medical Form Generating System" is accepted to AMIA Annual Symposium, Washington, DC, November 16-20, 2013. ASLForm is an interactive, incrementally learning based system data extraction system for clinical narrative reports. ASLForm provides users a convenient interface that can be used as a simple data extraction and data entry system. User feedback can incrementally refine the decision model in real-time, which further reduces users' interaction effort thereafter. The system eventually achieves high accuracy on data extraction with minimal effort from users.
You can watch a demo video here.

June 1, 2013: Our paper entitled "Hadoop-GIS: A High Performance Spatial Data Warehousing System Over MapReduce" is accepted to VLDB 2013.

Our paper entitled "Hadoop-GIS: A High Performance Spatial Data Warehousing System Over MapReduce" is accepted to the 39th International Conference on Very Large Databases (VLDB'2013), Trento, Italy, August 26-30, 2013. Hadoop-GIS is a scalable and high performance spatial data warehousing system for running large scale spatial queries on Hadoop. Hadoop-GIS supports multiple types of spatial queries on MapReduce through space partitioning, customizable spatial query engine RESQUE, implicit parallel spatial query execution on MapReduce, and effective methods for amending query results through handling boundary objects on MapReduce. Hadoop-GIS takes advantage of global partition indexing and customizable on demand local spatial indexing to achieve efficient query processing. Hadoop-GIS is integrated into Hive to support declarative spatial queries with an integrated architecture.

May 10, 2013: Winner of 2013 IBM Champion.

Recognized as a 2013 IBM Champion of the IBM Champion program. This program recognizes individuals who make significant contributions to IBM communities by advocating the solutions; sharing their knowledge and expertise; and helping nurture and grow independent communities.
June 1, 2013: Our paper entitled "Hadoop-GIS: A Spatial Data Warehousing System Over MapReduce" is accepted to VLDB 2013.

Our paper entitled "Hadoop-GIS: A Spatial Data Warehousing System Over MapReduce" is accepted to The 39th International Conference on Very Large Databases (VLDB'2013), Trento, Italy, August 26-30, 2013. Hadoop-GIS is a scalable and high performance spatial data warehousing system for running large scale spatial queries on Hadoop. Hadoop-GIS supports multiple types of spatial queries on MapReduce through space partitioning, customizable spatial query engine RESQUE, implicit parallel spatial query execution on MapReduce, and effective methods for amending query results through handling boundary objects on MapReduce. Hadoop-GIS takes advantage of global partition indexing and customizable on demand local spatial indexing to achieve efficient query processing. Hadoop-GIS is integrated into Hive to support declarative spatial queries with an integrated architecture.
May 10, 2013: Winner of 2013 IBM Champion.

Recognized as a 2013 IBM Champion of the IBM Champion program. This program recognizes individuals who make significant contributions to IBM communities by advocating the solutions; sharing their knowledge and expertise; and helping nurture and grow independent communities.
April 25, 2013: We are organizing the Fifth International Workshop on Cloud Data Management (CloudDB 2013), in conjunction with CIKM 2013.

The Fifth International workshop on Cloud Data Management (CloudDB 2013) will bring together researchers and practitioners in cloud computing and data-intensive system design, programming, parallel algorithms, data management, scientific applications, and information-based applications to maximize performance, minimize cost and improve the scale of their endeavours.

April 19, 2013: Our temporal colaescing method is adopted by Teradata Temporal.

Teradata offers native support to a wide range of temporal analytics. Temporal coalescing is a key temporal query processing operation, and we developed an orderd analytical funciton based approach to support temporal coalescing, which is adopted by Teradata for its temporal database support.
October 12, 2012: We won the best presentation award at Pathology Informatics Conference 2012.

Our talk "MIGIS: High Performance Spatial Query System for Analytical Pathology Imaging" won the best presentation award of the session of Pathology Informatics Conference, Chicago, October 9-12, 2012. Pathology Informatics is a subdiscipline of pathology and clinical laboratory medicine, and the annual conference is the largest meeting with a broad attedance of experts in the fields of pathology and informatics.
May 10, 2012: We are organizing the Fourth International Workshop on Cloud Data Management (CloudDB 2012), in conjunction with CIKM 2012.

The Fourth International workshop on Cloud Data Management (CloudDB 2012) will bring together researchers and practitioners in cloud computing and data-intensive system design, programming, parallel algorithms, data management, scientific applications, and information-based applications to maximize performance, minimize cost and improve the scale of their endeavours.

May 3, 2012: Our paper on "Accelerating Pathology Image Data Cross Comparison on CPU-GPU Hybrid Systems" is accepted to VLDB 2012.

Our paper on building hybrid CPU-GPU based solution for spatially cross-matching of pathology image data is accepted to the Very Large Data Base Conference (VLDB'2012). The work is in collaboration with Ohio State University.


January 31, 2012: We are organizing the Extremely Large Databases Asia (XLDB Asia) Conference at Beijing this summer (June 22-23, 2012).

The Extremely Large Databases (XLDB) series of conferences/workshops (xldb.org) have been held successfully with the leadership of Jacek Becla in recent years. The first XLDB satellite conference in Asia (XLDB Asia) will be held at Beijing, China on June 22-23, 2012. This conference will bring together people with highly demanding data challenges, and researchers and solution providers who are developing systems to address such challenges. The conference will facilitate discussions on:
    The state of the art data handling technologies on extremely large datasets;
    Practical use cases of current and anticipated data challenges;
    Lessons and innovations on building extremely large databases;
    Trends and strategies for surmounting current hurdles.
The conference will provide a meeting place for database researchers, for businesses with advanced solutions, and for people from many research disciplines, industries and organizations who need to urgently address real data challenges.



December 8, 2011: AWS in Education research grant award.

We have been selected to receive an Amazon AWS in Education research grant (two years).
July 20, 2011: Dr. Wang was elected as a SPIE Senior Member, among 28 new senior members honored this year.

Dr. Wang was elected by the Society of Photographic Instrumentation Engineers (SPIE) Board of Directors and the Membership Committee to the grade of Senior Member, among 28 new senior members honored this year from 16,000 members, for his achievements in data management, integration and standardization for medical images and RFID. SPIE Senior Members are Members of distinction who are honored for their professional experience, their active involvement with the optics community and SPIE, and/or significant performance that sets them apart from their peers. See list of 2011 SPIE Senior Members.
June 24, 2011: Best Paper Award Received at ICDCS 2011.

The 31st International Conference on Distributed Computing Systems (ICDCS 11) presented the Best Paper Award to the researchers at Ohio State University (Rubao Lee, Tian Luo, Yin Hua, Xiaodong Zhang), Emory University(Fusheng Wang), and Facebook (Yongqiang He)to their paper entitled "YSmart: Yet another SQL-to-MapReduce Translator."
MapReduce has become a standard software framework for big data analytics in distributed systems, where system execution of SQL queries is a critical data processing task. An SQL-to-MapReduce translator automatically converts database queries to MapReduce jobs for their execution in distributed systems. Complex SQL query jobs generated by existing translators, such as the one in data warehouse Hive developed in Facebook, and in MapReduce programming environment of Pig developed by Yahoo!, are executed at unacceptable slow speed. The authors of the paper demonstrate that the poor performance of these translators is caused by the framework of one-operation-to-one-job mapping, which does not consider correlations of input data and input keys among the queries. The authors design and implement a correlation-aware SQL-to-MapReduce translator, called YSmart. The translator achieves superior performance compared with the existing translators in Hive and Pig. The translator has been patched in Hive for an adoption, and an independent version of YSmart will be released for public usage.
ICDCS is an annual conference providing a forum for engineers and scientists in academia, industry and government to present their latest research findings in many aspects of distributed and parallel computing. The annual conference this year was held in Minneapolis, Minnesota, June 21-24, 2011.
September 20, 2010: Dr. Wang won the best presentation award at Pathology Informatics Conference 2010.

Dr. Wang's talk "Developing Data Model Standards and Databases for Pathology Analytical Imaging" won the best presentation award of the session of Pathology Informatics Conference, Boston, September 19-22, 2010. Pathology Informatics is a subdiscipline of pathology and clinical laboratory medicine, and the annual conference is the largest meeting with a broad attedance of experts in the field of pathology and informatics.