III: Medium: Collaborative Research:
Collective Opinion Fraud Detection:
Identifying and Integrating Cues from Language, Behavior, and Networks

PI: Leman Akoglu
Co-PI: Yejin Choi
Phone: 1 (631) 632 9801
Department of Computer Science Fax: 1 (631) 632 1784
Stony Brook University Email: {leman,ychoi} AT cs.stonybrook.edu
Stony Brook, NY 11794 Website: http://www.cs.stonybrook.edu/~leman
PI: Bing Liu Phone: 1 (312) 685 2570
Department of Computer Science Fax: 1 (312) 413 0024
University of Illinois at Chicago Email: liub AT cs.uic.edu
Website: http://www.cs.uic.edu/~liub/
PI: Christos Faloutsos Phone: 1 (412) 268 1457
Computer Science Department Fax: 1 (412) 268 5576
Carnegie Mellon University Email: christos AT cs.cmu.edu
Pittsburgh, PA 15213 Website: www.cs.cmu.edu/~christos/

This material is based upon work supported by the National Science Foundation under Grant No. IIS-1408287. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation.


1.1. Abstract

Link to NSF abstract

1.2. Keywords

Opinion Fraud, Fraud Detection, Deception, Linguistic Patterns, Behavioral Analysis, Graph Mining.

1.3. Funding agency


In addition to the PIs, the following graduate students work on the project.


3.1. Project goals

Technical Merits:

Given the critical issues of opinion fraud in online communities, how can one identify fake reviews and attribute responsible culprits behind them? By conjoining expertise of the PIs over various modalities of deception footprints ranging over language, user behavior, and relational information, this project presents a research program that will result in much needed solutions to this emergent, prevalent, and socially impactful problem. The ultimate goal is to create a unified detection framework via synergistic integration of multiple information sources; from linguistics, user behavior, and network effects, to obtain the best of all worlds. The main idea is to formulate the problem as a relational inference task on composite heterogeneous networks, providing a principled, extensible approach that can blend and reinforce all the above cues towards effective and robust detection of fraud. From a scientific point of view, the research brings together three disciplines: natural language analysis, behavioral modeling, and graph mining. The outcome is a suite of novel, principled, and scalable techniques and models that will enhance our understanding of the creation and dissemination of opinion fraud and misinformation in general at a large scale. The PIs will collaborate with industry partners such as Yelp, Google, and Amazon, directly solicit online fake reviews, and conduct well-designed user studies for testing and validation of their techniques.

Broader Impacts:

The broader impact of our work is that it will enable the development of opinion fraud and misinformation detection solutions that are critical in achieving integrity and credibility on the Web. The outcome of this research will be beneficial to billions of Web users, governments, law enforcement agencies, multi-billion-dollar industries and service providers. As such, the two main bodies that this project will directly and significantly impact are the Web users and the e-commerce site owners. The PIs will collaborate with Yelp in evaluation and integration of their developed techniques and tools. The PIs will further reach out to other industry contacts at Amazon, Google, and TripAdvisor and aim to disseminate research results to them through published manuscripts and tutorials at major conferences where many industry practitioners attend, as well as release publicly available open-source software for opinion fraud detection. The public will also be educated through reaching out to popular press media for interviews and educational press articles.

3.2. Results

  • GUI for Manual Inspection of Opinion Fraud: We have developed a GUI tool for manual inspection and evaluation of opinion fraud. This Work is shared under Creative Commons Attribution-NonCommercial 4.0 International Public License. Licensees may copy, distribute, display, and perform the Work and make derivative works based on the Work only for non-commercial purposes. You may download the GUI here (zip).

3.3. Related Publications

    • Mining Big Time-series Data on the Web
      Yasushi Sakurai, Yasuko Matsubara, and Christos Faloutsos
      Proceedings of the 25th International Conference Companion on World Wide Web, WWW '16 Companion, pp 1029-1032, 2016.  

3.4. Tutorials/Workshops

3.5. Dissertations

  1. Danai Koutra, Exploring and Making Sense of Large Graphs. Dissertation, CMU, August 2015. (KDD dissertation award).
  2. Alex Beutel User Behavior Modeling with Large-Scale Graph Analysis May, 2016.


The educational contributions of the project include:
  • Under construction

Point of Contact: Leman Akoglu, leman AT cs.stonybrook.edu

Last updated: July 13, 2016, by Leman Akoglu