Yejin Choi
    Assistant Professor
    Stony Brook University (SUNY Stony Brook)
    1422 Computer Science
    Stony Brook, NY 11794-4400
    (phone) 631-632-8457
    (fax) 631-632-8334


  • Joining University of Washington CSE this fall!
  • Area chair for EMNLP 2014
  • 1 paper at ACL
  • Our EMNLP paper on predicting successful novels is featured in numerous media outlets --- IEEE Spectrum Podcast; Toronto Star; NPR; CBS Radio Canada;
  • 3 papers (2 long + 1 short) at EMNLP 2013
  • 1 paper at ICCV 2013 -- Best Paper Award
  • New media coverage: our collaboration with Mike Luca at Harvard Business School is featured in The Atlantic
  • 2 papers at ACL 2013: one on connotation lexicon, another on new image-text parallel corpus
  • New media coverage: our work on connotation lexicon is featured by FastCompany
  • 1 journal to appear at TPAMI 2013
  • Invited speaker at Vision+NLP Workshop at NAACL 2013
  • Panel speaker at Student Research Workshop at NAACL 2013
  • Interview with News for New York @ WNBC on deception cues in product reviews
  • Area chair for EMNLP 2012
  • Area chair for NAACL 2012
Office Hours:




Recent Research Projects:

  • Language and Vision; Language Grounding

    Web data today is increasingly multi-modal, opening up opportunities as well as the need for integrative models to bridge Natural Language Processing with Computer Vision. Our recent explorations include
    - Generating natural language descriptions of images by guiding object detection with language prior [CVPR-11], by predicting likely action verbs from language-driven world knowledge [CoNLL-11], and by composing phrases retrieved by partial image matching [ACL-12].
    - Understanding characteristics of visual descriptions [NAACL-12].
    - Constructing a new image-text parallel corpus by reducing information misalignment between images and text [ACL-13].
  • Writing Styles, Deception Detection, Personal Analytics, Forensic Language Technologies

    Language is a window into people's minds. We explore data-driven approaches to statistical stylometry (i.e., the study of linguistic styles), and forensic language technologies (e.g., authorship verification, obfuscation, deception detection). This research is naturally interdisciplinary with broad connections to Psychology, Social Science, Cognitive Science, Psycholinguistics, and Literature.

    Our recent development includes
    - Predicting the success of novels [EMNLP-13a], and creative lexical compositions [EMNLP-13b].
    - Uncovering (hidden) intent of the authors, such as deception [ACL-11, ACL-12, ICWSM-12], and textual vandalism [ACL-11].
    - Detecting socio-cognitive identities, such as authorship [EMNLP-12], gender [CoNLL-11], and nationality.



Short Bio: