Recent Updates

Happening Aug 2024 @ACL in Bangkok : I am leading the organization of the Workshop on Human-Centered Large Language Modeling! Follow @HuCLLM

Happening Jun 2024 @NAACL in Mexico City : I will be giving an oral presentation on our position paper Large Human Language Models: A Need and the Challenges!

Happening Jun 2024 @NAACL 2024 in Mexico City : I will be a presenter at the Tutorial on From Text to Context: Contextualizing Language with Humans, Groups, and Communities for Socially Aware NLP!

Apr 2024 : Invited to attend the CRA Grad Cohort for Women Workshop in Minnesota, and my interview will be recorded!

Apr 2024 : Invited to the German Consulate New York to attend the symposium Smart Minds Meet Smart Machines - AI For Science and the Public Good.

Mar 2024 : I will be giving a guest lecture to the NLP grad class at Stony Brook University on Transformer and Self-Attention!

Mar 2024 : Our position paper arguing for Large Human Language Models is accepted in NAACL 2024!

Nikita Soni

I am a PhD candidate at Stony Brook University, New York, co-advised by H. Andrew Schwartz and Niranjan Balasubramanian. My research interest lies in Human-Centered Natural Language Processing and enabling Language Modeling with the context of the human behind the language.
Prior, I was working in the software industry exploring multiple facets of the software engineering world (details in CV). Personally, I'm a very outdoorsy person but also enjoy my own company.

Download CV

Let's chat more over a cup of coffee.

Personal

Professional Nikita Soni

Research Purpose

I am enthused about NLP's expanding outreach in our lives and its unexplored abilities to understand human nature better and more efficiently than ever. Language is more than words, it expresses identities, psychologies, cultures and much more. I find myself challenged in directing NLP language models to look beyond the current limitations and consider the human behind the language. The purpose of my research is to enable the growth of more empathy in an AI-future centric world, thereby augmenting humanity rather than detracting from it.

Publications

Automatic implicit motives codings are as accurate as humans', cheaper, and 99% faster
PsyArXiv (2024). preprint
[pdf] August Nilsson, J. Malte Runge, Oscar Kjell, Nikita soni, Adithya V Ganesan, and Carl Viggo Nilsson.

Robust language-based mental health assessments in time and space through social media
To appear in Nature npj Digital Medicine, (2024). journal
[pdf] Siddharth Mangalik, Johannes C Eichstaedt, Salvatore Giorgi, Jihu Mun, Farhan Ahmed, Gilvir Gill, Adithya V Ganesan, Shashanka Subrahmanya, Nikita Soni, Sean AP Clouston, and H Andrew Schwartz.

Large Human Language Models: A Need and the Challenges.
To appear in NAACL (2024). conference
[pdf] Nikita Soni, H Andrew Schwartz, João Sedoc and Niranjan Balasubramanian.

Archetypes and Entropy: Theory-Driven extraction of Evidence for Suicide Risk.
[pdf] CLPsych workshop in EACL (2024). workshop
Vasudha Varadarajan, Allison Lahnala, Adithya V Ganesan, Gourab Dey, Siddharth Mangalik, Ana-Maria Bucur, Nikita Soni, Rajath Rao, Kevin Lanning, Isabella Valejo, Lucie Flek, H Andrew Schwartz, Charles Welch, and Ryan L Boyd.

Comparing Pre-trained Human Language Models: Is it Better with Human Context as Groups, Individual Traits, or Both?
arXiv (2024) preprint

[pdf] Nikita Soni, Niranjan Balasubramanian, H Andrew Schwartz, Dirk Hovy.

I slept like a baby: Using human traits to characterize deceptive ChatGPT and human text.
IACT workshop at ACM SIGIR conference (2023). workshop
[pdf] Salvatore Giorgi, David M. Markovitz, Nikita Soni, Vasudha Varadarajan, Siddharth Mangalik and H Andrew Schwartz.

Human Language Modeling website
ACL-Findings (2022) conference
[pdf] Nikita Soni, Matthew Matero, Niranjan Balasubramanian and H. Andrew Schwartz

WWBP-SQT-lite: Multi-level Models and Difference Embeddings for Moments of Change Identification in Mental Health Forums
CLPsych Workshop in NAACL (2022).workshop
[pdf] Adithya V Ganesan, Vasudha Varadarajan, Juhi Mittal, Shashanka Subrahmanya, Matthew Matero, Nikita Soni, Sharath Chandra Guntuku, Johannes Eichstaedt, and H Andrew Schwartz.

Detecting Dissonant Stance in Social Media: The Role of Topic Exposure.
NLP+CSS Workshop in EMNLP (2022) workshop

[pdf]Vasudha Varadarajan, Nikita Soni, Weixi Wang, Christian Luhmann, H Andrew Schwartz, and Naoya Inoue.

MeLT: Message-Level Transformer with Masked Document Representations as Pre-Training for Stance Detection
EMNLP-Findings (2021)conference
[pdf] Matthew Matero, Nikita Soni, Niranjan Balasubramanian and H. Andrew Schwartz

Recent Services

Reviewer for ACL Rolling Review Feb 2024 (ACL)

Reviewer for ACL Rolling Review Dec 2023 (NAACL)

PC member of ICWSM Data Challenge 2023

Reviewer for EMNLP 2023

Reviewer of Language Modeling & Analysis of Language Models Track, EMNLP 2022

PC member of The 5th workshop on Natural Language Processing and Computational Social Science (NLP+CSS), EMNLP 2022

Volunteer for Diversity & Inclusion Committee, NAACL Conference 2022

Follow @nikita_soni_

Experience

Education

2019 -

2008 - 2012

Research Visits

Research Internships

Software Engineering Jobs

Nikita Soni

Automatic implicit motives codings are as accurate as humans', cheaper, and 99% fasterPsyArXiv (2024). preprint [pdf] August Nilsson, J. Malte Runge, Oscar Kjell, Nikita soni, Adithya V Ganesan, and Carl Viggo Nilsson.

Large Human Language Models: A Need and the Challenges. To appear in NAACL (2024). conference [pdf] Nikita Soni, H Andrew Schwartz, João Sedoc and Niranjan Balasubramanian.

Comparing Pre-trained Human Language Models: Is it Better with Human Context as Groups, Individual Traits, or Both? arXiv (2024) preprint [pdf] Nikita Soni, Niranjan Balasubramanian, H Andrew Schwartz, Dirk Hovy.

I slept like a baby: Using human traits to characterize deceptive ChatGPT and human text.IACT workshop at ACM SIGIR conference (2023). workshop [pdf] Salvatore Giorgi, David M. Markovitz, Nikita Soni, Vasudha Varadarajan, Siddharth Mangalik and H Andrew Schwartz.

Human Language Modeling website ACL-Findings (2022) conference [pdf] Nikita Soni, Matthew Matero, Niranjan Balasubramanian and H. Andrew Schwartz

Detecting Dissonant Stance in Social Media: The Role of Topic Exposure.NLP+CSS Workshop in EMNLP (2022) workshop [pdf]Vasudha Varadarajan, Nikita Soni, Weixi Wang, Christian Luhmann, H Andrew Schwartz, and Naoya Inoue.

MeLT: Message-Level Transformer with Masked Document Representations as Pre-Training for Stance DetectionEMNLP-Findings (2021)conference[pdf] Matthew Matero, Nikita Soni, Niranjan Balasubramanian and H. Andrew Schwartz