Nikita Soni

Nikita Soni

I am a PhD candidate at Stony Brook University, New York, co-advised by H. Andrew Schwartz and Niranjan Balasubramanian.My research focuses on integrating the author's context into language modeling to build human-context-aware models that can be useful in multiple avenues, such as mental health and education. I lead the workshop Human-Centered Large Language Modeling(1st edition at ACL 2024). I have presented at the tutorial on Contextualizing Language with Humans(at NAACL 2024). I am also leading a SemEval-2026 shared task on Predicting Variation in Emotional Affect(at ACL 2026). I also organize Birds of Feather sessions(1st editions at ACL 2024) for human-centered LLMs mentorship and community building, as well as another to discuss mental health among researchers(students and academics alike). I actively serve as a reviewer for many top NLP conferences and workshops.

I am very excited about the following research directions:

  1. Building and Evaluating Human-Context-aware Language Models (ACL 2022, NAACL 2024, EMNLP 2021). How can we design models that address the ecological fallacy presented in LLMs via pre-training text sequences independently, even when authored by the same person, discounting the human (author's) context? For example, independently processing two blogs written by the same person: one where they express how deeply they were affected by an altercation with a loved one, and another where they write about their childhood trauma with the same loved one, can miss the author's context in better understanding their emotions.

  2. Human-Centered Application and Interdisciplinary focus (ACL 2022, NAACL 2024, NAACL 2054, WASSA 2024, CLPsych 2025, EMNLP 2021). Applying human-context-aware models to a range of downstream human-centered NLP tasks such as sentiment analysis, stance detection (e.g., towards abortion), social scientific tasks such as assessing age and personality, psychological tasks such as assessing affect and empathy, and mental health applications such as identifying suicide risks.

  3. Datasets with Human Context (In submission). Building and curating datasets that ethically include anonymized human contexts.

  4. Benchmarking Human-Context-aware Models (Ongoing work). Creating benchmarks to standardize evaluations for human-context-aware models.

PersonalNikita Soni
ProfessionalNikita Soni
Recent Updates

Aug 2024 @ACL in Bangkok : I am leading the organization of the Workshop on Human-Centered Large Language Modeling!

Jun 2024 @NAACL in Mexico City : I will be giving an oral presentation on our position paper Large Human Language Models: A Need and the Challenges!

Jun 2024 @NAACL 2024 in Mexico City : I will be a presenter at the Tutorial on From Text to Context: Contextualizing Language with Humans, Groups, and Communities for Socially Aware NLP!

Apr 2024 : Invited to attend the CRA Grad Cohort for Women Workshop in Minnesota, and my interview will be recorded!

Apr 2024 : Invited to the German Consulate New York to attend the symposium Smart Minds Meet Smart Machines - AI For Science and the Public Good.

Mar 2024 : I will be giving a guest lecture to the NLP grad class at Stony Brook University on Transformer and Self-Attention!

Mar 2024 : Our position paper arguing for Large Human Language Models is accepted in NAACL 2024!

Publications

Addressing the Ecological Fallacy in Larger LLMs with the Author’s Context

In Submission at EMNLP 2025 (Long) preprint

Nikita Soni, Dhruv Vijay Kunjadiya, Pratham Piyush Shah, Dikshya Mohanty, H. Schwartz, Niranjan Balasubramanian

Systematic Evaluation of Auto-Encoding and Large Language Model Representations for Capturing Author States and Traits

Findings of the Association for Computational Linguistics: ACL 2025 (Long) conference

[pdf]

Khushboo Singh, Vasudha Varadarajan, Adithya V. Ganesan, August H˚akan Nilsson, Nikita Soni, Syeda Mahwish, Pranav Chitale, Ryan L. Boyd, Lyle Ungar, Richard N. Rosenthal, H. Andrew Schwartz

Evaluation of LLMs-based Hidden States as Author Representations for Psychological Human-Centered NLP Tasks

Findings of the Association for Computational Linguistics: NAACL 2025 (Short) conference

[pdf]

Nikita Soni, Pranav Chitale, Khushboo Singh, Niranjan Balasubramanian, H. Andrew Schwartz

Who We Are, Where We Are: Mental Health at the Intersection of Person, Situation, and Large Language Models

Proceedings of the 10th Workshop workshop

[pdf]

Nikita Soni, August Håkan Nilsson, Syeda Mahwish, Vasudha Varadarajan, H. Andrew Schwartz, Ryan L. Boyd

The Consistent Lack of Variance of Psychological Factors Expressed by LLMs and Spambots

Detecting AI-Generated Content Workshop at COLING 2025 (Short) conference

[pdf]

Vasudha Varadarajan, Salvatore Giorgi, Siddharth Mangalik, Nikita Soni, Dave M. Markowitz, H. Andrew Schwartz

Large Human Language Models: A Need and the Challenges

The 2024 Annual Conference of the North American Chapter of the Association for Computational Linguistics (Long) conference

[pdf]

Nikita Soni, H. Andrew Schwartz, João Sedoc, Niranjan Balasubramanian

Comparing Pre-trained Human Language Models: Is it Better with Human Context as Groups, Individual Traits, or Both?

14th Workshop on Computational Approaches to Subjectivity, Sentiment, & Social Media Analysis (WASSA 2024), ACL 2024 (Long) workshop

[pdf]

Nikita Soni, Niranjan Balasubramanian, H. Andrew Schwartz, Dirk Hovy

Proceedings of the 1st Human-Centered Large Language Modeling Workshop

1st Human-Centered Large Language Modeling Workshop (ACL 2024) workshop

[pdf]

Nikita Soni, Lucie Flek, Ashish Sharma, Diyi Yang, Sara Hooker, H. Andrew Schwartz

From Text to Context: Contextualizing Language with Humans, Groups, and Communities for Socially Aware NLP

2024 Conference of the North American Chapter of the Association conference

[pdf]

Adithya V Ganesan, Siddharth Mangalik, Vasudha Varadarajan, Nikita Soni, Swanie Juhng, João Sedoc, H. Andrew Schwartz, Salvatore Giorgi, Ryan L. Boyd

Automatic Implicit Motives Codings are at Least as Accurate as Humans’ and 99% Faster

Journal of Personality and Social Psychology: Personality Processes and Individual Differences,2024 (Journal) journal

[pdf]

August Nilsson, J. Malte Runge, Oscar Kjell, Nikita Soni, Adithya V Ganesan, Carl Viggo Nilsson

Robust language-based mental health assessments in time and space through social media

Nature npj Digital Medicine (2024) journal

[pdf]

Siddharth Mangalik, Johannes C Eichstaedt, Salvatore Giorgi, Jihu Mun, Farhan Ahmed, Gilvir Gill, Adithya V Ganesan, Shashanka Subrahmanya, Nikita Soni, Sean AP Clouston, H Andrew Schwartz

Archetypes and Entropy: Theory-Driven extraction of Evidence for Suicide Risk

CLPsych workshop in EACL (2024) workshop

[pdf]

Vasudha Varadarajan, Allison Lahnala, Adithya V Ganesan, Gourab Dey, Siddharth Mangalik, Ana-Maria Bucur, Nikita Soni, Rajath Rao, Kevin Lanning, Isabella Valejo, Lucie Flek, H Andrew Schwartz, Charles Welch, Ryan L Boyd

I slept like a baby: Using human traits to characterize deceptive ChatGPT and human text

IACT workshop at ACM SIGIR conference (2023) workshop

[pdf]

Salvatore Giorgi, David M. Markovitz, Nikita Soni, Vasudha Varadarajan, Siddharth Mangalik, H Andrew Schwartz

Human Language Modeling

ACL-Findings (2022) conference

[pdf]

Nikita Soni, Matthew Matero, Niranjan Balasubramanian, H. Andrew Schwartz

WWBP-SQT-lite: Multi-level Models and Difference Embeddings for Moments of Change Identification in Mental Health Forums

CLPsych Workshop in NAACL (2022) workshop

[pdf]

Adithya V Ganesan, Vasudha Varadarajan, Juhi Mittal, Shashanka Subrahmanya, Matthew Matero, Nikita Soni, Sharath Chandra Guntuku, Johannes Eichstaedt, H Andrew Schwartz

Detecting Dissonant Stance in Social Media: The Role of Topic Exposure.

NLP+CSS Workshop in EMNLP (2022) workshop

[pdf]

Vasudha Varadarajan, Nikita Soni, Weixi Wang, Christian Luhmann, H Andrew Schwartz, Naoya Inoue

MeLT: Message-Level Transformer with Masked Document Representations as Pre-Training for Stance Detection

EMNLP-Findings (2021) conference

[pdf]

Matthew Matero, Nikita Soni, Niranjan Balasubramanian, H. Andrew Schwartz

Recent Services

Organizing a SemEval-2026 Shared Task: Predicting Variation in Emotional Valence and Arousal over Time from Ecological Essays
Nikita Soni, H. Andrew Schwartz, Tony Bui, Ryan Boyd, August Håkan Nilsson, Syeda Mahwish, Adithya V Ganesan, Lyle Ungar, Niranjan Balasubramanian, Saif M. Mohammad

Organized the 1st Human-Centered Large Language Modeling Workshop co-located with ACL 2024
Nikita Soni, Lucie Flek, Ashish Sharma, Diyi Yang, Sara Hooker, and H. Andrew Schwartz.

Tutorial on From Text to Context: Contextualizing Language with Humans, Groups, and Communities for Socially Aware NLP at NAACL 2024
Adithya V Ganesan, Siddharth Mangalik, Vasudha Varadarajan, Nikita Soni, Swanie Juhng, João Sedoc, H. Andrew Schwartz, Salvatore Giorgi, and Ryan Boyd

Organized a Birds of Feather Session at ACL 2024 focused on mentorship and community building for Human-Centered Large Language Modeling.

Organized a Birds of Feather Session at ACL 2024 to engage in an open dialogue on mental health challenges faced by graduate students, postdocs, faculty, and researchers.

Program Committee at EMNLP 2025, NAACL 2025, ACL 2024, NAACL 2024, NAACL 2023, EMNLP 2023, EMNLP 2022

Program Committee at The 5th workshop on Natural Language Processing and Computational Social Science (NLP+CSS).

Program Committee at ICWSM Data Challenge 2023

Volunteer for Diversity & Inclusion Committee at NAACL 2022

Experience

Research Visits

Software Engineering Jobs