I'm on the academic job market. Please reach out if you or someone you know is hiring and think we make a good fit for postdoc positions!

Nikita Soni

I am a PhD candidate at Stony Brook University, New York, co-advised by H. Andrew Schwartz and Niranjan Balasubramanian .My research focuses on integrating the author's context into language modeling to build human-context-aware models that can be useful in multiple avenues, such as mental health and education. I also spent a semester for a research visit with Dirk Hovy's lab (MilaNLP) at Bocconi University in Milan and continue to collaborate on this line of work. I lead the workshop Human-Centered Large Language Modeling (1st edition at ACL 2024). I have presented at the tutorial on Contextualizing Language with Humans (at NAACL 2024). I am also leading a SemEval-2026 shared task on Predicting Variation in Emotional Affect (at ACL 2026). In addition, I organize Birds of Feather sessions (1st editions at ACL 2024) for human-centered LLMs mentorship and community building, as well as another to discuss mental health among researchers (students and academics alike). I actively serve as a reviewer for many top NLP conferences and workshops.

I am very excited about the following research directions:

Building and Evaluating Human-Context-aware Language Models (ACL 2022, NAACL 2024, EMNLP 2021). How can we design models that address the ecological fallacy presented in LLMs via pre-training text sequences independently, even when authored by the same person, discounting the human (author's) context? For example, independently processing two blogs written by the same person: one where they express how deeply they were affected by an altercation with a loved one, and another where they write about their childhood trauma with the same loved one, can miss the author's context in better understanding their emotions.
Human-Centered Application and Interdisciplinary focus (ACL 2022, NAACL 2024, NAACL 2025, WASSA 2024, CLPsych 2025, EMNLP 2021). Applying human-context-aware models to a range of downstream human-centered NLP tasks such as sentiment analysis, stance detection (e.g., towards abortion), social scientific tasks such as assessing age and personality, psychological tasks such as assessing affect and empathy, and mental health applications such as identifying suicide risks.
Datasets with Human Context (In submission). Building and curating datasets that ethically include anonymized human contexts.
Benchmarking Human-Context-aware Models (Ongoing work). Creating benchmarks to standardize evaluations for human-context-aware models.

Personal

Professional Nikita Soni

News

Nov 2025 : Invited talk at H2Lab @ University of Washington, on Human-Centered Large Language Modeling

Jun 2025 : Leading the organization of SemEval 2026 Shared Task: Predicting Variation in Emotional Valence and Arousal over Time from Ecological Essays

May 2025 @NAACL in Albuquerque: I am presenting 2 papers, see you in Albuquerque, New Mexico!

[.. sorry, didn't keep up with updates in between 😅 ]

Aug 2024 @ACL in Bangkok : I am leading the organization of the Workshop on Human-Centered Large Language Modeling! Follow @HuCLLM

Jun 2024 @NAACL in Mexico City : I will be giving an oral presentation on our position paper Large Human Language Models: A Need and the Challenges!

Jun 2024 @NAACL 2024 in Mexico City : I will be a presenter at the Tutorial on From Text to Context: Contextualizing Language with Humans, Groups, and Communities for Socially Aware NLP!

Apr 2024 : Invited to attend the CRA Grad Cohort for Women Workshop in Minnesota, and my interview will be recorded!

Apr 2024 : Invited to the German Consulate New York to attend the symposium Smart Minds Meet Smart Machines - AI For Science and the Public Good.

Mar 2024 : I will be giving a guest lecture to the NLP grad class at Stony Brook University on Transformer and Self-Attention!

Mar 2024 : Our position paper arguing for Large Human Language Models is accepted in NAACL 2024!

Publications

Addressing the Ecological Fallacy in Larger LLMs with the Author’s Context

Ongoing Work [Long]

Nikita Soni, Dhruv Vijay Kunjadiya, Pratham Piyush Shah, Dikshya Mohanty, H. Schwartz, Niranjan Balasubramanian

Systematic Evaluation of Auto-Encoding and Large Language Model Representations for Capturing Author States and Traits

Findings of the Association for Computational Linguistics: ACL 2025 (Long) conference