Soumyadeep Roy

Postdoc at Stanford Medicine

I am a postdoctoral scholar working with Prof. Tina Hernandez-Boussard at the Division of Computational Medicine, Department of Medicine, Stanford University.

Currently, I am building and evaluating LLM-based clinical decision support systems, specifically on two applications: (i) clinical guideline adherence over real-world patient trajectories and (ii) synthetic cohort generation for evaluating AI-based patient-to-trial matching systems like TrialGPT.

My work sits at the intersection of biomedical AI, machine learning, natural language processing, and real-world evidence. As medicine continues to generate increasingly complex clinical, genomic, and textual data, this kind of research is central to the future of AI in medicine.

Worked with clinical data (structured EHR and unstructured notes) of Parkinson’s disease (L3S Research Center, Hannover Medical School, Germany), oncology (breast, lung and prostate cancer) at GE Healthcare India and postoperative pain management at Stanford Medicine.

Research Career Overview

Translational Works

[Boston GrandHack 2026] Co-developed AuriCare, a holistic pain-management decision-support concept, presented at MIT Hacking Medicine’s Boston GrandHack 2026. Blog Event

[Patent, filed 2025] Lead co-inventor on a deep-learning representation-learning system for medical-imaging-equipment sensor logs, enabling anomaly detection and predictive maintenance (Wipro GE Healthcare).

[Translation Funded Project] My PhD work on interpretable clinical trial search has been continued as a funded translational research project at AI4ICPS programme at IIT Kharagpur, led by my co-author Official Website Poster

news

Jul 04, 2026 Serving as a reviewer for NeurIPS 2026, ACL Rolling Review (March, May 2026) and AAAI 2027
May 10, 2026 Guest Lecturer for Stanford BMDS 223 Course “Deploying and Evaluating Fair AI in Healthcare” (Spring 2026). One lecture on “Bias Evaluation in LLMs” and one hands-on coding workshop on “Bias Audit on Real-World Data (MIMIC-IV)”
Apr 10, 2026 Our work on efficient vocabulary adaptation on medical and legal domains got accepted as an ACL 2026 Mains track as a full paper. I will be presenting in-person at San Diego, California. Code Preprint
Mar 16, 2026 Presented AuriCare, a holistic pain management decision support concept, at MIT Hacking Medicine’s Boston GrandHack 2026. Demo Blog
Feb 20, 2026 Served as a reviewer for 2 conferences (FAccT 2026, ACL ARR January 2026) and 2 journals (JAMIA, Frontiers in AI)
Feb 14, 2026 Our work “LongTailQA: Benchmarking LLMs and RAG Models on Disambiguated Long-Tail Entities” got accepted to LREC 2026. Year-long collaborative effort with PhD students and colleagues from L3S Research Center, Germany
Dec 05, 2025 Presented our work on vocabulary adaptation (VA) for training medical language models at the Microsoft Research India (Bangalore) Friday Breakfast talk series. Link to slides
Nov 21, 2025 Our Parkinson Disease Subtyping paper with L3S Research Center Germany and Hannover Medical School got published at the Frontiers in AI Journal under Section Medicine and Public Health https://doi.org/10.3389/frai.2025.1668206. Link to slides
Sep 03, 2025 Started my postdoc at Stanford Medicine with Prof. Tina Hernandez-Boussard. I will work on understanding how real-world patient trajectories deviate from clinical guidelines. Does it lead to positive patient outcomes or avoidable harm?
Aug 01, 2025 Served as a reviewer for A* conferences such as EMNLP, AAAI, ACL RR - July and journals such as Frontiers in Genetics, Knowledge and Information Systems.

selected publications

  1. Learning Faster with Better Tokens: Parameter-Efficient Vocabulary Adaptation for Specialized Text Summarization
    Gunjan Balde , Soumyadeep Roy, Mainack Mondal , and 1 more author
    In Main Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics , Sep 2026
  2. Decision tree-based approach to robust Parkinson’s disease subtyping using clinical data of the Michael J. Fox Foundation LRRK2 cross-sectional study
    Soumyadeep Roy, Stefanie Krähe , Michael Marschollek , and 3 more authors
    Frontiers in Artificial Intelligence, Sep 2025
  3. Building Trustworthy AI Models for Medicine: From Theory to Applications
    Soumyadeep Roy, Sowmya S. Sundaram , Dominik Wolff , and 1 more author
    In The 18th ACM International Conference on Web Search and Data Mining , Sep 2025
  4. MEDVOC: Vocabulary Adaptation for Fine-tuning Pre-trained Language Models on Medical Text Summarization
    Gunjan Balde , Soumyadeep Roy, Mainack Mondal , and 1 more author
    In Proceedings of the 33rd International Joint Conference on Artificial Intelligence , Sep 2024
  5. Beyond Accuracy: Investigating Error Types in GPT-4 Responses to USMLE Questions
    Soumyadeep Roy, Aparup Khatua , Fatemeh Ghoochani , and 3 more authors
    In Proceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval , Sep 2024
  6. GENEMASK: Fast Pretraining of Gene Sequences to Enable Few-Shot Learning
    Soumyadeep Roy, Jonas Wallat , Sowmya S. Sundaram , and 2 more authors
    In 26th European Conference on Artificial Intelligence ECAI 2023 , Sep 2023
  7. Interpretable Clinical Trial Search using Pubmed Citation Network
    Soumyadeep Roy, Niloy Ganguly , Shamik Sural , and 1 more author
    In 2023 IEEE International Conference on Digital Health (ICDH) , Sep 2023
  8. Knowledge-Aware Neural Networks for Medical Forum Question Classification
    Soumyadeep Roy, Sudip Chakraborty , Aishik Mandal , and 6 more authors
    In Proceedings of the 30th ACM International Conference on Information & Knowledge Management , Sep 2021