Soumyadeep Roy

Address:
Complex Networks Research Lab, Dept. of CSE,
Indian Institute of Technology Kharagpur,
West Bengal, India - 721302
Curriculum Vitae | Research BlogI am an incoming postdoctoral scholar of the Department of Biomedical Informatics, Stanford University.
My primary area of research is natural language processing, with expertise in medical and healthcare applications. My research areas of interest are Foundation Models for Medicine, Generative AI, Text Summarization, and Efficient Pretraining.
I hold a PhD in Computer Science and Engineering from the Indian Institute of Technology Kharagpur, where I worked with Prof. Niloy Ganguly and Prof. Shamik Sural. Here, I was part of the Complex Networks Research Group (CNeRG).
My PhD thesis is titled “Domain Adaptation for Medical Language Understanding”, where I developed novel domain adaptation techniques to effectively and efficiently adapt open-domain AI models to the medical domain.
Please go through my CV for further details.
news
Jul 11, 2025 | Delighted to share that our work ““Where does it hurt? Medical Intent Classification from Dialogues: A Dataset on Doctor Intents and Benchmarking Study”” with Berlin University of Applied Sciences Berlin (BHT) and L3S Research Center, Germany has been accepted to 28th European Conference on Artificial Intelligence (ECAI 2025) to be held in Bologna, Italy. |
---|---|
Jun 01, 2025 | Relaunched my research blog about NLP, medical AI and academic research, at https://datanalytics101.com |
May 15, 2025 | Our work “Evaluation of LLMs in Medical Text Summarization: The Role of Vocabulary Adaptation in High OOV Settings” at IIT Kharagpur, got accepted to ACL 2025 Findings, an A* conference in natural language processing. |
May 05, 2025 | Successfully defended my PhD Thesis “Domain Adaptation for Medical Language Understanding” on May 5, 2025 |
Mar 14, 2025 | Moderated a panel discussion with Prof. Preslav Nakov, Prof. Krishna Gummadi, Prof. Ritumbra Manuvie, Prof. Jeanne Mifsud Bonnici, and Prof. Prasenjit Mitra, at the Workshop on Generative AI for Disinformation and Misinformation Detection of WSDM 2025 |
Mar 10, 2025 | Delivered a 3-hour tutorial “Building Trustworthy AI Models for Medicine: From Theory to Applications” (tutorial website) together with Dominik Wolff (Hannover Medical School, Germany), Sowmya S. Sundaram (Stanford University) and Prof. Niloy Ganguly (IIT Kharagpur), at the 18th ACM International Conference on Web Search and Data Mining (WSDM 2025) held in Hannover, Germany |
Jan 20, 2025 | Selected to attend the Google DeepMind Research Symposium, held at the Google Bangalore office. |
Dec 20, 2024 | Presented our SIGIR 2024 paper “Beyond Accuracy: Investigating Error Types in GPT-4 Responses to USMLE Questions” as a poster at the 6th Indian Symposium on Machine Learning (IndoML 2024), held at the BITS Pilani Goa Campus |
Dec 13, 2024 | Our benchmarking and fine-grained evaluation work with CAIMED and L3S colleagues - “A Systematic Evaluation of Single-Cell Foundation Models on Cell-Type Classification Task” has been accepted to WSDM Day programme at WSDM 2025, to be held in Germany (https://www.wsdm-conference.org/2025/) |
Nov 04, 2024 | Joined the Healthcare Innovation and Technology Center of Wipro GE Healthcare Pvt. Ltd. at Bangalore, India, as Ph.D Intern |