Soumyadeep Roy

Address:
Complex Networks Research Lab, Dept. of CSE,
Indian Institute of Technology Kharagpur,
West Bengal, India - 721302
Curriculum VitaeI successfully defended my Ph.D on May 2025 in the Department of Computer Science and Engineering at IIT Kharagpur, supervised by Prof. Niloy Ganguly and Prof. Shamik Sural. My Ph.D thesis title was “Domain Adaptation for Medical Language Understanding”.
Research Interests: I have expertise in developing foundational models for medical applications, ranging from NLP (text) models to DNA and sc-RNA-based biological foundational models. For NLP, I work on develop efficient domain adaptation techniques for adapting open-domain generative AI model for medical tasks. Specifically, optimizing the model vocabulary during finetuning (vocabulary adaptation).
Research Experience: After submitting my Ph.D, I was a Ph.D intern at the Health Innovation and Technology Centre of GE Healthcare India where I worked on generative AI for oncology use case and developed an embedding model from error codes/system logs of medical devices. I am looking for research scientist and postdoc positions in Foundation Models and Generative AI for medicine. During my Ph.D, I was part of the Complex Networks Research Group (CNeRG) at IIT Kharagpur. I have spent 2.5 years (January 2021 - July 2023) working as a Research Associate at the Leibniz AI Future Lab, L3S Research Center in Germany, with Prof. Wolfgang Nejdl. In collaboration with the Hannover Medical School, Germany, I have developed a novel machine-learning methodology for identifying novel patient subtypes for Parkinson’s disease. Before joining Ph.D., I have completed an M.S Research degree at CSE Dept. at IIT Kharagpur and worked as a Junior Research Fellow (first 2 years) and Senior Research Fellow. During my bachelors, I also secured the Indian Academic of Sciences Summer Research Fellowship, and spent three months summer internship at IIT Kharagpur.
news
Dec 13, 2024 | Our benchmarking and fine-grained evaluation work with CAIMED and L3S colleagues - “A Systematic Evaluation of Single-Cell Foundation Models on Cell-Type Classification Task” has been accepted to WSDM Day programme at WSDM 2025, to be held in Germany (https://www.wsdm-conference.org/2025/) |
---|---|
Nov 04, 2024 | Joined the Healthcare Innovation and Technology Center of Wipro GE Healthcare Pvt. Ltd. at Bangalore, India, as Ph.D Intern |
Oct 25, 2024 | Our tutorial proposal “Building Trustworthy AI Models for Medicine: From Theory to Applications” has been accepted at the 18th ACM International Conference on Web Search and Data Mining (WSDM 2025), to be held in Germany (https://www.wsdm-conference.org/2025/) |
Oct 16, 2024 | Submitted my Ph.D thesis titled “Domain Adaptation for Medical Language Understanding”. |
Sep 20, 2024 | Our work at IIT Kharagpur “Adaptive BPE Tokenization for Enhanced Vocabulary Adaptation in Finetuning Pretrained Language Models” got accepted to EMNLP 2024 Findings (an A* conference) as a short paper |