Soumyadeep Roy

Address:
Complex Networks Research Lab, Dept. of CSE,
Indian Institute of Technology Kharagpur,
West Bengal, India - 721302
Curriculum VitaeI am an Institute Ph.D. Research Fellow in the Department of Computer Science and Engineering at IIT Kharagpur, supervised by Prof. Niloy Ganguly and Prof. Shamik Sural. I submitted my Ph.D thesis titled “Domain Adaptation for Medical Language Understanding” on October 2024.
Currently, I am doing a research internship at the Health Innovation and Technology Centre of GE Healthcare India. I am looking for research scientist and postdoc positions in Foundation Models and Generative AI for medicine.
Research Interests: I have expertise in developing foundational models for medical applications, ranging from NLP (text) models to DNA and sc-RNA-based biological foundational models. For NLP, I work on develop efficient domain adaptation techniques for adapting open-domain generative AI model for medical tasks. Specifically, optimizing the model vocabulary during finetuning (vocabulary adaptation).
Research Experience: During my Ph.D, I was part of the Complex Networks Research Group (CNeRG) at IIT Kharagpur. I have spent 2.5 years (January 2021 - July 2023) working as a Research Associate at the Leibniz AI Future Lab, L3S Research Center in Germany, with Prof. Wolfgang Nejdl. In collaboration with the Hannover Medical School, Germany, I have developed a novel machine-learning methodology for identifying novel patient subtypes for Parkinson’s disease. Before joining Ph.D., I have completed an M.S Research degree at CSE Dept. at IIT Kharagpur and worked as a Junior Research Fellow (first 2 years) and Senior Research Fellow. During my bachelors, I also secured the Indian Academic of Sciences Summer Research Fellowship, and spent three months summer internship at IIT Kharagpur.
news
Dec 13, 2024 | Our benchmarking and fine-grained evaluation work with CAIMED and L3S colleagues - “A Systematic Evaluation of Single-Cell Foundation Models on Cell-Type Classification Task” has been accepted to WSDM Day programme at WSDM 2025, to be held in Germany (https://www.wsdm-conference.org/2025/) |
---|---|
Nov 04, 2024 | Joined the Healthcare Innovation and Technology Center of Wipro GE Healthcare Pvt. Ltd. at Bangalore, India, as Ph.D Intern |
Oct 25, 2024 | Our tutorial proposal “Building Trustworthy AI Models for Medicine: From Theory to Applications” has been accepted at the 18th ACM International Conference on Web Search and Data Mining (WSDM 2025), to be held in Germany (https://www.wsdm-conference.org/2025/) |
Oct 16, 2024 | Submitted my Ph.D thesis titled “Domain Adaptation for Medical Language Understanding”. |
Sep 20, 2024 | Our work at IIT Kharagpur “Adaptive BPE Tokenization for Enhanced Vocabulary Adaptation in Finetuning Pretrained Language Models” got accepted to EMNLP 2024 Findings (an A* conference) as a short paper |
selected publications
- A Systematic Evaluation of Single-Cell Foundation Models on Cell-Type Classification TaskIn The 18th ACM International Conference on Web Search and Data Mining , 2025
- Building Trustworthy AI Models for Medicine: From Theory to ApplicationsIn The 18th ACM International Conference on Web Search and Data Mining , 2025