Soumyadeep Roy

Curriculum Vitae



Complex Networks Research Lab, Dept. of CSE,

Indian Institute of Technology Kharagpur,

West Bengal, India - 721302

Curriculum Vitae

I am an Institute Ph.D. Research Fellow in the Department of Computer Science and Engineering at IIT Kharagpur, supervised by Prof. Niloy Ganguly and Prof. Shamik Sural. Currently, I am part of the Complex Networks Research Group (CNeRG) at IIT Kharagpur.

Research Interests: I am passionate about interdisciplinary research related to medical NLP, fast pretraining of gene transformer models and adapting LLMs for medical question-answering and summarization, particularly in limited data settings.

Research Experience: I have spent 2.5 years (January 2021 - July 2023) working as a Research Associate at the Leibniz AI Future Lab, L3S Research Center in Germany, with Prof. Wolfgang Nejdl. In collaboration with the Hannover Medical School, Germany, I have developed a novel machine-learning methodology for identifying novel patient subtypes for Parkinson’s disease. Before joining Ph.D., I have completed an M.S Research degree at CSE Dept. at IIT Kharagpur and worked as a Junior Research Fellow (first 2 years) and Senior Research Fellow. During my bachelors, I also secured the Indian Academic of Sciences Summer Research Fellowship, and spent three months summer internship at IIT Kharagpur.


Apr 17, 2024 Our work at IIT Kharagpur “MEDVOC: Vocabulary Adaptation for Fine-tuning Pre-trained Language Models on Medical Text Summarization” got accepted to IJCAI 2024 Main track(a Core A* conference).
Mar 26, 2024 Our work in collaboration with L3S Research Center, Germany called “Beyond Accuracy: Investigating Error Types in GPT-4 Responses to USMLE Questions” got accepted to SIGIR 2024 (a Core A*) as a full paper in the “Resource and Reproducibility” Track.
Feb 20, 2024 Proposal got accepted. Our clinical trial search work will now be further investigated as a new project titled “Developing a Fair and Interpretable Clinical Trial Search System” by Dr. Koustav Rudra (Project PI) of Centre of Excellence in Artificial Intelligence, IIT Kharagpur.
Dec 22, 2023 Awarded 3rd prize in poster presentation at the 4th Indian Symposium on Machine Learning (IndoML 2023) held at IIT Bombay for our ECAI 2023 work on fast pretraining of gene transformer models.
Sep 15, 2023 Serving as a reviewer for AAAI 2024 and EMNLP 2023

selected publications

  1. MEDVOC: Vocabulary Adaptation for Fine-tuning Pre-trained Language Models on Medical Text Summarization
    Gunjan Balde , Soumyadeep Roy, Mainack Mondal , and 1 more author
    In Proceedings of the 33rd International Joint Conference on Artificial Intelligence , 2024
  2. Beyond Accuracy: Investigating Error Types in GPT-4 Responses to USMLE Questions
    Soumyadeep Roy, Aparup Khatua , Fatemeh Ghoochani , and 3 more authors
    In Proceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval , 2024
  3. GENEMASK: Fast Pretraining of Gene Sequences to Enable Few-Shot Learning
    Soumyadeep Roy, Jonas Wallat , Sowmya S. Sundaram , and 2 more authors
    In 26th European Conference on Artificial Intelligence ECAI 2023 , Sep 2023
  4. Interpretable Clinical Trial Search using Pubmed Citation Network
    Soumyadeep Roy, Niloy Ganguly , Shamik Sural , and 1 more author
    In 2023 IEEE International Conference on Digital Health (ICDH) , Sep 2023
  5. Knowledge-Aware Neural Networks for Medical Forum Question Classification
    Soumyadeep Roy, Sudip Chakraborty , Aishik Mandal , and 6 more authors
    In Proceedings of the 30th ACM International Conference on Information & Knowledge Management , Sep 2021
  6. An Integrated Approach for Improving Brand Consistency of Web Content: Modeling, Analysis, and Recommendation
    Soumyadeep Roy, Shamik Sural , Niyati Chhaya , and 2 more authors
    ACM Trans. Web, May 2021