Clinical trial search

Developing an aspect-based clinical trial search and an interpretable system using the Pubmed citation network

Method overview of proposed metapath-based similarity search (MPSS).

Publications and Awards

IEEE ICDH 2023 Full Paper that also got recognized as a “candidate for the Best Student Paper”. Received the Best Graduate Forum Presentation Award in COMSNETS 2018. One Full paper accepted in CSoNet 2019.

Proposal got accepted. Our clinical trial search work will now be further investigated as a new project titled “Developing a Fair and Interpretable Clinical Trial Search System” by Dr. Koustav Rudra (Project PI) of Centre of Excellence in Artificial Intelligence, IIT Kharagpur for the next 2 years.

Selected as a candidate for the best Student Research Paper at IEEE ICDH 2023 held at Chicago.

Abstract

Clinical trials are an essential source of information for practicing Evidence-Based Medicine because they help to determine the efficacy of newly developed treatments and drugs. However, most of the existing trial search systems focus on a specific disease (e.g., cancer) and utilize disease-specific knowledge bases that hinder the adaptation of such methods to new diseases.

In this work, we overcome both limitations and propose a graph-based model that explores both clinical trials and the Pubmed databases to alleviate the shortage of relevant clinical trials for a query. We construct a large heterogeneous graph (750K nodes and 1.2 Million edges) made of clinical trials and Pubmed articles linked to clinical trials.

As both the graph edges and nodes are labeled, we develop a novel metapath-based similarity search (MPSS) method to retrieve and rank clinical trials across multiple disease classes. We primarily focus on consumers and users that do not have any prior medical knowledge.

As there are no multiple disease-wide trial search evaluation datasets, we contribute a high-quality, well-annotated query-relevant trial set comprising around 25 queries and, on average, approximately 95 annotated trials per query. We also perform a detailed evaluation of MPSS on the TREC Precision Medicine Benchmark Dataset, a disease-specific clinical trial search setting.

Methodological overview of MPSS. A user enters a query in free-form text, and medical concepts are extracted from the query, which is then used to retrieve relevant trials for the clinical trials registry database. Our proposed metapath-based similarity search algorithm that uses the Pubmed bibliographic network is used to improve the retrieval performance of MPSS. We then introduce three meta-data-based ranking aspects of relevance, adversity, and popularity, as well as a single ranked list combining all the aspects through aspect-based rank fusion. MPSS follows a faceted search paradigm where the user is given the option to select any one (among four) ranking aspects based on the user's information need. We present the metapath as explanations in the final ranked list in case of the additional trials retrieved using Pubmed-enhanced retrieval, which makes the trial search results more explainable