Home

Welcome to the CTSuggest wiki!

1. Step 1: Trial Specification

The first step of the CTSuggest app serves as a user-friendly starting point for both reviewing existing clinical trials and making a new custom ones. Once a trial is selected, the following fields are populated with detailed information:

Title: The official name of the trial.
Brief Summary: A concise overview of what the trial is about.
Condition: The health condition or disease that the trial addresses.
Eligibility Criteria: This details the criteria for participant inclusion and exclusion.
Intervention: Descriptions of the interventions tested.
Outcome: The expected outcome of the trial.

This step is intuitively divided into two sections i.e Existing trials and Custom made trials. The aforementioned fields are present in both of the sections.

1.1 Existing Trials

Here, users can easily look up and review details of existing clinical trials. The datasets used are CT-Pub and CT-Repo which are publicly available on clinicaltrial.gov. It begins with a dropdown menu where users can select a clinical trial ID (default being NCT00126737). The dropdown includes a search feature, making it easier to find specific trials.

To load the data of an existing trial into these fields, users click the `Load an existing trial" button, which brings up all the relevant information of the trial to be stored for the LLM's action in generation and evaluation of features.

Actual Features: The actual baseline features determined by medical experts that serve as a reference for the LLM in generating and evaluating the candidate baseline features.

The Actual Features field only appears in the existing trials. One thing for users to keep in mind, in this case they should only be hitting the Load an existing trial'' button and the information gets stored for the LLMs to receive. No other buttons should be hit in this case as it might confuse the LLMs. Here, the users can also modify the trial information and then store it to be used by the LLM. But if they do this they must hit the Update'' button so that it gets stored and there will be no evaluation for this case.

1.2 Custom made Trials

Users also have the option to create a custom trial by filling out the aforementioned fields manually by hitting the Create a new trial" button. This option however excludes the Actual Features field because the medical experts have not determined the reference baseline features for these trials yet and are completely reliant on the LLM for the generation. As there are no reference features here, the LLM would not do any evaluation thus completely avoiding step 3. Once all information is entered, clicking the Update" button saves the new trial details into the system and LLM receives it for the subsequent steps.

There is an option for the users to export the data which includes the trial ID, trial state and the aforementioned fields for the case of existing trials and trial state and all fields excluding Actual Features for the case of custom made trials in a structured format for external use or reporting in a JSON format if they hit the `Download trial information as JSON file".

This step is pretty much straightforward, making it easy for users to either pull up existing data or input new data efficiently. This supports a smooth and effective workflow, addressing research in clinical trial management.

2. Step 2: Generation of Baseline Features

This step follows immediately after the previous step, where LLMs (default one being gpt-4o three shot learning with explanations on) receive the stored information, process them and generate the baseline features which serve as the candidate features and their explanations as the output based on certain prompts. Now, in order to achieve this the LLMs follow two approaches (in context learning). The first approach being zero-shot learning where the LLMs have no prior knowledge of the representation of the output and generate everything based on the prompt. The other approach being three-shot learning where the LLMs are allowed to view the details of three trial IDs which will help them in generating features in a format that aligns more with the actual reference features. The three trial IDs for this approach can also be interchanged. The prompts used for this process are studied below:

2.1 System Prompt

This is the prompt that the LLMs receive for zero shot learning where they are instructed that they are a clinical trial expert. When given a query that includes detailed trial information from the previous step and their task is to provide a list of baseline features (e.g., age, sex, BMI, blood pressure and serve as the candidate features) and not include any extra text in the list. This is followed by a detailed explanation for why each feature is relevant to the trial's goals.

2.2 System Prompt (Three-Shot)

This prompt is received by the LLMs for three shot learning where again they are instructed that they are a clinical trial expert and when given a query that includes trial information, their job is to provide the list of baseline features along with the explanations much like the zero shot learning prompt. The key addition to this prompt is the inclusion of format of three examples (default being NCT00000620, NCT01483560, NCT04280783) that are received by the LLM which will help them generate the baseline features that can align more closely with the actual reference features. This prompt is designed in such a way that it can capture any three trial ID with no duplicates. Users can play with this and set up the trial ID they would like the LLM to view and give the output. Both of the above prompts are based on the prompts used in CTBench with slight modifications.

2.3 Explanation Prompt

The LLMs have the option of generating the report for the baseline features with or without the explanations. If explanation is on, they also receive this additional prompt that tells the LLMs to provide a short, clear explanation for each baseline features that resonates with the trial's objective after listing them. All of prompts discussed are available to the users to only view and not edit in the app as well as in the downloadable report in JSON format for this step.

This step basically has two sections:

2.4 Report

In this page users can click on Generate'' button which calls the LLM to generate the baseline features with or without explanations (default being gpt-4o with three shot learning and explanations on) using the saved trial details from the previous step. The report contains the trial ID (in case of custom it is set as custom), title, suggested baseline features (with or without explanations), LLM and the in context learning setting used including the three examples users picked for three shot learning. Like the previous step users have the option to download the report as a downloadable JSON file by clicking on Download generation report as JSON file''. This file includes the report as well as the prompts used for the generation.

2.5 Options

This page provides a plethora of options that users can experiment with. The page has a dropdown for the LLMs that the users want to use for generation. They also have the option to use three shot learning or zero shot learning for the LLM. Additionally, users can also pick up the trial ID that they want for three shot learning provided they are not duplicates. There is also an option to include or exclude explanations if users prefer a more simplified report. After users are satisfied with all of the option they want to prefer, hitting the `Update'' (different from step 1 Update'') button will generate a report according to their preferences.

This step ensures that users can have some first hand experience in experimenting with different LLMs according to their preferences. The baseline candidate features generated in this step is passed on to the next step where it will be compared with the actual reference features by the LLM.

3. Step 3: Evaluation using LLM as a judge

This is the next and final step of the app. In this step, the baseline candidate features generated in the previous step are evaluated and matched with the actual reference features by the LLM (default being set to gpt-4o) using an evaluation prompt that is based on CTBench with slight modifications.

3.1 Evaluation Prompt

This is the final prompt received by the LLMs where they are instructed that they are an expert in the medical field and clinical trial design. They have access to two lists: one with actual reference features and one with candidate baseline features and their task is to match each candidate feature with the reference feature that is most semantically similar in meaning and context. They are strictly instructed to match each reference and candidate feature once and if more than one candidate feature aligns with a reference, they are to choose the one that fits best and provide the list of remaining unmatched reference and candidate features as well. They will be providing the final answer as a JSON object in the following format:

{
  "matched_features": [["<reference feature 1>", "<candidate feature 1>"], ["<reference feature 2>", "<candidate feature 2>"]],
  "remaining_reference_features": ["<unmatched reference feature 1>", "<unmatched reference feature 2>"],
  "remaining_candidate_features": ["<unmatched candidate feature 1>", "<unmatched candidate feature 2>"]
}

Prompts Comparison

Generation

Uh oh!

Home

Uh oh!

Clone this wiki locally