Home

Welcome to the CTSuggest wiki!

Step 1: Trial Specification

The first step of the CTSuggest app serves as a user-friendly starting point for both reviewing existing clinical trials and making new custom ones. Once a trial is selected, the following fields are populated with detailed information:

Title: The official name of the trial.
Brief Summary: A concise overview of what the trial is about.
Condition: The health condition or disease that the trial addresses.
Eligibility Criteria: This details the criteria for participant inclusion and exclusion.
Intervention: Descriptions of the interventions tested.
Outcome: The expected outcome of the trial.

This step is intuitively divided into two sections: Existing trials and Custom made trials. The aforementioned fields are present in both of the sections.

1.1 Existing Trials

Here, users can easily look up and review details of existing clinical trials. The datasets used are CT-Pub and CT-Repo, which are publicly available on clinicaltrials.gov. It begins with a dropdown menu where users can select a clinical trial ID (default: NCT00126737). The dropdown includes a search feature for easier access.

To load the data of an existing trial into these fields, users click the "Load an existing trial" button, which loads all relevant information for use by the LLM.

Actual Features: Baseline features determined by medical experts that serve as a reference for the LLM.

⚠️ The Actual Features field only appears in existing trials.
Users should only click "Load an existing trial" and avoid other actions to prevent LLM confusion.
If users modify any data, they must hit "Update" to store changes. Note: Evaluation will not run in that case.

1.2 Custom Made Trials

Users can create a custom trial by manually filling out the fields and clicking "Create a new trial".

This option excludes Actual Features since there are no reference features.
As a result, Step 3 (Evaluation) will be skipped.

Once all details are entered, users click "Update" to save the trial. The LLM will receive this data for the next step.

Users can export:

For existing trials: trial ID, trial state, all fields (including Actual Features)
For custom trials: trial state, all fields (excluding Actual Features)

Export is available as a JSON file via the "Download trial information as JSON file" button.

Step 2: Generation of Baseline Features

In this step, the LLM (default: gpt-4o, with three-shot learning and explanations enabled) receives the trial data and generates candidate baseline features with explanations.

Two learning approaches are used:

Zero-shot learning: No prior examples are shown to the LLM.
Three-shot learning: The LLM sees three example trials before generating output.

The three-shot examples (defaults: NCT00000620, NCT01483560, NCT04280783) are user-editable and must be unique.

2.1 System Prompt (Zero-Shot)

LLMs are instructed as clinical trial experts. Given trial details, they generate:

A list of relevant baseline features (e.g., age, sex, BMI, blood pressure)
An explanation for each feature

No unrelated text is allowed in the output.

2.2 System Prompt (Three-Shot)

Same as above, but includes three example trials to help align the format with actual reference features.

Based on prompts from CTBench with slight modifications.
Examples can be replaced by users (no duplicates).

2.3 Explanation Prompt

If explanations are enabled, the LLM receives this extra prompt to provide a rationale for each baseline feature.

All prompts are view-only in the app and can be downloaded as part of the generation report (JSON).

2.4 Report

Clicking "Generate" produces a report that includes:

Trial ID (custom for custom trials)
Title
Suggested baseline features (with or without explanations)
LLM used
In-context learning setting
Three-shot example IDs (if used)

Download the full report and prompts via "Download generation report as JSON file".

2.5 Options

Users can configure:

LLM selection
Zero-shot vs. three-shot learning
Custom three-shot examples
Enable/disable explanations

Once configured, clicking "Update" will regenerate the report with the selected options.

This allows hands-on experimentation with LLMs and in-context setups.

Step 3: Evaluation Using LLM as a Judge

This step is only available for existing trials. The LLM compares the generated candidate features against expert-defined reference features.

3.1 Evaluation Prompt

The LLM is instructed to:

Match each candidate feature to the most semantically similar reference feature
Match each only once
If multiple candidates fit a reference, choose the best one
Report unmatched features and hallucinations

Output Format (JSON):

{
  "matched_features": [
    ["<reference feature 1>", "<candidate feature 1>"],
    ["<reference feature 2>", "<candidate feature 2>"]
  ],
  "remaining_reference_features": ["<unmatched reference 1>", "<unmatched reference 2>"],
  "remaining_candidate_features": ["<unmatched candidate 1>", "<unmatched candidate 2>"]
}


Prompts Comparison

Generation