-
Notifications
You must be signed in to change notification settings - Fork 0
Experiment Run Page
nipdep edited this page Mar 17, 2025
·
2 revisions
The Experiment Page provides the functionality to set up and run experiments on your dataset. The application supports two experiment modes:
- Answer Generation – The system generates answers for the user prompt using the provided dataset without any fact verification.
- Fact Extraction & Verification – In addition to generating answers, the system performs a structured verification process, including information extraction, SPARQL queries, and post-processing to validate facts against a knowledge graph (KG). To use this mode, you must have a pre-configured pipeline.
-
Choose the setup method:
- Select whether to build the experiment using a Prompt-based setup or a Pipeline file.
- Prompt-based setup only generates answers and does not perform fact verification.
- Pipeline file setup generates answers and extracts facts using predefined functions from the uploaded pipeline file.
-
Upload your dataset – Provide the dataset file containing the input data for the experiment.
-
Select LLMs – Choose the Large Language Models (LLMs) to run the experiment against.
-
Set the number of retries (optional) – If you want to generate multiple versions of an answer for a single query, specify the number of retries.
-
Submit the experiment – The system follows these steps upon submission:
- Validates all experiment configurations to ensure inputs are correctly provided.
- Builds the execution pipeline based on the selected setup method.
- Executes the pipeline across multiple LLMs over the dataset.
-
Receive confirmation and view results:
- A confirmation popup will indicate the experiment run status.
- The system will display results in a structured table with an option to download the full experiment report.
| Field | Description | Required | Input Type |
|---|---|---|---|
| Setup Method | Select "Answer Generation" for response generation only, or "Fact Extraction" to include fact verification. | Yes | Selection |
| System Prompt | Defines the system prompt template for Answer Generation mode. Use {{column_name}} to dynamically insert dataset values into queries. |
Yes | String |
| User Prompt | Specifies the task or query for generating answers. | Yes | String |
| Pipeline Configuration File | Upload a pre-configured pipeline file (only required for Fact Extraction mode). | Yes (if using Fact Extraction) | JSON |
| Dataset File | Upload the dataset containing queries and relevant input information. | Yes | CSV |
| Number of Retries | Number of times the same query should be tested with an LLM. | No | Integer |
| LLMs | Select the Large Language Models to run the experiment against. | Yes | Category |
The system generates structured experiment results in the following format:
{
"experiment_info": {
"datetime": "Timestamp of the experiment run",
"mode": "Experiment mode; 'Answer Generation' or 'Fact Extraction'",
"retries": "Number of retries per query",
"LLMs": "LLMs used in the experiment"
},
"pipeline": {
"dataset_structure": ["List of expected column names"],
"system_prompt_builder": "Function used to incorporate dataset values into the system prompt",
"user_prompt": "Defined user prompt",
"information_extraction": "Function responsible for extracting relevant facts",
"sparql_endpoint": "SPARQL knowledge graph endpoint",
"sparql_query_template": "Template for SPARQL queries",
"query_function": "Function used to execute SPARQL queries",
"post_processor": "Function handling post-processing of SPARQL responses"
},
"dataset": ["Dataset contents"],
"results": [
{
"llm": "LLM used to generate the answer",
"query_id": "Index of the query in the dataset",
"try": "Retry attempt number",
"system_prompt": "Final system prompt used after dataset values were inserted",
"generated_answer": "Response generated by the LLM",
"extracted_info": "Comma-separated string of extracted factual information",
"<post-processed column names>": "Values produced after post-processing (varies based on configuration)"
}
]
}