Setup Pipeline Page

Introduction

The Pipeline Setup Page allows users to configure a four-step pipeline essential for fact verification and answer generation. Each step builds upon the previous one, ensuring a structured approach to extracting, validating, and processing information.

Pipeline Steps

System Prompt Generation – Create an optimal prompt template that integrates dataset values into the system prompt.
Information Extraction – Define a function to extract key information from the generated answers that require validation.
SPARQL Query Construction – Develop a function that formulates a SPARQL query to retrieve relevant knowledge from a knowledge graph (KG) based on extracted information.
Post-Processing – Since SPARQL responses may not always follow the expected format, this step ensures the extracted information is structured correctly, transforming and concatenating it into a standardized format.

At each stage, the system provides options to view, update, and test the generated functions before proceeding. All four steps must be completed before finalizing the pipeline. The UI enforces a sequential order, where the next stage is unlocked only after successfully setting up and accepting the previous step.

Operational Flow

Configure the System Prompt.
Define the Information Extractor.
Set up the SPARQL Query.
Configure the Post-Processor.

Pipeline Configuration Inputs

1. System Prompt Setup

Field	Description	Required	Input Type
System Prompt Instruction	Defines the context and scenario for generating the system prompt.	Yes	String
User Prompt	Specifies the task that the system should generate responses for.	Yes	String
Upload Sample Data	Provides a dataset that defines the expected data fields (case-sensitive).	Yes	CSV
Generated Text	Optionally edit or update the system-generated prompt.	No	String
Upload Test Data	Allows testing the generated system prompt with sample data.	No	CSV

2. Information Extraction Setup

Field	Description	Required	Input Type
Instruction	Clear guidelines on what information should be extracted, preferably with examples.	Yes	String
Generated Text	Optionally refine the system-generated extraction function.	No	String
Test Input	A sample string to validate the information extraction process.	No	CSV

3. SPARQL Query Setup

Field	Description	Required	Input Type
SPARQL Endpoint	URL of the knowledge graph SPARQL endpoint.	Yes	String (URL)
Query Template	A well-structured SPARQL query template. Use `%s` as a placeholder for dynamic values.	Yes	String (SPARQL Query)
Test Input	Entity name to be inserted into `%s` for testing query execution.	Yes	String

4. Post-Processing Setup

Field	Description	Required	Input Type
Instruction	Detailed guidelines on how to process each column in the SPARQL query response. Use `` markers to denote column names.	Yes	String
Generated Text	Optionally refine the system-generated post-processing function.	No	String
Test Input	A sample string to validate the post-processing logic.	No	CSV

Pipeline Output Structure

Upon successful setup, the configured pipeline will be structured as follows:

{
    "dataset_structure": ["List of expected column names"],
    "system_prompt_builder": "Function to integrate dataset values into the system prompt",
    "user_prompt": "User-defined task prompt",
    "information_extraction": "Function for extracting relevant information",
    "sparql_endpoint": "SPARQL knowledge graph endpoint",
    "sparql_query_template": "Template for querying SPARQL endpoint",
    "query_function": "Function executing the SPARQL query",
    "post_processor": "Function for processing SPARQL query responses"
}

Home
User guide
- Setup pipeline page
- Experiment run page
Developer guide