-
Notifications
You must be signed in to change notification settings - Fork 0
Setup Pipeline Page
nipdep edited this page Mar 17, 2025
·
2 revisions
The Pipeline Setup Page allows users to configure a four-step pipeline essential for fact verification and answer generation. Each step builds upon the previous one, ensuring a structured approach to extracting, validating, and processing information.
Pipeline Steps
- System Prompt Generation – Create an optimal prompt template that integrates dataset values into the system prompt.
- Information Extraction – Define a function to extract key information from the generated answers that require validation.
- SPARQL Query Construction – Develop a function that formulates a SPARQL query to retrieve relevant knowledge from a knowledge graph (KG) based on extracted information.
- Post-Processing – Since SPARQL responses may not always follow the expected format, this step ensures the extracted information is structured correctly, transforming and concatenating it into a standardized format.
At each stage, the system provides options to view, update, and test the generated functions before proceeding. All four steps must be completed before finalizing the pipeline. The UI enforces a sequential order, where the next stage is unlocked only after successfully setting up and accepting the previous step.
- Configure the System Prompt.
- Define the Information Extractor.
- Set up the SPARQL Query.
- Configure the Post-Processor.
| Field | Description | Required | Input Type |
|---|---|---|---|
| System Prompt Instruction | Defines the context and scenario for generating the system prompt. | Yes | String |
| User Prompt | Specifies the task that the system should generate responses for. | Yes | String |
| Upload Sample Data | Provides a dataset that defines the expected data fields (case-sensitive). | Yes | CSV |
| Generated Text | Optionally edit or update the system-generated prompt. | No | String |
| Upload Test Data | Allows testing the generated system prompt with sample data. | No | CSV |
| Field | Description | Required | Input Type |
|---|---|---|---|
| Instruction | Clear guidelines on what information should be extracted, preferably with examples. | Yes | String |
| Generated Text | Optionally refine the system-generated extraction function. | No | String |
| Test Input | A sample string to validate the information extraction process. | No | CSV |
| Field | Description | Required | Input Type |
|---|---|---|---|
| SPARQL Endpoint | URL of the knowledge graph SPARQL endpoint. | Yes | String (URL) |
| Query Template | A well-structured SPARQL query template. Use %s as a placeholder for dynamic values. |
Yes | String (SPARQL Query) |
| Test Input | Entity name to be inserted into %s for testing query execution. |
Yes | String |
| Field | Description | Required | Input Type |
|---|---|---|---|
| Instruction | Detailed guidelines on how to process each column in the SPARQL query response. Use `` markers to denote column names. | Yes | String |
| Generated Text | Optionally refine the system-generated post-processing function. | No | String |
| Test Input | A sample string to validate the post-processing logic. | No | CSV |
Upon successful setup, the configured pipeline will be structured as follows:
{
"dataset_structure": ["List of expected column names"],
"system_prompt_builder": "Function to integrate dataset values into the system prompt",
"user_prompt": "User-defined task prompt",
"information_extraction": "Function for extracting relevant information",
"sparql_endpoint": "SPARQL knowledge graph endpoint",
"sparql_query_template": "Template for querying SPARQL endpoint",
"query_function": "Function executing the SPARQL query",
"post_processor": "Function for processing SPARQL query responses"
}