Skip to content

Setup Pipeline Page

nipdep edited this page Mar 17, 2025 · 2 revisions

Introduction

The Pipeline Setup Page allows users to configure a four-step pipeline essential for fact verification and answer generation. Each step builds upon the previous one, ensuring a structured approach to extracting, validating, and processing information.

Pipeline Steps

  1. System Prompt Generation – Create an optimal prompt template that integrates dataset values into the system prompt.
  2. Information Extraction – Define a function to extract key information from the generated answers that require validation.
  3. SPARQL Query Construction – Develop a function that formulates a SPARQL query to retrieve relevant knowledge from a knowledge graph (KG) based on extracted information.
  4. Post-Processing – Since SPARQL responses may not always follow the expected format, this step ensures the extracted information is structured correctly, transforming and concatenating it into a standardized format.

At each stage, the system provides options to view, update, and test the generated functions before proceeding. All four steps must be completed before finalizing the pipeline. The UI enforces a sequential order, where the next stage is unlocked only after successfully setting up and accepting the previous step.

Operational Flow

  1. Configure the System Prompt.
  2. Define the Information Extractor.
  3. Set up the SPARQL Query.
  4. Configure the Post-Processor.

Pipeline Configuration Inputs

1. System Prompt Setup

Field Description Required Input Type
System Prompt Instruction Defines the context and scenario for generating the system prompt. Yes String
User Prompt Specifies the task that the system should generate responses for. Yes String
Upload Sample Data Provides a dataset that defines the expected data fields (case-sensitive). Yes CSV
Generated Text Optionally edit or update the system-generated prompt. No String
Upload Test Data Allows testing the generated system prompt with sample data. No CSV

2. Information Extraction Setup

Field Description Required Input Type
Instruction Clear guidelines on what information should be extracted, preferably with examples. Yes String
Generated Text Optionally refine the system-generated extraction function. No String
Test Input A sample string to validate the information extraction process. No CSV

3. SPARQL Query Setup

Field Description Required Input Type
SPARQL Endpoint URL of the knowledge graph SPARQL endpoint. Yes String (URL)
Query Template A well-structured SPARQL query template. Use %s as a placeholder for dynamic values. Yes String (SPARQL Query)
Test Input Entity name to be inserted into %s for testing query execution. Yes String

4. Post-Processing Setup

Field Description Required Input Type
Instruction Detailed guidelines on how to process each column in the SPARQL query response. Use `` markers to denote column names. Yes String
Generated Text Optionally refine the system-generated post-processing function. No String
Test Input A sample string to validate the post-processing logic. No CSV

Pipeline Output Structure

Upon successful setup, the configured pipeline will be structured as follows:

{
    "dataset_structure": ["List of expected column names"],
    "system_prompt_builder": "Function to integrate dataset values into the system prompt",
    "user_prompt": "User-defined task prompt",
    "information_extraction": "Function for extracting relevant information",
    "sparql_endpoint": "SPARQL knowledge graph endpoint",
    "sparql_query_template": "Template for querying SPARQL endpoint",
    "query_function": "Function executing the SPARQL query",
    "post_processor": "Function for processing SPARQL query responses"
}