Skip to content
nipdep edited this page Mar 17, 2025 · 4 revisions

Introduction

ChatBS-NextGen is a user-friendly application designed to assess the factual accuracy of AI-generated responses using semantic knowledge graphs. By extracting entities from LLM outputs and mapping them to authoritative knowledge sources, it streamlines fact verification, enabling seamless factoid extraction through simple text input.

The platform supports multiple LLMs, multi-query workflows, and iterative experiment pipelines, generating structured, downloadable reports for each run. This makes it an ideal solution for organizations, researchers, and individuals who require reliable verification of AI-generated content.

Detailed description

ChatBS-NextGen automates the complex yet repetitive process of answer generation across multiple LLMs using custom datasets. To ensure answer consistency, it offers a unique feature that allows the same query to be run multiple times. The results are compiled into a structured yet easily interpretable report, detailing both the generated responses and experiment logs.

For factual verification, the application follows a structured factoid extraction process. It first identifies the information that needs verification, queries a designated authoritative knowledge graph (KG) for validated facts, and then presents the user with a side-by-side comparison of extracted facts from both the AI-generated response and the KG.

The application provides two dedicated interfaces: one for configuring the pipeline and another for running experiments. Upon logging in, users land on the experiment page, where they can run analyses on their data. With minimal setup, users can initiate experiments by defining the system prompt, user prompt, and selecting the LLMs they wish to test.

In the pipeline configuration interface, users can construct a four-step workflow:

  1. System Prompt Generation – Configuring system instructions and contextual data.
  2. Information Extraction – Identifying key entities and factoid candidates.
  3. SPARQL Query Execution – Retrieving authoritative data from the knowledge graph.
  4. Post-Processing – Structuring and presenting results for easy interpretation.

On the experiment page, users can refine system and user prompts, incorporate additional data into the system prompt, and select the LLMs for each experiment run. With these capabilities, ChatBS-NextGen provides a streamlined yet highly flexible environment for evaluating the factual accuracy and consistency of AI-generated responses.