Corrective RAG · LangGraph · pgvector

Your Documents.
Understood.

Upload any document and ask questions in natural language. DocMind uses a self-correcting RAG pipeline that retrieves, verifies, and — when context is insufficient — automatically rewrites the query and falls back to web search before generating an answer.

Start Free How It Works

5-Step

RAG Pipeline

Hybrid

Search + RRF

SSE

Real-Time Stream

pgvector

Vector Store

01 The Problem

Naive RAG Breaks.
Corrective RAG Adapts.

Standard RAG pipelines retrieve documents and generate answers in a single pass. If the retrieved context is wrong or incomplete, the answer is wrong — and you never know. Corrective RAG adds a verification loop that detects bad context and self-corrects before answering.

Naive RAG

Single-pass retrieval — no quality check

Irrelevant documents go straight to the LLM

No fallback when context is missing

Hallucinations from bad context go undetected

Keyword-only or semantic-only search

DocMind's Corrective RAG

Retrieve → Rerank → Grade → Correct → Generate

Score-threshold filtering removes irrelevant docs

Query rewriting + web search fallback when needed

Answers grounded strictly in verified context

Hybrid search: semantic + keyword via RRF fusion

02 The Pipeline

5 Steps. One Graph.

Orchestrated as a LangGraph state machine with conditional edges. If the grading step detects insufficient context, the pipeline branches into query transformation and web search before generating — this is the corrective loop.

Retrieve

Hybrid search: pgvector cosine similarity + PostgreSQL tsvector full-text, merged with Reciprocal Rank Fusion.

Rerank

Cohere cross-encoder re-scores candidates. Higher precision than bi-encoder similarity alone.

Branch

Grade

Score threshold filters irrelevant documents. If too many filtered → triggers correction branch.

If needed

Correct

Query rewritten for better retrieval. Tavily web search provides external context as fallback.

Generate

LLM produces an answer grounded in verified context. Streamed in real-time via Server-Sent Events.

Retrieve

Hybrid search: pgvector cosine similarity + PostgreSQL tsvector full-text, merged with Reciprocal Rank Fusion.

Rerank

Cohere cross-encoder re-scores candidates for higher precision.

GradeBranch

Score threshold filters irrelevant documents. Triggers correction if needed.

CorrectConditional

Query rewriting + web search fallback. Only runs when grading detects insufficient context.

Generate

LLM produces answer from verified context, streamed via SSE.

03 Architecture

Built for Intelligence

Every component is chosen for a reason. No unnecessary abstractions, no over-engineering — just the right tool for each layer.

LangGraph State Machine

The RAG pipeline is a directed graph with conditional edges — not a linear chain. Nodes execute async, edges branch on grading results.

Hybrid Search + RRF

Semantic similarity (pgvector HNSW) and keyword matching (tsvector GIN) merged via Reciprocal Rank Fusion for robust retrieval across query types.

Multitenancy

Every vector, document, and query is scoped by user_id — enforced at the SQL WHERE clause level. Complete data isolation between users.

SSE Streaming

Server-Sent Events deliver LLM tokens in real-time. The frontend renders chunks as they arrive — no polling, no waiting for the full response.

Embedding Cache

Redis caches embedding vectors with SHA-256 keys and 1-hour TTL. Repeated queries skip the OpenAI API entirely.

Input Guardrails

Regex-based prompt injection detection catches common attack patterns before they reach the LLM. Zero API cost, sub-millisecond latency.

04 Tech Stack

What Powers DocMind

Next.js 14

Frontend

App Router, RSC

FastAPI

Backend

Async Python

LangGraph

Orchestration

State machine

LangChain

LLM Framework

Chains & prompts

pgvector

Vector Store

HNSW + cosine

PostgreSQL

Database

tsvector + GIN

Redis

Cache

Embeddings + rate limit

Better Auth

Auth

Sessions + cookies

OpenAI

LLM Provider

GPT-4o + embeddings

Cohere

Reranking

Cross-encoder v3.5

PyMuPDF

PDF Processing

Text + images

Docker

Infrastructure

Compose + CI/CD

05 Features

From Upload to Insight

Document Processing

Upload PDFs with text or visual content. PyMuPDF extracts and chunks text using tiktoken, with optional GPT-4o Vision for charts and diagrams.

Intelligent Q&A

Ask anything about your documents. The Corrective RAG pipeline retrieves, verifies, and generates answers streamed in real time via SSE.

Feedback Loop

Rate every answer with thumbs up/down. Feedback is tracked per user with analytics dashboard showing satisfaction rates over time.

Ready to explore
your documents?

Create a free account and start querying in minutes. No credit card required.

Get Started

Your Documents.Understood.

Naive RAG Breaks.Corrective RAG Adapts.

Naive RAG

DocMind's Corrective RAG

5 Steps. One Graph.

Retrieve

Rerank

Grade

Correct

Generate

Retrieve

Rerank

GradeBranch

CorrectConditional

Generate

Built for Intelligence

LangGraph State Machine

Hybrid Search + RRF

Multitenancy

SSE Streaming

Embedding Cache

Input Guardrails

What Powers DocMind

From Upload to Insight

Document Processing

Intelligent Q&A

Feedback Loop

Ready to exploreyour documents?

Your Documents.
Understood.

Naive RAG Breaks.
Corrective RAG Adapts.

Ready to explore
your documents?