Atlas project concept

privateGPT

Expose a private, extensible RAG platform that works offline and keeps data inside the execution environment. Ingest documents from a local folder, splitting and embedding them into a vector store. Expose chat and completion endpoints that retrieve context from ingested data. Provide low-level APIs for embeddings and retrieval to support custom pipelines.

Type: System
Lifecycle: Maintenance
Last touched: 2024-03-20
Visibility: Public

Purpose

Expose a private, extensible RAG platform that works offline and keeps data inside the execution environment.

Current state

Last touched: 2024-03-20. Functionality and completeness: Core ingestion and API flows are documented; deployment guidance is evolving.

Next step

Add baseline automated tests to cover critical flows; Add CI pipeline for build/test/lint; Document deployment/runtime environment (or add Dockerfile); Document interfaces (CLI flags, API endpoints, file formats); Add structured logging and basic health checks.

Interfaces

Inputs

settings.yaml profiles, document folders, optional external model keys
Documents for ingestion
model files

Outputs

embeddings/vector store, API responses, UI outputs
generated answers

Reality to Action trace

Reality Ingestion

Contributes in this stage.

Canonical Storage

Contributes in this stage.

Automation Engines

Contributes in this stage.

Human Interfaces

Contributes in this stage.

Operational Adoption

Contributes in this stage.

Core workflow

TBD. Document the 5-10 steps that define the core workflow.

Artifacts

OpenAI-compatible API schema for chat and completions

Operational notes

Constraints and scars

Large model dependencies and vector stores require significant local resources.

Reliability posture

Failure modes and safe behavior: Missing models or vector store failures prevent ingestion or query. Idempotency / retries / batching behavior: Ingestion re-runs reindex documents; no built-in retry policy.

Observability

Logs: FastAPI/Uvicorn logs and application log output
Metrics/health checks: None documented
Logs: application logs via FastAPI/Uvicorn and Python logging.

Security and privacy

Designed to keep data local; avoid enabling external providers without strict governance. Local data and embeddings should be stored in protected directories.

Dependencies

Upstream

Optional OpenAI/Ollama/Gemini/SageMaker providers (config-driven)

Ownership

Owners

Josh Barton

Users

Josh Barton (owner)

privateGPT

Architecture & Major Components

High-level diagram (text):
- Entry/trigger -> core logic -> outputs (details per docs below)
Entry/trigger
core logic
outputs
FastAPI server exposes high-level (RAG) and low-level (embeddings/retrieval) APIs.
Components wrap LlamaIndex abstractions (LLM, embeddings, vector store) for swappable backends.
Optional Gradio UI provides a browser-based test client.
Top-level folders: .github, fern, local_data, models, private_gpt, scripts, tests, tiktoken_cache
Key abstractions: server routers, service layers, component providers

Setup / Build / Run

Build system(s): Python (pyproject).
Uses Poetry/Makefile workflows; profiles in settings.yaml control runtime behavior.