Atlas project concept

privateGPT

Expose a private, extensible RAG platform that works offline and keeps data inside the execution environment. Ingest documents from a local folder, splitting and embedding them into a vector store. Expose chat and completion endpoints that retrieve context from ingested data. Provide low-level APIs for embeddings and retrieval to support custom pipelines.

Type
System
Lifecycle
Maintenance
Last touched
2024-03-20
Visibility
Public

Purpose

Expose a private, extensible RAG platform that works offline and keeps data inside the execution environment.

Current state

Last touched: 2024-03-20. Functionality and completeness: Core ingestion and API flows are documented; deployment guidance is evolving.

Next step

Add baseline automated tests to cover critical flows; Add CI pipeline for build/test/lint; Document deployment/runtime environment (or add Dockerfile); Document interfaces (CLI flags, API endpoints, file formats); Add structured logging and basic health checks.

Interfaces

Inputs
  • settings.yaml profiles, document folders, optional external model keys
  • Documents for ingestion
  • model files
Outputs
  • embeddings/vector store, API responses, UI outputs
  • generated answers

Reality to Action trace

Reality Ingestion

Contributes in this stage.

Canonical Storage

Contributes in this stage.

Automation Engines

Contributes in this stage.

Human Interfaces

Contributes in this stage.

Operational Adoption

Contributes in this stage.

Core workflow

TBD. Document the 5-10 steps that define the core workflow.

Artifacts

  • OpenAI-compatible API schema for chat and completions

Operational notes

Constraints and scars

  • Large model dependencies and vector stores require significant local resources.

Reliability posture

Failure modes and safe behavior: Missing models or vector store failures prevent ingestion or query. Idempotency / retries / batching behavior: Ingestion re-runs reindex documents; no built-in retry policy.

Observability

  • Logs: FastAPI/Uvicorn logs and application log output
  • Metrics/health checks: None documented
  • Logs: application logs via FastAPI/Uvicorn and Python logging.

Security and privacy

Designed to keep data local; avoid enabling external providers without strict governance. Local data and embeddings should be stored in protected directories.

Dependencies

Upstream
  • Optional OpenAI/Ollama/Gemini/SageMaker providers (config-driven)

Ownership

Owners

Josh Barton

Users

Josh Barton (owner)

privateGPT

Architecture & Major Components

  • High-level diagram (text):

    • Entry/trigger -> core logic -> outputs (details per docs below)
  • FastAPI server exposes high-level (RAG) and low-level (embeddings/retrieval) APIs.

  • Components wrap LlamaIndex abstractions (LLM, embeddings, vector store) for swappable backends.

  • Optional Gradio UI provides a browser-based test client.

  • Top-level folders: .github, fern, local_data, models, private_gpt, scripts, tests, tiktoken_cache

  • Key abstractions: server routers, service layers, component providers

Setup / Build / Run

  • Build system(s): Python (pyproject).
  • Uses Poetry/Makefile workflows; profiles in settings.yaml control runtime behavior.