Purpose
Expose a private, extensible RAG platform that works offline and keeps data inside the execution environment.
Current state
Last touched: 2024-03-20. Functionality and completeness: Core ingestion and API flows are documented; deployment guidance is evolving.
Next step
Add baseline automated tests to cover critical flows; Add CI pipeline for build/test/lint; Document deployment/runtime environment (or add Dockerfile); Document interfaces (CLI flags, API endpoints, file formats); Add structured logging and basic health checks.
Interfaces
Inputs- settings.yaml profiles, document folders, optional external model keys
- Documents for ingestion
- model files
Outputs- embeddings/vector store, API responses, UI outputs
- generated answers
Reality to Action trace
Reality IngestionContributes in this stage.
Canonical StorageContributes in this stage.
Automation EnginesContributes in this stage.
Human InterfacesContributes in this stage.
Operational AdoptionContributes in this stage.
Core workflow
TBD. Document the 5-10 steps that define the core workflow.
Artifacts
- OpenAI-compatible API schema for chat and completions
Operational notes
Constraints and scars
- Large model dependencies and vector stores require significant local resources.
Reliability posture
Failure modes and safe behavior: Missing models or vector store failures prevent ingestion or query. Idempotency / retries / batching behavior: Ingestion re-runs reindex documents; no built-in retry policy.
Observability
- Logs: FastAPI/Uvicorn logs and application log output
- Metrics/health checks: None documented
- Logs: application logs via FastAPI/Uvicorn and Python logging.
Security and privacy
Designed to keep data local; avoid enabling external providers without strict governance. Local data and embeddings should be stored in protected directories.
Dependencies
Upstream- Optional OpenAI/Ollama/Gemini/SageMaker providers (config-driven)
Ownership
OwnersJosh Barton
UsersJosh Barton (owner)
privateGPT
Architecture & Major Components
High-level diagram (text):
- Entry/trigger -> core logic -> outputs (details per docs below)
Entry/trigger
→core logic
→outputs
FastAPI server exposes high-level (RAG) and low-level (embeddings/retrieval) APIs.
Components wrap LlamaIndex abstractions (LLM, embeddings, vector store) for swappable backends.
Optional Gradio UI provides a browser-based test client.
Top-level folders: .github, fern, local_data, models, private_gpt, scripts, tests, tiktoken_cache
Key abstractions: server routers, service layers, component providers
Setup / Build / Run
- Build system(s): Python (pyproject).
- Uses Poetry/Makefile workflows; profiles in settings.yaml control runtime behavior.