Atlas project concept

Database-Ingester

Enable safe, repeatable CSV ingestion into SQL Server tables with backups and dry-run visibility. Load config, optionally back up the target table, compare CSV rows by primary key, then update/insert (and optionally delete missing rows). Dry-run produces a detailed diff report without applying changes. Restore mode rehydrates the table from a backup CSV.

Type: Component
Lifecycle: Maintenance
Last touched: 2025-02-10
Visibility: Public

Purpose

Enable safe, repeatable CSV ingestion into SQL Server tables with backups and dry-run visibility.

Current state

Last touched: 2025-02-10. Functionality and completeness: Core ingestion/backup/restore features are documented; tests and CI are pending.

Next step

Add baseline automated tests to cover critical flows; Add CI pipeline for build/test/lint; Document deployment/runtime environment (or add Dockerfile); Document interfaces (CLI flags, API endpoints, file formats); Add structured logging and basic health checks.

Interfaces

Inputs

TOML config, input CSV, target MSSQL table
Configuration files (TOML/YAML/JSON/INI/CONF)
CSV files

Outputs

Updated table rows
backup CSVs
log file entries
CSV files

Reality to Action trace

Reality Ingestion

Contributes in this stage.

Canonical Storage

Contributes in this stage.

Automation Engines

Not in scope.

Human Interfaces

Contributes in this stage.

Operational Adoption

Contributes in this stage.

Core workflow

TBD. Document the 5-10 steps that define the core workflow.

Artifacts

CSV headers must align with table column names; primary key column required in config

Operational notes

Constraints and scars

ODBC driver configuration is required and large tables can make backup/restore slow.
Requires UnixODBC + Microsoft SQL Server ODBC driver installed and configured.
CSV headers must match target table columns (case/trim normalized).
Large tables can make backup/restore operations slow.

Reliability posture

Failure modes and safe behavior: ODBC connection errors or column mismatches abort; dry-run avoids writes. Idempotency / retries / batching behavior: Re-running with the same CSV yields no diffs; no automatic retries.

Observability

Logs: Rust logging framework detected (log/tracing/env_logger).
Metrics/health checks: None documented; use logs and dry-run to validate
Logs: Writes to stdout and a configurable log file; dry-run logs include row-level diffs.

Security and privacy

Config contains DB credentials; restrict access and avoid committing configs to git.

Dependencies

Upstream

Microsoft SQL Server via ODBC

Ownership

Owners

Josh Barton

Users

Josh Barton (owner)

Database-Ingester

Architecture & Major Components

High-level diagram (text):
- Entry/trigger -> core logic -> outputs (details per docs below)
Entry/trigger
core logic
outputs
Entry points: src/main.rs
Top-level folders: src
Key abstractions: config loader, ODBC client, CSV reader/writer, diff engine

Setup / Build / Run

Build system(s): Cargo.
Install Rust, UnixODBC, and the Microsoft ODBC driver; configure odbcinst.ini / odbc.ini.
Provide a config.toml with DB, table, and logging settings; run in dry-run mode before live updates.