Atlas project concept

Database-Ingester

Enable safe, repeatable CSV ingestion into SQL Server tables with backups and dry-run visibility. Load config, optionally back up the target table, compare CSV rows by primary key, then update/insert (and optionally delete missing rows). Dry-run produces a detailed diff report without applying changes. Restore mode rehydrates the table from a backup CSV.

Type
Component
Lifecycle
Maintenance
Last touched
2025-02-10
Visibility
Public

Purpose

Enable safe, repeatable CSV ingestion into SQL Server tables with backups and dry-run visibility.

Current state

Last touched: 2025-02-10. Functionality and completeness: Core ingestion/backup/restore features are documented; tests and CI are pending.

Next step

Add baseline automated tests to cover critical flows; Add CI pipeline for build/test/lint; Document deployment/runtime environment (or add Dockerfile); Document interfaces (CLI flags, API endpoints, file formats); Add structured logging and basic health checks.

Interfaces

Inputs
  • TOML config, input CSV, target MSSQL table
  • Configuration files (TOML/YAML/JSON/INI/CONF)
  • CSV files
Outputs
  • Updated table rows
  • backup CSVs
  • log file entries
  • CSV files

Reality to Action trace

Reality Ingestion

Contributes in this stage.

Canonical Storage

Contributes in this stage.

Automation Engines

Not in scope.

Human Interfaces

Contributes in this stage.

Operational Adoption

Contributes in this stage.

Core workflow

TBD. Document the 5-10 steps that define the core workflow.

Artifacts

  • CSV headers must align with table column names; primary key column required in config

Operational notes

Constraints and scars

  • ODBC driver configuration is required and large tables can make backup/restore slow.
  • Requires UnixODBC + Microsoft SQL Server ODBC driver installed and configured.
  • CSV headers must match target table columns (case/trim normalized).
  • Large tables can make backup/restore operations slow.

Reliability posture

Failure modes and safe behavior: ODBC connection errors or column mismatches abort; dry-run avoids writes. Idempotency / retries / batching behavior: Re-running with the same CSV yields no diffs; no automatic retries.

Observability

  • Logs: Rust logging framework detected (log/tracing/env_logger).
  • Metrics/health checks: None documented; use logs and dry-run to validate
  • Logs: Writes to stdout and a configurable log file; dry-run logs include row-level diffs.

Security and privacy

Config contains DB credentials; restrict access and avoid committing configs to git.

Dependencies

Upstream
  • Microsoft SQL Server via ODBC

Ownership

Owners

Josh Barton

Users

Josh Barton (owner)

Database-Ingester

Architecture & Major Components

  • High-level diagram (text):

    • Entry/trigger -> core logic -> outputs (details per docs below)
  • Entry points: src/main.rs

  • Top-level folders: src

  • Key abstractions: config loader, ODBC client, CSV reader/writer, diff engine

Setup / Build / Run

  • Build system(s): Cargo.
  • Install Rust, UnixODBC, and the Microsoft ODBC driver; configure odbcinst.ini / odbc.ini.
  • Provide a config.toml with DB, table, and logging settings; run in dry-run mode before live updates.