Atlas project development

BOUSD-AeriesDataExportFormatter-Extract

`run.sh` orchestrates query execution and sheet sync. SQL lives under `sql/` and is referenced by TOML configs in `conf.d/`. It ingests Configuration files (TOML/YAML/JSON/INI/CONF), Database tables (SQL scripts), CSV files and produces Query results (reports/extracts), CSV files.

Type
Field Tool
Lifecycle
Active
Last touched
2025-12-05
Visibility
Public

Purpose

`run.sh` orchestrates query execution and sheet sync. SQL lives under `sql/` and is referenced by TOML configs in `conf.d/`.

Current state

Last touched: 2025-12-05. Functionality and completeness: Scripted pipeline exists; scheduling/monitoring are not documented.

Next step

Add baseline automated tests to cover critical flows; Add CI pipeline for build/test/lint; Document deployment/runtime environment (or add Dockerfile); Document interfaces (CLI flags, API endpoints, file formats); Add structured logging and basic health checks.

Interfaces

Inputs
  • Configuration files (TOML/YAML/JSON/INI/CONF), Database tables (SQL scripts), CSV files
Outputs
  • Query results (reports/extracts), CSV files

Reality to Action trace

Reality Ingestion

Contributes in this stage.

Canonical Storage

Not in scope.

Automation Engines

Not in scope.

Human Interfaces

Not in scope.

Operational Adoption

Not in scope.

Core workflow

TBD. Document the 5-10 steps that define the core workflow.

Artifacts

  • CSV columns defined by SQL in `sql/`; sheet ranges in `run.sh` must match destination tab schemas

Operational notes

Constraints and scars

  • Depends on fixed Google Sheet ranges and SQL output shape; sheet column changes require query/config updates.

Reliability posture

Failure modes and safe behavior: ODBC connection/auth failures or Google API permission errors stop the script and leave prior sheet data intact. Idempotency / retries / batching behavior: Re-runs overwrite the same sheet ranges; no explicit retry/backoff beyond underlying tools.

Observability

  • Logs: stdout/stderr from `run.sh`, `mssql_to_csv`, and `google-sheet-download`
  • Metrics/health checks: None documented; rely on exit codes and CSV outputs
  • Logs: stdout/stderr from run.sh and bundled CLI tools; CSV artifacts in data/ and temp/ act as run outputs.

Security and privacy

Staff data exports are processed; protect CSV outputs and logs accordingly. Service account JSON and DB credentials must be stored securely and excluded from git. Sensitive secret material detected in BOUSD-AeriesDataExportFormatter-Extract/priv_key.json; ensure it is excluded from docs and CI.

Dependencies

Upstream
  • Aeries SQL Server via ODBC
  • Google Sheets API via service account

Ownership

Owners

Josh Barton

Users

Josh Barton (owner)

BOUSD-AeriesDataExportFormatter-Extract

Architecture & Major Components

  • High-level diagram (text):

    • Entry/trigger -> core logic -> outputs (details per docs below)
  • Entry points: BOUSD-AeriesDataExportFormatter-Extract/run.sh

  • Top-level folders: bin, conf.d, data, sql, temp

  • Key abstractions: Config-driven query execution, sheet range overwrite, per-extract CSV artifacts

Setup / Build / Run

  • Build system(s): Shell scripts plus Rust CLI binaries (Cargo in tool subdirectories).
  • Install UnixODBC and the Microsoft ODBC driver for SQL Server; configure odbcinst.ini and odbc.ini.
  • Provide per-extract TOML configs under conf.d/ (DSN, query, output paths).
  • Place the Google service-account key where the sheet tool expects it (default is priv_key.json).
  • Run ./run.sh to execute the extract and sync pipeline.