Atlas project production

BOUSD-ClassSize-Extract

Uses UnixODBC + Microsoft ODBC driver to query Aeries SQL Server. Uses a Google service account to update reporting sheets after each extract. It ingests Configuration files (TOML/YAML/JSON/INI/CONF), Database tables (SQL scripts), CSV files and produces Query results (reports/extracts), CSV files.

Type: Field Tool
Lifecycle: Active
Last touched: 2025-08-18
Visibility: Public

Purpose

Uses UnixODBC + Microsoft ODBC driver to query Aeries SQL Server. Uses a Google service account to update reporting sheets after each extract.

Current state

Last touched: 2025-08-18. Functionality and completeness: Scripted pipeline exists; scheduling/monitoring are not documented.

Next step

Add baseline automated tests to cover critical flows; Add CI pipeline for build/test/lint; Document deployment/runtime environment (or add Dockerfile); Document interfaces (CLI flags, API endpoints, file formats); Add structured logging and basic health checks.

Interfaces

Inputs

Configuration files (TOML/YAML/JSON/INI/CONF), Database tables (SQL scripts), CSV files

Outputs

Query results (reports/extracts), CSV files

Reality to Action trace

Reality Ingestion

Contributes in this stage.

Canonical Storage

Contributes in this stage.

Automation Engines

Not in scope.

Human Interfaces

Contributes in this stage.

Operational Adoption

Contributes in this stage.

Core workflow

TBD. Document the 5-10 steps that define the core workflow.

Artifacts

CSV columns defined by the SQL in `sql/`; sheet ranges in `run.sh` must match destination tab schemas

Operational notes

Constraints and scars

Depends on fixed Google Sheet ranges and SQL output shape; sheet column changes require query/config updates.

Reliability posture

Failure modes and safe behavior: ODBC connection/auth failures or Google API permission errors stop the script and leave prior sheet data intact. Idempotency / retries / batching behavior: Re-runs overwrite the same sheet ranges; no explicit retry/backoff beyond underlying tools.

Observability

Logs: stdout/stderr from `run.sh`, `mssql_to_csv`, and `google-sheet-download`
Metrics/health checks: None documented; rely on exit codes and CSV outputs
Logs: stdout/stderr from run.sh and bundled CLI tools; CSV artifacts in data/ and temp/ act as run outputs.

Security and privacy

Student/class size data is processed; protect CSV outputs and logs accordingly. Service account JSON and DB credentials must be stored securely and excluded from git. README includes internal hostnames and service-account identifiers; keep those details private. Sensitive secret material detected in BOUSD-ClassSize-Extract/priv_key.json; ensure it is excluded from docs and CI.

Dependencies

Upstream

Aeries SQL Server via ODBC
Google Sheets API via service account

Ownership

Owners

Josh Barton

Users

Josh Barton (owner)

BOUSD-ClassSize-Extract

Architecture & Major Components

High-level diagram (text):
- Entry/trigger -> core logic -> outputs (details per docs below)
Entry/trigger
core logic
outputs
Entry points: BOUSD-ClassSize-Extract/run.sh
Top-level folders: bin, conf.d, data, sql, temp
Key abstractions: Config-driven query execution, sheet range overwrite, per-extract CSV artifacts

Setup / Build / Run

Build system(s): Shell scripts plus Rust CLI binaries (Cargo in tool subdirectories).
Install UnixODBC and the Microsoft ODBC driver for SQL Server; configure odbcinst.ini and odbc.ini.
Provide per-extract TOML configs under conf.d/ (DSN, query, output paths).
Place the Google service-account key where the sheet tool expects it (default is priv_key.json).
Run ./run.sh to execute the extract and sync pipeline.