Phase 5: Alembic baseline, per-site deploy, ADRs to docs/adr

Migration runner ready and a sister site can deploy from a clean checkout with one .env file. ADRs relocated (migrations/adr/ -> docs/adr/): - migrations/ is now Alembic territory, not docs. - All cross-references updated: CLAUDE.md, docs/PLUGIN-HOOKS.md, docs/PLUGIN-QUICKSTART.md. Alembic initialized (migrations/): - env.py, script.py.mako, alembic.ini copied from Flask-Migrate templates so `flask db migrate` and `flask db upgrade` work without a one-time `flask db init` (which would clash with the existing migrations/ directory). - Baseline migration generated via autogenerate, captures all 47 tables (core models + 6 plugins) as the upgrade target. Ready for per-site `flask db upgrade` from an empty schema. Deploy artifacts: - Dockerfile: python:3.12-slim base, gunicorn server, non-root user, healthcheck against /api/auth/login. Single image bundles all six plugins; sites enable via `flask plugin install <name>`. - docker-compose.yml: MySQL 8 + API container, healthcheck-gated startup, env-driven secrets that fail loud on missing values (`${SECRET_KEY:?}` form). - .env.example: full env-var inventory with comments. Calls out required vs optional. Matches what ProductionConfig.validate enforces. docs/DEPLOY.md: - Step-by-step per-site runbook: clone, configure .env, bring up stack, run migrations, seed reference data, install plugins, create admin, front with TLS, backups, updates. - Common-issues table. - Cross-links to ADR-004 (per-site rationale), ADR-003 (plugin distribution), and the config source. Skills: - migrating-asset-schema: Alembic + one-shot data migration policy. Rules: additive first, renames are three steps, destructive ops need rollback, equipment migration filter per ADR-001 + ADR-005. - hardening-flask-config: production validation, CORS allowlist policy, JWT cookie hardening, per-site deploy isolation per ADR-004. CLAUDE.md updated to reflect the post-Phase-5 state. No tests added this commit; the Alembic baseline is exercised by the existing db.create_all-based test suite (tests do not touch the migration runner; that's by design until per-plugin migrations land). Test count unchanged: 101 passing. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-08 17:56:19 -04:00
parent 8eb9362452
commit d4e3ac9fc8
19 changed files with 1503 additions and 41 deletions
--- a/docs/adr/ADR-006-collector-contract.md
+++ b/docs/adr/ADR-006-collector-contract.md
@@ -0,0 +1,131 @@
+# ADR-006: Plugin collector contract pattern
+
+- **Status:** ACCEPTED
+- **Date:** 2026-05-08
+- **Deciders:** cproudlock
+- **Supersedes:** none
+
+## Context
+
+PC inventory data was collected by PowerShell scripts pushing to `/api/collector/pc` (`shopdb/core/api/collector.py`, ~374 LOC). The endpoint is hardcoded for PCs: it accepts a fixed schema and writes to the legacy `Machine` model.
+
+Per ADR-001, `Machine` is being retired in favor of `Asset`. Per the project shift to PXE-driven imaging, PC inventory is moving to a new collection pipeline (PXE / GE-Enforce / manifest engine produces JSON about each PC). Other asset classes may want similar collector pipelines (printers via Zabbix, network gear via SNMP scan).
+
+This calls for a generalizable contract: any plugin that wants to accept external collector input declares a JSON schema, and the framework wires the endpoint, auth, and idempotency.
+
+## Decision
+
+`BasePlugin` gets one new optional hook:
+
+```python
+def get_collector_schema(self) -> Optional[dict]:
+    """Return JSON Schema describing the collector payload for this plugin.
+    Return None if the plugin does not accept collector input.
+
+    The schema must include:
+    - 'identityfield': name of the field that uniquely identifies an asset
+      across submissions (e.g., 'hostname' for PCs, 'macaddress' for network
+      devices). Used for idempotent upsert.
+    - 'fields': JSON Schema definitions for the rest of the payload.
+    """
+    return None
+```
+
+Plugin loader auto-registers an endpoint at `/api/collector/<pluginname>` for each plugin returning a schema. Auth is API-key, separate from JWT. Per-plugin keys via env vars:
+
+- `COLLECTOR_API_KEY_<PLUGINNAME>` (preferred, plugin-specific)
+- `COLLECTOR_API_KEY` (fallback, shared)
+
+### Idempotent upsert
+
+The endpoint uses the `identityfield` to find an existing `Asset` for the same identity. Found = update. Not found = insert. Existing relationships are preserved on update.
+
+### Response contract
+
+```json
+{
+  "status": "ok",
+  "action": "created" | "updated" | "noop",
+  "assetid": 12345,
+  "identityvalue": "PC-1234",
+  "warnings": []
+}
+```
+
+### Audit logging
+
+Every collector submission produces an audit log entry: `{action, plugin, identityvalue, before/after diff}`. Audit retention per site policy.
+
+### Schema discovery
+
+The framework exposes the registered schemas at `/api/collector/_schemas` (read-only, JWT-protected) so external collector authors can introspect what payloads are accepted by which plugins.
+
+## Concrete first user: computers plugin
+
+The `computers` plugin is the first to implement `get_collector_schema`. The PXE pipeline conforms.
+
+Initial computers collector schema (sketch, finalized when plugin is built):
+
+```json
+{
+  "identityfield": "hostname",
+  "fields": {
+    "hostname": "string, required",
+    "macaddress": "string, optional, secondary identity",
+    "osname": "string",
+    "osversion": "string",
+    "lastboottime": "datetime",
+    "currentuser": "string",
+    "ipaddress": "string",
+    "memorygb": "number",
+    "cputype": "string",
+    "imagename": "string (PXE image deployed)",
+    "imageappliedat": "datetime",
+    "installedsoftware": "array of {name, version}"
+  }
+}
+```
+
+The PC re-image case is handled by the identity field: a freshly imaged PC keeps its hostname, so the existing `Asset` row is updated rather than duplicated. Existing `AssetRelationship` rows pointing at that PC (e.g., `controls` to a machine) are preserved across re-images.
+
+## Migration of the existing endpoint
+
+`shopdb/core/api/collector.py` (`/api/collector/pc`) is **deprecated** in v1 and **removed** before v1.0.
+
+Migration path:
+
+1. Implement `get_collector_schema` on the `computers` plugin. New endpoint `/api/collector/computers` is auto-registered.
+2. Run both endpoints in parallel for one cycle of PXE imaging across the floor. PXE pipeline switches to `/api/collector/computers`.
+3. Remove `shopdb/core/api/collector.py` and the legacy blueprint registration.
+
+## Consequences
+
+### Positive
+
+- Generalizable across plugins. Sister sites adopting `printers`, `network`, etc. can wire their own collectors with no core change.
+- Identity-based idempotency makes PC re-imaging safe by default.
+- Audit logging is uniform across plugins.
+- Schema discovery enables external tools to validate before submission.
+
+### Negative
+
+- Plugin authors must write a JSON schema. Slight learning curve, but JSON Schema is widely understood and the framework can ship a few examples.
+- The `/api/collector/_schemas` endpoint plus per-plugin endpoints expand the public API surface; minor maintenance cost.
+
+### Neutral
+
+- API-key auth pattern stays as it is today (separate from JWT). Sites manage their own collector keys per plugin via env vars.
+
+## Alternatives considered
+
+1. **Keep `/api/collector/pc` and add new plugin-specific endpoints alongside.** Two ways to send PC data, plugin authors confused. Rejected.
+2. **Use JWT for collectors instead of API key.** Collectors are headless processes (PXE pipeline, scripts), not interactive users. JWT lifecycle (refresh tokens, expiry) is the wrong tool. API key is simpler. Rejected.
+3. **Plugins write directly to the database, no collector endpoint.** Skips audit logging and schema validation. Rejected.
+
+## References
+
+- `shopdb/core/api/collector.py` (legacy endpoint to be removed)
+- `shopdb/plugins/base.py` (`get_collector_schema` hook to be added)
+- ADR-001 (asset model the collectors target)
+- ADR-002 (collector schema is part of plugin contract; changes to the hook signature are major bumps)
+- The PXE project (`/home/camp/projects/pxe/`) which feeds the computers collector