Phase 5: Alembic baseline, per-site deploy, ADRs to docs/adr

Migration runner ready and a sister site can deploy from a clean
checkout with one .env file.

ADRs relocated (migrations/adr/ -> docs/adr/):
- migrations/ is now Alembic territory, not docs.
- All cross-references updated: CLAUDE.md, docs/PLUGIN-HOOKS.md,
  docs/PLUGIN-QUICKSTART.md.

Alembic initialized (migrations/):
- env.py, script.py.mako, alembic.ini copied from Flask-Migrate
  templates so `flask db migrate` and `flask db upgrade` work without
  a one-time `flask db init` (which would clash with the existing
  migrations/ directory).
- Baseline migration generated via autogenerate, captures all 47
  tables (core models + 6 plugins) as the upgrade target. Ready for
  per-site `flask db upgrade` from an empty schema.

Deploy artifacts:
- Dockerfile: python:3.12-slim base, gunicorn server, non-root user,
  healthcheck against /api/auth/login. Single image bundles all six
  plugins; sites enable via `flask plugin install <name>`.
- docker-compose.yml: MySQL 8 + API container, healthcheck-gated
  startup, env-driven secrets that fail loud on missing values
  (`${SECRET_KEY:?}` form).
- .env.example: full env-var inventory with comments. Calls out
  required vs optional. Matches what ProductionConfig.validate
  enforces.

docs/DEPLOY.md:
- Step-by-step per-site runbook: clone, configure .env, bring up
  stack, run migrations, seed reference data, install plugins,
  create admin, front with TLS, backups, updates.
- Common-issues table.
- Cross-links to ADR-004 (per-site rationale), ADR-003 (plugin
  distribution), and the config source.

Skills:
- migrating-asset-schema: Alembic + one-shot data migration policy.
  Rules: additive first, renames are three steps, destructive ops
  need rollback, equipment migration filter per ADR-001 + ADR-005.
- hardening-flask-config: production validation, CORS allowlist
  policy, JWT cookie hardening, per-site deploy isolation per ADR-004.

CLAUDE.md updated to reflect the post-Phase-5 state. No tests added
this commit; the Alembic baseline is exercised by the existing
db.create_all-based test suite (tests do not touch the migration
runner; that's by design until per-plugin migrations land).

Test count unchanged: 101 passing.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
cproudlock
2026-05-08 17:56:19 -04:00
parent 8eb9362452
commit d4e3ac9fc8
19 changed files with 1503 additions and 41 deletions

View File

@@ -0,0 +1,173 @@
# ADR-001: Asset model is the platform contract
- **Status:** ACCEPTED
- **Date:** 2026-05-08
- **Deciders:** cproudlock
- **Supersedes:** none
## Context
shopdb-flask is being shaped as a framework that sister GE Aerospace facilities can adopt. The framework defines a stable core; sites install plugins for the asset classes they care about (equipment, computers, printers, measuring tools, network gear, etc.).
The codebase ran two parallel object models:
1. **Legacy `Machine` model** in `shopdb/core/models/machine.py`. Original schema inherited from the classic-ASP shopdb. Tables: `machines`, `pctypes`, `machinetypes`. Plugins like printers stored extension data via `PrinterData` keyed by `machineid`.
2. **New `Asset` model** in `shopdb/core/models/asset.py`. Generic asset abstraction with `AssetType`, `AssetStatus`, `AssetRelationship`. Plugins also exposed asset-based tables keyed by `assetid`.
For the framework to be adoptable, the platform contract has to be one model, documented, versioned, and stable.
## Decision
**`Asset` is the platform contract.** Plugin authors target the asset-based API. The `Machine` model and its dependents (`PCType`, `MachineType`, `PrinterData` keyed by `machineid`) are legacy and will be retired through a tracked migration.
### Platform contract surface
The following are the public, versioned surface. Plugin authors may depend on them. Breaking changes require a major version bump per ADR-002.
#### Models
- `Asset` (core entity)
- `AssetType` (asset classification, registered by plugins)
- `AssetStatus` (active / inactive / decommissioned / retired)
- `AssetRelationship` + `RelationshipType` (cross-plugin links)
- `Vendor`, `Location`, `LocationType`, `BusinessUnit`, `Model`, `OperatingSystem` (shared reference data)
#### Helpers
- `AuditLog` API: `audit_log(action, entitytype, entityid, ...)` for plugins to record audit entries with consistent schema
- `Setting` API: `plugin.get_setting(key)` and `plugin.set_setting(key, value)` for plugin-scoped config persisted via the core `Setting` model
#### Plugin contract
- `BasePlugin` ABC and its hooks (search, navigation, dashboard, relationships, collector schema)
#### Excluded from the contract for v1
- Event bus (`get_event_handlers` removed from `BasePlugin`). Add later if a real use case appears.
### Relationship types (seeded core values)
`RelationshipType` is seeded with three rows. Plugin authors do not add new types in v1.
| Type | Meaning | Boundary rule |
|------|---------|---------------|
| `partof` | Composition, siblings, sub-assemblies | Parent dies without it |
| `controls` | One asset has operational authority over another | PC commands a machine, measuring tool, etc. |
| `connectedto` | Network or data link without operational authority | Cable, switch port, NAS mount |
### AssetRelationship columns
```
AssetRelationship
- relationshipid PK
- sourceassetid FK to assets
- targetassetid FK to assets
- relationshiptypeid FK to relationshiptypes (one of partof, controls, connectedto)
- label text, free description (e.g., "ethernet PoE", "DNC feed")
- inheritsposition bool, true means resolved-position walk follows this edge
- propagatesthroughid FK to relationshiptypes, nullable, see propagation below
- notes text
- isactive bool
- createddate, modifieddate
```
Free-text `label` carries the domain nuance (`controls` with label "DNC feed" vs `controls` with label "operator workstation"). Avoids inflating the type list.
### Sibling propagation
When a relationship is created or deleted, the framework checks `RelationshipType.propagatesthroughid`. If set, it finds all assets related to the source via the propagation type and applies the same change.
Default seeding:
| RelationshipType | propagatesthroughid |
|------------------|---------------------|
| `partof` | null (the propagation rail itself) |
| `controls` | `partof` (controls relationships propagate across siblings) |
| `connectedto` | null (network paths don't propagate) |
Cycle protection: max walk depth 3, visited-set during traversal.
### Position resolution
Resolved map position for any asset follows this priority chain:
1. Asset-specific override (`assets.mapx`, `assets.mapy` if non-null)
2. Walk relationships where `inheritsposition=true`, ordered by relationship type priority (`partof` first, then `controls`), recursively resolve the related asset's position with cycle detection
3. Fall back to the asset's location coords (`locations.mapx`, `locations.mapy` via `assets.locationid`)
4. If none of the above, asset is "unplaced" and rendered in a tray, not on the map
API exposes resolved position with a source indicator: `{"mapx": 234, "mapy": 567, "positionsource": "self|related|location|none"}`.
### Hierarchical locations
`locations` is extended with `parentlocationid` (self-FK, nullable). Sites define their own location tree (cells, sub-cells, network closets, meeting rooms, labs).
`locationtypes` lookup is added with seeded values: `section`, `cell`, `subcell`, `meetingroom`, `lab`, `office`, `storage`, `hallway`, `networkcloset`, `building`. Sites can extend.
Asset-to-location is the existing `assets.locationid` FK only (single primary location for v1). Multi-location and transient placement are deferred.
## Migration scope (narrowed)
Only one class of legacy data migrates: physical manufacturing equipment with a `machinenumber` (5-axis mills, lathes, broachers, heat treatment, CMMs, etc.).
Migration filter:
```sql
SELECT * FROM legacy.machines
WHERE category = 'Equipment'
AND machinenumber IS NOT NULL
```
Rows without `machinenumber`: **skipped.** Add manually post-migration if any matter.
Migrated rows become `assets` with `assettype='Equipment'` plus a row in the equipment plugin's `equipment` table. Legacy `machinetypeid` is preserved on the equipment row to enable later reclassification (some "Equipment" rows are actually measuring tools, see ADR-005).
Skipped from migration:
- PCs (rebuilt via PXE collector pipeline, see ADR-006)
- Printers, USB devices, network gear, notifications, KB articles (operational or re-collectable, decision per class deferred until needed)
## Consequences
### Positive
- Plugin authors at sister sites have a single, documented contract to target
- Cross-plugin features (relationships, search, map) work uniformly
- Three relationship types are easy to learn; free-text label captures nuance
- Sibling propagation handles dual-path machines without code changes per use case
- Position resolution gives users one map without per-plugin map code
### Negative
- Migration of equipment data must run cleanly; `migrating-asset-schema` skill owns the procedure
- Frontend `views/machines/` will be repointed or replaced; effort tracked in Phase 5
- Legacy `core/api/machines.py` (641 LOC) becomes deprecated; deletion or shim deferred until equipment is migrated
### Neutral
- `Machine`, `PCType`, `MachineType`, `PrinterData` remain in the schema until the migration completes, behind a deprecation notice. New code does not touch them.
## Alternatives considered
1. **Keep both models indefinitely.** Doubles test surface, confuses contract docs, breaks cross-plugin features. Rejected.
2. **Make `Machine` the contract; retire `Asset`.** `Machine` has shop-floor-specific assumptions that don't generalize. Rejected.
3. **Define a higher abstraction above both.** Yet another layer. `Asset` is already the abstraction. Rejected.
4. **More relationship types** (`controls`, `operates`, `monitors`, `dncfeeds`, `mountedon`, `connectedvia`, `inspects`, `siblingbay`). Eight types proved unwieldy. Collapsed to three with free-text labels and per-row inheritance/propagation flags.
## Open questions deferred
- Lifecycle relationship type (`replaces`, `replacedby`). Add when first needed.
- Multi-location asset placement, transient placement (calibration trip, off-site repair). Add when first needed.
- Map editing UI (drag locations onto floor plan). Phase 6 polish.
- Plugin-extensible relationship types beyond the three seeded. Add when a real cross-plugin case can't be expressed via labels.
## References
- `shopdb/core/models/asset.py`
- `shopdb/core/models/machine.py` (legacy, deprecated)
- `shopdb/plugins/base.py`
- ADR-002 (versioning of the surface)
- ADR-003 (plugin distribution)
- ADR-004 (deployment topology)
- ADR-005 (equipment vs measuring tools split)
- ADR-006 (collector contract pattern)

View File

@@ -0,0 +1,60 @@
# ADR-002: Plugin contract versioning
- **Status:** ACCEPTED
- **Date:** 2026-05-08
- **Deciders:** cproudlock
- **Supersedes:** none
## Context
Once sister sites start writing their own plugins (or pulling community plugins), the framework's plugin contract becomes a public API. Without a versioning story, any change to `BasePlugin` or the core platform models can silently break installed plugins at remote sites.
The existing `BasePlugin` and `PluginMeta` already declare a `core_version` field (default `">=1.0.0"`), but it is not enforced anywhere. The plugin loader does not check it before instantiation.
## Decision
The framework adopts **semantic versioning for the plugin contract**, declared in two places:
1. **Framework version** (`shopdb/__init__.py`): a single `__contract_version__` constant. This is the version of the platform contract as defined in ADR-001. Bumped according to semver:
- **Major**: breaking change to `BasePlugin` ABC, `PluginMeta` schema, or any model in the platform contract (`Asset`, `AssetType`, `AssetStatus`, `AssetRelationship`, `Vendor`, `Location`, `BusinessUnit`, `Model`, `OperatingSystem`).
- **Minor**: additive change (new optional hook, new field on a contract model with default).
- **Patch**: bug fix, no contract surface change.
2. **Plugin requirement** (`plugins/<name>/manifest.json`): the existing `core_version` field, expressed as a semver range (e.g., `">=1.0.0,<2.0.0"`).
The plugin loader (`shopdb/plugins/loader.py`) checks `core_version` against `__contract_version__` at load time. Mismatch in dev = re-raise (fail loud). Mismatch in prod = log error, mark plugin as incompatible, exclude from registration.
The `__contract_version__` starts at **`1.0.0`** when ADR-001 is accepted and the `Machine` retirement migration is complete (whichever comes later). Until then, the framework is pre-1.0; plugins should declare `core_version: ">=0.1.0,<1.0.0"`.
## Consequences
### Positive
- Sister sites can pin a known-good framework version. They will not be silently broken when the framework is upgraded.
- Plugin authors know what counts as a breaking change because the contract surface is enumerated in ADR-001.
- The loader fails predictably: a mismatched plugin is reported, not silently disabled.
### Negative / cost
- Discipline required: every change to the contract surface must be classified (major / minor / patch). Adding a `version-bump` skill (or a check in code review) reduces the chance of mis-classification.
- `__contract_version__` becomes a coupling point. Forgetting to bump it after a breaking change means downstream plugins crash silently at runtime instead of failing at install.
### Neutral
- Existing plugins (`plugins/printers/`, etc.) ship as part of the framework, so their `core_version` is always the current `__contract_version__`. The discipline matters mostly for external / sister-site plugins.
## Alternatives considered
1. **No versioning, just trust.** Works for an in-tree-only world. Fails the moment a sister site ships its own plugin. Rejected.
2. **Calendar versioning** (e.g., `2026.05.0`). Easier to bump, harder to communicate breaking changes. Rejected; semver is the industry standard for library-like contracts.
3. **Per-hook versioning.** Each hook has its own version. Too granular; plugins still couple to multiple hooks. Rejected.
## Open questions
- When does the framework declare `1.0.0`? Tied to ADR-001 (Asset retirement of Machine) and the framework being deemed "ready for sister sites". Best-effort target: end of Phase 5 in the refactor plan.
- Should `core_version` accept commercial-grade ranges (`^1.0.0`) or stick to PEP 440 / npm-style ranges? Recommend pip-style (`>=,<`) to match Python ecosystem.
## References
- `shopdb/plugins/base.py` (PluginMeta declaration)
- `shopdb/plugins/loader.py` (where the version check belongs)
- ADR-001 (defines what is in the contract)

View File

@@ -0,0 +1,68 @@
# ADR-003: Plugin distribution model
- **Status:** ACCEPTED
- **Date:** 2026-05-08
- **Deciders:** cproudlock
## Context
Sister sites adopting shopdb-flask need a way to:
1. Install the framework
2. Pick which plugins they want
3. Build their own plugins for site-specific equipment
4. Receive updates to both framework and plugins
Today, every plugin lives in the `plugins/` directory of the framework repo. There is no separation between framework code and plugin code, no install / uninstall, and no way for a site to develop a plugin without forking the whole repo.
Three viable distribution models:
| Model | How a site installs a plugin |
|---|---|
| **In-tree only** | Fork the framework repo, add plugin under `plugins/`, run their own deploy. No separation. |
| **Pip-installable plugins** | Each plugin published as a Python package. Site does `pip install shopdb-printers shopdb-network` etc. Discovery via Python entry points. Framework loads any installed plugin that registers itself. |
| **Git-based plugins** | Each plugin lives in its own git repo. Site clones / submodules into `plugins/<name>/`. Loader picks them up from the directory. |
## Decision
**PROPOSED:** Use a **hybrid model** with two clearly-labeled paths.
1. **Bundled plugins**: a small set of plugins ships with the framework, in-tree at `plugins/`. These are the reference implementations and the default install (printers, computers, network, equipment, usb, notifications). A site that wants only what's bundled needs no extra work.
2. **External plugins**: sister sites or third parties build plugins in their own git repos. The site running the framework drops the plugin into `plugins/<name>/` (clone, submodule, or symlink) and runs `flask plugin install <name>`. No pip packaging required for v1.
Pip-installable plugins (Python entry-point discovery) are deferred to v2. The complexity is not justified until at least two sites are running their own plugins.
## Consequences
### Positive
- v1 is simple: filesystem-based discovery (already implemented in `shopdb/plugins/loader.py`), works for both bundled and external plugins.
- Sites can develop plugins without changing the framework repo.
- The `plugins/` directory is already the canonical location, so no architectural change is needed.
### Negative / cost
- No automatic update path for external plugins. Sites must `git pull` in each plugin directory manually. Acceptable for v1; revisit when plugin count grows.
- Multiple plugin authors writing in parallel can collide on namespace (e.g., two plugins both registering an `AssetType` named "Equipment"). Need a naming policy: plugin names and asset-type names should be prefixed with the site or org if not in the bundled set.
### Neutral
- The existing in-tree pattern keeps working. This decision just formalizes it and clarifies the path for outside-the-tree plugins.
## Alternatives considered
1. **Pip-installable from day one.** Cleaner for the long term but adds packaging, entry-point registration, and CI steps. Premature for current scale (one site running, no sister-site plugins yet).
2. **In-tree only forever.** Forces every site to fork. Doesn't scale beyond two or three sites.
3. **Submodules only.** Forces git-submodule discipline on every adopting site. Submodules are notoriously fiddly. Rejected.
## Open questions
- For external plugins, should there be a manifest field (`source_url`) declaring where the plugin can be cloned from, so `flask plugin install` could pull it for the site? Defer; manual clone is fine for v1.
- Naming convention for non-bundled plugin directory names: prefix with site? (`gea-wjsf-shipping`)? Adopt if and when we hit a name collision.
## References
- `shopdb/plugins/loader.py` (filesystem discovery)
- `shopdb/plugins/cli.py` (plugin install / uninstall command)
- ADR-001 (defines what plugins target)
- ADR-002 (defines plugin version compatibility)

View File

@@ -0,0 +1,69 @@
# ADR-004: Deployment topology (per-site instances)
- **Status:** ACCEPTED
- **Date:** 2026-05-08
- **Deciders:** cproudlock
## Context
shopdb-flask manages shop-floor inventory. If multiple GE Aerospace sites adopt it, the deployment can take one of two shapes:
| Model | How it works |
|---|---|
| **Per-site instances** | Each site runs its own Flask + MySQL + Vue stack. Each site has its own DB, its own users, its own enabled-plugin list, its own deploy. Sites are isolated. |
| **Multi-tenant single instance** | One central Flask + MySQL + Vue stack serves all sites. A `siteid` foreign key on every asset partitions data. Auth distinguishes which site a user belongs to. |
The codebase today is single-tenant per deployment. There is no `siteid` column, no tenant filter, no cross-site auth model. Plugins can be enabled / disabled but only globally for the running instance.
## Decision
**PROPOSED:** **Per-site instances.** Each adopting site runs its own dedicated stack. The framework does not support multi-tenancy.
Each site:
- Owns its database (own credentials, own backup policy, own retention)
- Picks its own enabled plugins
- Configures its own JWT secret, CORS allowlist, Zabbix integration, Active Directory binding
- Deploys at its own cadence
The framework provides:
- A `Dockerfile` and `docker-compose.yml` template suitable for a single-site deploy
- A `.env.example` listing all required environment variables
- A `docs/DEPLOY.md` walking through a fresh-site install
## Consequences
### Positive
- Simpler code: no tenant filter on every query, no cross-tenant auth, no shared-state partitioning bugs.
- Sites are independent. A schema change at one site does not affect another. A plugin crash at one site does not blast radius to other sites.
- Clear ownership: each site's IT team owns their own stack and data. Compliance and audit boundaries match operational boundaries.
- Aligns with how GE Aerospace sites already operate (independent IT, independent shop floors).
### Negative / cost
- No cross-site reporting out of the box. If GE corporate ever wants a fleet-wide view, it has to be built on top (e.g., a roll-up dashboard that queries each site's API). That layer is out of scope for the framework.
- Each site administers its own stack. Higher operational overhead than a single central instance, but each site already runs its own infrastructure.
- Updates require visiting each site's deploy. Fine for the current adoption model; revisit if dozens of sites adopt.
### Neutral
- No `siteid` column needed. The existence of one DB per site is the partition.
## Alternatives considered
1. **Multi-tenant single instance.** Lower operational overhead at scale, easier cross-site reporting, but adds significant code complexity and risk: every query needs a tenant filter, auth gets complex, schema migrations affect every site at once, and a bug at one site can leak data across sites. Rejected for v1; revisit if and only if more than five sites adopt and operational overhead becomes painful.
2. **Hybrid: per-site DB but central app server.** Adds the operational complexity of multi-tenancy without isolating the failure domain (one app crash = all sites down). Rejected.
## Open questions
- Should the framework provide an optional **read-only fleet roll-up** mode where a "central" instance can pull aggregate metrics from each site's API? Defer. Out of scope for v1.
- Backup strategy per site: framework recommendation, or each site decides? Framework should publish a recommended backup runbook (mysqldump + offsite copy) but not enforce.
- Auth federation: each site has its own user table, or sites can share an LDAP / SSO? Recommend documenting the LDAP config knob in `.env.example` so sites can plug in their own auth without code change.
## References
- `shopdb/config.py` (currently single-tenant, no `siteid`)
- ADR-001 (asset model is per-site, not cross-site)
- ADR-003 (plugin distribution per site)

View File

@@ -0,0 +1,149 @@
# ADR-005: Equipment plugin scope vs measuringtools plugin
- **Status:** ACCEPTED
- **Date:** 2026-05-08
- **Deciders:** cproudlock
- **Supersedes:** none
## Context
ADR-001 narrowed the migration to physical manufacturing equipment with a `machinenumber`. In practice, the legacy `category='Equipment'` rows contain two distinct asset classes:
1. **Manufacturing machinery** (5-axis mills, lathes, broachers, heat treatment ovens). These produce parts.
2. **Metrology and inspection instruments** (CMMs, Keyence vision systems, wax-and-trace surface profilometers, GenSpec instruments). These measure parts.
Both share `Asset` properties (vendor, model, location, controller). They differ in domain fields (axes vs measurement accuracy, cycle time vs calibration interval, controller protocol vs measurement software).
Mixing them under one plugin pollutes the schema and confuses cross-plugin queries ("show me all measuring tools" requires an enumeration of measuring-instrument equipmenttype values, which scales badly).
## Decision
Two plugins, separate concerns, shared platform contract.
### `equipment` plugin
Tracks manufacturing machinery. Bundled, in-tree.
Schema (per ADR-001 contract):
```
equipment
- assetid FK to assets, PK
- equipmenttypeid FK to equipmenttypes (5-axis mill, lathe, broacher, heat treat, ...)
- vendorid FK to vendors (platform)
- modelid FK to models (platform)
- controllertypeid FK to controllertypes (equipment plugin)
- controllerosid FK to controlleros (equipment plugin)
- (other shared fields: spindle count, axes, max workpiece size, ...)
equipmenttypes (lookup, equipment plugin)
- equipmenttypeid, name (5-axis mill, lathe, broacher, heat treat, ...)
controllertypes (lookup, equipment plugin)
- controllertypeid, name (Fanuc 31i, Siemens 840D, Mitsubishi M70, Heidenhain TNC640, ...)
- vendorid (FK to vendors)
controlleros (lookup, equipment plugin - separate from PC OS)
- controllerosid, name (FAPT, VxWorks, embedded Windows, Linux RT, ...)
equipmentfocas (subtype, optional, present only when FOCAS-equipped)
- assetid PK, FK to equipment
- focasipaddress text
- focasport integer
- focasversion text
- focasmachinenumber text
equipmentclm (subtype, optional, present only when CLM-equipped)
- assetid PK, FK to equipment
- (CLM-specific: address, port, station ID - finalize when plugin is built)
equipmentmtconnect (subtype, optional, present only when MTConnect-equipped)
- assetid PK, FK to equipment
- mtconnectagenturl text
- mtconnectdevicename text
```
The `equipment.protocol` enum field is deliberately **not** included. Presence or absence of a subtype row indicates which protocol applies. Avoids a denormalized field that can drift out of sync.
### `measuringtools` plugin
Tracks metrology and inspection instruments. Bundled, in-tree (built in Phase 3-4 of the refactor as the first new plugin built using the framework scaffold).
Schema (initial draft, refined when plugin is built):
```
measuringtools
- assetid FK to assets, PK
- measuringtooltypeid FK to measuringtooltypes (CMM, vision system, profilometer, surface tester, ...)
- vendorid FK to vendors (platform)
- modelid FK to models (platform)
- measurementaxes integer (e.g., 3 for a 3-axis CMM)
- accuracyspec text (e.g., "+/-0.5um")
- calibrationintervaldays integer
- lastcalibrationdate date
- nextcalibrationdate date (computed)
- (other domain fields as needed)
measuringtooltypes (lookup, measuringtools plugin)
- measuringtooltypeid, name (CMM, vision system, surface profilometer, gage block, ...)
```
Future extension: subtype tables for measurement-software integrations (PC-DMIS, Keyence, GenSpec). Same pattern as equipment subtype tables.
### Subtype-table pattern (general)
Both plugins use the same pattern for protocol- or software-specific fields:
- Core plugin table carries shared, common fields
- Optional subtype tables (one per protocol or software) hold extension fields
- Each subtype table is keyed by `assetid` (PK), one-to-one with the parent
- Subtype row exists if and only if the asset uses that protocol or software
- Sister sites add new subtype tables for their own integrations without touching core
## Reclassification of legacy data
ADR-001's migration moves all legacy `category='Equipment' AND machinenumber IS NOT NULL` rows to `assets` with `assettype='Equipment'` and into the equipment plugin's `equipment` table. This includes both manufacturing machinery and measuring tools.
After the equipment migration, when the measuringtools plugin is built:
1. Build a mapping table: legacy `machinetypeid` values that are measuring tools (CMM type, Keyence type, etc.)
2. Run a reclassification script:
- For each `assets` row where the original `machinetypeid` is in the measuring-tool mapping
- Change `assets.assettype` to `'MeasuringTool'`
- Move the row from `equipment` to `measuringtools`
- Map domain fields where they differ (e.g., legacy `axes` field maps to `measurementaxes`)
3. Verify counts pre- and post-reclassification
4. Audit log entry per reclassified row
Reclassification is one-shot, run once, archived. Like the original migration script.
## Consequences
### Positive
- Manufacturing machinery and measuring tools are first-class plugins, each with appropriate domain fields
- Sister sites can install one or both depending on what they track
- Subtype-table pattern is the canonical example for protocol-specific data and extends naturally to other plugins
- Building `measuringtools` mid-refactor validates the plugin scaffold tooling against a real new plugin
### Negative
- Reclassification is a second migration step. Lower risk than the initial migration because it is data-only (no schema change beyond moving rows between two tables that share the same `assetid` link).
- Sites that adopt the framework before `measuringtools` ships need to either keep measuring tools in `equipment` (workable but suboptimal) or wait for the plugin
### Neutral
- Legacy `machinetypeid` is preserved on the equipment row during migration to enable reclassification
## Alternatives considered
1. **Single equipment plugin with sub-typed assets.** Use `equipment.equipmenttypeid` to discriminate manufacturing vs metrology. Rejected: domain fields differ enough that a single table is wide and full of NULLs.
2. **Migrate split (build mapping before initial migration).** Cleaner end state but requires the `measuringtools` plugin to exist before the migration runs, which delays Phase 5. Rejected.
3. **JSON blob for protocol data instead of subtype tables.** Considered for both plugins. Rejected: weak typing, awkward queries, no schema validation.
## References
- ADR-001 (Asset is platform contract)
- ADR-002 (versioning of the surface)
- `plugins/equipment/` (current placeholder)
- `plugins/computers/` (existing example of plugin pattern)

View File

@@ -0,0 +1,131 @@
# ADR-006: Plugin collector contract pattern
- **Status:** ACCEPTED
- **Date:** 2026-05-08
- **Deciders:** cproudlock
- **Supersedes:** none
## Context
PC inventory data was collected by PowerShell scripts pushing to `/api/collector/pc` (`shopdb/core/api/collector.py`, ~374 LOC). The endpoint is hardcoded for PCs: it accepts a fixed schema and writes to the legacy `Machine` model.
Per ADR-001, `Machine` is being retired in favor of `Asset`. Per the project shift to PXE-driven imaging, PC inventory is moving to a new collection pipeline (PXE / GE-Enforce / manifest engine produces JSON about each PC). Other asset classes may want similar collector pipelines (printers via Zabbix, network gear via SNMP scan).
This calls for a generalizable contract: any plugin that wants to accept external collector input declares a JSON schema, and the framework wires the endpoint, auth, and idempotency.
## Decision
`BasePlugin` gets one new optional hook:
```python
def get_collector_schema(self) -> Optional[dict]:
"""Return JSON Schema describing the collector payload for this plugin.
Return None if the plugin does not accept collector input.
The schema must include:
- 'identityfield': name of the field that uniquely identifies an asset
across submissions (e.g., 'hostname' for PCs, 'macaddress' for network
devices). Used for idempotent upsert.
- 'fields': JSON Schema definitions for the rest of the payload.
"""
return None
```
Plugin loader auto-registers an endpoint at `/api/collector/<pluginname>` for each plugin returning a schema. Auth is API-key, separate from JWT. Per-plugin keys via env vars:
- `COLLECTOR_API_KEY_<PLUGINNAME>` (preferred, plugin-specific)
- `COLLECTOR_API_KEY` (fallback, shared)
### Idempotent upsert
The endpoint uses the `identityfield` to find an existing `Asset` for the same identity. Found = update. Not found = insert. Existing relationships are preserved on update.
### Response contract
```json
{
"status": "ok",
"action": "created" | "updated" | "noop",
"assetid": 12345,
"identityvalue": "PC-1234",
"warnings": []
}
```
### Audit logging
Every collector submission produces an audit log entry: `{action, plugin, identityvalue, before/after diff}`. Audit retention per site policy.
### Schema discovery
The framework exposes the registered schemas at `/api/collector/_schemas` (read-only, JWT-protected) so external collector authors can introspect what payloads are accepted by which plugins.
## Concrete first user: computers plugin
The `computers` plugin is the first to implement `get_collector_schema`. The PXE pipeline conforms.
Initial computers collector schema (sketch, finalized when plugin is built):
```json
{
"identityfield": "hostname",
"fields": {
"hostname": "string, required",
"macaddress": "string, optional, secondary identity",
"osname": "string",
"osversion": "string",
"lastboottime": "datetime",
"currentuser": "string",
"ipaddress": "string",
"memorygb": "number",
"cputype": "string",
"imagename": "string (PXE image deployed)",
"imageappliedat": "datetime",
"installedsoftware": "array of {name, version}"
}
}
```
The PC re-image case is handled by the identity field: a freshly imaged PC keeps its hostname, so the existing `Asset` row is updated rather than duplicated. Existing `AssetRelationship` rows pointing at that PC (e.g., `controls` to a machine) are preserved across re-images.
## Migration of the existing endpoint
`shopdb/core/api/collector.py` (`/api/collector/pc`) is **deprecated** in v1 and **removed** before v1.0.
Migration path:
1. Implement `get_collector_schema` on the `computers` plugin. New endpoint `/api/collector/computers` is auto-registered.
2. Run both endpoints in parallel for one cycle of PXE imaging across the floor. PXE pipeline switches to `/api/collector/computers`.
3. Remove `shopdb/core/api/collector.py` and the legacy blueprint registration.
## Consequences
### Positive
- Generalizable across plugins. Sister sites adopting `printers`, `network`, etc. can wire their own collectors with no core change.
- Identity-based idempotency makes PC re-imaging safe by default.
- Audit logging is uniform across plugins.
- Schema discovery enables external tools to validate before submission.
### Negative
- Plugin authors must write a JSON schema. Slight learning curve, but JSON Schema is widely understood and the framework can ship a few examples.
- The `/api/collector/_schemas` endpoint plus per-plugin endpoints expand the public API surface; minor maintenance cost.
### Neutral
- API-key auth pattern stays as it is today (separate from JWT). Sites manage their own collector keys per plugin via env vars.
## Alternatives considered
1. **Keep `/api/collector/pc` and add new plugin-specific endpoints alongside.** Two ways to send PC data, plugin authors confused. Rejected.
2. **Use JWT for collectors instead of API key.** Collectors are headless processes (PXE pipeline, scripts), not interactive users. JWT lifecycle (refresh tokens, expiry) is the wrong tool. API key is simpler. Rejected.
3. **Plugins write directly to the database, no collector endpoint.** Skips audit logging and schema validation. Rejected.
## References
- `shopdb/core/api/collector.py` (legacy endpoint to be removed)
- `shopdb/plugins/base.py` (`get_collector_schema` hook to be added)
- ADR-001 (asset model the collectors target)
- ADR-002 (collector schema is part of plugin contract; changes to the hook signature are major bumps)
- The PXE project (`/home/camp/projects/pxe/`) which feeds the computers collector

25
docs/adr/README.md Normal file
View File

@@ -0,0 +1,25 @@
# Architecture Decision Records
Each ADR captures a single architectural decision: the context, the decision itself, the consequences, and the alternatives considered. ADRs are immutable once accepted. Superseded ADRs stay in this folder with a pointer to the newer ADR.
## Status definitions
- **PROPOSED**: drafted, awaiting decision
- **ACCEPTED**: decision is in effect
- **SUPERSEDED**: replaced by a later ADR (link forward)
- **DEPRECATED**: no longer in effect, no replacement
## Index
| ADR | Title | Status |
|-----|-------|--------|
| [001](ADR-001-asset-as-platform-contract.md) | Asset model is the platform contract | ACCEPTED |
| [002](ADR-002-plugin-versioning.md) | Plugin contract versioning (semver) | ACCEPTED |
| [003](ADR-003-plugin-distribution.md) | Plugin distribution model | ACCEPTED |
| [004](ADR-004-deployment-topology.md) | Deployment topology (per-site instances) | ACCEPTED |
| [005](ADR-005-equipment-vs-measuringtools.md) | Equipment vs measuringtools plugin scope | ACCEPTED |
| [006](ADR-006-collector-contract.md) | Plugin collector contract pattern | ACCEPTED |
## Authoring
When proposing a new decision, copy the most recent ADR as a template, increment the number, and update this index. Do not edit accepted ADRs in place; supersede them with a new one.