# ADR-006: Plugin collector contract pattern - **Status:** ACCEPTED - **Date:** 2026-05-08 - **Deciders:** cproudlock - **Supersedes:** none ## Context PC inventory data was collected by PowerShell scripts pushing to `/api/collector/pc` (`shopdb/core/api/collector.py`, ~374 LOC). The endpoint is hardcoded for PCs: it accepts a fixed schema and writes to the legacy `Machine` model. Per ADR-001, `Machine` is being retired in favor of `Asset`. Per the project shift to PXE-driven imaging, PC inventory is moving to a new collection pipeline (PXE / GE-Enforce / manifest engine produces JSON about each PC). Other asset classes may want similar collector pipelines (printers via Zabbix, network gear via SNMP scan). This calls for a generalizable contract: any plugin that wants to accept external collector input declares a JSON schema, and the framework wires the endpoint, auth, and idempotency. ## Decision `BasePlugin` gets one new optional hook: ```python def get_collector_schema(self) -> Optional[dict]: """Return JSON Schema describing the collector payload for this plugin. Return None if the plugin does not accept collector input. The schema must include: - 'identityfield': name of the field that uniquely identifies an asset across submissions (e.g., 'hostname' for PCs, 'macaddress' for network devices). Used for idempotent upsert. - 'fields': JSON Schema definitions for the rest of the payload. """ return None ``` Plugin loader auto-registers an endpoint at `/api/collector/` for each plugin returning a schema. Auth is API-key, separate from JWT. Per-plugin keys via env vars: - `COLLECTOR_API_KEY_` (preferred, plugin-specific) - `COLLECTOR_API_KEY` (fallback, shared) ### Idempotent upsert The endpoint uses the `identityfield` to find an existing `Asset` for the same identity. Found = update. Not found = insert. Existing relationships are preserved on update. ### Response contract ```json { "status": "ok", "action": "created" | "updated" | "noop", "assetid": 12345, "identityvalue": "PC-1234", "warnings": [] } ``` ### Audit logging Every collector submission produces an audit log entry: `{action, plugin, identityvalue, before/after diff}`. Audit retention per site policy. ### Schema discovery The framework exposes the registered schemas at `/api/collector/_schemas` (read-only, JWT-protected) so external collector authors can introspect what payloads are accepted by which plugins. ## Concrete first user: computers plugin The `computers` plugin is the first to implement `get_collector_schema`. The PXE pipeline conforms. Initial computers collector schema (sketch, finalized when plugin is built): ```json { "identityfield": "hostname", "fields": { "hostname": "string, required", "macaddress": "string, optional, secondary identity", "osname": "string", "osversion": "string", "lastboottime": "datetime", "currentuser": "string", "ipaddress": "string", "memorygb": "number", "cputype": "string", "imagename": "string (PXE image deployed)", "imageappliedat": "datetime", "installedsoftware": "array of {name, version}" } } ``` The PC re-image case is handled by the identity field: a freshly imaged PC keeps its hostname, so the existing `Asset` row is updated rather than duplicated. Existing `AssetRelationship` rows pointing at that PC (e.g., `controls` to a machine) are preserved across re-images. ## Migration of the existing endpoint `shopdb/core/api/collector.py` (`/api/collector/pc`) is **deprecated** in v1 and **removed** before v1.0. Migration path: 1. Implement `get_collector_schema` on the `computers` plugin. New endpoint `/api/collector/computers` is auto-registered. 2. Run both endpoints in parallel for one cycle of PXE imaging across the floor. PXE pipeline switches to `/api/collector/computers`. 3. Remove `shopdb/core/api/collector.py` and the legacy blueprint registration. ## Consequences ### Positive - Generalizable across plugins. Sister sites adopting `printers`, `network`, etc. can wire their own collectors with no core change. - Identity-based idempotency makes PC re-imaging safe by default. - Audit logging is uniform across plugins. - Schema discovery enables external tools to validate before submission. ### Negative - Plugin authors must write a JSON schema. Slight learning curve, but JSON Schema is widely understood and the framework can ship a few examples. - The `/api/collector/_schemas` endpoint plus per-plugin endpoints expand the public API surface; minor maintenance cost. ### Neutral - API-key auth pattern stays as it is today (separate from JWT). Sites manage their own collector keys per plugin via env vars. ## Alternatives considered 1. **Keep `/api/collector/pc` and add new plugin-specific endpoints alongside.** Two ways to send PC data, plugin authors confused. Rejected. 2. **Use JWT for collectors instead of API key.** Collectors are headless processes (PXE pipeline, scripts), not interactive users. JWT lifecycle (refresh tokens, expiry) is the wrong tool. API key is simpler. Rejected. 3. **Plugins write directly to the database, no collector endpoint.** Skips audit logging and schema validation. Rejected. ## References - `shopdb/core/api/collector.py` (legacy endpoint to be removed) - `shopdb/plugins/base.py` (`get_collector_schema` hook to be added) - ADR-001 (asset model the collectors target) - ADR-002 (collector schema is part of plugin contract; changes to the hook signature are major bumps) - The PXE project (`/home/camp/projects/pxe/`) which feeds the computers collector