Commit Graph

8 Commits

Author SHA1 Message Date
cproudlock
69a1682a7f webapp: imaging UX overhaul + image management CRUD
Imaging dashboard
- services/imaging_log_tail.py: parses dnsmasq leases, Apache access log,
  Samba per-host log files, and dnsmasq syslog (DHCP/TFTP). Synthesizes
  inferred sessions keyed by MAC for bays that have only touched the boot
  chain but not yet pushed to /imaging/status. Active window 90 min.
- imaging_status.list_sessions() merges inferred sessions into the dashboard
  list. Real client-pushed sessions win for the same MAC.
- imaging_status: stage_history field tracks every stage transition (capped
  30); sidecar .log file per serial records every log_lines push uncapped
  (read_full_log() caps detail-page response to 1 MB).
- delete_session/delete_all_sessions clean up sidecar .log too.
- New SSE endpoint /imaging/stream emits a session-list hash every 5s.
  Client fetches /imaging/tiles (HTML partial) on hash change and swaps
  #imaging-tiles innerHTML. Polling fallback at 15s if SSE drops.
- Tile-swap preserves scroll, filter input, expanded state via localStorage,
  and any LAPS input the operator is mid-pasting (swap skipped when a
  laps-input is focused).
- imaging.html: removed 15s location.reload(). Added live-status dot in
  header (gray idle / green SSE connected / red SSE lost).
- _imaging_tiles.html: shared partial used by both /imaging full render and
  /imaging/tiles SSE refresh. Inferred bays render with yellow border +
  log-inferred badge + no progress bar (stage_index inference is coarse).
- imaging_detail.html (new): per-bay forensics page at /imaging/session/
  <serial>. Session metadata grid, stage timeline table, full sidecar log
  with truncation indicator, Copy-support-summary button. Linked from each
  client-pushed tile.
- qr-render.js exposes window.renderAllQRs() so the SSE swap can re-render
  Intune device-ID QRs in the swapped-in tiles.

Image management
- services/image_registry.py: JSON registry of image types at
  {SAMBA_SHARE}/image-registry.json. Bootstraps from baked-in
  config.IMAGE_TYPES on first run. create/clone/delete/rename_friendly
  mutate the file then call reload() which rewrites config.IMAGE_TYPES +
  config.FRIENDLY_NAMES in place. Sidebar reflects on next request.
- app.py routes: /images/new, /images/<t>/clone, /images/<t>/delete (with
  optional content-wipe checkbox), /images/<t>/rename.
- dashboard.html: + New image type button + Clone/Delete per row, all in
  Bootstrap modals with confirmation copy.
- Clone copies Deploy/ tree but preserves symlinks to shared dirs (Out-of-
  box Drivers, Operating Systems, Packages) so disk usage stays low.
- Delete with content checked unlinks symlinks (does not follow into shared
  dirs).

Driver / package upload + orphan adoption
- services/images.py: upload_driver, adopt_orphan, remove_orphans,
  upload_package. Filename sanitization blocks path traversal.
- app.py routes: /images/<t>/drivers/upload, /images/<t>/drivers/adopt,
  /images/<t>/drivers/orphans/delete, /images/<t>/packages/upload.
- image_config.html: Upload .zip button + modal on Drivers section. Orphan
  drivers card-footer rebuilt as interactive list with per-row Adopt inline
  form (family + destinationDir inputs) and bulk select+delete.
- Upload .zip on Packages section with optional destinationDir field that
  appends a packages.json entry.

Configuration
- config.py: new env vars DNSMASQ_LEASES, APACHE_ACCESS_LOG, SAMBA_LOG_DIR,
  DNSMASQ_SYSLOG for the log-tailer.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-30 13:21:06 -04:00
cproudlock
3aabd47571 imaging dashboard: add Clear all button + endpoint
New /imaging/delete_all endpoint wipes every per-bay JSON in IMAGING_DIR
via imaging_status.delete_all_sessions(). Template adds "Clear all"
outline-danger button next to the count badge, gated on sessions list
non-empty, with confirm() prompt naming the count.

Deployed via scp + systemctl restart pxe-webapp on 172.16.9.1.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-16 15:58:31 -04:00
cproudlock
8cd0c147d8 imaging: renumber stages to be time-monotonic (1=WinPE, 7=Intune ID)
Previously the stage indices reflected logical milestones but not the
order they fire in. Run-ShopfloorSetup posted idx=1 (start) and idx=4
(PPKG) - but 09-Setup-Keyence (inside per-type loop) ran BETWEEN them
and posted idx=5/6. The dashboard then "regressed" from 6 back to 4
when PPKG fired, making it look stuck at the per-type-complete card.

New numbering matches actual execution order:

  1 - WinPE: PESetup / WIM apply              (startnet.cmd)
  2 - Run-ShopfloorSetup: starting            (Run-ShopfloorSetup.ps1)
  3 - 09-Setup-<Type>: starting               (per-type)
  4 - 09-Setup-<Type>: complete               (per-type)
  5 - Run-ShopfloorSetup: PPKG enrollment     (Run-ShopfloorSetup.ps1)
  6 - Run-ShopfloorSetup: handoff to Monitor  (Run-ShopfloorSetup.ps1)
  7 - Monitor-IntuneProgress: Intune Device ID captured

services/imaging_status.py rewind threshold reverts to stage_index <= 1
now that WinPE startnet posts idx=1.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-13 11:34:01 -04:00
cproudlock
e3f523eedd webapp/imaging: bump rewind threshold to stage_index <= 2
Reset trigger previously fired only when a new POST landed at idx <= 1,
which meant a reimage didn't reset the dashboard card until
Run-ShopfloorSetup ran post-PPKG (~10-20 min in). With the WinPE-phase
status push from startnet.cmd in commit 4e018fe firing at idx=2, that
earlier signal needs to count as a new-run marker too.

Threshold of 2 makes startnet.cmd the canonical reset point: within
seconds of PXE menu choice on the bay, the dashboard card flips from
the previous run's high-idx state back to "WinPE: PESetup / WIM apply"
+ fresh started_at.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-13 11:29:34 -04:00
cproudlock
4e018feaa0 webapp/imaging: rewind detection + WinPE-phase status push
services/imaging_status.py - if a new POST arrives with stage_index <= 1
that is lower than the cached stage_index, OR the previous run already
finished (status=succeeded|failed), reset the session: clear log_tail,
mint a fresh started_at, drop the status field so the in_progress
default re-applies. Preserves serial + records the previous run's
last_updated under previous_run_at for audit. Without this, a reimage
on the same bay would leave a stale 6/8 "succeeded" card visible until
the new run progressed past that index.

playbook/startnet.cmd - one-line PowerShell POST after the PXE menu
choice + enrollment-share mount, before PESetup.exe waits to start.
Captures BIOS serial via wmic, MAC via Get-NetAdapter, and posts:
  stage_index=2, current_stage="WinPE: PESetup / WIM apply".
Best-effort; try/catch swallows any network failure so a missing
webapp never blocks imaging. PXE clients will now appear on the
/imaging dashboard during WinPE phase instead of only post-PPKG.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-13 11:11:03 -04:00
cproudlock
9122b28c31 webapp: imaging progress dashboard + serial column on reports list
Adds end-to-end progress tracking for PXE imaging sessions and surfaces
each Blancco report's BIOS serial in the report list.

webapp:
  * services/imaging_status.py - JSON-per-serial state store under
    IMAGING_DIR (default /var/log/pxe-imaging). Atomic write via
    tempfile + rename. log_tail capped at 50 lines. Merges partial
    updates so clients can post just the current_stage tick.
  * config.py - new IMAGING_DIR env-overridable path.
  * services/csrf.py - explicit exempt list for machine-to-machine
    endpoints; /imaging/status is the first entry. Air-gapped LAN;
    trust-by-network for client posts.
  * app.py - four new routes:
      GET  /imaging               dashboard (renders all sessions)
      POST /imaging/status        client status push (JSON body)
      GET  /imaging/<serial>.json raw session JSON for ad-hoc polling
      POST /imaging/delete/<s>    clear a session from the dashboard
    Also parses each Blancco XML in the /reports list to surface
    system.serial + system.model columns.
  * templates/imaging.html - Bootstrap dashboard with per-session
    cards (state badge, progress bar, stage idx/total, mac, elapsed,
    log tail). meta http-equiv refresh=5 for auto-tick.
  * templates/base.html - new "Imaging Progress" nav entry.
  * templates/reports.html - Serial + Model columns added.

playbook:
  * shopfloor-setup/Shopfloor/lib/Send-PxeStatus.ps1 - new helper.
    Dot-source this then call Send-PxeStatus -Stage X -StageIndex N
    -StageTotal M from any stage script. BIOS serial via CIM, MAC via
    Get-NetAdapter, pctype + machinenumber from C:\Enrollment.
    Failures are swallowed to a local log so a network blip doesn't
    block imaging.
  * shopfloor-setup/Run-ShopfloorSetup.ps1 - dot-sources helper +
    posts at three coarse milestones (start, PPKG enrollment,
    handoff to Monitor-IntuneProgress).
  * shopfloor-setup/gea-shopfloor-keyence/09-Setup-Keyence.ps1 -
    posts at session start + after Install-FromManifest with
    succeeded/failed status derived from $rc. Other 09-Setup-*.ps1
    scripts can follow the same pattern.

ID is BIOS serial (stable across WinPE -> Windows transition and
across reboots, unlike hostname which is random pre-PPKG). Operator
already knows the serial of the bay they imaged.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-13 10:07:18 -04:00
cproudlock
974accf98a blancco: fix silent prefs fallback, suspend trap, display blank + add View
End-to-end fixes for Blancco Drive Eraser PXE flow uncovered by chasing
"reports never reach SMB share" across two air-gapped sites:

playbook/blancco-init.sh:
  * Drop silent || true on wget of preferences.xml + config.xml. Fail
    loud with shell-drop if download or marker grep fails. Background:
    airootfs /opt/scripts/validate_preferences.sh restores
    /albus/preferences.save (factory defaults, empty network_share) if
    xmllint fails. wget failure made every report silently land nowhere.
  * Clobber /albus/preferences.save with the same served file so even if
    the validator fallback fires, the SMB target survives.
  * Bind-mount /dev/null over /sys/power/{state,disk,mem_sleep,autosleep}
    before switch_root. Albus's license-retry path writes /sys/power/state
    directly (bypassing systemd targets); this is the last-line block.
  * /dev/null symlinks for sleep/suspend/hibernate systemd targets in the
    airootfs overlay + logind drop-in with IdleAction/Handle*=ignore.
    Three independent layers because cmdline systemd.mask alone is bypassed
    by direct /sys/power/state writes.
  * xinitrc.d/00-no-screen-blank.sh runs xset s off -dpms + setterm
    -blank 0 -powerdown 0 so the Blancco GUI doesn't blank during long
    erasures.
  * Removed the 20-failsafeDriver.conf "modesetting" pin. modesetting
    needs DRM/KMS which we disable on kernel cmdline; "vesa" also failed
    on NVIDIA. With the pin gone Xorg auto-picks fbdev which uses the
    kernel framebuffer from vga=normal - works across Intel, AMD, and
    older NVIDIA without nouveau.

playbook/pxe_server_setup.yml:
  * dnsmasq.conf: explicit empty-value dhcp-option=3 + dhcp-option=6.
    Without them, dnsmasq defaults to sending its own IP as router AND
    DNS. Commenting the configured-value lines did NOT disable the push
    (root cause of "wired keeps picking up 10.9.100.1 as gateway").
  * Split the Blancco config.img extraction and preferences.xml deploy
    into separate tasks. The previous shell-with-creates: gate caused
    playbook re-runs to skip the prefs deploy entirely after first run.
  * Added a validation task that runs python3 xml parse + grep on the
    deployed preferences.xml to fail the playbook at deploy time if the
    SMB markers are missing.
  * Added Environment=TZ=America/New_York to the pxe-webapp systemd
    service so report mtimes and audit log render in Eastern time even
    if the Python process is started before timedatectl converges.

webapp:
  * services/blancco_report.py: parse Blancco's XML report format
    (recursive <entries name="..."> walker) into a friendly dict.
  * templates/report_view.html: Bootstrap "Drive Erasure Certificate"
    layout - hero summary, customer + system cards, per-drive cards with
    step-by-step erasure timeline, document signing footer with
    integrity hash detail.
  * /reports/view/<filename> route + View button on the reports list
    (XML reports only; PDFs still download).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-13 07:38:54 -04:00
cproudlock
c16a4f23b4 webapp: extract service layer (config.py + services/) from app.py
Phase 1a of a multi-session refactor toward a clean blueprint
structure. Pulls the helper code that lived alongside the routes in
the 1621-line app.py into focused modules. app.py is now 625 lines
of mostly routes plus a small Flask wiring header. Behaviour is
unchanged: smoke-tested against the 8 main GET routes (200 OK).

New modules:

- config.py            env vars + IMAGE_TYPES + FRIENDLY_NAMES +
                       SHARED_DEPLOY_* taxonomy + unattend XML
                       namespaces.
- services/audit.py    audit log file handler + audit() helper.
- services/csrf.py     session CSRF token + before_request validator
                       wired via init_csrf(app).
- services/fs.py       image_root / deploy_path / unattend_path /
                       control_path / tools_path + load_json /
                       save_json + resolve_destination.
- services/system.py   service_status / find_usb_mounts /
                       find_upload_sources.
- services/images.py   image_status + load_image_config.
- services/deploy.py   import_deploy + _merge_tree +
                       _replace_with_symlink + allowed_import_source.
- services/unattend.py parse_unattend / build_unattend_xml /
                       extract_form_data and the qn / qwcm / settings
                       pass helpers.
- services/wim.py      extract_startnet / update_startnet / list_files
                       wrapping wimextract / wimupdate / wimdir.

Endpoint names kept stable (dashboard, clonezilla_backups, etc.) so
existing url_for(...) calls in templates are unchanged. Phase 1b
(Flask blueprints with ".endpoint" naming) deferred to a future
session because it requires updating ~30 url_for sites in templates
and is mostly cosmetic.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-08 18:25:32 -04:00