Commit Graph

23 Commits

Author SHA1 Message Date
cproudlock
69a1682a7f webapp: imaging UX overhaul + image management CRUD
Imaging dashboard
- services/imaging_log_tail.py: parses dnsmasq leases, Apache access log,
  Samba per-host log files, and dnsmasq syslog (DHCP/TFTP). Synthesizes
  inferred sessions keyed by MAC for bays that have only touched the boot
  chain but not yet pushed to /imaging/status. Active window 90 min.
- imaging_status.list_sessions() merges inferred sessions into the dashboard
  list. Real client-pushed sessions win for the same MAC.
- imaging_status: stage_history field tracks every stage transition (capped
  30); sidecar .log file per serial records every log_lines push uncapped
  (read_full_log() caps detail-page response to 1 MB).
- delete_session/delete_all_sessions clean up sidecar .log too.
- New SSE endpoint /imaging/stream emits a session-list hash every 5s.
  Client fetches /imaging/tiles (HTML partial) on hash change and swaps
  #imaging-tiles innerHTML. Polling fallback at 15s if SSE drops.
- Tile-swap preserves scroll, filter input, expanded state via localStorage,
  and any LAPS input the operator is mid-pasting (swap skipped when a
  laps-input is focused).
- imaging.html: removed 15s location.reload(). Added live-status dot in
  header (gray idle / green SSE connected / red SSE lost).
- _imaging_tiles.html: shared partial used by both /imaging full render and
  /imaging/tiles SSE refresh. Inferred bays render with yellow border +
  log-inferred badge + no progress bar (stage_index inference is coarse).
- imaging_detail.html (new): per-bay forensics page at /imaging/session/
  <serial>. Session metadata grid, stage timeline table, full sidecar log
  with truncation indicator, Copy-support-summary button. Linked from each
  client-pushed tile.
- qr-render.js exposes window.renderAllQRs() so the SSE swap can re-render
  Intune device-ID QRs in the swapped-in tiles.

Image management
- services/image_registry.py: JSON registry of image types at
  {SAMBA_SHARE}/image-registry.json. Bootstraps from baked-in
  config.IMAGE_TYPES on first run. create/clone/delete/rename_friendly
  mutate the file then call reload() which rewrites config.IMAGE_TYPES +
  config.FRIENDLY_NAMES in place. Sidebar reflects on next request.
- app.py routes: /images/new, /images/<t>/clone, /images/<t>/delete (with
  optional content-wipe checkbox), /images/<t>/rename.
- dashboard.html: + New image type button + Clone/Delete per row, all in
  Bootstrap modals with confirmation copy.
- Clone copies Deploy/ tree but preserves symlinks to shared dirs (Out-of-
  box Drivers, Operating Systems, Packages) so disk usage stays low.
- Delete with content checked unlinks symlinks (does not follow into shared
  dirs).

Driver / package upload + orphan adoption
- services/images.py: upload_driver, adopt_orphan, remove_orphans,
  upload_package. Filename sanitization blocks path traversal.
- app.py routes: /images/<t>/drivers/upload, /images/<t>/drivers/adopt,
  /images/<t>/drivers/orphans/delete, /images/<t>/packages/upload.
- image_config.html: Upload .zip button + modal on Drivers section. Orphan
  drivers card-footer rebuilt as interactive list with per-row Adopt inline
  form (family + destinationDir inputs) and bulk select+delete.
- Upload .zip on Packages section with optional destinationDir field that
  appends a packages.json entry.

Configuration
- config.py: new env vars DNSMASQ_LEASES, APACHE_ACCESS_LOG, SAMBA_LOG_DIR,
  DNSMASQ_SYSLOG for the log-tailer.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-30 13:21:06 -04:00
cproudlock
b57ba0fb6f webapp: add CSRF token to imaging Clear-all form
The dashboard Clear-all button posts to /imaging/delete-all but the form
was missing the hidden _csrf_token input that the rest of the webapp's
POST forms include, so the endpoint would reject the request when CSRF
enforcement is active.
2026-05-24 07:04:20 -04:00
cproudlock
3aabd47571 imaging dashboard: add Clear all button + endpoint
New /imaging/delete_all endpoint wipes every per-bay JSON in IMAGING_DIR
via imaging_status.delete_all_sessions(). Template adds "Clear all"
outline-danger button next to the count badge, gated on sessions list
non-empty, with confirm() prompt naming the count.

Deployed via scp + systemctl restart pxe-webapp on 172.16.9.1.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-16 15:58:31 -04:00
cproudlock
d8c64bef2b Add conditional BIOS-update sub-stage on idx=1
winpe-status-push.ps1 now accepts -CurrentStage / -StageIndex
params so callers can override the default "WinPE: PESetup / WIM
apply" string. Backwards compatible.

startnet.cmd: after the existing initial WinPE status push,
inspect $BIOS_STATUS for the "->" marker that check-bios.cmd
writes when an update was actually applied or staged. If present,
fire a second idx=1 push with stage="WinPE: BIOS firmware update -
<status>". No-op for clean "up to date" / "no update in catalog"
runs.

imaging.html: at stage_idx=1 with "bios" in current_stage, swap
friendly label to "Updating BIOS firmware" with a do-NOT-power-off
hint. Bays without firmware updates show the default "Booting from
PXE" label as before.

boot.wim startnet.cmd updated via wimupdate so live PXE clients
pick it up at next boot.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-15 10:26:31 -04:00
cproudlock
3385bc87aa Monitor + imaging: per-phase sub-stages within idx=7
Monitor's Get-Snapshot already tracks Phase 1-5 (Intune Registration,
Device Configuration, Software Deployment, Credentials, Lockdown).
The webapp dashboard only saw a single idx=7 push for the entire
post-PPKG / pre-lockdown window, so the friendly label couldn't
reflect "where is this bay actually". Operator looking at the
dashboard had no idea whether to assign category or hit ARTS for
lockdown next.

Monitor now pushes additional idx=7 entries as it crosses Phase
boundaries:
 - On DeviceId capture: "Intune Device ID captured" (existing)
 - On Phase 2 done (SFLD policy delivered = category was assigned):
   "Phase 2 SFLD policy delivered (device configuration)"
 - On Phase 1-4 all complete: "Phases 1-4 complete - ready for
   lockdown (ARTS request)"
 - On lockdown done: idx=8 (existing)

imaging.html maps the stage_string substring to friendly labels:
 - default idx=7         -> "Registered - assign category"
 - 'sfld policy' / 'phase 2' -> "Phase 2 - device configuration"
 - 'credentials' / 'phase 4' -> "Phase 3 / 4 - DSC + credentials"
 - 'ready for lockdown' / 'request lockdown' -> "Ready - request
                              lockdown" (hint: click ARTS request)

Operator now knows exactly when to act vs when to wait.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-15 07:39:20 -04:00
cproudlock
8debc4ddb3 imaging: LAPS input always visible, not gated on intune_device_id
Was hiding LAPS QR section until idx=7 pushed with a DeviceId.
Operator couldn't paste a password if Monitor hadn't gotten around
to capturing the DeviceId yet. The QR encoding doesn't depend on
DeviceId - it's just the password being encoded - so the section is
useful any time the bay is past the LAPS reboot.

Drop the {% if s.intune_device_id %} gate. LAPS section now appears
in every expanded tile.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-15 07:30:48 -04:00
cproudlock
036090348c imaging: persist tile expanded state across page refresh
localStorage-backed set of serials at key 'imaging-expanded'. On
DOMContentLoaded, walk each .imaging-card; if its data-serial is in
the set, set card.open=true. On every <details> toggle, update the
set. Refresh no longer collapses the tile the operator was looking
at.

Per-browser state (localStorage), no server round-trip.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-15 07:29:45 -04:00
cproudlock
b9f66687ac imaging: search now also matches stage name, stage-N, and status
Extended the data-filter attribute to include:
 - friendly stage label (e.g. "awaiting intune lockdown")
 - "stage-N" token (e.g. type "stage-7" to find idx=7 bays)
 - status string (in_progress / succeeded / failed)

Use cases:
 - Find all bays waiting on lockdown: type "lockdown"
 - Find all bays at the same stage: "stage-7"
 - Find failed bays: "failed"
 - Find succeeded bays: "succeeded"

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-15 07:28:08 -04:00
cproudlock
65eeead5a0 imaging: collapsible tile + ARTS link + reword stage 7
Tile is now <details>. Always-visible summary:
  - QR (96px)
  - serial / hostname / pctype / machine# / status badge
  - friendly stage label + N/M badge + pct
  - progress bar

Click to expand. Body shows:
  - friendly stage hint
  - Intune device id row with [copy] [set category] [ARTS request]
  - metadata one-liner (started / last / MAC / raw current_stage)
  - error banner (if any)
  - LAPS password QR generator
  - log tail
  - Clear button

ARTS button links to https://arts.dw.geaerospace.net/requests/type
for kicking off a new lockdown request (Intune-side step happens
externally; this is a deep-link for convenience).

Stage 7 wording: "Awaiting Intune lockdown" (was "awaiting category /
lockdown" - confusing when category was already assigned). Hint
explicitly mentions category check for cases where it isn't yet set.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-15 07:27:35 -04:00
cproudlock
a9a7478d5a imaging: organize tile metadata into deterministic rows
Previously last_updated, MAC, started, Intune device id, and the
raw current_stage string were sprinkled around the card in
hard-to-track positions. Reorganized:

Row 1 (header): serial | hostname | pctype | machine# | status badge
Row 2 (stage):  friendly label + N/M badge | pct% (right)
Row 3:          full-width progress bar
Row 4:          friendly hint (optional)
Row 5:          Intune device id + copy + set-category (optional)
Row 6:          started ... last ... MAC ... raw current_stage

Each row consistent across all cards regardless of which fields
are populated.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-14 20:16:34 -04:00
cproudlock
1c361e138b imaging: compact tile + search filter + stage 7 label tweak
Tile shrunk for fleet density:
 - QR: 160px -> 96px
 - Drop big h4 for serial, use fs-6 strong instead
 - DeviceId + buttons + MAC + started time consolidated into one
   small grey row instead of three separate sections
 - Progress bar 1.2rem -> 0.7rem
 - mb-4 -> mb-2 between cards
 - card-body py-2 for tighter vertical rhythm

Search:
 - Sticky search input above the card list
 - Filters live on serial, hostname, pctype, machinenumber,
   intune_device_id via lowercase substring match on a data-filter
   attribute
 - Visible-count badge updates as you type ("3/12")
 - Auto-refresh paused while query has text or while input is focused

Stage 7 label: was "assign category" only, now "awaiting category /
lockdown" to reflect that bays past category assignment are still
waiting on the Intune-driven LAPS-prompt reboot before lockdown.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-14 20:14:51 -04:00
cproudlock
ca647cb690 imaging: redesign tile + LAPS persist + 15s refresh
Tile redesign:
 - QR (or placeholder if not yet captured) on the left as a fixed 160px block
 - Right side: header (serial / hostname / pctype / machinenumber / status)
   then stage label as a big h4 with stage badge + % on the same row,
   then full-width progress bar, then friendly stage hint
 - Intune device id row with copy + set-category buttons consolidated
   under the progress section
 - Footer one-liner: started / last / MAC / raw current_stage (small grey)
 - LAPS QR + log tail still expandable below
 - shadow-sm for visual lift, no card-header line splitting

LAPS persist: POST password to /imaging/<serial>/laps so it survives
the dashboard refresh. Auto-renders QR on page load if the session
already has a stored password. Clear button POSTs empty string to
wipe server-side. No more 60s auto-clear - stays until cleared (or
daily server reset).

Refresh: 5s -> 15s. Reduces polling jitter + gives the eye time to
read before page flickers.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-14 20:11:58 -04:00
cproudlock
5f322d1110 imaging: operator-friendly stage labels per bay card
Was showing the raw push label (e.g. "Run-ShopfloorSetup: handoff to
Monitor-IntuneProgress") which only makes sense if you know the
playbook internals. Added a stage_index -> (label, hint) lookup table:

  1  Booting from PXE                    WinPE loaded
  2  Configuring Windows                 First boot baseline scripts
  3  Installing apps                     09-Setup-<pctype>
  4  Apps installed                      preparing for enrollment
  5  Enrolling in Intune                 PPKG + AAD/Intune join
  6  Waiting on first Intune sync        post-PPKG settle (~120s)
  7  Registered - assign category        idx=7 with QR + set-category btn
  8  Imaging complete                    lockdown applied

Friendly label + one-line hint shown bold, raw stage string shown
underneath in small monospace for techs who want the playbook
breadcrumb. Stage index/total folded into a badge next to the
"Current stage" header so it doesn't need its own column.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-14 20:08:05 -04:00
cproudlock
1b7e1bfee4 imaging: pause page auto-refresh while a LAPS QR is showing
meta http-equiv=refresh fires every 5s and reloads the entire page,
wiping the LAPS QR state mid-scan. Replaced the meta tag with a
JS-driven setTimeout(location.reload, 5000) so renderLapsQR() can
clearTimeout it. Reload resumes when the QR is cleared (manual or
60s auto). Multi-bay safety: only resumes if no other bay still has
a QR rendered.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-14 19:50:24 -04:00
cproudlock
d5398bdd74 imaging: LAPS-password-to-QR generator per bay card
Per-bay <details> section with:
 - Input field for LAPS password (paste from Intune portal manually,
   since deep-link to LAPS blade needs AAD objectId we can't obtain)
 - Make QR button generates a client-side QR from the input
 - QR displayed below at 280px with 4-cell quiet zone
 - Auto-clears input + QR after 60s with live countdown
 - Manual Clear button
 - Enter key on the input also triggers QR generation

Password never POSTs to server, never logged, never persists past the
60s window. Generated using the same qrcode-generator lib already
loaded for the device-id QR. Scan with a USB barcode scanner plugged
into the bay (HID keyboard mode) -> password types into bay login
field. Faster than reading off the Intune portal letter-by-letter.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-14 19:48:43 -04:00
cproudlock
cdb6655e4a imaging: drop LAPS deep-link, keep only category
LAPS retrieval blade is keyed on AAD object id, not aadDeviceId /
mdmDeviceId. We capture aadDeviceId from dsregcmd; resolving to
objectId would require a Graph API call with Device.Read.All which
we don't have at WJ. Removed the LAPS button - operator goes to
Intune portal manually for LAPS as before.

set-category button stays - aadDeviceId works for that blade.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-14 19:47:46 -04:00
cproudlock
74ba3d1339 imaging: deep-link buttons for Set Category + LAPS per bay
Two buttons next to the Intune device id on each bay card:
 - "set category" -> portal.azure.us Intune device blade properties
   via aadDeviceId/{deviceId}
 - "LAPS" -> intune.microsoft.us encryptionKeys blade via
   mdmDeviceId/{deviceId}

Both use the dsregcmd DeviceId we already capture - no Graph API
lookup or objectId resolution needed. One click from the dashboard
takes the tech to the right page for category assignment or LAPS
retrieval.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-14 19:20:10 -04:00
cproudlock
ce604adcda Renumber PXE LAN from 10.9.100.0/24 to 172.16.9.0/24
Single-site bay-stuck issue at WJ: GE Intune Report IP script filters
Get-NetIPAddress on StartsWith("10.") and posts everything matching
to the GE Tines webhook. Bays at WJ get the PXE LAN 10.9.100.x IP
captured and reported -> GE backend tags bays as on a non-corp 10.x
subnet -> dynamic group eligibility for SFLD policy never matches.
Other GE sites work because their PXE LANs aren't on 10.x at all.

Renumber PXE LAN to RFC1918 172.16.9.0/24 so the GE filter naturally
skips wired PXE addresses without any disable-NIC dance.

Server-side already in flight (netplan dual-bound, dnsmasq scope +
boot URL repointed, blancco preferences + grub.cfg + iPXE GetPxeScript
all sed'd to 172.16.9.1). This commit is the playbook / scripts /
docs side: 109 hits across 35 files sed'd in one shot.

After this lands + boot.wim is rebuilt + bays renumber off DHCP,
the 10.9.100.1 binding will be dropped from netplan as the final
cleanup step.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-14 16:30:32 -04:00
cproudlock
0eb52c6a9e imaging: copy button HTTP fallback (execCommand)
navigator.clipboard.writeText is gated on isSecureContext - HTTPS
or localhost only. PXE dashboard is served over plain HTTP
(10.9.100.1:9009) so the API was undefined and the chain threw
before .catch fired - user saw nothing. Wrap clipboard write in
copyText() that prefers the modern API and falls back to the
classic invisible-textarea + document.execCommand('copy') path
which works on HTTP. Visual flash logic moved into flashCopied()
for reuse.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-13 18:30:16 -04:00
cproudlock
6275a6a2b3 imaging: add visual feedback to device-id copy button
Click effect: button flashes green with "copied!" text and 1.15x
scale pulse, reverts after 1.2s. Failure case (clipboard API blocked
or HTTP context) shows red "failed" for 1.5s. Handler moved out of
inline onclick into a single delegated click listener at the doc
level so future copy buttons just need the .copy-btn class +
data-copy-text attribute.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-13 18:29:00 -04:00
cproudlock
59dbd64e37 Fix Report IP glob (.LOG not .txt) + add device-id copy button
Field bay surfaced two bugs in one diag dump (mdm-diag-F907T5X3 -
6PPSF24):

1. GE Proactive Remediation Report IP actually writes
   GE_Report_IP_Address_2_5.LOG (uppercase .LOG), not the .txt I
   assumed. Globs in two places had .txt filter -> never matched ->
   Phase 1 stuck IN PROGRESS forever even after the file landed and
   wired-NIC re-enable never fired. Drop extension from both globs
   in Monitor-IntuneProgress.ps1 (id=7 push gate + p1Done check).

2. The "GE Re-enable Wired NICs" SYSTEM task registered by
   Run-ShopfloorSetup was polling Autologon_Remediation.log for
   "Autologon set for ShopFloor" - a lockdown-time signal. Re-enable
   needs to fire at Report-IP time (well before lockdown) so that
   Monitor can push idx=7 with the QR before the Intune-triggered
   LAPS-prompt reboot. Repoint the SYSTEM task's poll to
   C:\Logs\GE_Report_IP_Address* (any extension).

Plus minor UX: copy button next to the Intune device ID on
/imaging dashboard so techs can grab the GUID without having to
double-click-select the <code>.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-13 18:27:26 -04:00
cproudlock
a8d38f6117 imaging: load Send-PxeStatus at script scope + bump QR size to 160px
Monitor-IntuneProgress.ps1: the previous Ensure-SendPxeStatus function
ran '. $lib' from inside the function body. PowerShell's dot-source-
inside-function semantics put the imported Send-PxeStatus into the
function's LOCAL scope, not the script scope. By the time Get-Phase1
called Get-Command Send-PxeStatus, the function had already returned
and Send-PxeStatus was out of scope - silently never invoked, no log
entry at all (success or failure). Diagnostic confirmed: bay had
DeviceId in dsregcmd, manual Send-PxeStatus from operator prompt
fired idx=7 cleanly with QR rendered, but Monitor's automatic call
never showed up in C:\Logs\send-pxe-status.log.

Fix: dot-source at script top-level (outside any function). Then
Send-PxeStatus is in script scope where every function in the file
can call it. Keep Ensure-SendPxeStatus as a no-op stub for any caller
still invoking it.

imaging.html: bump QR data-qr-size from 56 to 160 px. A 36-char UUID
at ECC M needs ~29x29 modules; at 56px each module was ~1.5px which
is too tight for a phone camera to lock onto from typical distance.
160 px gives ~5 px/module which scans cleanly.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-13 15:41:52 -04:00
cproudlock
9122b28c31 webapp: imaging progress dashboard + serial column on reports list
Adds end-to-end progress tracking for PXE imaging sessions and surfaces
each Blancco report's BIOS serial in the report list.

webapp:
  * services/imaging_status.py - JSON-per-serial state store under
    IMAGING_DIR (default /var/log/pxe-imaging). Atomic write via
    tempfile + rename. log_tail capped at 50 lines. Merges partial
    updates so clients can post just the current_stage tick.
  * config.py - new IMAGING_DIR env-overridable path.
  * services/csrf.py - explicit exempt list for machine-to-machine
    endpoints; /imaging/status is the first entry. Air-gapped LAN;
    trust-by-network for client posts.
  * app.py - four new routes:
      GET  /imaging               dashboard (renders all sessions)
      POST /imaging/status        client status push (JSON body)
      GET  /imaging/<serial>.json raw session JSON for ad-hoc polling
      POST /imaging/delete/<s>    clear a session from the dashboard
    Also parses each Blancco XML in the /reports list to surface
    system.serial + system.model columns.
  * templates/imaging.html - Bootstrap dashboard with per-session
    cards (state badge, progress bar, stage idx/total, mac, elapsed,
    log tail). meta http-equiv refresh=5 for auto-tick.
  * templates/base.html - new "Imaging Progress" nav entry.
  * templates/reports.html - Serial + Model columns added.

playbook:
  * shopfloor-setup/Shopfloor/lib/Send-PxeStatus.ps1 - new helper.
    Dot-source this then call Send-PxeStatus -Stage X -StageIndex N
    -StageTotal M from any stage script. BIOS serial via CIM, MAC via
    Get-NetAdapter, pctype + machinenumber from C:\Enrollment.
    Failures are swallowed to a local log so a network blip doesn't
    block imaging.
  * shopfloor-setup/Run-ShopfloorSetup.ps1 - dot-sources helper +
    posts at three coarse milestones (start, PPKG enrollment,
    handoff to Monitor-IntuneProgress).
  * shopfloor-setup/gea-shopfloor-keyence/09-Setup-Keyence.ps1 -
    posts at session start + after Install-FromManifest with
    succeeded/failed status derived from $rc. Other 09-Setup-*.ps1
    scripts can follow the same pattern.

ID is BIOS serial (stable across WinPE -> Windows transition and
across reboots, unlike hostname which is random pre-PPKG). Operator
already knows the serial of the bay they imaged.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-13 10:07:18 -04:00