Commit Graph

128 Commits

Author SHA1 Message Date
cproudlock
e9fc284dcb Restore-UDCData: mount share with SFLD creds instead of raw UNC from SYSTEM
Symptom: every Restore-UDCData log entry showed bay-level files as 'absent'
even when they actually existed on the share - on a device where another
PC's run had successfully consumed and migrated the same backup. Endless
'no work this cycle' loop on the device that should have done the consume.

Cause: script ran as NT AUTHORITY\SYSTEM (manifest engine on logon).
SYSTEM authenticates to remote SMB as the COMPUTER ACCOUNT
(DOMAIN\HOSTNAME$), not as a user. The SFLD share's ACL grants top-level
enumeration to authenticated computers (so Test-Path on share root +
bay dir returned True) but file-level read only to the SFLD user. With
no explicit user creds, Test-Path on bay-level files returns False -
indistinguishable from 'file not found' - so the script silently logged
'absent' on files that actually exist. A different PC with proper creds
consumed bay 3207 first; ours kept polling forever.

Update-MachineNumber.ps1's branch already worked around this by calling
Mount-SFLDShare (Restore-EDncReg.ps1's helper that reads
HKLM:\SOFTWARE\GE\SFLD\Credentials\* and net-use's the share with the
SFLD user identity).

Fix: Restore-UDCData.ps1 now does the same. Replaces raw-UNC Test-Path
polling with Mount-SFLDShare, probes via the W: drive letter, and
unmounts on every exit path. If creds are missing in registry the script
fails fast with a clear ERROR rather than masquerading as 'no backup'.
2026-05-01 11:50:04 -04:00
cproudlock
a72db8af5e Add Install-UDCWebServerConfig.cmd for ongoing manifest enforcement
Wrapper invoked by Install-FromManifest.ps1 (Type=CMD) when Hash detection
on C:\ProgramData\UDC\udc_webserver_settings.json fails. Mirrors the
Install-eMxInfo.cmd pattern: copies the colocated json from the SFLD
machineapps share into ProgramData\UDC.

Manifest entry (with DetectionValue 4E04A865...DEA3) goes in
machineapps-manifest.json on the SFLD share - separate from this repo.
2026-04-30 12:58:00 -04:00
cproudlock
75b85bfde6 Update-MachineNumber: pull per-bay udc_settings.json from SFLD on placeholder->real
When the tech transitions a 9999-placeholder PC to its real machine number,
also restore the per-bay udc_settings_<num>.json from
\\tsgwp00525\shared\spc\udc\settings_backups\. PXE-time preinstall can't reach
this share (no SFLD creds yet), so 00-PreInstall uses the local C:\Enrollment
mirror; post-config the share is reachable, so the renumber path goes direct
to the canonical source.

Adds udcSettingsSharePath to site-config.json under Standard-Machine.

Bundles in prior uncommitted work in the same file: ntlars reg restore,
UDC data restore (CurrentData.json + ArchivedData/), MTConnect Devices.xml
inline rewrite + service restart, and one-shot consume of per-bay UDC
backup -> migrated/<timestamp>/.
2026-04-30 12:34:53 -04:00
cproudlock
6e9053b83c 00-PreInstall: pre-stage udc_webserver_settings.json + firewall/NetFx3 hardening
Add staging block that copies udc_webserver_settings.json from the enrollment
share to C:\ProgramData\UDC during preinstall, mirroring the existing
udc_settings.json pattern. New PCs were imaging without UDC web server
config because the file was never wired into the imaging flow (only the
remote-maintenance task in powershell/remote-execution touched it).

Also folds in two prior uncommitted hardening blocks in the same script:
firewall NotifyOnListen=False (suppress Oracle OUI's listen-port prompt)
and NetFx3 pre-enable (Oracle 11.2's welcome path needs .NET 3.5).
2026-04-30 12:16:41 -04:00
cproudlock
4f4f1f43e8 Restore-UDCData: handle ArchivedData-only backups (no CurrentData.json)
Production case: bay 3207 had ArchivedData\ on the share with full
production records but no CurrentData.json at the bay root. The previous
Restore logic treated CurrentData.json as the marker for "valid backup"
and exited early when absent, so the script silently no-op'd every cycle
even though there was real archive data ready to restore.

Asymmetric with Backup-UDCData.ps1, which already handles missing
CurrentData.json gracefully (it copies whatever exists). Possible causes
of CurrentData.json absence in a backup: source PC had no live UDC
session at backup time (UDC inactive / not recording), backup partially
failed for that one file (no Backup-side log to confirm without rerun).

Either way, an ArchivedData-only backup is still a valid backup.

Behavior change:
- Early-exit only when BOTH CurrentData.json AND ArchivedData\ are
  absent. Otherwise proceed with whatever exists.
- Copy step for CurrentData.json wrapped in srcCurExists guard.
- consumeOk now requires: every present source successfully copied,
  AND at least one thing was actually copied.
- Move-to-migrated wraps CurrentData.json move in Test-Path guard
  (was already guarded for ArchivedData).
- restore.manifest.json gains CurrentDataPresent and ArchivedDataPresent
  booleans so future audits can see which side actually restored.
- UDC relaunch now fires when EITHER copy succeeded (was only on
  CurrentData.json copy).

Verbose logs now distinguish three cases at the early-exit:
- Both absent: "no work to do this cycle" (the 99% path)
- Only ArchivedData\: WARN with explanation, proceed
- Only CurrentData.json: WARN with explanation, proceed

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-29 14:49:38 -04:00
cproudlock
7be5518fd7 Fix v2 imaging: copy common/ at imaging time + use $setupDir not $PSScriptRoot
Two bugs that have been silently masking GE-Enforce registration since
Stage 2a landed 2026-04-22, surfaced when v1 enforcers were retired
(commit 0badfc1) and could no longer cover for the missing v2 registration.

Bug 1: startnet.cmd at imaging time only xcopied Shopfloor\ and the
PCTYPE-specific dir from the imaging share to W:\Enrollment\shopfloor-setup\.
common\ was never copied. v1 dispatchers lived per-pctype and rode in via
the %PCTYPE% xcopy, so this was never noticed. v2's GE-Enforce.ps1 +
Register-GEEnforce.ps1 + lib\Install-FromManifest.ps1 all live in common\
and got skipped at imaging entirely.

Fix: add a third xcopy block for common\, mirroring the Shopfloor\ block
above it. Applies to playbook/startnet.cmd and startnet-template.cmd.

Bug 2: Run-ShopfloorSetup.ps1 line 288 set $commonSetupDir via
'Join-Path $PSScriptRoot common'. Run-ShopfloorSetup.ps1 lives at
C:\Enrollment\Run-ShopfloorSetup.ps1 (xcopied by startnet.cmd), so
$PSScriptRoot resolves to C:\Enrollment, and $commonSetupDir resolved
to C:\Enrollment\common - which is NOT where common\ lives even after
the bug 1 fix (correct path is C:\Enrollment\shopfloor-setup\common\).
The Test-Path -LiteralPath check on Register-GEEnforce.ps1 returned
false silently and GE-Enforce never registered.

Same bug existed for Register-MapSfldShare on line 321.

Fix: $PSScriptRoot -> $setupDir for both. $setupDir was already defined
on line 51 as Join-Path $enrollDir "shopfloor-setup", which is the path
the rest of the script uses consistently.

Pre-v1-cleanup, v1's per-pctype enforcer registrations on lines 322-357
(now deleted) ran independently and covered the gap, so PCs ended up
with v1 enforcers and the user thought v2 was running. Post-cleanup,
this bug means nothing gets registered.

PXE server has been patched directly: boot.wim re-baked with the new
startnet.cmd, /srv/samba/enrollment/shopfloor-setup/Run-ShopfloorSetup.ps1
replaced. New PXE-imaged PCs from this point forward will register
GE-Enforce correctly.

For PCs imaged before this fix: run Deploy-GEEnforce.ps1 from the SFLD
share's _meta/runtime/ to retrofit. Same one-liner used for promoting
v1 PCs to v2.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-29 14:31:15 -04:00
cproudlock
26ecd0da0a 01-eDNC.ps1: match both eDNC-*.msi and eDNC_*.msi naming styles
Vendor stamps the 6.4.5 MSI as eDNC_6-4-5.msi (underscore + hyphens) but
prior versions shipped as eDNC-6.4.3.msi (dash). The previous filter only
matched the dash form so the imaging-time install would skip 6.4.5 outright.
Filter is now eDNC*.msi which catches both. Imaging dir is expected to hold
exactly one version at a time; rollback to a prior version is handled
post-imaging via the SFLD share's standard-machine/apps/ alternates, not by
keeping multiple MSIs in the imaging path.

Also updated the Write-Warning fallback to mention the new filter pattern.

PXE server's /srv/samba/enrollment/shopfloor-setup/Standard/eDNC/ has been
swapped 6.4.3 -> 6.4.5 alongside this commit.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-29 13:45:38 -04:00
cproudlock
6b3690e286 Restore-UDCData: verbose per-cycle logging + share-reachability retry
Two production-debuggability gaps closed.

1. Logging is now always-on. The previous version exited silently on the
   common no-op paths (no UDC installed, no backup waiting, share not
   reachable), leaving zero log evidence when techs reported "restore
   didn't happen". New behavior writes a header + identity + share-path
   + decision-point line to a single rotating log file every cycle.
   Errors include exception type, position, and full ScriptStackTrace.
   Log lives at C:\Logs\UDC\Restore-UDCData.log with a 1 MB cap and
   one-generation rotation to .old.log.

2. Share-reachability is now polled instead of probed once. The SFLD
   share over the SMB redirector takes 20-60 s to become reachable
   from SYSTEM context after a cold logon, especially on the first
   GE-Enforce cycle of the boot. The old single Test-Path returned
   false in that window and the script silently exited, missing the
   backup. New behavior polls Test-Path on the share root every 3 s
   for up to 60 s (both tunable via -ShareTimeoutSec / -SharePollSec)
   before deciding "no backup". If the share never comes up in that
   window the script exits 1 instead of 0 so the dispatcher logs a
   visible failure.

Both behaviors propagated to the host staging copy at
/home/camp/pxe-images/Restore-UDCData.ps1 and to the v2 share-staged
copy at tsgwp00525-v2/.../standard-machine/scripts/Restore-UDCData.ps1.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-29 12:49:04 -04:00
cproudlock
e169f8d0f5 Standard-Machine: UDC backup/restore use ArchivedData (not ArchiveData)
UDC's per-bay archive directory is C:\ProgramData\UDC\ArchivedData, not
ArchiveData. The previous spelling was a typo introduced when the scripts
were first written; it would have meant Backup-UDCData.ps1 found no archive
content (silent zero-file backups), and Restore-UDCData.ps1 wrote into a
location UDC does not read from.

Path swap is straight string replacement across both scripts plus the .bat
wrapper's usage comment. Manifest field names in backup.manifest.json /
restore.manifest.json (ArchivedDataPresent, ArchivedDataFiles,
ArchivedDataBytes) updated to match.

Update-MachineNumber.ps1's parallel UDC-restore branch (still uncommitted
in a prior workstream) has the same fix in the working tree, captured in
that branch's eventual commit.

The v2 share-staged copy at tsgwp00525-v2\standard-machine\scripts\
Restore-UDCData.ps1 also got the fix and is ready for push.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-29 11:45:39 -04:00
cproudlock
0badfc1983 Retire v1 per-pctype enforcers; GE-Enforce is the sole dispatcher
Stage 2a (GE-Enforce.ps1, landed 2026-04-22) is now the only ongoing-update
enforcer. The legacy per-pctype tasks (Machine-Enforce, Common-Enforce,
CMM-Enforce, Keyence-Enforce, Acrobat-Enforce) were kept as transition
belt-and-suspenders; with retrofitted PCs handled, the v1 path is dead and
gets removed entirely.

Deleted (13 files):
  Standard/{Machine-Enforce,Register-MachineEnforce}.ps1
  Standard/machineapps-manifest.template.json
  common/{Common-Enforce,Acrobat-Enforce,Register-CommonEnforce,Register-AcrobatEnforce}.ps1
  common/common-apps-manifest.template.json
  CMM/CMM-Enforce.ps1
  Keyence/Keyence-Enforce.ps1
  {CMM,Keyence,Standard}/lib/Install-FromManifest.ps1 (orphan dups of common/lib)

Trimmed:
  Run-ShopfloorSetup.ps1: dropped the legacy register-* invocations (Common,
    Machine) and the transition-period comment. Sole enforcer registration
    is now Register-GEEnforce.
  09-Setup-Keyence.ps1: keeps imaging-time install (step 1); removes the
    enforcer staging (step 2) and scheduled-task registration (step 3).
    Library lookup repointed to common/lib/Install-FromManifest.ps1.
  09-Setup-CMM.ps1: same treatment - keeps .NET 3.5 enable, install,
    PC-DMIS ACL grants, and bootstrap cleanup. Library repointed to common/lib.
  cmm-manifest.json + keyence-manifest.json: _comment fields updated to
    reflect imaging-time-only role (ongoing enforcement now goes through
    the v2 share manifests via GE-Enforce).

Verified clean: no orphan references to *-Enforce.ps1 / Register-*Enforce.ps1
/ machineapps-manifest / common-apps-manifest in any code path that runs.
A few historical mentions remain in unmodified header comments (GE-Enforce.ps1,
Deploy-GEEnforce.ps1, Monitor-IntuneProgress.ps1) describing what the new
dispatcher replaced; left as historical context.

Run-ShopfloorSetup.ps1 also picks up an unrelated 1-line hunk adding
SetShopfloorAutoLogon.bat to the desktop-copy list (already in the working
tree from a prior session). The file itself is not yet tracked; the
desktop-copy step is Test-Path-guarded so this is harmless until the
.bat is committed.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-29 09:55:40 -04:00
cproudlock
8564a37541 Standard-Machine: UDC data backup + restore scripts for PC swap workflow
Backup-UDCData.bat / Backup-UDCData.ps1: tech-runnable, UAC-self-elevating.
Run on the OLD PC before retirement; reads bay number from
udc_settings.json, copies CurrentData.json + ArchiveData/ to
\\tsgwp00525\...\backup\udc\<bay>\, drops backup.manifest.json. Refuses
the 9999 placeholder so backups never collide across PCs.

Restore-UDCData.ps1: idempotent, designed for the manifest engine. 99%
of cycles silent no-op (sub-second, zero side effects); 1% (cycle after
a backup lands at this PC's bay) restores files locally, moves consumed
backup to <bay>\migrated\<timestamp>\, writes restore.manifest.json,
relaunches UDC. Round-trip + no-op fast path verified end-to-end on the
win11 analyzer VM. Already wired into the Standard-Machine GE-Enforce
manifest at standard-machine\manifest.json on the v2 share.

Complementary to the placeholder-to-real branch in Update-MachineNumber.ps1:
that branch covers the 9999 -> real flow, this one covers the
pre-imaged-then-swapped flow where Update-MachineNumber already ran
before any backup existed. Both safely no-op if the other consumed.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-29 09:27:20 -04:00
cproudlock
70f176650b Blancco: playbook now produces working Ubuntu-kernel initramfs out of the box
Companion to the previous commit (4550d43). Three files that should have
been in the same commit but got left out of `git add`:

- .gitignore: negate rule for boot-tools/blancco/grub-blancco.cfg so the
  tracked cfg (source of truth for grubx64.efi rebuilds) survives
  the blanket boot-tools/ ignore.

- playbook/blancco-init.sh: rewritten for modprobe-with-deps, full NIC
  driver coverage, set -x trace to /dev/console, dmesg + PCI-device +
  /proc/modules dump + interactive shell on "no NIC after 60s".
  Replaces the narrow insmod-loop version that silently hung on
  unsupported NICs.

- playbook/pxe_server_setup.yml "Build Blancco PXE initramfs" task now
  sweeps the full drivers/net/ tree (ethernet + phy + mdio + usb + fddi
  + wan) plus overlay / squashfs / loop / ptp / libphy / mii deps, runs
  depmod to regenerate modules.dep inside the initramfs (required for
  modprobe dependency resolution), and symlinks the full applet list
  blancco-init.sh needs (modprobe, insmod, dmesg, find, env, etc).
  Result: ~20 MB initramfs vs the old 2 MB narrow build.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-22 18:08:57 -04:00
cproudlock
03ed694671 pxe_server_setup: close three playbook gaps from 2026-04-22 session
1. Deploy gea-standard / gea-engineer FlatUnattendW10.xml
   The playbook only copied the shopfloor variant before; standard and
   engineer's unattend was hand-staged on the running servers. New task
   loops the repo's playbook/FlatUnattendW10.xml into Deploy/ for each
   entry in standard_types (new var covering gea-standard, gea-engineer,
   ge-standard, ge-engineer). force: yes because repo drift vs deployed
   copy is what produced the Win10/Win11 search-cleanup regression
   earlier this session (d49f516).

2. Deploy Oracle Client 11.2 preinstall payload
   preinstall.json now leads with Oracle 11.2 (commit 3a29784). The CMD
   wrapper is tracked in the repo at playbook/preinstall/oracle/; the
   686 MB Oracle_OracleDatabase_11r2_V03.zip is too large to commit and
   rides on USB under oracle/ alongside BIOS exes. Three tasks:
   mkdir staging dir, copy CMD from usb_mount, copy zip from usb_root
   with a soft-fail + warning if absent.

3. No change needed for sync-preinstall.sh — Oracle 10.2.0.3 flat
   installer was already dropped in 9235d19.

YAML lints clean. Fresh server built from this commit will bring up
Blancco-agnostic imaging paths correctly; Blancco-specific gaps
(grubx64.efi native-vs-slim, narrow kexec-initrd driver tree,
narrow blancco-init.sh) are still deferred per earlier "option B"
decision and remain server-side-pinned only.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-22 17:59:08 -04:00
cproudlock
d49f516b16 FlatUnattendW10: catch repo up to deployed state + Win10/Win11 search cleanup
Two things in one pass because the repo copy was 162 lines behind the
deployed one already:

1. Sync repo to the currently-deployed FlatUnattendW10.xml baseline
   (Java JRE 8 u441 + Java auto-update pins + Cortana/Bing/Search
   disable block that had been added on-server but never committed).

2. Prune three ineffective registry entries and replace the Bing
   suppression with a documented equivalent that works on both Win10
   and Win11:
   - DROP #32  HKLM\...\Search\CortanaEnabled=0
               Undocumented at HKLM (the real key is HKCU). No effect.
   - DROP #37  AllowCortanaAboveLock=0
               Deprecated per AboveLock Policy CSP. Cortana app was
               removed from Win11 in Canary 25967 anyway.
   - REPLACE #34  BingSearchEnabled (HKLM, undocumented) with
                  DisableSearchBoxSuggestions=1 written into the
                  Default User hive so every new account inherits it.
                  This is the Microsoft-documented kill-switch for
                  Bing / web results in Start-menu search on both
                  Win10 and Win11.

Validated XML well-formed (xmllint + Python ET). RunSynchronous orders
remain unique and ascending after the deletions. Deployed to both PXE
servers under /srv/samba/winpeapps/{gea-engineer,gea-standard}/Deploy/
with timestamped .pre-winsearch-cleanup-* backups.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-22 16:55:16 -04:00
cproudlock
3a29784124 preinstall: add Oracle Client 11.2 as first application
Oracle 11.2 is now installed at image build time (preinstall) rather
than deferred to the runtime enforcer. eDNC / NTLARS / UDC / CMM tooling
all link against the Oracle home, so shipping an image without Oracle
means the first-boot experience is broken until the enforcer completes.

Uses the EXE-launches-CMD trick (same pattern as OpenText Setup-OpenText
.cmd) since the preinstall runner only knows MSI/EXE. The wrapper itself
accepts the zip next to the .cmd (preinstall flat layout) OR under
..\apps\ (SFLD share layout for the runtime enforcer), so one script
serves both paths. First entry in Applications[] so downstream apps see
Oracle already present.

The 686 MB Oracle_OracleDatabase_11r2_V03.zip lives outside git and is
pushed to /srv/samba/enrollment/pre-install/installers/oracle/ separately.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-22 16:19:45 -04:00
cproudlock
9235d19e55 preinstall: drop Oracle 10.2.0.3, defer to runtime enforcer for 11.2
10.2.0.3 was being installed into every Standard/CMM/Genspect/Keyence/
WaxAndTrace/Display image at build time. Oracle 11.2 is now installed
and version-enforced at runtime by the GE-Enforce common manifest
entry (Install-Oracle11r2.cmd), so baking 10.2 into the image creates
drift: 10.2 on disk, 11.2 expected by registry. Remove from preinstall.json
and drop the flat installer from sync-preinstall.sh so new builds come
up clean and the enforcer does the install on first boot.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-22 16:15:01 -04:00
cproudlock
f7a14f6804 blancco-preferences: disable wired LAN so WiFi INTERNETACCESS takes default route
The config.img internal preferences.xml has <network><enabled>false</enabled>
(set up by b7cd097 for native-kernel BMC licensing over WiFi). The sibling
playbook/blancco-preferences.xml was left at true, so the Ubuntu-kernel
switch_root path downloaded a preferences.xml that re-enabled wired LAN,
killing the WiFi default route and preventing BMC reach.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-22 15:52:44 -04:00
cproudlock
c918dea9d1 Revert all Blancco changes from this session
User reports Blancco was working before our mirror/session activity
today - then my attempted fixes (grubx64.efi rebuild, kexec-initrd
driver sweep, verbose blancco-init.sh) made it worse:

  - First attempt (narrow igc driver add) did not help because the
    switch-root path was not the one actually loaded by grubx64.efi's
    embedded config.
  - Second attempt (swapped grub embedded config to Ubuntu-kernel path)
    got further, but then kexec-initrd modules failed on insmod.
  - Third attempt (full ethernet tree sweep) pulled in broken ancient
    drivers (winbond-840, w5100-spi, xirc2ps_cs) that failed with
    unknown-symbol errors and prevented good drivers from loading.

Full revert: .gitignore, blancco-init.sh, pxe_server_setup.yml back to
the pre-session commit 6dcf832 state. Removes boot-tools/blancco/grub-
blancco.cfg from git (it was only added this session).

Runtime on both PXE servers was also restored: grubx64.efi and
kexec-initrd.img reverted from the .bak files taken before each
modification this session.

Whatever was there before today is now restored byte-for-byte on both
servers. If there is still a Blancco boot issue on specific modern
hardware that the user needs to fix, we will diagnose that narrowly
against the actual failure mode on that specific machine, not by
making sweeping preemptive changes.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-22 13:33:49 -04:00
cproudlock
d7ec6a2b5f Blancco: sweep full NIC driver tree into kexec-initrd + verbose init
Previous approach listed ~6 specific drivers (e1000e, igb, tg3, bnx2,
bnxt_en, b44) and silenced insmod errors (2>/dev/null). On modern Dell
fleet (Latitude 5330/5440, Pro-series, newer OptiPlex) this missed
igc (Intel I225/I226) entirely, and for the drivers we did include,
dependency modules they need at insmod time (libeth, libie, dca,
i2c-algo-bit, macsec, mii, libphy, ptp, ...) were never bundled.
insmod does not resolve dependencies, so NIC drivers that need
helpers failed to load silently.

playbook/pxe_server_setup.yml (kexec-initrd build):
  - Sweep the whole drivers/net/ethernet tree (~170 drivers, all
    vendors, ~15 MB total). Drivers for hardware not present skip
    without binding.
  - Add common helper dirs: drivers/net/{phy,mdio}, drivers/i2c/algos,
    drivers/dca, drivers/ptp, net/macsec, drivers/ssb.
  - overlay.ko kept.

playbook/blancco-init.sh:
  - Load helpers BEFORE main NIC drivers (libeth/libie, dca,
    i2c-algo-bit, macsec, mii, ssb, libphy, mdio*, phy*, ptp*),
    then iterate remaining modules.
  - Remove 2>/dev/null on insmod so actual failures surface on the
    boot console.
  - Print kernel version + /sys/class/net before/after driver load,
    plus dmesg grep for NIC driver activity.
  - On "no interface found" failure, dump dmesg tail and drop to a
    busybox shell for manual debug rather than just hanging.

Separate from this commit but related: kexec-initrd.img on both PXE
servers (.1 and .2) was rebuilt inline with these changes. Pre-rebuild
binary kept as kexec-initrd.img.bak-<timestamp>.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-22 13:29:35 -04:00
cproudlock
1886857c0f Use [Environment]::MachineName instead of $env:COMPUTERNAME
Live kernel NetBIOS name instead of the PowerShell process-env cache.

$env:COMPUTERNAME is populated when PowerShell starts and does not
update if the PC gets renamed (common on Intune-managed Autopilot /
AADJ devices that come up with a DESKTOP-XXXXXXXX name and get
renamed by policy post-imaging). Until the next reboot, the env var
stays stale while 'hostname.exe' already reports the new name.

That mismatch showed up live on the first production retrofit: the
status.json was written under _outputs/logs/DESKTOP-XXXXXXXX/
instead of under the device's current name, and the
TargetHostnames filter and monitor drift-check would likewise see
the stale name.

[Environment]::MachineName reads from the kernel on each call, so
it always returns the current NetBIOS name. Swapped at all five
callsites in GE-Enforce.ps1, Register-GEEnforce.ps1, and
Install-FromManifest.ps1.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-22 12:51:05 -04:00
cproudlock
ba03f63465 Register-GEEnforce: use SHA-256 instead of MD5 for per-PC jitter offset
FIPS-enforced PCs (System cryptography GPO) reject non-approved
algorithms at the .NET crypto API level. MD5 throws
"This implementation is not part of the Windows Platform FIPS
validated cryptographic algorithms" on .Create(), which aborts
Register-GEEnforce before the scheduled task is built.

SHA-256 is FIPS 180-4 approved and its default .NET provider is
validated, so SHA256.Create() works under FIPS mode. Functionally
equivalent for the 0-4 minute modulo we need for jitter.

Hit this live on the first production retrofit. Enforcer runtime
files were copied and legacy tasks were unregistered, but the new
task creation aborted. Rerunning Deploy-GEEnforce.ps1 is idempotent
and recovers.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-22 12:39:26 -04:00
cproudlock
70a36711c3 Install-FromManifest: InUseCheck ForceClose / CloseAndReopen
Adds partial Stage 2b support for InUseCheck entries in manifests. When
an entry declares InUseCheck.Behavior = ForceClose or CloseAndReopen and
the listed processes are running at install time, the lib now:

  1. Calls CloseMainWindow() on each matching Process handle (polite WM_CLOSE).
  2. Waits GracefulCloseTimeoutSec (default 10) for exit.
  3. Hard-kills the process if it did not exit gracefully.
  4. Proceeds with the install.

"CloseAndReopen" is currently treated the same as ForceClose - no reopen
happens today. Stage 2b will add the user-session scheduled-task trick
to relaunch the closed app in the logged-in user's session. In practice
for the 24/7 ShopFloor persistent-user pattern the operator relaunches
the app manually (or the app is registered as Startup and reopens on
the next reboot), which is acceptable.

Concrete impact: the eDNC entry in standard-machine/manifest.json lists
InUseCheck.Processes = DncMain + NTLARS with Behavior=CloseAndReopen. On
a retrofit or upgrade cycle that finds eDNC 6.4.3 needs to go to 6.4.5,
the lib now force-closes DncMain and NTLARS before msiexec rather than
risking Restart Manager silently scheduling a pending-file-replace that
does not actually upgrade until the next reboot (which on a 24/7 PC
might be never).

Verified in the Win11 analyzer VM against manifests declaring InUseCheck
on eDNC - logs show "InUseCheck: DncMain (PID ...) asked to close"
followed by either graceful exit or the force-kill path, then install
proceeds without 3010.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-22 12:17:34 -04:00
cproudlock
eb68793e79 Stage 2a: unified GE-Enforce framework + share-root mirror
Consolidates per-type enforcers (CMM, Keyence, Machine, Common, Acrobat)
into one dispatcher driven by pc-type.txt + site-config and a share-side
manifest layout. Same share is now the single source of truth for routine
software updates without re-imaging.

Runtime:
  common/GE-Enforce.ps1           SYSTEM scheduled task. Reads
                                   common/manifest.json plus optional
                                   <pcType>/manifest.json and
                                   <pcType-subType>/manifest.json.
                                   Dispatches each entry through the lib.
                                   Writes _outputs/logs/<hostname>/status.json
                                   on the share after each cycle for fleet
                                   monitoring.
  common/Register-GEEnforce.ps1   Task registration. Triggers: AtLogOn +
                                   every 5 min (jittered per-PC from
                                   hostname hash) + daily at 05:45,
                                   13:45, 21:45 EST shift windows.
                                   Unregisters legacy per-type tasks on
                                   install so the two coexist at most for
                                   the duration of a single enforce cycle.
  common/Deploy-GEEnforce.ps1     Retrofit helper for already-imaged PCs
                                   (admin-run; copies runtime + registers
                                   task + optional immediate trigger).

Library (common/lib/Install-FromManifest.ps1):
  - New Type values: PS1, BAT, File, Registry, INF
  - New DetectionMethod values: Always, MarkerFile, ValueMatches, pnputil
  - TargetHostnames filter (exact + -like wildcards, ANDed with PCTypes)
  - Schema version check (logs WARN on manifest newer than lib MAJOR)
  - Auto-writes MarkerFile on successful one-shot PS1/BAT/CMD runs
  - MSI log scan on failure surfaces meaningful install errors
  - Lib version bumped 2.0 -> 2.1 for TargetHostnames

Observability:
  common/monitor-fleet-status.py  Scans _outputs/logs/*/status.json for
                                   stale check-ins, failed scopes, and
                                   version drift. Respects scope (dir-name),
                                   PCTypes, and TargetHostnames filters so
                                   entries excluded from a PC do not
                                   false-flag as drift.

Regression harness:
  common/test/                    Parameterized VM harness + README
                                   covering every action type plus
                                   rollback, bad/missing SFLD creds, and
                                   schema versioning.

Imaging integration:
  Run-ShopfloorSetup.ps1 now stages GE-Enforce.ps1 and lib to
  C:\Program Files\GE\Shopfloor\ and invokes Register-GEEnforce.ps1
  at the end of setup. Legacy Register-CommonEnforce invocation is
  kept for the transition; it and the legacy per-type enforcer files
  are dead code once Register-GEEnforce runs and will be removed in a
  dedicated cleanup pass.

Standard-Machine manifest:
  eDNC entry bumped 6.4.3 -> 6.4.5. DetectionValue pinned to the
  4-part FileVersion 6.4.5.0 verified against a fresh install in the
  Win11 analyzer VM. UDC DetectionValue pinned to 1.0.34 (registry
  stores 3-part for UDC; verified live).

scripts/mirror-from-gold.sh:
  Restructured around share-root rsyncs (one pass per Samba share)
  to close gaps in the prior per-subdir layout: winpeapps/_shared/
  Applications (7.5 GB of Adobe + fonts + Java + Office + OpenText
  + printdrivers + wireless + Zscaler), additional winpeapps image
  types, and enrollment flat-layout root files. Adds
  --skip-clonezilla and --skip-reports.

Verified end-to-end in the Win11 analyzer VM:
  - Every action Type and DetectionMethod round-tripped
  - PCTypes filter (Oracle excluded on Shopfloor, Firefox included
    on Shopfloor and DESKTOP-*, excluded elsewhere)
  - TargetHostnames filter (exact, wildcard, no-match)
  - Upgrade path: XML hash bump + fleet re-copy
  - Rollback path: history-archive restore propagates via enforcer,
    fleet converges back without per-PC intervention
  - Status writeback + monitor script drift detection
  - Graceful degradation on bad creds, missing creds, share
    unreachable (all exit 0, log clearly, retry next cycle)

Not in this commit (follow-ups):
  - Retire legacy per-type *-Enforce.ps1 files and simplify
    09-Setup-*.ps1 scripts (coordinated multi-file cleanup)
  - Stage 2b: InUseCheck close-and-reopen, ApplyMode gating,
    UpdateWindow, .apply-now.txt sentinel, BITS pre-staging,
    1618 mutex retry, PostInstallCheck, Uninstall action
  - Management app (manifest CRUD + deploy + rollback + fleet view)
  - ShopFloor autologon persistence bug (deferred for next imaging
    attempt with live registry evidence)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-22 11:19:23 -04:00
cproudlock
6dcf832ace Keyence ongoing-update enforcer (tsgwp00525 share pattern)
Adds a CMM-style logon enforcer so VR-6000 updates push fleet-wide
without re-imaging.

- keyence-manifest.json: declares VR-6000 MSI (ProductCode-keyed) and
  KEYENCE VR USB driver (pnputil-keyed). Single source of truth for
  both imaging-time and ongoing-enforcement paths.
- lib/Install-FromManifest.ps1: forked from CMM/lib; adds DetectionMethod
  "pnputil" (regex-matches `pnputil /enum-drivers` output) and Type
  "INF" (invokes `pnputil /add-driver /install`). Everything else
  unchanged so CMM-style error parsing + MSI log scanning carry over.
- Keyence-Enforce.ps1: forked from CMM-Enforce.ps1. SYSTEM scheduled
  task, logon trigger, mounts tsgwp00525 SFLD share with creds from
  HKLM:\SOFTWARE\GE\SFLD\Credentials (provisioned by Azure DSC),
  hands off to Install-FromManifest against the share manifest.
- 09-Setup-Keyence.ps1: rewritten around the manifest. Runs
  Install-FromManifest at imaging time, stages runtime scripts to
  C:\Program Files\GE\Keyence, registers "GE Keyence Enforce"
  scheduled task. Idempotent.
- site-config.json: add keyenceSharePath to the Keyence profile
  pointing at \\tsgwp00525\shared\dt\shopfloor\keyence\machineapps.

To push a new VR-6000 version: drop the new MSI + updated manifest on
the tsgwp00525 share, every Keyence PC upgrades on next logon.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 10:16:20 -04:00
cproudlock
22c59b889e Keyence VR-6000 Series Software + USB driver deployment
- shopfloor-setup/Keyence/09-Setup-Keyence.ps1: populate placeholder with
  MSI install via msiexec and driver install via pnputil. Idempotent on
  ProductCode {058E7194-...} and DriverStore entry. Logs to C:\Logs\Keyence\.
- shopfloor-setup/Keyence/installers/VR-6000 Series Software.msi: main
  product (1.7 MB; pulled from Keyence6000.exe Inno wrapper's Windows
  Installer cache, built with InstallShield 2019).
- shopfloor-setup/Keyence/drivers/: KEYENCE VR Series USB driver
  package (.inf + .cat + Wdf/WinUsb co-installers). 2.7 MB, pulled from
  DriverStore\FileRepository\keyence_vr_series.inf_amd64_b5e5eb0924d7b4ce.
- preinstall.json: add VC++ 2013 x64 Min + Add entries (PCTypes: ["*"])
  as prereqs for VR-6000. GUIDs {A749D8E6-B613-...} and {929FBD26-9020-...}.

Staging footprint for non-Keyence PCs is unchanged (the 4.4 MB Keyence
payload lives under shopfloor-setup/Keyence/ which startnet.cmd only
xcopies for PCTYPE=Keyence). Rollout still requires dropping the two
VC++ 2013 x64 MSIs into \$PXE_IMAGES_DIR/dependencies/vcredist/2013-x64-{min,add}/
on the workstation running sync-preinstall.sh.

Rationale for bundling the MSI + driver locally rather than running
Keyence6000.exe: the Inno wrapper calls an InstallShield child (Setup.exe)
without silent flags, which hangs indefinitely in session 0 during
automated imaging. msiexec + pnputil from the extracted bundle runs
fully non-interactive.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 10:02:34 -04:00
cproudlock
eca9ee2b36 startnet.cmd: stage udc-backups to WinPE; mirror-from-gold: taxonomy layout
- playbook/startnet.cmd + startnet-template.cmd: after preinstall staging,
  xcopy Y:\pre-install\udc-backups to W:\PreInstall\udc-backups so UDC
  settings JSONs are available during image deployment. Harvested from
  live gold where this block existed but was never committed.

- scripts/mirror-from-gold.sh: update source paths to current taxonomy
  layout (pre-install/, installers-post/, blancco/, config/) and add
  ppkgs/, scripts/, shopfloor-setup/ sections. Added --delete for exact
  mirror semantics. Used to seed the spare PXE server at 10.9.100.2 on
  2026-04-16 from gold at 10.9.100.1.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-16 21:37:23 -04:00
cproudlock
6e85e19c85 S: drive mapping via HKLM\Run, autologon-count non-intervention, Phase 4 no-scripts handling
- Register-MapSfldShare.ps1: swap scheduled task for HKLM\Run entry. Task with -GroupId runs in session 0 with no HKCU, so /persistent:yes fails and the drive mapping isn't visible to Explorer. Run key fires at Explorer startup in the interactive user's session with full token + HKCU. Unregisters legacy 'GE Shopfloor Map S: Drive' task for PCs already imaged.
- Run-ShopfloorSetup.ps1: stop bumping AutoLogonCount (99 at start, 4 at end). Windows decrements per-logon and at 0 clears AutoAdminLogon + DefaultPassword, which nukes the lockdown-configured ShopFloor autologon. Re-enable-wired-NICs task now gates on Autologon_Remediation.log 'Autologon set for ShopFloor' instead of SFLD creds, so wired stays off through the whole Intune+DSC+lockdown chain.
- Monitor-IntuneProgress.ps1: Phase 4 treats 'no custom scripts' as COMPLETE when DSC install is done (was WAITING, which stalled the state machine on PC types without scripts). Push retrigger out to 15min when entering lockdown-wait so a stale 5min retrigger doesn't fire mid-Remediation. Removed the AutoLogonCount delete in Invoke-SetupComplete since we no longer set it.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-16 17:42:22 -04:00
cproudlock
2ab6055125 Fix ShopFloor autologon persistence, S: drive mapping, sync throttle
AutoLogonCount depletion:
  Run-ShopfloorSetup set AutoLogonCount=4 for SupportUser. Windows
  decrements per-logon; at 0 it clears AutoAdminLogon + DefaultPassword,
  nuking the lockdown-configured ShopFloor autologon. Fix: delete
  AutoLogonCount in Invoke-SetupComplete before the lockdown reboot.
  ShopFloor's Autologon.exe-set config persists indefinitely.

Sync_intune window on ShopFloor:
  The marker-check path used 'exit 0' but the task runs with -NoExit,
  leaving a dangling PowerShell window on every ShopFloor logon. Fix:
  [Environment]::Exit(0) kills the host outright, defeating -NoExit.

S: drive mapping:
  Vendor ConsumeCredentials.ps1 calls New-StoredCredential -Persist
  LocalMachine (needs admin) before net use. ShopFloor is non-admin so
  cred-store fails silently and net use has no auth. Fix: new
  Map-SfldShare.ps1 reads HKLM creds and passes them inline to
  net use /user: -- no Credential Manager needed, works as Limited.
  Register-MapSfldShare updated to stage + reference our script.

Wired NIC re-enable:
  SYSTEM task polls for SFLD creds (Phase 5), re-enables wired NICs,
  self-deletes. Replaces the broken Enable-NetAdapter in Monitor
  (Limited principal can't enable NICs). No-WiFi devices unaffected
  (migrate-to-wifi never disables, re-enable is a no-op).

Sync throttle:
  15 min retrigger when only waiting for lockdown (was 5 min for all
  phases). Avoids interrupting the Intune Remediation script.

Defect Tracker path:
  All references corrected to C:\Program Files (x86)\WJF_Defect_Tracker.

QR code retry:
  Build-QRCodeText retried every poll cycle until DeviceId appears
  (was single-shot that could miss the dsregcmd timing window).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-16 12:29:02 -04:00
cproudlock
f73f999938 Unified Common-Enforce for cross-type apps, add WJF Defect Tracker
Replaces the Acrobat-only enforcer with a generic Common-Enforce that
handles all cross-PC-type apps from one manifest + one scheduled task
on the SFLD share at \\tsgwp00525\shared\dt\shopfloor\common\apps\.

Renames:
  Acrobat-Enforce.ps1        -> Common-Enforce.ps1
  Register-AcrobatEnforce    -> Register-CommonEnforce
  acrobat-manifest.json      -> common-apps-manifest.json
  common.acrobatSharePath    -> common.commonAppsSharePath
  'GE Acrobat Enforce' task  -> 'GE Common Apps Enforce' task
  C:\Program Files\GE\Acrobat -> C:\Program Files\GE\CommonApps

Register-CommonEnforce cleans up the legacy 'GE Acrobat Enforce' task
if present from a prior image.

WJF Defect Tracker (replaces ClickOnce):
  - Added to preinstall.json (PCTypes=*, fleet-wide imaging-time install)
  - MSI staged on PXE at pre-install/installers/
  - Added to common-apps-manifest with FileVersion detection on
    C:\Program Files\WJF_Defect_Tracker\Defect_Tracker.exe
  - site-config + 06-OrganizeDesktop: shortcut changed from ClickOnce
    'existing' to exe-path pointing at the MSI-installed binary
  - Update workflow: drop new MSI on share, bump DetectionValue

CMM 09-Setup-CMM: added goCMM + DODA to the ACL grant list.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-16 11:13:05 -04:00
cproudlock
1a5feefb01 CMM: grant Users Modify on PC-DMIS install dirs for non-admin launch
PC-DMIS writes settings, probe configs, and measurement data to its own
Program Files install directory at runtime. Without Modify permission
for BUILTIN\Users, non-admin accounts (ShopFloor) get a UAC elevation
prompt on every launch. The "run as admin once" workaround can't be
automated because PC-DMIS shows a license dialog on first run that
blocks silently.

Fix: grant BUILTIN\Users Modify with inheritance on:
  - C:\Program Files\Hexagon\PC-DMIS 2016.0 64-bit
  - C:\Program Files\Hexagon\PC-DMIS 2019 R2 64-bit
  - C:\ProgramData\Hexagon

Runs as Step 2.5 in 09-Setup-CMM.ps1 after Install-FromManifest
completes. If the exe also has an embedded requireAdministrator manifest
(separate from the file-permission issue), that will need an additional
fix after testing.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-16 10:38:24 -04:00
cproudlock
ac23759486 UDC firewall rules + Acrobat Reader as default PDF viewer
- Pre-create Windows Firewall inbound-allow rules for UDC.exe and
  MTConnect agent.exe before UDC_Setup.exe runs, suppressing the
  interactive "allow through firewall?" dialogs during silent install.

- Set Adobe Acrobat Reader (Acrobat.Document.DC) as the default .pdf
  handler via dism /import-defaultappassociations. Runs in
  03-ShellDefaults.ps1 so the OEMDefaultAssociations.xml is in place
  before ShopFloor's profile is created on first logon. Edge no longer
  claims .pdf on new profiles.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-16 09:18:44 -04:00
cproudlock
85e74e5dd1 UDC settings: pre-stage from server backups, fix arg format, action prompts
Root cause found via decompiling UDC_Setup.exe: it never writes
udc_settings.json from CLI args. Instead it pulls
Settings_Backups\udc_settings_<num>.json from \\tsgwp00525\shared\SPC\UDC
-- which is unreachable at imaging time (no SFLD creds yet). Silent
File.Exists() false, settings never copy, UDC lands on Evendale defaults.

Fix: stage 80 udc_settings_*.json backups under
shopfloor-setup/Standard/udc-backups/ (same tree as ntlars-backups,
xcopy'd to C:\Enrollment\ by existing startnet.cmd). 00-PreInstall
pre-creates C:\ProgramData\UDC\udc_settings.json from the matching
backup BEFORE UDC_Setup.exe runs. Installer's server-side copy silently
fails (unreachable), our pre-staged file survives.

Also:
- preinstall.json UDC InstallArgs corrected: "West Jefferson" -9999
  (quoted spaced site + dash-prefixed number, confirmed via decompile)
- Update-MachineNumber.ps1 UDC.exe relaunch: quoted site + dash number
- Monitor-IntuneProgress: action prompts (Select Device Category after
  Phase 1; Initiate ARTS Lockdown after Phase 5/creds), Display flow
  (3-phase: Registration -> Config -> Lockdown), Phase 6 IME-based
  lockdown detection

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-16 08:44:34 -04:00
cproudlock
db55bd772a sync_intune: professional UI, IME-based lockdown detection
UI overhaul:
  Replaced the 30+ line checkbox-per-sub-item view with a clean
  6-line phase summary styled for GE Aerospace branding. Each phase
  shows one colored status tag: [COMPLETE] green, [IN PROGRESS] cyan,
  [WAITING] gray, [FAILED] red. Action hint for Phase 2 (device
  category assignment) in yellow. QR code + Device ID below.

Phase 6 lockdown detection:
  Replaced DefaultUserName + admin-rename checks (which pass at PPKG
  time, way too early) with Intune Remediation log artifacts:
  - Autologon_Remediation.log: "Autologon set for ShopFloor"
  - Autologon_Detection.log: "matches the expected value: 1"
  These only exist after the Intune Remediation cycle actually fires
  post-enrollment, making Phase 6 a true end-of-chain signal.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-16 07:35:22 -04:00
cproudlock
a4de11814d Force-Lockdown.bat + S: drive logon mapper for ShopFloor end-user
Force-Lockdown.bat (SupportUser desktop):
  Vendor escape hatch when Intune Lockdown push hasn't applied within
  ~30 minutes. Self-elevates via UAC, prompts for typed YES confirmation
  that an ARTS request is in place, then runs sfld_autologon.ps1.

Register-MapSfldShare.ps1 (every PC type):
  The SFLD vendor's 'SFLD - Consume Credentials' scheduled task is
  principal-restricted (admin-only) so it fires for SupportUser logon
  but not for ShopFloor logon -- ShopFloor lands at the desktop with
  no S: drive and no way to reach \\tsgwp00525\shared. Workaround:
  register a parallel 'GE Shopfloor Map S: Drive' AtLogOn task with
  Principal=BUILTIN\Users + RunLevel=Limited that invokes the vendor's
  C:\ProgramData\SFLD\CredentialManager\ConsumeCredentials.ps1 in the
  interactive user's session. Vendor script handles cred-store + net use
  end to end; we just give it a wider trigger principal. Cross-PC-type
  because every shopfloor account needs S:.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-15 18:31:18 -04:00
cproudlock
a334a56f1e WiFi detection: widen regex to catch hyphen-less 'WiFi' + 802.11
Realtek RTL8852BE describes itself as 'Realtek RTL8852BE WiFi 6 802.11ax
PCIe Adapter' -- no hyphen in 'WiFi' -- which the previous regex
'Wi-Fi|Wireless' rejected. migrate-to-wifi.ps1's gate then exited 0
silently and neither wired NIC got disabled, leaving the imaging chain
running over PXE ethernet for the entire PPKG phase.

New regex Wi-?Fi|Wireless|WLAN|802\.11 covers:
  - Wi-Fi (Intel-style with hyphen)
  - WiFi (Realtek-style without hyphen)
  - Wireless (Intel Wireless-AC, Killer Wireless)
  - WLAN (some Realtek/MediaTek variants)
  - 802.11 (vendor-agnostic spec reference, fallback)

Applied in two callers:
- migrate-to-wifi.ps1 (3 occurrences: gate + disable + re-enable on timeout)
- Monitor-IntuneProgress.ps1 (re-enable wired on sync_intune startup)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-15 18:03:19 -04:00
cproudlock
c23b803dc6 sync_intune: align Phase 3/5/6 columns; ignore benign 'Failed: 0' tails
Cosmetic + accuracy fixes spotted on the live test PC:

- Phase 3 deploy/install lines had a stray double-space after the
  checkbox; Phase 5 'Share creds present in HKLM' and Phase 6
  'Administrator renamed' had wider misalignment. All four lines
  collapsed to single-space-after-checkbox so the column lines up
  with the rest of the table.

- Phase 4 status detector was greping the last 30 lines of each
  Install-*.log for /(?i)\b(ERROR|Failed|exception)\b/. That hit
  benign summary lines like 'Failed: 0' or 'Errors:    0' and
  marked successful runs as failed (Install-VCRedists.ps1 was the
  trigger -- 8/8 'Already installed - skipping' but the summary
  contained 'Failed: 0' and Phase 4 said FAILED). Tightened the
  regex to also exclude /\b(ERROR|Failed|Failures|Errors|Exceptions?)\s*[:=]\s*0\b/
  so the keyword has to be next to a non-zero value (or the
  vocabulary 'Exit code 1603 - FAILED' style still trips correctly).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-15 17:53:15 -04:00
cproudlock
2db35c2976 UDC: correct CLI arg signature to compact site + dash-prefixed machine#
UDC_Setup.exe and UDC.exe expect:
  UDC_Setup.exe WestJefferson -7605

Not the spaced-quoted positional pair we'd been passing:
  UDC_Setup.exe "West Jefferson" 7605

The wrong format meant UDC ignored both args, fell back to defaults
(Site=Evendale, MachineNumber=blank). Combined with the kill-after-detect
window, neither value got persisted to udc_settings.json regardless of
whether UDC.exe was given time to write.

Changes:
- preinstall.json: UDC InstallArgs now "WestJefferson -9999"
- 00-PreInstall-MachineApps.ps1: site override now matches/replaces
  the compact 'WestJefferson' token (not 'West Jefferson') and uses
  siteNameCompact from site-config; targetNum extraction regex updated
  to '-(\d+)$' for the new dash-prefix format
- Update-MachineNumber.ps1: UDC.exe relaunch now passes positional
  compact-site + dash-prefixed number instead of -site/-machine flags

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-15 17:47:57 -04:00
cproudlock
14d103a248 run-enrollment: switch provtool /source from BPRT to PSCmdlet
BPRT was stopping after the first RestartRequired=true command (DotNet35).
Test image captured 2026-04-15 showed 3 of 21 PPKG commands ran (PPKG
Version Check, Lock Screen, DotNet35) before provtool exited 0 leaving
Office / Chrome / Tanium / Activate-Windows / Enable-DeviceLockdown /
Hide-SupportUser / 12 more scripts unexecuted. Symptom: criticalChecks
said EntraID NOT joined (wrong -- it was), sessions.json showed a
'LogonIdleTask' session perpetually 'Not started', and the resulting PC
was missing most of its fleet software.

BPRT is the OOBE runtime source -- it expects the OOBE engine to own the
post-DotNet35 reboot + resume. In our post-autounattend context there is
no OOBE engine, so restart-required commands stall the pipeline. PSCmdlet
is the source Install-ProvisioningPackage uses internally and has the
correct resume semantics for post-OOBE application.

The original motivation for BPRT (avoiding the 180s PowerShell timeout)
does not apply because we invoke provtool.exe directly, not via the
Install-ProvisioningPackage cmdlet.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-15 17:07:12 -04:00
cproudlock
8528a1bcae Install-FromManifest: add FileVersion detection for version-pinned upgrades
File-existence detection on NTLARS.exe couldn't tell eDNC 6.4.3 from 6.4.4
(both installers leave the same binary in place), so the enforcer skipped
upgrades. FileVersion compares the vendor-stamped FileVersion field on a
named binary against the manifest's DetectionValue with exact-string match.

Added to all three lib copies (common, Standard, CMM). Standard manifest
template flipped to FileVersion against DncMain.exe -- the eDNC main
binary is more reliably version-stamped than the bundled NTLARS sub-tool.

Update workflow now: drop the new vendor MSI on the SFLD share, bump
Installer + DetectionValue in machineapps-manifest.json, next user logon
runs Machine-Enforce which detects mismatch and installs.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-15 16:15:32 -04:00
cproudlock
a6648c5a40 sync_intune: full lifecycle gate, lockdown phase, creds verification
Add Phase 6 (Lockdown) and tighten Phase 5 so the 5-min Intune sync loop
doesn't declare success until the device is genuinely operator-ready.

- Phase 6 watches two HKLM-level signals confirmed in the 2026-04-15
  pre/post lockdown state diff: Winlogon\DefaultUserName flipped to
  'ShopFloor', and local Administrator renamed to 'SFLDAdmin'. Both land
  via MDM PolicyCSP after DSCInstall.log finishes.

- Phase 5 was just checking that the Consume Credentials scheduled task
  existed; that only proves DSC scheduled it. Now also verifies creds
  actually landed under HKLM:\SOFTWARE\GE\SFLD\Credentials\* with
  TargetHost+Username+Password populated -- which is what Machine/Acrobat/
  CMM-Enforce actually consume.

- Final completion gate: DscInstallComplete && CredsPopulated &&
  LockdownComplete (was just DscInstallComplete). Display PCs unchanged --
  they exit early via the no-DSC Phase 1 path.

- Invoke-SetupComplete now issues shutdown /r /t 10 in AsTask mode after
  writing the sync-complete marker and running the Configure-PC machine#
  prompt. Next boot triggers ShopFloor autologon, which materializes the
  ShopFloor profile from C:\Users\Default (where 03-ShellDefaults already
  baked in TaskbarAl=0, etc.).

- Phase 1->2 gap (waiting for tech to assign device category in Intune
  portal) now shows an explicit ACTION hint instead of empty checkboxes.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-15 16:01:52 -04:00
cproudlock
6db170bf54 Shell defaults + eDNC reg restore from machine-number backups
- 03-ShellDefaults.ps1: Default-User TaskbarAl=0 (left), HKLM policies to
  hide Start Recommended section, kill Bing web search + suggestions,
  disable Cortana. LTSC-honoured; runs fleet-wide via baseline loop.

- ntlars-backups/: 147 per-machine eDNC registry backups renamed to
  flat <MachineNumber>.reg scheme. Historical off-by-one entries from
  the original dump rewritten to match CSV-target MachineNo.

- Standard/03-RestoreEDncConfig.ps1: at imaging time, if tech typed a
  real machine number at PXE (not 9999), import <num>.reg from the local
  staged copy. Restores eFocas IP, PPDCS serial, Hssb relays -- not just
  the bare MachineNo. Skipped on Timeclock / 9999 / missing backup.

- Update-MachineNumber.ps1: when tech later sets a real number from 9999,
  pull <num>.reg from tsgwp00525 SFLD share (ntlarsBackupSharePath in
  site-config) and reg-import it before writing the new MachineNo.

- Restore-EDncReg.ps1: shared helper (Mount-SFLDShare + Import-EDncRegBackup)
  used by both callers.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-15 15:42:21 -04:00
cproudlock
67845372b2 Harvest provtool diagnostics, enable ETW channel, skip Timeclock machine#
run-enrollment.ps1:
- Enable Provisioning-Diagnostics-Provider/Admin event log before invoking
  provtool (was disabled by default; no diagnostics survived early runs).
- After provtool returns, copy C:\ProgramData\Microsoft\Provisioning\*
  into C:\Logs\PPKG\ and snapshot HKLM\...\Sessions\* as
  provisioning-sessions.json, plus export the Admin event channel to
  Provisioning-Diagnostics-Admin.evtx. Gives us reviewable state
  without relying on provtool's failure-only diagnostic bundle.
- provtool arg order is positional path + /quiet + /source BPRT (verified
  against ProvEventLog from the PS cmdlet internal call).

startnet.cmd / startnet-template.cmd:
- Standard-Timeclock sub-type skips the machine-number prompt. Timeclock
  PCs do not use a machine number so forcing a prompt wasted tech time
  and left MACHINENUM at the 9999 default anyway. Machine sub-type is
  unaffected.
2026-04-15 14:22:43 -04:00
cproudlock
cc9aad0ea1 Install-FromManifest: add Hash detection for content-versioned files
Needed for eMxInfo.txt (site-specific eDNC config). The file has no
DisplayVersion in the registry and no canonical MSI; we ship it as a
standalone secret on the SFLD share and key drift correction off its
SHA256. When the yearly replacement drops, bump the hash in
machineapps-manifest.json and every Standard-Machine PC catches up on
next logon.

Patched Install-FromManifest in all three copies (CMM, common, Standard)
for consistency. Also adds the eMxInfo.txt entry to the Standard
machineapps-manifest template and an Install-eMxInfo.cmd template that
copies the file into both 32/64-bit eDNC Program Files paths.
2026-04-15 12:37:35 -04:00
cproudlock
3ef981f19e Add Standard-Machine logon enforcer for UDC/eDNC/NTLARS
Reason: Intune DSC's main-category YAML was pushing these to every main
device, including Timeclocks - DSC has no awareness of our pc-subtype
distinction. After UDC/eDNC/NTLARS are removed from the DSC YAML, ongoing
version drift would no longer be corrected. This enforcer replaces that,
scoped correctly by subtype.

Structure mirrors CMM (CMM-Enforce.ps1) and common (Acrobat-Enforce.ps1):
- Machine-Enforce.ps1: SYSTEM logon task; mounts SFLD share with HKLM-
  backed creds; hands off to Install-FromManifest.
- machineapps-manifest.template.json: repo reference; authoritative copy
  lives on the share at \\tsgwp00525.wjs.geaerospace.net\shared\dt\
  shopfloor\main\machineapps\machineapps-manifest.json.
- Register-MachineEnforce.ps1: idempotent setup; stages scripts to
  C:\Program Files\GE\MachineApps and registers the task.
- lib/Install-FromManifest.ps1: copy of the common/ version (already has
  Type=CMD support).

Sub-type gating belt-and-suspenders:
- Run-ShopfloorSetup.ps1 only calls Register-MachineEnforce when
  $pcType -eq "Standard" -and $pcSubType -eq "Machine".
- Machine-Enforce.ps1 itself re-reads pc-subtype.txt and exits early if
  not "Machine", so a mistakenly-deployed copy no-ops.

site-config.json:
- Added "machineappsSharePath" to Standard-Machine pcProfile.

Drive letter U: to stay clear of CMM (S:) and Acrobat (T:) enforcers
that may run concurrently at logon.

Update workflow:
  drop new UDC/eDNC/NTLARS installer on the SFLD share,
  bump DetectionValue in machineapps-manifest.json,
  every Machine PC catches up on next user logon.
2026-04-15 12:16:17 -04:00
cproudlock
8848fca88a Add Acrobat Reader logon enforcer (cross-PC-type), provtool.exe arg fix
Acrobat Reader enforcement:
- playbook/shopfloor-setup/common/ is the cross-PC-type staging dir. Mirrors
  CMM/ structure (enforce script + its Install-FromManifest copy + manifest
  template + register script).
- Acrobat-Enforce.ps1 runs as SYSTEM on every logon, reads
  acrobatSharePath from site-config.common, mounts the SFLD share with
  the same HKLM-backed credential lookup CMM-Enforce uses, hands the
  acrobat-manifest.json from the share to Install-FromManifest.
- Install-FromManifest extended with Type=CMD so it can invoke vendor-
  supplied .cmd wrappers (Install-AcroReader.cmd does a two-step MSI+MSP
  install that does not fit MSI/EXE types cleanly). cmd.exe /c wraps it
  because UseShellExecute=false cannot launch .cmd directly.
- Register-AcrobatEnforce.ps1 stages scripts to C:\Program Files\GE\Acrobat
  and registers "GE Acrobat Enforce" scheduled task. Called from
  Run-ShopfloorSetup.ps1 right before the enrollment (PPKG) step so it
  applies to every PC type, not just CMM.
- acrobat-manifest.template.json is the repo reference; the authoritative
  copy lives on the SFLD share at
  \\tsgwp00525.wjs.geaerospace.net\shared\dt\shopfloor\common\acrobat\
  Bumping Acrobat updates = drop new MSP on share, bump DetectionValue in
  manifest; enforcer catches every PC on next logon.
- site-config.json: add "common": { "acrobatSharePath": ... }. Uses a
  new top-level block rather than a PC-type-specific one since Acrobat
  applies everywhere.

Initial install still happens via the preinstall flow
(Install-AcroReader.cmd during WinPE). The enforcer is the ongoing-
updates side; on a freshly-imaged PC detection passes and it no-ops.

Also in this commit:
- run-enrollment.ps1: provtool.exe argument syntax fix. First test
  returned 0x80004005 E_FAIL in 1s because /ppkg: and /log: are not
  valid provtool flags; the cmdlet's internal call used positional
  path + /quiet + /source. Switched to that syntax.
2026-04-15 09:24:13 -04:00
cproudlock
0292bc01ad Auto-flush stale SMB/conntrack state on DHCP lease, one-source PPKG model
Three changes that go together so a re-image never hits "System error 53":

1. dnsmasq dhcp-script hook (playbook/pxe-server-helpers/pxe-dhcp-hook.sh)
   Fires on every add/del lease event. Runs conntrack -D and ss -K for the
   client IP so any stale ESTABLISHED SMB session from a previous boot is
   cleared before the client reconnects. Runs as root (dnsmasq default).
   Wired into /etc/dnsmasq.conf via dhcp-script= directive in the playbook.

2. One-source PPKG (playbook/startnet.cmd + startnet-template.cmd)
   The 5 per-Office PPKG copies were bit-for-bit identical; only the
   filename differs because BPRT parses Office and Region out of the name.
   Store one source file (e.g. GCCH_Prod_SFLD_v4.11.ppkg) and construct
   the BPRT-tagged target filename at menu-selection time from variables:
     SOURCE_PPKG / PPKG_VER / PPKG_EXP / REGION / OFFICE
   copy /Y "Y:\ppkgs\%SOURCE_PPKG%" "W:\Enrollment\%PPKG%"
   Bumped PPKG_VER v4.10 -> v4.11 and PPKG_EXP 20260430 -> 20270430.
   Saves ~30G on disk per version.

3. run-enrollment.ps1 already committed in 5a9c3db uses provtool.exe
   directly (no PowerShell cmdlet 180s timeout). Included here because it
   is part of the same end-to-end PPKG path.
2026-04-15 09:03:16 -04:00
cproudlock
5a9c3db7af run-enrollment.ps1: invoke provtool.exe directly, skip PowerShell cmdlet timeout
Observed today on E8FHGDB4: Install-ProvisioningPackage timed out after
the PowerShell cmdlet's hardcoded 180s limit on a 7.6 GB GCCH v4.10
PPKG. The catch-block fell through to Add-ProvisioningPackage, which
returned "success" but the PPKG diagnostic bundle showed the child
provtool.exe was called with empty packagePathsToAdd (session created,
State=Not started, RebootCount=0). The PC was named, OOBE-completed,
and BPRT apps ran, but the bulk enrollment never applied - PC was not
Entra-joined.

Microsoft Docs GitHub issue 502 confirms the 180s cmdlet timeout is
hardcoded with no configuration option. Quest KB 4376269 suggests
rebuilding the PPKG with the latest Windows Configuration Designer,
but that is upstream and not under our control per PPKG.

Switch to Start-Process -FilePath provtool.exe -Wait. The wait is on
the actual child process, no caller-side timeout. provtool.exe is
what the cmdlet was invoking anyway; we just bypass the wrapper that
imposes the limit.

Sources:
  https://support.quest.com/on-demand-migration/kb/4376269
  https://github.com/MicrosoftDocs/windows-powershell-docs/issues/502
  https://learn.microsoft.com/en-us/windows/configuration/provisioning-packages/provisioning-apply-package
2026-04-15 08:35:35 -04:00
cproudlock
d6776f7c7f Reorganize repo, enrollment share taxonomy, Blancco USB-build fixes, v4.10 PPKGs
Workstation reorganization:
- All build/deploy/helper scripts moved into scripts/ (paths updated to use
  REPO_ROOT instead of SCRIPT_DIR so they resolve sibling dirs from the new
  depth)
- New config/ directory placeholder for site-specific overrides
- Removed stale: mok-keys/, test-vm.sh, test-lab.sh, setup-guide-original.txt,
  unattend/ (duplicate of moved playbook/FlatUnattendW10.xml)
- README.md and SETUP.md structure listings updated, dead "Testing with KVM"
  section removed
- .claude/ gitignored

Enrollment share internal taxonomy (forward-looking; existing servers
unaffected since they keep their current boot.wim with flat paths):
- Single SMB share kept (WinPE only mounts one Y: drive), but content now
  organised into ppkgs/, scripts/, config/, shopfloor-setup/, pre-install/{bios,
  installers}, installers-post/cmm/, blancco/, logs/
- README.md deployed to share root explaining each subdir
- New playbook tasks deploy site-config.json + wait-for-internet.ps1 +
  migrate-to-wifi.ps1 explicitly (were ad-hoc on legacy servers)
- BIOS subdir moved into pre-install/bios/, preinstall/ renamed to pre-install/
- startnet.cmd + startnet-template.cmd updated with new Y:\subdir\ paths
- Bumped GCCH PPKG references v4.9 -> v4.10

Blancco USB-build fixes (so next fresh USB install boots Blancco end-to-end
without the manual fixup we did against GOLD):
- grub-blancco.cfg: kernel/initrd switched HTTP -> TFTP (GRUB's HTTP module
  times out on multi-MB files); added modprobe.blacklist=iwlwifi,iwlmvm,btusb
  (WiFi drivers hang udev on Intel business PCs)
- grubx64.efi rebuilt from updated cfg
- Playbook task added to create /srv/tftp/blancco/ symlinks pointing at the
  HTTP-served binaries

run-enrollment.ps1: OOBEComplete is now set AFTER PPKG install (Win11 22H2+
hangs indefinitely if OOBEComplete is set before the bulk-enrollment PPKG runs).

Also includes deploy-bios.sh / pull-bios.sh / busybox-static / models.txt
that were sitting untracked at the repo root.
2026-04-14 16:01:02 -04:00
cproudlock
d14c240b48 Change dnsmasq-restart cron delay from 30s to 15s
Task name already said "15s after reboot" but content had sleep 30.
Align content with name; faster recovery from systemd-resolved race at boot.
2026-04-14 13:01:38 -04:00
cproudlock
ade2f3b5ff Fix USB install reliability: bash, LV resize, deps, idempotency
- autoinstall/user-data: move lvextend/growpart/pvresize BEFORE playbook
  so 130GB of drivers+PPKGs fits during first-boot copy. Use
  tr -d "[:space:]" to avoid breaking outer bash -c single-quote wrap.
- playbook: add executable: /bin/bash to Dell driver deploy (process
  substitution) and Blancco initramfs builder (brace expansion).
- playbook: make "Ensure Samba user for Blancco reports" idempotent via
  pdbedit check so re-runs don't abort the play.
- download-packages.sh: also download dist-upgrade package set. Explicit
  --simulate misses transitive version bumps (e.g. gnupg 17.4 needs
  matching gpgv 17.4) causing offline dpkg "dependency problems" when
  ISO baseline is older than noble-updates.
2026-04-14 12:57:28 -04:00