Commit Graph

5 Commits

Author SHA1 Message Date
cproudlock
e9fc284dcb Restore-UDCData: mount share with SFLD creds instead of raw UNC from SYSTEM
Symptom: every Restore-UDCData log entry showed bay-level files as 'absent'
even when they actually existed on the share - on a device where another
PC's run had successfully consumed and migrated the same backup. Endless
'no work this cycle' loop on the device that should have done the consume.

Cause: script ran as NT AUTHORITY\SYSTEM (manifest engine on logon).
SYSTEM authenticates to remote SMB as the COMPUTER ACCOUNT
(DOMAIN\HOSTNAME$), not as a user. The SFLD share's ACL grants top-level
enumeration to authenticated computers (so Test-Path on share root +
bay dir returned True) but file-level read only to the SFLD user. With
no explicit user creds, Test-Path on bay-level files returns False -
indistinguishable from 'file not found' - so the script silently logged
'absent' on files that actually exist. A different PC with proper creds
consumed bay 3207 first; ours kept polling forever.

Update-MachineNumber.ps1's branch already worked around this by calling
Mount-SFLDShare (Restore-EDncReg.ps1's helper that reads
HKLM:\SOFTWARE\GE\SFLD\Credentials\* and net-use's the share with the
SFLD user identity).

Fix: Restore-UDCData.ps1 now does the same. Replaces raw-UNC Test-Path
polling with Mount-SFLDShare, probes via the W: drive letter, and
unmounts on every exit path. If creds are missing in registry the script
fails fast with a clear ERROR rather than masquerading as 'no backup'.
2026-05-01 11:50:04 -04:00
cproudlock
4f4f1f43e8 Restore-UDCData: handle ArchivedData-only backups (no CurrentData.json)
Production case: bay 3207 had ArchivedData\ on the share with full
production records but no CurrentData.json at the bay root. The previous
Restore logic treated CurrentData.json as the marker for "valid backup"
and exited early when absent, so the script silently no-op'd every cycle
even though there was real archive data ready to restore.

Asymmetric with Backup-UDCData.ps1, which already handles missing
CurrentData.json gracefully (it copies whatever exists). Possible causes
of CurrentData.json absence in a backup: source PC had no live UDC
session at backup time (UDC inactive / not recording), backup partially
failed for that one file (no Backup-side log to confirm without rerun).

Either way, an ArchivedData-only backup is still a valid backup.

Behavior change:
- Early-exit only when BOTH CurrentData.json AND ArchivedData\ are
  absent. Otherwise proceed with whatever exists.
- Copy step for CurrentData.json wrapped in srcCurExists guard.
- consumeOk now requires: every present source successfully copied,
  AND at least one thing was actually copied.
- Move-to-migrated wraps CurrentData.json move in Test-Path guard
  (was already guarded for ArchivedData).
- restore.manifest.json gains CurrentDataPresent and ArchivedDataPresent
  booleans so future audits can see which side actually restored.
- UDC relaunch now fires when EITHER copy succeeded (was only on
  CurrentData.json copy).

Verbose logs now distinguish three cases at the early-exit:
- Both absent: "no work to do this cycle" (the 99% path)
- Only ArchivedData\: WARN with explanation, proceed
- Only CurrentData.json: WARN with explanation, proceed

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-29 14:49:38 -04:00
cproudlock
6b3690e286 Restore-UDCData: verbose per-cycle logging + share-reachability retry
Two production-debuggability gaps closed.

1. Logging is now always-on. The previous version exited silently on the
   common no-op paths (no UDC installed, no backup waiting, share not
   reachable), leaving zero log evidence when techs reported "restore
   didn't happen". New behavior writes a header + identity + share-path
   + decision-point line to a single rotating log file every cycle.
   Errors include exception type, position, and full ScriptStackTrace.
   Log lives at C:\Logs\UDC\Restore-UDCData.log with a 1 MB cap and
   one-generation rotation to .old.log.

2. Share-reachability is now polled instead of probed once. The SFLD
   share over the SMB redirector takes 20-60 s to become reachable
   from SYSTEM context after a cold logon, especially on the first
   GE-Enforce cycle of the boot. The old single Test-Path returned
   false in that window and the script silently exited, missing the
   backup. New behavior polls Test-Path on the share root every 3 s
   for up to 60 s (both tunable via -ShareTimeoutSec / -SharePollSec)
   before deciding "no backup". If the share never comes up in that
   window the script exits 1 instead of 0 so the dispatcher logs a
   visible failure.

Both behaviors propagated to the host staging copy at
/home/camp/pxe-images/Restore-UDCData.ps1 and to the v2 share-staged
copy at tsgwp00525-v2/.../standard-machine/scripts/Restore-UDCData.ps1.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-29 12:49:04 -04:00
cproudlock
e169f8d0f5 Standard-Machine: UDC backup/restore use ArchivedData (not ArchiveData)
UDC's per-bay archive directory is C:\ProgramData\UDC\ArchivedData, not
ArchiveData. The previous spelling was a typo introduced when the scripts
were first written; it would have meant Backup-UDCData.ps1 found no archive
content (silent zero-file backups), and Restore-UDCData.ps1 wrote into a
location UDC does not read from.

Path swap is straight string replacement across both scripts plus the .bat
wrapper's usage comment. Manifest field names in backup.manifest.json /
restore.manifest.json (ArchivedDataPresent, ArchivedDataFiles,
ArchivedDataBytes) updated to match.

Update-MachineNumber.ps1's parallel UDC-restore branch (still uncommitted
in a prior workstream) has the same fix in the working tree, captured in
that branch's eventual commit.

The v2 share-staged copy at tsgwp00525-v2\standard-machine\scripts\
Restore-UDCData.ps1 also got the fix and is ready for push.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-29 11:45:39 -04:00
cproudlock
8564a37541 Standard-Machine: UDC data backup + restore scripts for PC swap workflow
Backup-UDCData.bat / Backup-UDCData.ps1: tech-runnable, UAC-self-elevating.
Run on the OLD PC before retirement; reads bay number from
udc_settings.json, copies CurrentData.json + ArchiveData/ to
\\tsgwp00525\...\backup\udc\<bay>\, drops backup.manifest.json. Refuses
the 9999 placeholder so backups never collide across PCs.

Restore-UDCData.ps1: idempotent, designed for the manifest engine. 99%
of cycles silent no-op (sub-second, zero side effects); 1% (cycle after
a backup lands at this PC's bay) restores files locally, moves consumed
backup to <bay>\migrated\<timestamp>\, writes restore.manifest.json,
relaunches UDC. Round-trip + no-op fast path verified end-to-end on the
win11 analyzer VM. Already wired into the Standard-Machine GE-Enforce
manifest at standard-machine\manifest.json on the v2 share.

Complementary to the placeholder-to-real branch in Update-MachineNumber.ps1:
that branch covers the 9999 -> real flow, this one covers the
pre-imaged-then-swapped flow where Update-MachineNumber already ran
before any backup existed. Both safely no-op if the other consumed.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-29 09:27:20 -04:00