Files
pxe-server/playbook/shopfloor-setup/Standard/Restore-UDCData.ps1
cproudlock 4f4f1f43e8 Restore-UDCData: handle ArchivedData-only backups (no CurrentData.json)
Production case: bay 3207 had ArchivedData\ on the share with full
production records but no CurrentData.json at the bay root. The previous
Restore logic treated CurrentData.json as the marker for "valid backup"
and exited early when absent, so the script silently no-op'd every cycle
even though there was real archive data ready to restore.

Asymmetric with Backup-UDCData.ps1, which already handles missing
CurrentData.json gracefully (it copies whatever exists). Possible causes
of CurrentData.json absence in a backup: source PC had no live UDC
session at backup time (UDC inactive / not recording), backup partially
failed for that one file (no Backup-side log to confirm without rerun).

Either way, an ArchivedData-only backup is still a valid backup.

Behavior change:
- Early-exit only when BOTH CurrentData.json AND ArchivedData\ are
  absent. Otherwise proceed with whatever exists.
- Copy step for CurrentData.json wrapped in srcCurExists guard.
- consumeOk now requires: every present source successfully copied,
  AND at least one thing was actually copied.
- Move-to-migrated wraps CurrentData.json move in Test-Path guard
  (was already guarded for ArchivedData).
- restore.manifest.json gains CurrentDataPresent and ArchivedDataPresent
  booleans so future audits can see which side actually restored.
- UDC relaunch now fires when EITHER copy succeeded (was only on
  CurrentData.json copy).

Verbose logs now distinguish three cases at the early-exit:
- Both absent: "no work to do this cycle" (the 99% path)
- Only ArchivedData\: WARN with explanation, proceed
- Only CurrentData.json: WARN with explanation, proceed

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-29 14:49:38 -04:00

331 lines
14 KiB
PowerShell

# Restore-UDCData.ps1 - Idempotent UDC data restore for the manifest engine.
#
# Triggered by the GE Shopfloor Enforce scheduled task (runs as SYSTEM, every
# user logon + every 5 min). Standard-machine manifest entry uses
# DetectionMethod=Always so this fires every cycle; the script self-decides
# whether there's actually any work to do.
#
# CONTRACT:
# - 99% of cycles: no backup waiting -> exit 0 in ~1 second, ~5 lines of log
# - 1 cycle (the one after Backup-UDCData lands a backup for this PC's bay):
# stop UDC, copy CurrentData.json + ArchivedData/ to C:\ProgramData\UDC,
# move consumed backup to <bay>\migrated\<timestamp>\, write
# restore.manifest.json, restart UDC. After this, root is empty so the
# check returns "no backup waiting" again on subsequent cycles.
#
# DESIGNED FOR THE SWAP WORKFLOW:
# New PC gets pre-imaged with real machine number + locked down, sits in
# storage. Days/weeks later, tech runs Backup-UDCData on old PC -> backup
# lands on share. Tech swaps PCs. New PC powers on at the bay -> ShopFloor
# autologon -> manifest engine fires this script -> backup detected ->
# restored -> UDC opens with prior history intact.
#
# Replaces the placeholder->real trigger in Update-MachineNumber.ps1 for the
# pre-imaged-then-swapped case (where the trigger fired at imaging time, before
# the backup existed). Update-MachineNumber.ps1's branch still handles the
# secondary case (tech used 9999 placeholder + sets number at bay) - both
# triggers safely no-op if the other already consumed the backup.
#
# LOGGING:
# Single rotating log at C:\Logs\UDC\Restore-UDCData.log (1 MB cap, rotated
# to .old.log on overflow). Every cycle writes a header line so even the
# silent no-op path leaves a trace. Errors include full exception type,
# position, and stack trace.
[CmdletBinding()]
param(
[string]$BackupShareRoot = '\\tsgwp00525.wjs.geaerospace.net\shared\dt\shopfloor\backup\udc',
[string]$UdcDataDir = 'C:\ProgramData\UDC',
[string]$UdcExePath = 'C:\Program Files\UDC\UDC.exe',
[string]$UdcSettingsPath = 'C:\ProgramData\UDC\udc_settings.json',
[string]$Site = 'West Jefferson',
# Share can take 20-60s to become reachable from SYSTEM context after a
# cold boot or fresh logon. Retry until then before deciding "no backup".
[int]$ShareTimeoutSec = 60,
[int]$SharePollSec = 3
)
$ErrorActionPreference = 'Continue'
# -- Logging setup --------------------------------------------------------
$logDir = 'C:\Logs\UDC'
try {
if (-not (Test-Path $logDir)) { New-Item -Path $logDir -ItemType Directory -Force | Out-Null }
} catch { $logDir = $env:TEMP }
$logFile = Join-Path $logDir 'Restore-UDCData.log'
$logFileMaxBytes = 1MB
# Rotate log file if oversized (keeps one prior generation)
try {
if ((Test-Path $logFile) -and ((Get-Item $logFile).Length -gt $logFileMaxBytes)) {
$rotated = Join-Path $logDir 'Restore-UDCData.old.log'
if (Test-Path $rotated) { Remove-Item $rotated -Force -ErrorAction SilentlyContinue }
Rename-Item -Path $logFile -NewName 'Restore-UDCData.old.log' -Force -ErrorAction SilentlyContinue
}
} catch {}
function Log {
param([string]$Msg, [string]$Level = 'INFO')
$ts = Get-Date -Format 'yyyy-MM-dd HH:mm:ss.fff'
$line = "[$ts][$Level] $Msg"
try { Add-Content -LiteralPath $logFile -Value $line -ErrorAction SilentlyContinue } catch {}
Write-Host $line
}
function LogErr {
param($Err)
if (-not $Err) { return }
$exType = if ($Err.Exception) { $Err.Exception.GetType().FullName } else { '<no exception>' }
$exMsg = if ($Err.Exception) { $Err.Exception.Message } else { "$Err" }
Log " exception: $exType - $exMsg" 'ERROR'
if ($Err.InvocationInfo -and $Err.InvocationInfo.PositionMessage) {
$pos = ($Err.InvocationInfo.PositionMessage -replace "`r?`n", ' | ')
Log " at: $pos" 'ERROR'
}
if ($Err.ScriptStackTrace) {
$st = ($Err.ScriptStackTrace -replace "`r?`n", ' | ')
Log " stack: $st" 'ERROR'
}
if ($Err.Exception -and $Err.Exception.InnerException) {
Log " inner: $($Err.Exception.InnerException.Message)" 'ERROR'
}
}
Log '==============================================='
Log "Restore-UDCData starting (PID $PID)"
Log "Hostname: $env:COMPUTERNAME"
try {
$whoami = [System.Security.Principal.WindowsIdentity]::GetCurrent().Name
} catch { $whoami = '<unknown>' }
Log "User identity: $whoami"
Log "PowerShell version: $($PSVersionTable.PSVersion)"
Log "BackupShareRoot: $BackupShareRoot"
Log "UdcDataDir: $UdcDataDir"
Log "UdcSettingsPath: $UdcSettingsPath"
Log "ShareTimeoutSec: $ShareTimeoutSec SharePollSec: $SharePollSec"
# -- Resolve local machine number ----------------------------------------
if (-not (Test-Path -LiteralPath $UdcSettingsPath)) {
Log "udc_settings.json not present - UDC not installed yet, no work to do."
Log 'Exit 0.'
exit 0
}
try {
$json = Get-Content -LiteralPath $UdcSettingsPath -Raw -ErrorAction Stop | ConvertFrom-Json -ErrorAction Stop
$mn = $json.GeneralSettings.MachineNumber
Log "Resolved MachineNumber from udc_settings: $mn"
} catch {
Log "Failed to parse $UdcSettingsPath" 'ERROR'
LogErr $_
Log 'Exit 0.'
exit 0
}
if (-not $mn -or $mn -eq '9999' -or $mn -notmatch '^\d+$') {
Log "Machine number is placeholder/empty/non-numeric ('$mn'). Update-MachineNumber.ps1's branch will catch the placeholder->real transition. No work to do."
Log 'Exit 0.'
exit 0
}
# -- Wait for the share to be reachable ---------------------------------
# When this script runs early in a logon (e.g. via GE-Enforce on autologon),
# the SFLD share via the SMB redirector can take 20-60 seconds to become
# reachable, especially in SYSTEM context where the credential is the
# computer account. Poll until reachable or timeout before deciding "no backup".
Log "Polling share root for reachability: $BackupShareRoot"
$shareReachable = $false
$sw = [Diagnostics.Stopwatch]::StartNew()
while ($sw.Elapsed.TotalSeconds -lt $ShareTimeoutSec) {
if (Test-Path -LiteralPath $BackupShareRoot) {
$shareReachable = $true
break
}
Start-Sleep -Seconds $SharePollSec
}
$sw.Stop()
if ($shareReachable) {
Log ("Share reachable after {0:N1} s" -f $sw.Elapsed.TotalSeconds)
} else {
Log "Share NOT reachable after $ShareTimeoutSec s. Cannot probe for backup. Exiting non-zero so the dispatcher logs a failure." 'ERROR'
Log 'Exit 1.'
exit 1
}
# -- Probe for a waiting backup ------------------------------------------
$bayDir = Join-Path $BackupShareRoot $mn
$srcCur = Join-Path $bayDir 'CurrentData.json'
$srcArc = Join-Path $bayDir 'ArchivedData'
Log "Probing backup paths for bay $mn"
Log " bayDir: $bayDir"
$bayDirExists = Test-Path -LiteralPath $bayDir
Log " bayDir exists: $bayDirExists"
$srcCurExists = Test-Path -LiteralPath $srcCur
Log " CurrentData.json src: $(if ($srcCurExists) { 'present' } else { 'absent' }) - $srcCur"
$srcArcExists = Test-Path -LiteralPath $srcArc
Log " ArchivedData/ src: $(if ($srcArcExists) { 'present' } else { 'absent' }) - $srcArc"
if (-not $srcCurExists -and -not $srcArcExists) {
Log "No backup waiting for bay $mn (neither CurrentData.json nor ArchivedData\ at bay root) - no work to do this cycle."
Log 'Exit 0.'
exit 0
}
if (-not $srcCurExists) {
Log "Partial backup waiting (ArchivedData\ present, CurrentData.json absent). Will restore ArchivedData\ only. Source PC may have had no live UDC session at backup time, or backup partially failed." 'WARN'
}
if (-not $srcArcExists) {
Log "Partial backup waiting (CurrentData.json present, ArchivedData\ absent). Will restore CurrentData.json only." 'WARN'
}
# -- We have a backup. Restore. ------------------------------------------
Log "Backup waiting at $bayDir - proceeding with restore"
# Stop UDC.exe so CurrentData.json isn't locked
$udcProcs = @(Get-Process UDC -ErrorAction SilentlyContinue)
Log "UDC processes currently running: $($udcProcs.Count)"
foreach ($p in $udcProcs) {
try {
Log " stopping UDC.exe PID $($p.Id)"
$p.Kill()
$p.WaitForExit(5000) | Out-Null
Log " stopped"
} catch {
Log " could not stop UDC.exe PID $($p.Id)" 'WARN'
LogErr $_
}
}
Start-Sleep -Milliseconds 500
# Ensure local UDC data dir exists
if (-not (Test-Path -LiteralPath $UdcDataDir)) {
Log "Creating local UDC data dir: $UdcDataDir"
try {
New-Item -Path $UdcDataDir -ItemType Directory -Force | Out-Null
} catch {
Log "Failed to create $UdcDataDir - cannot continue" 'ERROR'
LogErr $_
Log 'Exit 1.'
exit 1
}
}
$localCur = Join-Path $UdcDataDir 'CurrentData.json'
$localArc = Join-Path $UdcDataDir 'ArchivedData'
# Copy CurrentData.json (only if present at source)
$copiedCur = $false
if ($srcCurExists) {
Log "Copying CurrentData.json"
Log " src: $srcCur"
Log " dst: $localCur"
try {
Copy-Item -LiteralPath $srcCur -Destination $localCur -Force -ErrorAction Stop
$copiedCur = $true
$sz = (Get-Item -LiteralPath $localCur).Length
Log " OK ($sz bytes)"
} catch {
Log " FAILED" 'ERROR'
LogErr $_
}
} else {
Log "CurrentData.json not present in backup - skipping that copy step"
}
# Copy ArchivedData/
$copiedArc = $false
$arcFiles = 0
$arcBytes = 0
if ($srcArcExists) {
Log "Copying ArchivedData/"
Log " src: $srcArc"
Log " dst: $localArc"
try {
if (Test-Path -LiteralPath $localArc) {
Log " removing existing $localArc"
Remove-Item -LiteralPath $localArc -Recurse -Force -ErrorAction SilentlyContinue
}
Copy-Item -LiteralPath $srcArc -Destination $localArc -Recurse -Force -ErrorAction Stop
$arcItems = Get-ChildItem -LiteralPath $localArc -Recurse -File -ErrorAction SilentlyContinue
$arcFiles = @($arcItems).Count
$arcBytes = ($arcItems | Measure-Object Length -Sum).Sum
$copiedArc = $true
Log " OK ($arcFiles files, $arcBytes bytes)"
} catch {
Log " FAILED" 'ERROR'
LogErr $_
}
} else {
Log "ArchivedData/ not present in backup - skipping that copy step"
}
# One-shot consumption: only consume when every present source has been
# successfully copied. If a source was absent we don't fault on it; if a
# source was present but copy failed, we leave the live backup for retry.
# Must have copied at least one thing to consume.
$consumeOk = (($copiedCur -or -not $srcCurExists) -and `
($copiedArc -or -not $srcArcExists) -and `
($copiedCur -or $copiedArc))
Log "consumeOk=$consumeOk (copiedCur=$copiedCur, copiedArc=$copiedArc, srcCurExists=$srcCurExists, srcArcExists=$srcArcExists)"
if ($consumeOk) {
try {
$stamp = Get-Date -Format 'yyyy-MM-ddTHH-mm-ssZ'
$migDir = Join-Path $bayDir 'migrated'
$migStamp = Join-Path $migDir $stamp
Log "Moving consumed backup to $migStamp"
if (-not (Test-Path -LiteralPath $migDir)) { New-Item -ItemType Directory -Path $migDir -Force | Out-Null }
if (-not (Test-Path -LiteralPath $migStamp)) { New-Item -ItemType Directory -Path $migStamp -Force | Out-Null }
if (Test-Path -LiteralPath $srcCur) {
Move-Item -LiteralPath $srcCur -Destination (Join-Path $migStamp 'CurrentData.json') -Force -ErrorAction Stop
Log " moved CurrentData.json"
}
if (Test-Path -LiteralPath $srcArc) {
Move-Item -LiteralPath $srcArc -Destination (Join-Path $migStamp 'ArchivedData') -Force -ErrorAction Stop
Log " moved ArchivedData/"
}
$bakManifest = Join-Path $bayDir 'backup.manifest.json'
if (Test-Path -LiteralPath $bakManifest) {
Move-Item -LiteralPath $bakManifest -Destination (Join-Path $migStamp 'backup.manifest.json') -Force -ErrorAction SilentlyContinue
Log " moved backup.manifest.json"
}
$restoreManifest = [ordered]@{
RestoredAt = (Get-Date -Format 'o')
DestinationHostname = $env:COMPUTERNAME
DestinationUser = $whoami
MachineNumber = $mn
CurrentDataPresent = $copiedCur
CurrentDataBytes = if ($copiedCur) { (Get-Item -LiteralPath $localCur).Length } else { 0 }
ArchivedDataPresent = $copiedArc
ArchivedDataFiles = $arcFiles
ArchivedDataBytes = $arcBytes
RestoredVia = 'Restore-UDCData.ps1 (manifest engine, on logon)'
}
$restoreManifest | ConvertTo-Json | Set-Content -Path (Join-Path $migStamp 'restore.manifest.json') -Encoding UTF8
Log " wrote restore.manifest.json"
Log "Backup consumed -> migrated\$stamp\"
} catch {
Log "Move-to-migrated FAILED (data IS restored locally; live backup remains, next cycle will retry consumption)" 'ERROR'
LogErr $_
}
} else {
Log "Restore incomplete - leaving live backup at $bayDir for retry next cycle." 'WARN'
}
# Relaunch UDC with the current machine number args. UDC's vendor autostart in
# HKLM\Run will also fire on the next user logon, so this is belt-and-suspenders
# for the same-session case (e.g. tech is at the keyboard during the restore).
if ((Test-Path -LiteralPath $UdcExePath) -and ($copiedCur -or $copiedArc)) {
Log "Relaunching UDC.exe: `"$Site`" -$mn"
try {
Start-Process -FilePath $UdcExePath -ArgumentList @("`"$Site`"", "-$mn")
Log " relaunched"
} catch {
Log " UDC relaunch FAILED" 'WARN'
LogErr $_
}
}
Log 'Exit 0.'
Log '==============================================='
exit 0