PXE server: fix WinPE re-image SMB connection loss
WinPE clients re-imaging the same machine hit "System error 53 - network path not found" on the second attempt. systemctl restart smbd did not help; only a full server power cycle cleared the state. Root cause is kernel nf_conntrack: the default TCP ESTABLISHED timeout is 5 days (432000s), so a session from the first WinPE run whose client rebooted abnormally leaves an ASSURED ESTABLISHED entry that ufw's state-tracking rules then mis-classify the new SYN against. Fix applied in three layers: - /etc/sysctl.d/99-pxe-conntrack.conf drops TCP ESTABLISHED timeout to 1 hour and shortens the half-closed states to 30s each. - smb.conf gains socket options TCP_NODELAY SO_KEEPALIVE IPTOS_LOWDELAY plus keepalive = 30 and deadtime = 5. Active sessions refresh the conntrack timer every 30s via keepalives so they never age out; dead ones expire in an hour. - /usr/local/sbin/smb-diag.sh snapshots kernel + Samba state for remote diagnosis; /usr/local/sbin/smb-soft-reset.sh walks a progressive recovery (nmbd/smbd restart, conntrack flush, arp flush, ss -K) as an alternative to power-cycling. conntrack package added to download-packages.sh and playbook verify list so the offline .deb bundle ships with it. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
85
playbook/pxe-server-helpers/smb-diag.sh
Executable file
85
playbook/pxe-server-helpers/smb-diag.sh
Executable file
@@ -0,0 +1,85 @@
|
||||
#!/bin/bash
|
||||
# smb-diag.sh - snapshot Samba + kernel network state so a future failure
|
||||
# can be diagnosed remotely. Run this on the PXE server BEFORE power-cycling
|
||||
# when a WinPE re-image client is getting "cannot connect" errors.
|
||||
#
|
||||
# Output: /tmp/smb-diag-<timestamp>.log (pastebin-friendly)
|
||||
#
|
||||
# Captures: smbd processes, open SMB sessions, port 445 TCP sockets,
|
||||
# conntrack, arp, bridge fdb, dnsmasq leases, recent smbd logs.
|
||||
|
||||
set -o pipefail
|
||||
|
||||
TS=$(date +%Y%m%d-%H%M%S)
|
||||
OUT=/tmp/smb-diag-$TS.log
|
||||
|
||||
exec > >(tee "$OUT") 2>&1
|
||||
|
||||
echo "=============================================================="
|
||||
echo "SMB diagnostic snapshot - $(date)"
|
||||
echo "=============================================================="
|
||||
|
||||
echo
|
||||
echo "### uptime / kernel ###"
|
||||
uptime
|
||||
uname -r
|
||||
|
||||
echo
|
||||
echo "### interfaces + bridge state ###"
|
||||
ip -brief addr
|
||||
echo
|
||||
bridge link show 2>/dev/null
|
||||
echo
|
||||
bridge fdb show 2>/dev/null | head -30
|
||||
|
||||
echo
|
||||
echo "### smbd process tree ###"
|
||||
pstree -p $(systemctl show -p MainPID --value smbd 2>/dev/null) 2>/dev/null
|
||||
echo
|
||||
ps -eo pid,ppid,state,command | grep -E 'smbd|nmbd' | grep -v grep
|
||||
|
||||
echo
|
||||
echo "### systemctl status ###"
|
||||
systemctl is-active smbd nmbd dnsmasq apache2
|
||||
|
||||
echo
|
||||
echo "### smbstatus ###"
|
||||
smbstatus 2>&1 | head -40
|
||||
|
||||
echo
|
||||
echo "### port 445 sockets ###"
|
||||
ss -tnp 2>/dev/null | grep :445
|
||||
|
||||
echo
|
||||
echo "### conntrack entries for PXE subnet ###"
|
||||
if command -v conntrack >/dev/null 2>&1; then
|
||||
conntrack -L 2>&1 | grep -E '10\.9\.100' | head -30
|
||||
echo "total conntrack entries: $(conntrack -C 2>&1)"
|
||||
else
|
||||
echo "conntrack tool not installed"
|
||||
fi
|
||||
|
||||
echo
|
||||
echo "### arp / neighbour table for PXE subnet ###"
|
||||
ip neigh show 2>/dev/null | grep -E '10\.9\.100|br-pxe'
|
||||
|
||||
echo
|
||||
echo "### dnsmasq DHCP leases ###"
|
||||
cat /var/lib/misc/dnsmasq.leases 2>/dev/null | head -20
|
||||
|
||||
echo
|
||||
echo "### recent smbd log files ###"
|
||||
ls -la /var/log/samba/ 2>/dev/null | head -20
|
||||
|
||||
echo
|
||||
echo "### recent smbd auth / status errors (all machine logs) ###"
|
||||
grep -hE 'NT_STATUS|error|denied' /var/log/samba/log.*.log 2>/dev/null | tail -30
|
||||
|
||||
echo
|
||||
echo "### last 20 lines of smbd master log ###"
|
||||
tail -20 /var/log/samba/log.smbd 2>/dev/null
|
||||
|
||||
echo
|
||||
echo "=============================================================="
|
||||
echo "Snapshot saved to $OUT"
|
||||
echo "=============================================================="
|
||||
Reference in New Issue
Block a user