Blancco: sweep full NIC driver tree into kexec-initrd + verbose init

Previous approach listed ~6 specific drivers (e1000e, igb, tg3, bnx2,
bnxt_en, b44) and silenced insmod errors (2>/dev/null). On modern Dell
fleet (Latitude 5330/5440, Pro-series, newer OptiPlex) this missed
igc (Intel I225/I226) entirely, and for the drivers we did include,
dependency modules they need at insmod time (libeth, libie, dca,
i2c-algo-bit, macsec, mii, libphy, ptp, ...) were never bundled.
insmod does not resolve dependencies, so NIC drivers that need
helpers failed to load silently.

playbook/pxe_server_setup.yml (kexec-initrd build):
  - Sweep the whole drivers/net/ethernet tree (~170 drivers, all
    vendors, ~15 MB total). Drivers for hardware not present skip
    without binding.
  - Add common helper dirs: drivers/net/{phy,mdio}, drivers/i2c/algos,
    drivers/dca, drivers/ptp, net/macsec, drivers/ssb.
  - overlay.ko kept.

playbook/blancco-init.sh:
  - Load helpers BEFORE main NIC drivers (libeth/libie, dca,
    i2c-algo-bit, macsec, mii, ssb, libphy, mdio*, phy*, ptp*),
    then iterate remaining modules.
  - Remove 2>/dev/null on insmod so actual failures surface on the
    boot console.
  - Print kernel version + /sys/class/net before/after driver load,
    plus dmesg grep for NIC driver activity.
  - On "no interface found" failure, dump dmesg tail and drop to a
    busybox shell for manual debug rather than just hanging.

Separate from this commit but related: kexec-initrd.img on both PXE
servers (.1 and .2) was rebuilt inline with these changes. Pre-rebuild
binary kept as kexec-initrd.img.bak-<timestamp>.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
cproudlock
2026-04-22 13:29:35 -04:00
parent 2ac88a6c1b
commit d7ec6a2b5f
2 changed files with 63 additions and 14 deletions

View File

@@ -671,16 +671,37 @@
ln -sf busybox "$WORK/bin/$cmd"
done
# NIC drivers (common server NICs)
# NIC drivers: sweep the whole drivers/net/ethernet tree. The
# earlier targeted list (e1000e, igb, tg3, bnx2, bnxt_en, b44)
# missed igc (Intel I225/I226 on modern Dell Latitude 5330/5440,
# Pro-series), plus helper modules (libeth, libie, dca,
# i2c-algo-bit, macsec) needed as dependencies by the main
# drivers. insmod does not resolve deps; bundling the full
# tree + helpers is the reliable way to cover any NIC.
KVER=$(uname -r)
KMOD="/lib/modules/$KVER/kernel/drivers/net/ethernet"
for drv in intel/e1000e/e1000e.ko.zst intel/igb/igb.ko.zst broadcom/tg3.ko.zst broadcom/bnx2.ko.zst broadcom/bnxt/bnxt_en.ko.zst broadcom/b44.ko.zst; do
if [ -f "$KMOD/$drv" ]; then
zstd -d "$KMOD/$drv" -o "$WORK/lib/modules/$(basename ${drv%.zst})" 2>/dev/null
fi
ETH="/lib/modules/$KVER/kernel/drivers/net/ethernet"
find "$ETH" -name "*.ko.zst" -type f 2>/dev/null | while read -r src; do
zstd -d "$src" -o "$WORK/lib/modules/$(basename ${src%.zst})" 2>/dev/null
done
# Overlay module
# Helper modules (PHY, MDIO, I2C, DCA, PTP, macsec, ssb) - loaded
# first in blancco-init.sh before the main NIC drivers.
for helper_dir in \
"/lib/modules/$KVER/kernel/drivers/net/phy" \
"/lib/modules/$KVER/kernel/drivers/net/mdio" \
"/lib/modules/$KVER/kernel/drivers/i2c/algos" \
"/lib/modules/$KVER/kernel/drivers/dca" \
"/lib/modules/$KVER/kernel/drivers/ptp" \
"/lib/modules/$KVER/kernel/net/macsec" \
"/lib/modules/$KVER/kernel/drivers/ssb" \
; do
[ -d "$helper_dir" ] || continue
find "$helper_dir" -name "*.ko.zst" -type f 2>/dev/null | while read -r src; do
zstd -d "$src" -o "$WORK/lib/modules/$(basename ${src%.zst})" 2>/dev/null
done
done
# Overlay module (switch_root overlay mount)
OVMOD="/lib/modules/$KVER/kernel/fs/overlayfs/overlay.ko.zst"
if [ -f "$OVMOD" ]; then
zstd -d "$OVMOD" -o "$WORK/lib/modules/overlay.ko" 2>/dev/null