“When the network dies at 2am, you learn to love ethtool. And kernel pinning. And backups.” - Homelab incident retrospective
“Game over, man! Game over!” - Aliens. When the NIC goes dark mid-backup, it feels that way. But there’s a fix. Proxmox hosts with Intel I217/I219 NICs (e1000e driver) can lose network connectivity under load. Backups freeze. VMs go dark. Unplugging and replugging the cable sometimes brings it back. This is a kernel regression - not your hardware.
“The network just… stopped. Ping works from the switch but not from my laptop.” - Every homelab operator at 2am
Symptoms#
| What you see | When it happens |
|---|---|
Network interface shows UP but no traffic | During or after backups |
| SSH hangs, web UI unreachable | Under sustained load (sync jobs, VM migrations) |
dmesg shows e1000e: Detected Hardware Unit Hang | Sporadically |
| Unplug/replug cable restores connectivity | Until next heavy traffic |
Affected NICs: Intel I217-LM, I219-V, I219-LM, and similar onboard gigabit controllers. The e1000e kernel driver has regressions in Proxmox kernels 6.8.12-9-pve and newer (including 6.17.x). Your PCI I350 or other Intel NICs using the igb driver are fine.
Root Cause#
A regression in the e1000e driver’s hardware offload path. Under load, the TSO/GSO/GRO path can hang the NIC. The driver thinks the interface is up; the hardware has effectively frozen.
Two Fix Options#
| Option | Approach | Pros | Cons |
|---|---|---|---|
| A. ethtool workaround | Disable offload features on the affected interface | Stays on latest kernel; widely confirmed | Slight CPU overhead |
| B. Kernel pinning | Boot from an older kernel (e.g. 6.8.12-8-pve) | No config change to NIC | Stuck on old kernel until upstream fix |
Recommendation: Try Option A first. It works for most people and lets you keep current security patches.
Option A: ethtool Workaround (Recommended)#
Step 1: Identify the interface#
SSH to your Proxmox host:
# Find the I217/I219 (e1000e) interface
lspci -k | grep -A3 -i "e1000e\|219\|217"
# Interface name (often eno1 or enp0s31f6)
ip link showThe interface in use will typically be eno1 or enp0s31f6. In my case it was eno1 on a Dell OptiPlex 7080 with I219-LM.
Step 2: Apply immediately (no reboot)#
sudo ethtool -K eno1 gso off gro off tso off tx off rx off rxvlan off txvlan off sg offReplace eno1 with your interface name. Connectivity stays up; the fix takes effect instantly.
Step 3: Make it persistent#
Edit /etc/network/interfaces and add a post-up hook under the interface block:
# /etc/network/interfaces
iface eno1 inet manual
post-up ethtool -K eno1 gso off gro off tso off tx off rx off rxvlan off txvlan off sg off
auto vmbr0
iface vmbr0 inet static
address 192.168.2.131/24
gateway 192.168.2.1
bridge-ports eno1
bridge-stp off
bridge-fd 0If eno1 is already a bridge port (as above), the post-up runs when the interface is brought up. Reboot to verify it survives a restart.
eno1 is often the bridge slave for vmbr0. The post-up runs during ifup vmbr0 before the bridge attaches. No need to modify the bridge block - just add the post-up to the physical interface.Step 4: Verify#
sudo ethtool -k eno1 | grep -E 'generic-segmentation-offload|generic-receive-offload|tcp-segmentation-offload'Expected output:
tcp-segmentation-offload: off
generic-segmentation-offload: off
generic-receive-offload: offOption B: Kernel Pinning#
If the ethtool workaround doesn’t help (or you prefer not to change NIC settings):
# List available kernels
proxmox-boot-tool kernel list
# Pin to previous working kernel (e.g. 6.8.12-8-pve)
proxmox-boot-tool kernel pin 6.8.12-8-pve
# One-time test (next boot only)
proxmox-boot-tool kernel pin 6.8.12-8-pve --next-boot
# Unpin when fixed upstream
proxmox-boot-tool kernel unpinReboot after pinning. You’ll stay on that kernel until you unpin.
--next-boot is temporary. Use proxmox-boot-tool kernel pin <name> without flags to make it permanent. Otherwise you’ll boot back into the broken kernel after the next reboot.Why This Works#
“Sometimes the fix is ’turn off the optimization.’ The kernel has bugs. Workarounds are valid engineering.” - Kernel regression survivor
The ethtool fix: Hardware offloads (TSO, GSO, GRO) move packet segmentation and reassembly into the NIC. The e1000e driver’s offload path has a bug - under load it can hang the hardware. Disabling those offloads forces the kernel to do the work in software. Slightly more CPU, but the NIC stays stable.
Kernel pinning: The regression was introduced in a specific kernel release. Older kernels don’t have the bad code path. Pinning avoids the problem entirely at the cost of not getting newer kernel fixes.
Real-World Result#
On a Dell OptiPlex 7080 (I219-LM) running Proxmox 8.x with kernel 6.17.9-1-pve:
- Before: Network died during backups; required cable unplug/replug.
- After: ethtool workaround applied + persistent config. Backups run without interruption.
No kernel downgrade. No hardware replacement. One post-up line.
Related#
- Automated Proxmox Backups with Proxmox Backup Server - When your backups do work, here’s how to automate them
- Proxmox Automated Install - Reproducible installs for new nodes