How to Fix ESXi Host Not Responding and Disconnections from vCenter: Step-by-Step Troubleshooting Guide
Last updated: July 23, 2025
Introduction
ESXi Host not responding and intermittent disconnections from vCenter Server can be frustrating and disruptive, especially in production environments. Affected hosts typically show a status of “Not Responding” in the vSphere Client, causing alarms and failed tasks. Most of the time, this is a result of lost heartbeat communication over UDP port 902.
In this article, we’ll explore the causes of these disconnections, walk you through step-by-step troubleshooting, highlight useful monitoring tools, and share a real-world scenario to help you resolve this issue confidently.
Why Triggers the “Not Responding” State?
vCenter Server monitors the availability of ESXi hosts using a heartbeat mechanism. These heartbeats are sent by the ESXi host every 10 seconds via UDP port 902. If vCenter doesn’t receive at least one heartbeat within a 60-second window, it flags the host as disconnected.
You may notice log entries like this in /var/log/vmware/vpxd/vpxd.log
:
[YYYY-MM-DD T hh:mm:ss verbose 'App'] [VpxdIntHost] Missed 2 heartbeats for host esxi.example.com
Common causes include:
- Firewall blocking UDP port 902
- Network congestion or packet drops
- Issues in ESXi host management agents (vpxa and hostd)
- High CPU or memory usage on vCenter (Bottleneck issue)
Step-by-Step Troubleshooting Guide
Step 1: Verify Heartbeat Transmission from ESXi Host
SSH into the ESXi host and run the following command: (ensure you start ssh service (TSM-SSH) in esxi web client)
tcpdump-uw dst host <vcenter_ip_address> and udp port 902
(Replace actual vCenter ip address)
This confirms the host is sending heartbeats packets to vCenter.
Step 2: Verify Heartbeat Receiving on vCenter Server
tcpdump src host <esxi_host_ip_address> and udp port 902
(Replace actual esxi host ip address)
As per above screenshot vCenter is receiving heartbeat packets, however If no packets appear, traffic is likely being blocked in transit. (port 902 is not allowed in firewall) and this need to be allow through firewall (Bi-directional or uni-directional.)
Interpret the Results
- Heartbeats sent but not received: Check intermediate firewalls, ACLs, or routers filtering UDP 902.
- Heartbeats not sent: Inspect ESXi services or logs such as
/var/log/hostd.log
. - Heartbeats sent and received: Investigate vCenter performance or internal processing issues or check vpxd logs in vCenter with particular date and time stamp when actual issue occurred .
Temporary Workaround: Increase Heartbeat Timeout
To prevent false disconnections while troubleshooting, increase the timeout window in vCenter:
Steps to increase Heartbeat Timeout in vSphere Client
- Login to the vSphere Web Client.
- Select your vCenter Server from the inventory.
- Go to Configure > Advanced Settings.
- Click Edit and add a new key value
Key: config.vpxd.heartbeat.notRespondingTimeout
Value: 120
- Click OK to save the change.
- Restart the vCenter service
service-control --stop vmware-vpxd && service-control --start vmware-vpxd
Important: This change is a temporary workaround. It gives more time for heartbeats to be received but doesn’t fix the actual network problem. Be sure to investigate and fix the root cause.
Monitoring Tools for Detecting Disconnections
To proactively monitor your environment, consider these tools:
- vRealize Operations Manager (vROps): Offers built-in alerts and dashboards for host connectivity issues.
- VMware Aria Operations (formerly vROps): For advanced correlation and AI-driven insights.
- Pingdom / Zabbix / PRTG: Use for network-layer heartbeat and uptime monitoring, Web Monitoring, Service Monitoring.
- Log Insight / vRealize Log Insight: Ideal for parsing logs to detect heartbeat drops and service restarts.
Real-World Example: Firewall Blocked udp 902 port at Branch Office
One of our customers faced a recurring issue where ESXi hosts at a remote branch office kept disconnecting from the main vCenter. After validating heartbeat transmission using tcpdump
, they found packets were sent from the host but never received by vCenter.
A deep-dive into their network path revealed that a recently updated firewall at the branch site was dropping outbound UDP traffic on port 902. Once an exception was added to allow that traffic, the problem resolved immediately, and host stability returned.
vCenter Server Under High Resource Utilization?
If heartbeat packets are reaching vCenter but the host still disconnects, the bottleneck might be on vCenter itself. Here’s what to check:
- Check CPU, memory, and disk I/O usage on the VCSA.
- Review the PostgreSQL DB size and health.
- Verify that the vCenter VM has sufficient resources assigned (recommend at least 16 GB RAM for small to mid environments).
- Ensure no snapshot is lingering on the vCenter VM.
Tools like top, esxtop, and df -h can help identify performance bottlenecks.
Final Thoughts
Frequent ESXi host disconnections can degrade performance, trigger alarms, and cause failed VM operations. While increasing heartbeat timeouts offers a temporary shield, long-term resolution requires a systematic approach—from confirming network paths to analyzing vCenter internals.
Use this guide as your go-to checklist. Combine it with monitoring tools and regular log analysis to keep your environment stable and reliable.
If still not able to identify issue , create a case in Broadcom Support Portal
If you are not aware How to create case with Broadcom, go through the Broadcom Official Article for guidance.
Tip: Always document firewall and network changes in shared knowledge bases to avoid surprises after updates.