This playbook gives support teams a shared diagnostic order of operations. The goal is to eliminate guesswork, capture evidence early and escalate only after the fast checks are complete.
Triage order
- Confirm the symptom by documenting what is failing, when it started and whether the issue is total or intermittent.
- Check power and physical links before moving into protocol analysis.
- Review recent changes such as firmware upgrades, switch work, address plan changes or maintenance windows.
- Pull device and gateway logs with timestamps that cover the first known failure window.
- Test the data path step by step from device to gateway to upstream application.
Minimum evidence set
- Device serial number, firmware revision and location.
- IP settings, protocol role and peer endpoint details.
- Photos or screenshots of LEDs, alarms and topology notes.
- Relevant log excerpts with exact timestamps.
Protocol-specific checks
For polling protocols, verify address ranges, unit IDs and timeouts. For publish-subscribe paths, confirm broker reachability, topic permissions and message freshness. For remote access faults, confirm both tunnel state and downstream asset reachability.
Escalation trigger
Escalate when the basic checks are complete, evidence is captured and the remaining question requires engineering access, firmware analysis or a reproducible lab test. A complete escalation package saves hours later.