IP SLA with Syslog Alerting

Most network outages are not detected by the router that experiences them — they are detected by an end user who calls the help desk, or by a NOC engineer refreshing a dashboard. IP SLA (Internet Protocol Service Level Agreement) changes this by turning the router itself into an active probe: it continuously sends synthetic test traffic to a target, measures reachability and performance, and declares the target either reachable or unreachable based on those measurements. Object Tracking watches the IP SLA result and translates probe outcomes into a binary state — Up or Down. EEM applets subscribe to that state and fire the moment a transition occurs, generating syslog alerts, capturing diagnostics, and sending email notifications — all before any human has opened a dashboard.

This lab assembles that complete monitoring pipeline on a single router. It covers four progressively more capable probe types: ICMP echo for basic reachability, HTTP for application-layer connectivity, UDP jitter for voice-quality monitoring, and DNS for resolver availability. Each probe is wired to a tracking object and an EEM applet that generates an alert on failure and a recovery notification when the target returns. The result is a lightweight, self-contained WAN monitoring system that requires no external NMS, no SNMP collector, and no subscription.

Before starting, ensure you understand IP SLA probe types and tracking at IP SLA Configuration & Tracking. For the EEM applet architecture used in this lab, see EEM — Embedded Event Manager Scripting. For forwarding alerts to a central syslog server, see Syslog Configuration and Syslog Server Configuration. For understanding syslog severity levels referenced in the EEM actions, see Syslog Severity Levels.

1. IP SLA + Track + EEM — The Full Pipeline

How the Three Components Work Together

IP SLA Probe Types Covered in this Lab

Probe Type	IOS Keyword	What It Measures	Requires Responder?	Typical Use Case
ICMP Echo	`icmp-echo`	Round-trip time (RTT) and reachability to any IP-addressable target	No — target only needs to respond to ICMP (ping)	WAN gateway reachability, ISP monitoring, basic link health
UDP Jitter	`udp-jitter`	RTT, jitter (delay variation), packet loss, and out-of-order delivery — full VoIP quality metrics	Yes — Cisco IP SLA Responder must be enabled on the target router	Voice quality monitoring, MPLS SLA verification, QoS validation
HTTP	`http get`	HTTP GET response time and HTTP return code — application-layer reachability	No — any web server	Web application availability, DNS + HTTP end-to-end testing
DNS	`dns`	DNS resolution time and success/failure for a specific hostname	No — standard DNS server	DNS resolver availability monitoring, split-horizon DNS validation
TCP Connect	`tcp-connect`	TCP three-way handshake completion time to a specific IP and port	No — any TCP server (port 80, 443, 22, etc.)	Application port availability — verify SSH, HTTPS, or custom services are accepting connections

Tracking Object Types

Track Type	IOS Syntax	State Goes Down When	Use Case
Reachability	`track N ip sla N reachability`	The IP SLA operation returns a failure (timeout, unreachable, non-OK return code)	Binary up/down monitoring — did the probe succeed or fail?
State	`track N ip sla N state`	The IP SLA operation result changes from its baseline value (over-threshold or under-threshold)	Threshold-based monitoring — did RTT exceed the configured threshold?

2. Lab Topology & Monitoring Plan

NetsTuts_R1 is a dual-WAN edge router with a primary ISP (Gi0/0) and backup ISP (Gi0/1). Four monitoring probes will be deployed: a WAN gateway ICMP probe on each ISP, a UDP jitter probe to the branch office, and an HTTP probe to a critical internal web server. Each has a dedicated tracking object and a pair of EEM applets (down alert + up recovery).

3. Step 1 — EEM Prerequisites and Environment Variables

Configure the global EEM prerequisites before writing any applets. These settings are shared across all four monitoring pairs.

NetsTuts_R1>en
NetsTuts_R1#conf t

! ══════════════════════════════════════════════════════════
! EEM CLI execution user — required for action cli command
! ══════════════════════════════════════════════════════════
NetsTuts_R1(config)#username eem-user privilege 15 secret EEM$ecret99
NetsTuts_R1(config)#event manager session cli username eem-user

! ══════════════════════════════════════════════════════════
! EEM environment variables — centralised parameters
! ══════════════════════════════════════════════════════════
NetsTuts_R1(config)#event manager environment _hostname     NetsTuts_R1
NetsTuts_R1(config)#event manager environment _email_server 10.0.0.25
NetsTuts_R1(config)#event manager environment _email_from    
NetsTuts_R1(config)#event manager environment _email_to     
NetsTuts_R1(config)#event manager environment _log_dir      flash:/sla-logs/

! ── Create the log directory on flash ─────────────────────
NetsTuts_R1#mkdir flash:/sla-logs
Create directory filename [sla-logs]? [Enter]
Created dir flash:/sla-logs

! ── Verify NTP is synchronised — timestamps must be accurate
! ── for syslog correlation during outage post-mortems ─────
NetsTuts_R1#show ntp status | include Clock
Clock is synchronized, stratum 2, reference is 10.0.0.200

Accurate timestamps in syslog are essential for correlating IP SLA alerts with other events in the network — a syslog alert with the wrong time is worse than no alert because it actively misleads the investigation. Confirm NTP is synchronised with

show ntp
    status

before deploying probes. For NTP configuration, see NTP Synchronisation. For the full EEM prerequisites explanation, see EEM — Embedded Event Manager Scripting.

4. Step 2 — ICMP Echo Probes for WAN Gateway Monitoring

ICMP echo probes send periodic pings from a specific source interface to the ISP gateway. Using source-interface ensures the probe tests the actual WAN path and is sourced from the correct interface — not just the best-path reachability from the router's routing table.

! ══════════════════════════════════════════════════════════
! IP SLA 1 — ISP-A gateway reachability (primary WAN)
! ══════════════════════════════════════════════════════════
NetsTuts_R1(config)#ip sla 1
NetsTuts_R1(config-ip-sla)# icmp-echo 203.0.113.1 source-interface GigabitEthernet0/0
NetsTuts_R1(config-ip-sla-echo)#  frequency 30
NetsTuts_R1(config-ip-sla-echo)#  timeout 5000
NetsTuts_R1(config-ip-sla-echo)#  threshold 2000
NetsTuts_R1(config-ip-sla-echo)#  tag ISP-A-GW-MONITOR
NetsTuts_R1(config-ip-sla-echo)#exit
NetsTuts_R1(config)#ip sla schedule 1 life forever start-time now

! ══════════════════════════════════════════════════════════
! IP SLA 2 — ISP-B gateway reachability (backup WAN)
! ══════════════════════════════════════════════════════════
NetsTuts_R1(config)#ip sla 2
NetsTuts_R1(config-ip-sla)# icmp-echo 198.51.100.1 source-interface GigabitEthernet0/1
NetsTuts_R1(config-ip-sla-echo)#  frequency 30
NetsTuts_R1(config-ip-sla-echo)#  timeout 5000
NetsTuts_R1(config-ip-sla-echo)#  threshold 2000
NetsTuts_R1(config-ip-sla-echo)#  tag ISP-B-GW-MONITOR
NetsTuts_R1(config-ip-sla-echo)#exit
NetsTuts_R1(config)#ip sla schedule 2 life forever start-time now

The key parameters for a WAN gateway probe: frequency 30 sends one probe every 30 seconds — a good balance between detection speed and ICMP overhead on the WAN link. timeout 5000 declares the probe a failure if no response is received within 5,000 milliseconds (5 seconds). threshold 2000 marks the RTT as over-threshold when it exceeds 2,000 ms — this feeds the track N ip sla N state tracking type, which alerts on latency degradation even before complete packet loss occurs. source-interface is critical — without it, IOS sources the probe from the best available exit interface. If the WAN link fails but a LAN route to the gateway still exists (rare but possible), the probe would succeed via the LAN — giving a false healthy result for the WAN interface specifically.

Tracking Objects for ICMP Probes

! ── Track 1: reachability of ISP-A gateway ────────────────
NetsTuts_R1(config)#track 1 ip sla 1 reachability
NetsTuts_R1(config-track)# delay down 10 up 10
NetsTuts_R1(config-track)#exit

! ── Track 2: reachability of ISP-B gateway ────────────────
NetsTuts_R1(config)#track 2 ip sla 2 reachability
NetsTuts_R1(config-track)# delay down 10 up 10
NetsTuts_R1(config-track)#exit

The delay down 10 up 10 setting introduces a 10-second hold-down in both directions. Without this, a single dropped probe (possible due to a momentary burst of congestion or a brief ICMP rate-limit on the ISP router) would immediately flip the tracking object to Down and fire the EEM alert. With delay down 10, the tracking object only transitions to Down after the probe has been failing continuously for 10 seconds — approximately one missed probe cycle at frequency 30. The delay up 10 prevents a flapping link from generating rapid successive Down/Up alert pairs before it has truly stabilised. For detailed IP SLA and tracking configuration, see IP SLA Configuration & Tracking.

5. Step 3 — UDP Jitter Probe for Voice Quality Monitoring

UDP jitter probes measure the metrics that matter for voice and video quality: jitter (delay variation), packet loss, and out-of-order delivery. Unlike ICMP echo, UDP jitter requires a Cisco IP SLA Responder on the target device. The responder timestamps probe packets with hardware-precision clocks, enabling accurate one-way delay measurement.

Configure the Responder on the Branch Router

! ── On Branch_Router — enable IP SLA Responder ────────────
Branch_Router>en
Branch_Router#conf t
Branch_Router(config)#ip sla responder
Branch_Router(config)#ip sla responder udp-echo ipaddress 10.10.0.1 port 5000
Branch_Router(config)#end
Branch_Router#wr

Configure the UDP Jitter Probe on NetsTuts_R1

! ══════════════════════════════════════════════════════════
! IP SLA 3 — UDP jitter to branch (voice quality)
! ══════════════════════════════════════════════════════════
NetsTuts_R1(config)#ip sla 3
NetsTuts_R1(config-ip-sla)# udp-jitter 10.10.0.1 5000 source-ip 10.0.0.1 \
   source-port 5001 num-packets 20 interval 20
NetsTuts_R1(config-ip-sla-jitter)#  frequency 60
NetsTuts_R1(config-ip-sla-jitter)#  timeout 5000
NetsTuts_R1(config-ip-sla-jitter)#  threshold 150
NetsTuts_R1(config-ip-sla-jitter)#  rtt-threshold 100
NetsTuts_R1(config-ip-sla-jitter)#  mos-threshold 3.60
NetsTuts_R1(config-ip-sla-jitter)#  tag BRANCH-VOIP-MONITOR
NetsTuts_R1(config-ip-sla-jitter)#exit
NetsTuts_R1(config)#ip sla schedule 3 life forever start-time now

! ── Track 3: state of jitter probe ───────────────────────
! ── Uses "state" not "reachability" — detects threshold
! ── violations even without complete packet loss ──────────
NetsTuts_R1(config)#track 3 ip sla 3 state
NetsTuts_R1(config-track)# delay down 15 up 30
NetsTuts_R1(config-track)#exit

The UDP jitter probe sends num-packets 20 test packets spaced interval 20 milliseconds apart — simulating a stream of RTP voice packets. rtt-threshold 100 marks the operation over-threshold if the average RTT exceeds 100 ms — the ITU-T G.114 recommendation for one-way voice delay is 150 ms, so a 100 ms RTT threshold provides early warning before voice quality degrades. mos-threshold 3.60 generates a threshold violation if the calculated MOS (Mean Opinion Score) drops below 3.60 — a MOS below 3.5 is considered unacceptable for VoIP. For QoS configuration that protects voice traffic on the WAN link, see QoS Overview. delay up 30 is longer than delay down 15 because voice quality must stabilise for 30 seconds before declaring recovery — a brief improvement followed by another degradation would otherwise generate rapid Down/Up alert pairs.

6. Step 4 — HTTP and DNS Application Probes

ICMP and UDP probes test Layer 3/4 connectivity. HTTP and DNS probes test the application layer — a server can be pingable while its web service or DNS resolver is down. These probes catch application failures that ICMP monitoring misses entirely.

! ══════════════════════════════════════════════════════════
! IP SLA 4 — HTTP GET to internal web server
! ══════════════════════════════════════════════════════════
NetsTuts_R1(config)#ip sla 4
NetsTuts_R1(config-ip-sla)# http get http://10.0.0.50/health source-ip 10.0.0.1
NetsTuts_R1(config-ip-sla-http)#  frequency 60
NetsTuts_R1(config-ip-sla-http)#  timeout 10000
NetsTuts_R1(config-ip-sla-http)#  threshold 5000
NetsTuts_R1(config-ip-sla-http)#  tag WEBSERVER-HTTP-MONITOR
NetsTuts_R1(config-ip-sla-http)#exit
NetsTuts_R1(config)#ip sla schedule 4 life forever start-time now

! ── Track 4: reachability of HTTP probe ───────────────────
NetsTuts_R1(config)#track 4 ip sla 4 reachability
NetsTuts_R1(config-track)# delay down 15 up 15
NetsTuts_R1(config-track)#exit

! ══════════════════════════════════════════════════════════
! IP SLA 5 — DNS resolution test
! ══════════════════════════════════════════════════════════
NetsTuts_R1(config)#ip sla 5
NetsTuts_R1(config-ip-sla)# dns netstuts.com name-server 10.0.0.53 \
   source-ip 10.0.0.1
NetsTuts_R1(config-ip-sla-dns)#  frequency 60
NetsTuts_R1(config-ip-sla-dns)#  timeout 5000
NetsTuts_R1(config-ip-sla-dns)#  tag DNS-RESOLVER-MONITOR
NetsTuts_R1(config-ip-sla-dns)#exit
NetsTuts_R1(config)#ip sla schedule 5 life forever start-time now

! ── Track 5: reachability of DNS probe ───────────────────
NetsTuts_R1(config)#track 5 ip sla 5 reachability
NetsTuts_R1(config-track)# delay down 15 up 15
NetsTuts_R1(config-track)#exit

The HTTP probe fetches /health — a lightweight status endpoint that returns HTTP 200 if the application is running. A full page fetch would work but wastes bandwidth on every probe cycle. If the web server is responding but the application is down (returning HTTP 500), the IP SLA HTTP probe detects this because it checks for a successful HTTP response code — not just TCP connectivity. The DNS probe resolves netstuts.com against the specific internal DNS server 10.0.0.53 — it will fail if the DNS server is unreachable or if the resolver cannot resolve the name, but will succeed even if external DNS is down (as long as the internal resolver is healthy). For static routing configuration that uses tracking objects for WAN failover, see Static Routing Configuration.

7. Step 5 — EEM Applets: Down Alert and Up Recovery Pairs

Each monitoring target needs two applets — one that fires when the tracking object goes Down (alert) and one that fires when it returns to Up (recovery notification). Without the recovery applet, the NOC team has no automated confirmation that an outage has ended and must manually verify resolution.

ISP-A Gateway — Down and Up Applets

! ══════════════════════════════════════════════════════════
! ISPA-GW-DOWN — fires when track 1 transitions to Down
! ══════════════════════════════════════════════════════════
NetsTuts_R1(config)#event manager applet ISPA-GW-DOWN
NetsTuts_R1(config-applet)# description "Alert: ISP-A gateway unreachable"
NetsTuts_R1(config-applet)# event track 1 state down
NetsTuts_R1(config-applet)#  maxrun 90

! ── ACTION 1: Critical syslog alert ──────────────────────
NetsTuts_R1(config-applet)# action 1.0 syslog priority critical \
   msg "*** OUTAGE *** ISP-A gateway 203.0.113.1 UNREACHABLE on $_hostname — SLA probe failing"

! ── ACTION 2: Capture SLA statistics at moment of failure ─
NetsTuts_R1(config-applet)# action 2.0 cli command "enable"
NetsTuts_R1(config-applet)# action 2.1 cli command \
   "show ip sla statistics 1 | redirect flash:/sla-logs/ispa-failure.txt"

! ── ACTION 3: Capture interface state ────────────────────
NetsTuts_R1(config-applet)# action 3.0 cli command \
   "show interfaces GigabitEthernet0/0 | redirect flash:/sla-logs/ispa-intf.txt"

! ── ACTION 4: Capture routing table — confirm failover ────
NetsTuts_R1(config-applet)# action 4.0 cli command \
   "show ip route | redirect flash:/sla-logs/ispa-route.txt"

! ── ACTION 5: Email NOC team ──────────────────────────────
NetsTuts_R1(config-applet)# action 5.0 mail server "$_email_server" \
   to "$_email_to" \
   from "$_email_from" \
   subject "*** OUTAGE: ISP-A Gateway DOWN on $_hostname ***" \
   body "ALERT: IP SLA probe 1 reports ISP-A gateway 203.0.113.1 \
   is UNREACHABLE from $_hostname. \
   Diagnostics saved to flash:/sla-logs/. \
   Verify routing failover to ISP-B is active. \
   Check show ip route on the router."

NetsTuts_R1(config-applet)#exit

! ══════════════════════════════════════════════════════════
! ISPA-GW-UP — fires when track 1 transitions back to Up
! ══════════════════════════════════════════════════════════
NetsTuts_R1(config)#event manager applet ISPA-GW-UP
NetsTuts_R1(config-applet)# description "Recovery: ISP-A gateway reachable again"
NetsTuts_R1(config-applet)# event track 1 state up
NetsTuts_R1(config-applet)#  maxrun 60

! ── ACTION 1: Informational syslog — recovery ─────────────
NetsTuts_R1(config-applet)# action 1.0 syslog priority notice \
   msg "*** RECOVERY *** ISP-A gateway 203.0.113.1 REACHABLE again on $_hostname"

! ── ACTION 2: Capture SLA statistics after recovery ───────
NetsTuts_R1(config-applet)# action 2.0 cli command "enable"
NetsTuts_R1(config-applet)# action 2.1 cli command \
   "show ip sla statistics 1 | redirect flash:/sla-logs/ispa-recovery.txt"

! ── ACTION 3: Email recovery notification ─────────────────
NetsTuts_R1(config-applet)# action 3.0 mail server "$_email_server" \
   to "$_email_to" \
   from "$_email_from" \
   subject "RECOVERY: ISP-A Gateway restored on $_hostname" \
   body "RECOVERY: IP SLA probe 1 reports ISP-A gateway 203.0.113.1 \
   is now REACHABLE from $_hostname. \
   Verify primary routing has been restored. \
   Check routing table to confirm ISP-A routes are active."

NetsTuts_R1(config-applet)#exit

ISP-B Gateway — Down and Up Applets

NetsTuts_R1(config)#event manager applet ISPB-GW-DOWN
NetsTuts_R1(config-applet)# description "Alert: ISP-B gateway unreachable"
NetsTuts_R1(config-applet)# event track 2 state down
NetsTuts_R1(config-applet)#  maxrun 90
NetsTuts_R1(config-applet)# action 1.0 syslog priority critical \
   msg "*** OUTAGE *** ISP-B gateway 198.51.100.1 UNREACHABLE on $_hostname"
NetsTuts_R1(config-applet)# action 2.0 cli command "enable"
NetsTuts_R1(config-applet)# action 2.1 cli command \
   "show ip sla statistics 2 | redirect flash:/sla-logs/ispb-failure.txt"
NetsTuts_R1(config-applet)# action 3.0 cli command \
   "show interfaces GigabitEthernet0/1 | redirect flash:/sla-logs/ispb-intf.txt"
NetsTuts_R1(config-applet)# action 4.0 mail server "$_email_server" \
   to "$_email_to" \
   from "$_email_from" \
   subject "OUTAGE: ISP-B Gateway DOWN on $_hostname" \
   body "IP SLA probe 2 reports ISP-B gateway 198.51.100.1 \
   is UNREACHABLE from $_hostname."
NetsTuts_R1(config-applet)#exit

NetsTuts_R1(config)#event manager applet ISPB-GW-UP
NetsTuts_R1(config-applet)# description "Recovery: ISP-B gateway reachable again"
NetsTuts_R1(config-applet)# event track 2 state up
NetsTuts_R1(config-applet)#  maxrun 30
NetsTuts_R1(config-applet)# action 1.0 syslog priority notice \
   msg "*** RECOVERY *** ISP-B gateway 198.51.100.1 REACHABLE again on $_hostname"
NetsTuts_R1(config-applet)# action 2.0 mail server "$_email_server" \
   to "$_email_to" \
   from "$_email_from" \
   subject "RECOVERY: ISP-B Gateway restored on $_hostname" \
   body "IP SLA probe 2 reports ISP-B gateway 198.51.100.1 \
   is REACHABLE from $_hostname."
NetsTuts_R1(config-applet)#exit

Branch Jitter — Down and Up Applets

NetsTuts_R1(config)#event manager applet BRANCH-JITTER-DOWN
NetsTuts_R1(config-applet)# description "Alert: Branch voice quality degraded"
NetsTuts_R1(config-applet)# event track 3 state down
NetsTuts_R1(config-applet)#  maxrun 90
NetsTuts_R1(config-applet)# action 1.0 syslog priority critical \
   msg "*** VOIP DEGRADED *** Branch UDP jitter probe failing on $_hostname — check WAN QoS"
NetsTuts_R1(config-applet)# action 2.0 cli command "enable"
NetsTuts_R1(config-applet)# action 2.1 cli command \
   "show ip sla statistics 3 details | redirect flash:/sla-logs/branch-jitter.txt"
NetsTuts_R1(config-applet)# action 3.0 cli command \
   "show policy-map interface GigabitEthernet0/0 | \
   redirect flash:/sla-logs/branch-qos.txt"
NetsTuts_R1(config-applet)# action 4.0 mail server "$_email_server" \
   to "$_email_to" \
   from "$_email_from" \
   subject "VOIP QUALITY ALERT: Branch jitter threshold exceeded on $_hostname" \
   body "UDP jitter probe 3 to branch (10.10.0.1) reports threshold \
   violation on $_hostname. VoIP quality may be degraded. \
   Check WAN QoS policy and interface utilisation."
NetsTuts_R1(config-applet)#exit

NetsTuts_R1(config)#event manager applet BRANCH-JITTER-UP
NetsTuts_R1(config-applet)# description "Recovery: Branch voice quality restored"
NetsTuts_R1(config-applet)# event track 3 state up
NetsTuts_R1(config-applet)#  maxrun 30
NetsTuts_R1(config-applet)# action 1.0 syslog priority notice \
   msg "*** VOIP RESTORED *** Branch jitter probe back in threshold on $_hostname"
NetsTuts_R1(config-applet)# action 2.0 cli command "enable"
NetsTuts_R1(config-applet)# action 2.1 cli command \
   "show ip sla statistics 3 | redirect flash:/sla-logs/branch-jitter-recovery.txt"
NetsTuts_R1(config-applet)#exit

Web Server — Down and Up Applets

NetsTuts_R1(config)#event manager applet WEBSERVER-DOWN
NetsTuts_R1(config-applet)# description "Alert: Internal web server HTTP probe failing"
NetsTuts_R1(config-applet)# event track 4 state down
NetsTuts_R1(config-applet)#  maxrun 60
NetsTuts_R1(config-applet)# action 1.0 syslog priority critical \
   msg "*** OUTAGE *** Web server HTTP probe FAILING on $_hostname — http://10.0.0.50"
NetsTuts_R1(config-applet)# action 2.0 cli command "enable"
NetsTuts_R1(config-applet)# action 2.1 cli command \
   "show ip sla statistics 4 | redirect flash:/sla-logs/webserver-failure.txt"
NetsTuts_R1(config-applet)# action 3.0 mail server "$_email_server" \
   to "$_email_to" \
   from "$_email_from" \
   subject "OUTAGE: Web server (10.0.0.50) DOWN on $_hostname" \
   body "IP SLA HTTP probe 4 cannot reach http://10.0.0.50/health. \
   Server may be down or the application is not responding. \
   Escalate to the application team."
NetsTuts_R1(config-applet)#exit

NetsTuts_R1(config)#event manager applet WEBSERVER-UP
NetsTuts_R1(config-applet)# description "Recovery: Web server HTTP responding again"
NetsTuts_R1(config-applet)# event track 4 state up
NetsTuts_R1(config-applet)#  maxrun 30
NetsTuts_R1(config-applet)# action 1.0 syslog priority notice \
   msg "*** RECOVERY *** Web server HTTP probe SUCCEEDING on $_hostname"
NetsTuts_R1(config-applet)# action 2.0 mail server "$_email_server" \
   to "$_email_to" \
   from "$_email_from" \
   subject "RECOVERY: Web server (10.0.0.50) restored on $_hostname" \
   body "IP SLA HTTP probe 4 reports http://10.0.0.50/health is \
   responding successfully. Application appears to be restored."
NetsTuts_R1(config-applet)#exit

NetsTuts_R1(config)#end
NetsTuts_R1#wr

The event track N state up recovery applet is as important as the down alert. Without it, the NOC team must either manually poll the router for track state or wait for the next monitoring cycle on their NMS to confirm resolution. An automated recovery notification closes the incident loop: the on-call engineer receives the down alert, works the issue, and receives the recovery alert — no manual verification step needed. For OSPF deployments where the tracking object also controls route injection or redistribution, recovery is especially critical to confirm the primary route has been re-advertised — see OSPF Single-Area Configuration. For HSRP/FHRP integration with tracking, see FHRP — HSRP, VRRP & GLBP and HSRP.

8. Step 6 — Advanced: RTT Threshold Alerting (Degradation Before Failure)

Reachability probes alert only when a target becomes completely unreachable. RTT threshold alerting goes further — it generates an alert when the link is still up but latency has degraded to a level that impacts applications. This gives the NOC team early warning before users start complaining.

! ══════════════════════════════════════════════════════════
! IP SLA 6 — ISP-A with RTT threshold alerting
! Alerts if RTT exceeds 100ms even if probe still succeeds
! ══════════════════════════════════════════════════════════
NetsTuts_R1(config)#ip sla 6
NetsTuts_R1(config-ip-sla)# icmp-echo 203.0.113.1 source-interface GigabitEthernet0/0
NetsTuts_R1(config-ip-sla-echo)#  frequency 30
NetsTuts_R1(config-ip-sla-echo)#  timeout 5000
NetsTuts_R1(config-ip-sla-echo)#  threshold 100
NetsTuts_R1(config-ip-sla-echo)#  tag ISP-A-LATENCY-MONITOR
NetsTuts_R1(config-ip-sla-echo)#exit
NetsTuts_R1(config)#ip sla schedule 6 life forever start-time now

! ── Track 6 on STATE (not reachability)
! ── "state" fires when probe is over-threshold, even if
! ── the probe technically succeeds (not a timeout)
NetsTuts_R1(config)#track 6 ip sla 6 state
NetsTuts_R1(config-track)# delay down 20 up 30
NetsTuts_R1(config-track)#exit

! ── EEM applet for latency degradation alert ──────────────
NetsTuts_R1(config)#event manager applet ISPA-LATENCY-HIGH
NetsTuts_R1(config-applet)# description "Alert: ISP-A RTT above 100ms threshold"
NetsTuts_R1(config-applet)# event track 6 state down
NetsTuts_R1(config-applet)#  maxrun 60
NetsTuts_R1(config-applet)# action 1.0 syslog priority warning \
   msg "*** LATENCY WARNING *** ISP-A RTT exceeded 100ms threshold on $_hostname — link degraded"
NetsTuts_R1(config-applet)# action 2.0 cli command "enable"
NetsTuts_R1(config-applet)# action 2.1 cli command \
   "show ip sla statistics 6 details | redirect flash:/sla-logs/ispa-latency.txt"
NetsTuts_R1(config-applet)#exit

NetsTuts_R1(config)#event manager applet ISPA-LATENCY-NORMAL
NetsTuts_R1(config-applet)# description "Recovery: ISP-A RTT back below threshold"
NetsTuts_R1(config-applet)# event track 6 state up
NetsTuts_R1(config-applet)#  maxrun 30
NetsTuts_R1(config-applet)# action 1.0 syslog priority notice \
   msg "*** LATENCY NORMAL *** ISP-A RTT back below 100ms threshold on $_hostname"
NetsTuts_R1(config-applet)#exit

The distinction between track N ip sla N reachability and track N ip sla N state is subtle but important for latency alerting. Reachability is binary: the probe either gets a response within the timeout window or it does not. A probe that takes 4,900 ms to respond (still within the 5,000 ms timeout) is considered "reachable" — no alert fires even though the WAN is almost unusable. State incorporates the threshold value: the probe is considered over-threshold when the RTT exceeds the configured threshold, and the tracking object transitions to Down even though technically the probe is still receiving responses. This pattern — reachability probe for outage alerting, state probe for degradation alerting — gives two distinct alert tiers: Warning (high latency) and Critical (complete outage).

9. Verification

show ip sla statistics — Per-Probe Results

NetsTuts_R1#show ip sla statistics

IPSLAs Latest Operation Statistics

IPSLA operation id: 1
        Latest RTT: 8 milliseconds
Latest operation start time: 14:35:30 UTC Wed Oct 16 2024
Latest operation return code: OK
Number of successes: 142
Number of failures: 0
Operation time to live: Forever

IPSLA operation id: 2
        Latest RTT: 12 milliseconds
Latest operation start time: 14:35:33 UTC Wed Oct 16 2024
Latest operation return code: OK
Number of successes: 141
Number of failures: 0
Operation time to live: Forever

IPSLA operation id: 3
        Latest RTT: 24 milliseconds
Latest operation start time: 14:35:00 UTC Wed Oct 16 2024
Latest operation return code: OK
Number of successes: 71
Number of failures: 0
Operation time to live: Forever

IPSLA operation id: 4
        Latest RTT: 87 milliseconds
Latest operation start time: 14:35:00 UTC Wed Oct 16 2024
Latest operation return code: OK
Number of successes: 70
Number of failures: 0
Operation time to live: Forever

All four probes show Latest operation return code: OK and zero failures — all monitored targets are currently reachable and within thresholds. The RTT values (8 ms ISP-A, 12 ms ISP-B, 24 ms branch jitter, 87 ms HTTP) establish the baseline for each target. If an outage occurs, the statistics here will show return code: Timeout and incrementing failure counts.

show ip sla statistics details — Rich Jitter Metrics

NetsTuts_R1#show ip sla statistics 3 details

IPSLAs Latest Operation Statistics

IPSLA operation id: 3
Type of operation: UDP Jitter
        Latest RTT: 24 ms
Latest operation start time: 14:35:00 UTC Wed Oct 16 2024
Latest operation return code: OK
RTT Values:
        Number Of RTT: 20        RTT Min/Avg/Max: 22/24/31 milliseconds
Latency one-way time:
        Number of Latency one-way Samples: 20
        Source to Destination Latency one way Min/Avg/Max: 9/11/14 ms
        Destination to Source Latency one way Min/Avg/Max: 12/13/17 ms
Jitter Time:
        Num of SD Jitter Samples: 19
        Num of DS Jitter Samples: 19
        Source to Destination Jitter Min/Avg/Max: 0/1/4 ms
        Destination to Source Jitter Min/Avg/Max: 0/1/3 ms
Packet Loss Values:
        Loss Source to Destination: 0      Loss Destination to Source: 0
        Out Of Sequence: 0                 Tail Drop: 0
        Skipped: 0                         Late Arrival: 0
Voice Score Values:
        Calculated Planning Impairment Factor (ICPIF): 0
        MOS score: 4.40
Number of successes: 71
Number of failures: 0
Operation time to live: Forever

The UDP jitter statistics output shows every voice-quality metric in detail: MOS score: 4.40 (above the 3.60 threshold — excellent quality), one-way latency split into source-to-destination and destination-to-source components, per-direction jitter, and packet loss. This level of detail is impossible to obtain from a simple ICMP echo probe. When the EEM alert fires for this probe, the diagnostic file saved to flash contains this full output at the exact moment of the degradation — invaluable for pinpointing whether the issue is asymmetric (one direction only), latency-only without packet loss, or complete loss.

show track — Tracking Object States

NetsTuts_R1#show track

Track 1
  IP SLA 1 Reachability
  Reachability is Up
    2 changes, last change 00:47:23
  Latest operation return code: OK
  Latest RTT (millisecs) 8
  Tracked by:
    ISPA-GW-DOWN (EEM)
    ISPA-GW-UP   (EEM)

Track 2
  IP SLA 2 Reachability
  Reachability is Up
    1 change, last change 02:15:44

Track 3
  IP SLA 3 State
  State is Up
    3 changes, last change 00:12:05

Track 4
  IP SLA 4 Reachability
  Reachability is Up
    1 change, last change 04:30:10

Track 5
  IP SLA 5 Reachability
  Reachability is Up
    1 change, last change 04:30:12

Track 6
  IP SLA 6 State
  State is Up
    4 changes, last change 00:05:32

show track is the single most important operational command for this monitoring system. It shows the current state of every tracking object, the number of state changes (a high change count on track 6 suggests recurring latency bursts), and the time since the last state change. Track 1 shows 2 changes — the link was down and has since recovered, which should correlate with the ISPA-GW-DOWN and ISPA-GW-UP alerts in the syslog. The Tracked by: EEM lines confirm the applets are registered against this tracking object.

show ip sla statistics aggregated — Historical Performance

NetsTuts_R1#show ip sla statistics aggregated 1

IPSLAs Aggregated Statistics

IPSLA operation id: 1
Start Time Index: 14:00:00 UTC Wed Oct 16 2024
        Aggregation interval: 900 seconds (15 minutes)

  Round-Trip-Time (RTT) Values
        Num of Measurements: 30     Min RTT: 7 ms
        Max RTT: 145 ms             Avg RTT: 9 ms
        Over thresholds: 2          (2 probes exceeded 2000ms threshold)

  Number of successes: 28
  Number of failures: 2
  Completion Time: 14:15:00 UTC Wed Oct 16 2024

show ip sla statistics aggregated shows the last 15-minute (configurable) window of probe results. This reveals intermittent problems that the instantaneous show ip sla statistics misses — in this example, 2 of 30 probes in the last 15 minutes failed (

Number of
    failures: 2

) and 2 exceeded the 2,000 ms threshold. The current probe shows OK, but the aggregated history shows the link is experiencing intermittent connectivity issues. This is the difference between point-in-time monitoring and trend analysis.

show logging — Confirm Alert Flow

NetsTuts_R1#show logging | include SLA\|OUTAGE\|RECOVERY\|HA_EM

Oct 16 14:32:01: %TRACK-6-STATE: 1 ip sla 1 reachability Up->Down
Oct 16 14:32:11: %HA_EM-2-LOG: ISPA-GW-DOWN: *** OUTAGE *** ISP-A gateway \
   203.0.113.1 UNREACHABLE on NetsTuts_R1 — SLA probe failing
Oct 16 14:32:12: %HA_EM-6-LOG: ISPA-GW-DOWN: diagnostic files saved to \
   flash:/sla-logs/
Oct 16 14:38:45: %TRACK-6-STATE: 1 ip sla 1 reachability Down->Up
Oct 16 14:38:55: %HA_EM-5-LOG: ISPA-GW-UP: *** RECOVERY *** ISP-A gateway \
   203.0.113.1 REACHABLE again on NetsTuts_R1

The syslog shows the complete event timeline: at 14:32:01 the tracking object transitions Up->Down (IOS-generated message), 10 seconds later (the delay down 10 hold-down has elapsed) the EEM applet ISPA-GW-DOWN fires at 14:32:11 and generates the CRITICAL alert. At 14:38:45 the tracking object transitions Down->Up and 10 seconds later the ISPA-GW-UP recovery applet fires. The outage lasted approximately 6 minutes and 44 seconds — this timeline is now permanently recorded in the syslog and the diagnostic files on flash allow post-mortem analysis of what the router's state was at the moment of failure. For forwarding these alerts to a central server, see Syslog Server Configuration.

Verification Command Summary

Command	What It Shows	Primary Use
`show ip sla statistics`	All probes — latest RTT, return code (OK/Timeout/Error), success and failure counts since last reset	Instant health check — confirm all probes are returning OK with zero or low failure counts
`show ip sla statistics [N] details`	Single probe — full detail including per-direction RTT, jitter, packet loss, MOS, threshold violations	Deep-dive on a specific probe, especially UDP jitter — verify voice quality metrics
`show ip sla statistics aggregated [N]`	15-minute aggregated window — min/avg/max RTT, total successes/failures, over-threshold count	Identify intermittent issues that the current probe misses — reveals patterns over time
`show ip sla configuration [N]`	Full probe configuration — target IP, source, frequency, timeout, threshold, tag, schedule	Verify probe is configured correctly — confirm source interface, frequency, and threshold values
`show track`	All tracking objects — current state (Up/Down), change count, last change time, EEM subscribers	Confirm tracking objects are Up and EEM applets are registered. High change count indicates a flapping probe
`show track [N]`	Single tracking object detail — IP SLA operation, reachability/state type, delay settings	Verify delay down/up settings are correct and confirm the linked IP SLA operation number
`show event manager policy registered`	All EEM applets — name, event type (track), registered track number, registration time	Confirm all down and up applets are registered against the correct track object numbers
`show event manager history events`	EEM execution history — applet name, event type, execution time	Verify applets fired when expected. Cross-reference with `show logging` timestamps
`show logging \| include TRACK\\|HA_EM`	All tracking state changes and EEM syslog actions in the log buffer	Complete timeline — correlate track state transitions with EEM alert timestamps
`dir flash:/sla-logs/`	Diagnostic files written by EEM actions	Confirm files are being created at failure events. Use `more flash:/sla-logs/[file].txt` to review captured output

10. Troubleshooting IP SLA + EEM Monitoring

Problem	Symptom	Cause	Fix
Probe shows continuous failures but target is reachable	`show ip sla statistics 1` shows `return code: Timeout` and incrementing failure count even though manual pings to the target succeed	The probe is not using `source-interface` and is being sourced from a different interface than intended, or the target device rate-limits ICMP causing the probe to time out while manual pings from the router succeed because they use a shorter timeout. Alternatively, the ISP gateway specifically blocks ICMP from some source IPs but not the router's loopback	Add `source-interface GigabitEthernet0/0` to the probe configuration — this forces the probe to use the WAN interface IP as its source, matching the exact path to the gateway. Verify the exact source IP used: `show ip sla configuration 1` shows the source IP being used. If rate-limiting is the issue, increase the `frequency` to 60 seconds to reduce ICMP rate — or switch to `tcp-connect` probe on a port the gateway accepts
EEM applet does not fire when track state changes	`show track` shows track state is Down, but `show event manager history events` shows no execution of the ISPA-GW-DOWN applet	The EEM applet is registered against the wrong track object number — the applet says `event track 2 state down` but the tracking object for ISP-A is track 1. Or the applet event clause says `state down` but the tracking object is using `reachability` type (which generates a different event notification)	Run `show event manager policy registered` — confirm the applet shows the correct track number and state. Run `show track 1` — confirm the track type (Reachability vs State) matches the applet's expectation. The EEM `event track N state down` works with both reachability and state tracking objects — "state down" means the object transitioned to the Down state regardless of tracking type. Re-check the track number in the applet matches the track number shown in `show track`
Track object flapping — repeated Down/Up transitions	`show track 1` shows a very high change count (50+ changes in an hour). EEM applet fires repeatedly, flooding syslog and the NOC inbox with alternating OUTAGE/RECOVERY emails	The track delay values are too short — a single dropped probe immediately triggers a Down transition, the next successful probe triggers Up, and so on. Or the WAN link is genuinely unstable (physical layer issue, ISP congestion)	Increase the track delay values: `track 1 ip sla 1 reachability` → `delay down 30 up 60`. This requires 30 consecutive seconds of failure before declaring Down (approximately one missed probe at frequency 30), and 60 seconds of continuous success before declaring recovery. Add `ratelimit 600` to the EEM event clause as an additional protection. Investigate the underlying WAN stability separately with `show ip sla statistics aggregated 1` to see the failure pattern
UDP jitter probe shows `return code: Busy` or `No Connection`	`show ip sla statistics 3` shows return code other than OK — specifically Busy, No Connection, or Timeout	`Busy` means the responder is not enabled or not listening on the configured port on the target router. `No Connection` means IP connectivity exists but the responder is not accepting UDP on port 5000. `Timeout` means no response at all — possible if the probe can reach the router but the responder is not running	Verify the responder is enabled on the branch router: SSH to Branch_Router and run `show ip sla responder` — it should show UDP responder listening on port 5000. If not, re-configure: `ip sla responder` and `ip sla responder udp-echo ipaddress 10.10.0.1 port 5000`. Confirm the ACL on the branch router does not block UDP 5000. Also verify the source port configured on the probe does not conflict with other probes (each probe needs a unique source port)
IP SLA probe stops running after a reload	After a router reload, `show ip sla statistics` shows old data but no new measurements — the probe is not generating new results	The `ip sla schedule` was configured with a specific start time in the past (`start-time 14:00:00`) rather than `start-time now` or `start-time after 0:0:5`. After a reload, IOS sees the start time has already passed and does not restart the schedule. Alternatively, `ip sla schedule` was configured with a `life` value that has expired	Reconfigure the schedule: `ip sla schedule 1 life forever start-time now`. The `life forever` ensures the probe never expires. `start-time now` restarts it immediately. Verify the probe is running after the schedule: `show ip sla statistics 1` should show the Latest operation start time updating every `frequency` seconds
Alert fires for a target that has a planned maintenance window	A server is being patched and taken offline deliberately. The monitoring system fires OUTAGE alerts and emails the NOC every 30 seconds during the maintenance window — flooding the team with false positives they must manually suppress	No maintenance mode mechanism is built into the IP SLA + EEM monitoring by default. The tracking object transitions Down the moment the probe fails, regardless of whether the outage is planned or unplanned	For planned maintenance, temporarily suspend the IP SLA schedule: `no ip sla schedule 4` before the maintenance window, then `ip sla schedule 4 life forever start-time now` after. Alternatively, add an EEM environment variable as a maintenance flag (`event manager environment _maintenance 1`) and add a conditional check in the applet: `action 0.5 if $_maintenance eq 1` → `action 0.6 exit` to skip all further actions when maintenance mode is active

Key Points & Exam Tips

The complete IP SLA alerting pipeline has three layers: IP SLA probe (sends synthetic test traffic and measures results), Object Tracking (translates probe results into a binary Up/Down state with configurable delay), and EEM applet (fires on state transitions and executes alert actions). All three layers must be correctly configured for automated alerting to work.
Always configure source-interface on WAN gateway probes. Without it, IOS sources the probe from the best available exit interface — a failed WAN link that still has an alternate path may produce false-healthy results if the probe routes around the failure instead of through it.
The track delay down [seconds] up [seconds] command prevents false alerts from single dropped probes. delay down requires the probe to fail continuously for N seconds before declaring Down; delay up requires continuous success for N seconds before declaring recovery. Size these to be slightly longer than one probe cycle at the configured frequency.
There are two tracking types for IP SLA objects: reachability (Down when probe times out — complete failure only) and state (Down when probe result exceeds the configured threshold — fires on latency degradation even without packet loss). Use both together for two-tier alerting: Warning on degradation, Critical on outage.
Always deploy paired applets — one on event track N state down and one on event track N state up. The down applet alerts on outage; the up applet confirms recovery. Without the recovery applet, the NOC team must manually verify resolution — defeating the purpose of automated monitoring.
UDP jitter probes require the Cisco IP SLA Responder (ip sla responder) on the target device. The responder uses hardware timestamps for precise one-way delay measurement. Without the responder, UDP jitter probes return Busy or No Connection return codes.
show ip sla statistics shows the current probe result (latest RTT, return code, cumulative success/failure counts). show ip sla statistics aggregated shows the historical 15-minute window including min/avg/max RTT and over-threshold counts — essential for identifying intermittent problems that the instantaneous view misses.
show track is the primary operational command — it shows current state, change count, last change time, and which EEM applets are subscribed. A high change count on a tracking object indicates a flapping probe — investigate both the underlying path stability and the track delay values.
IP SLA schedules configured with a past start-time do not restart automatically after a reload. Always use start-time now or start-time after 0:0:5 with life forever to ensure probes survive router reloads.
On the exam: know the three IP SLA probe types and whether they require a responder (ICMP — no; UDP jitter — yes; HTTP/DNS — no), the two tracking types (reachability vs state), the delay down/up purpose, and the EEM event track N state down/up syntax. For traffic-volume monitoring alongside SLA alerting, see NetFlow Configuration.

Next Steps: For the EEM applet architecture that this lab depends on, see EEM — Embedded Event Manager Scripting. For IP SLA probe configuration, Object Tracking, and tracked static routes in detail, see IP SLA Configuration & Tracking. For forwarding EEM syslog alerts to a central server for long-term storage and correlation, see Syslog Configuration and Syslog Server Configuration. For SNMP-based monitoring that complements IP SLA with threshold traps, see SNMP v2c & v3 Configuration and SNMP Traps. For Python-based multi-device monitoring that can poll IP SLA statistics across an entire network, see Python Netmiko Show Commands. For traffic capture at the moment of a detected outage, see SPAN & RSPAN Port Mirroring.

IP SLA with Syslog Alerting

1. IP SLA + Track + EEM — The Full Pipeline

How the Three Components Work Together

IP SLA Probe Types Covered in this Lab

Tracking Object Types

2. Lab Topology & Monitoring Plan

3. Step 1 — EEM Prerequisites and Environment Variables

4. Step 2 — ICMP Echo Probes for WAN Gateway Monitoring

Tracking Objects for ICMP Probes

5. Step 3 — UDP Jitter Probe for Voice Quality Monitoring

Configure the Responder on the Branch Router

Configure the UDP Jitter Probe on NetsTuts_R1

6. Step 4 — HTTP and DNS Application Probes

7. Step 5 — EEM Applets: Down Alert and Up Recovery Pairs

ISP-A Gateway — Down and Up Applets

ISP-B Gateway — Down and Up Applets

Branch Jitter — Down and Up Applets

Web Server — Down and Up Applets

8. Step 6 — Advanced: RTT Threshold Alerting (Degradation Before Failure)

9. Verification

show ip sla statistics — Per-Probe Results

show ip sla statistics details — Rich Jitter Metrics

show track — Tracking Object States

show ip sla statistics aggregated — Historical Performance

show logging — Confirm Alert Flow

Verification Command Summary

10. Troubleshooting IP SLA + EEM Monitoring

Key Points & Exam Tips

TEST WHAT YOU LEARNED

Why is source-interface GigabitEthernet0/0 critical on a WAN gateway ICMP echo probe, and what false result does omitting it risk?

What is the difference between track N ip sla N reachability and track N ip sla N state, and when should each be used for alerting?

A UDP jitter probe to the branch router returns return code: No Connection. The branch router is reachable via ping. What is the most likely cause?

show track 1 shows 87 changes in the last hour for an IP SLA reachability probe with frequency 30. What does this indicate and how should it be addressed?

After a router reload, show ip sla statistics shows the last measurement was taken before the reload and no new measurements are being generated. What is the cause and fix?

An ICMP echo probe has threshold 2000 and timeout 5000. An event track N state down EEM applet monitors this probe with track N ip sla N state. When does the tracking object go Down and when does it stay Up?

Why does an IP SLA + EEM monitoring system require a paired applet (one for state down and one for state up), and what operational problem occurs without the recovery applet?

How does an HTTP probe detect an application failure that an ICMP echo probe would miss, and give a specific scenario where this distinction matters?

A production router is monitoring six targets with IP SLA probes. During a planned maintenance window, the primary WAN link is taken down for hardware replacement. What is the correct procedure to prevent the monitoring system from sending false-positive OUTAGE alerts during the maintenance window?

show ip sla statistics aggregated 1 shows 28 successes and 2 failures in the last 15 minutes, with max RTT of 145 ms. show ip sla statistics 1 shows the current probe as OK with RTT of 8 ms. What does this reveal and what action should be taken?

Related Topics & Step-by-Step Tutorials

Why is `source-interface GigabitEthernet0/0` critical on a WAN gateway ICMP echo probe, and what false result does omitting it risk?

What is the difference between `track N ip sla N reachability` and `track N ip sla N state`, and when should each be used for alerting?

A UDP jitter probe to the branch router returns `return code: No Connection`. The branch router is reachable via ping. What is the most likely cause?

`show track 1` shows 87 changes in the last hour for an IP SLA reachability probe with `frequency 30`. What does this indicate and how should it be addressed?

After a router reload, `show ip sla statistics` shows the last measurement was taken before the reload and no new measurements are being generated. What is the cause and fix?

An ICMP echo probe has `threshold 2000` and `timeout 5000`. An `event track N state down` EEM applet monitors this probe with `track N ip sla N state`. When does the tracking object go Down and when does it stay Up?

Why does an IP SLA + EEM monitoring system require a paired applet (one for `state down` and one for `state up`), and what operational problem occurs without the recovery applet?

`show ip sla statistics aggregated 1` shows 28 successes and 2 failures in the last 15 minutes, with max RTT of 145 ms. `show ip sla statistics 1` shows the current probe as OK with RTT of 8 ms. What does this reveal and what action should be taken?