Network Automation Overview

1. Why Network Automation Matters

For decades, network engineers managed infrastructure almost exclusively through the CLI — logging into each device individually, typing commands one at a time, and verifying results manually. In a network of ten devices, this approach is manageable. In a modern enterprise with hundreds of routers and switches — or a cloud provider with tens of thousands — it becomes the primary bottleneck for every change, every deployment, and every troubleshooting task.

Network automation is the use of software, scripts, and tools to perform network tasks — configuration, monitoring, testing, and compliance checking — without direct manual intervention for each device. Automation does not replace the network engineer's judgement; it amplifies it, allowing one engineer to manage and configure hundreds of devices as quickly as they could previously manage one.

Business Driver The Problem Without Automation The Solution With Automation
Speed of deployment A new office branch requires 2–3 days of manual CLI configuration across 5–10 devices A playbook runs in minutes — all devices configured identically and simultaneously
Scale A security policy update across 500 switches takes weeks of CLI work across rotating shifts One script iterates over all 500 devices and pushes the change in under an hour
Consistency Each engineer applies configurations differently — "snowflake" devices with subtle differences that cause mysterious failures Every device receives identical configuration from the same template — no variation
Human error reduction Typos, missed commands, and wrong interface numbers are common in manual CLI — one mistake can cause an outage Tested, version-controlled code applies configurations — the same way every time
Auditability Who changed what, when, and why? Often unknown — informal notes or tribal knowledge All changes tracked in version control (Git) — full history, rollback capability, peer review
Compliance Verifying that 300 devices meet a security standard requires manual review of each A compliance script checks all devices against the standard in minutes and generates a report

Related pages: Ansible for Network Automation | Python for Networking | REST API Overview | NETCONF & RESTCONF | JSON, XML & YANG | Controller-Based Networking | Northbound & Southbound APIs | Ansible IOS Configuration Lab | Python Netmiko Lab

2. Traditional CLI Management — The Manual Model

The traditional approach to network management is engineer-driven, device-by-device, command-by-command. Understanding its limitations is essential for appreciating why automation exists.

How Traditional CLI Management Works

  Traditional CLI workflow for a VLAN change on 50 switches:

  For each switch (50 repetitions):
  1. Open PuTTY / SecureCRT
  2. SSH to the switch IP address
  3. Authenticate (username + password)
  4. Enter privileged mode: enable
  5. Enter config mode: configure terminal
  6. Type the VLAN commands:
       vlan 100
       name Finance
       interface range Gi0/1 - 24
       switchport access vlan 100
  7. Save: write memory
  8. Verify: show vlan brief
  9. Log out
  10. Document in a spreadsheet (maybe)

  Time per switch: ~10 minutes
  Total time (50 switches): ~500 minutes ≈ 8+ hours
  Error probability: High — typing 50 × 4 commands under time pressure

  Real-world risk: Steps 3, 4, 6 involve typing device-specific values.
  On switch 23, the engineer types "vlan 100" as "vlan 10" (missing a digit).
  Switch 23 is silently misconfigured — Finance users on that switch
  cannot communicate with the rest of Finance.
  Discovery: 45 minutes after the change, when users complain.
  Diagnosis: 30 more minutes to identify which switch.
  Fix: 10 minutes.
  Total outage: ~1.5 hours caused by one typo.

Traditional CLI Limitations

Limitation Real-World Impact
Sequential execution Changes are applied one device at a time — no parallelism; large deployments take proportionally longer with every device added
No built-in rollback If a change causes problems, reverting requires re-logging into every device and manually undoing each command — in a crisis, under pressure
Configuration drift Over time, devices accumulate undocumented one-off changes — the running-config no longer matches any standard; troubleshooting becomes unpredictable. See show running-config.
No version control Who made a change two months ago and why? No audit trail means no accountability and no root cause analysis for incidents
Human fatigue Repetitive CLI work increases error probability over time — the 47th switch is more likely to have a typo than the 1st
Knowledge dependency Tribal knowledge — only the engineer who built the network knows all its quirks; bus factor of one

3. The Automation Model — How It Changes Everything

Automation replaces the manual, error-prone, sequential CLI workflow with a programmatic, repeatable, parallel approach. The network engineer's role shifts from typing commands to writing and maintaining the code that types the commands.

Automated Workflow for the Same VLAN Change

  Automated VLAN change on 50 switches using Ansible:

  Step 1 — Engineer writes (once) a playbook:
  ---
  - name: Add Finance VLAN to access switches
    hosts: all_switches
    tasks:
      - name: Configure VLAN 100
        ios_vlan:
          vlan_id: 100
          name: Finance
          state: present

      - name: Assign ports to VLAN 100
        ios_l2_interfaces:
          config:
            - name: GigabitEthernet0/1
              access:
                vlan: 100

  Step 2 — Engineer runs the playbook (once):
  $ ansible-playbook vlan_change.yml -i inventory/switches.ini

  Step 3 — Ansible connects to ALL 50 switches in PARALLEL
           Applies identical configuration to each
           Verifies the change succeeded on each device
           Reports pass/fail per device

  Step 4 — Results:
  PLAY RECAP:
  switch01: ok=2 changed=2 unreachable=0 failed=0
  switch02: ok=2 changed=2 unreachable=0 failed=0
  ...
  switch50: ok=2 changed=2 unreachable=0 failed=0

  Time: ~3 minutes total (parallel execution)
  Human error: Eliminated — the VLAN number 100 is defined once in the playbook
  Rollback: Run the playbook with "state: absent" to undo on all 50 switches

CLI vs Automation — Direct Comparison

Aspect Traditional CLI Automation
Execution speed Sequential — one device at a time Parallel — all devices simultaneously
Change time (50 devices) ~8 hours manual work ~3–5 minutes
Error source Human typing — typos, wrong device, wrong values Code logic — tested before deployment; same every time
Consistency Varies by engineer, shift, fatigue level Identical on every device from the same template
Rollback Manual — re-login and undo each command Re-run playbook with previous state — automated
Documentation Post-change spreadsheet (sometimes skipped) Code IS the documentation; version-controlled in Git
Repeatability Different outcome possible each run Idempotent — same result every time regardless of current state
Knowledge transfer Tribal — locked in the engineer's head Code is readable, shareable, peer-reviewable

4. Key Benefits of Network Automation

4.1 Speed and Agility

Automation collapses what used to be hours or days of work into minutes. New branch deployments, firewall rule updates, VLAN changes, firmware upgrades — all can be executed across the entire network in a fraction of the time. This directly translates to faster time-to-market for new services and quicker incident recovery.

  Speed comparison examples:

  Task: Deploy a new OSPF area across 12 routers
  Manual CLI:      ~4 hours (20 min per router)
  Python script:   ~8 minutes (parallel SSH via Netmiko)
  Ansible:         ~4 minutes (parallel, idempotent)

  Task: Rotate NTP server IP on 200 switches
  Manual CLI:      ~3 days work (~10 min per switch)
  Ansible:         ~6 minutes

  Task: Audit 300 devices for password complexity compliance
  Manual CLI:      ~50 hours (10 min per device review)
  Python script:   ~12 minutes (automated output parsing)

4.2 Consistency and Standardisation

Consistency is perhaps the most underrated benefit of automation. When configurations are applied by code rather than humans, every device receives exactly the same configuration — no more "snowflake" devices with subtle variations that cause mysterious network behaviour.

  Configuration drift problem (manual management):

  Switch A: built by Engineer 1, Monday morning (alert and careful)
  Switch B: built by Engineer 2, Friday afternoon (tired, rushed)
  Switch C: built by Engineer 1 after an outage (under pressure)

  All "identical" configurations — until:
  Switch A: spanning-tree mode rapid-pvst (correct)
  Switch B: spanning-tree mode pvst (old mode, forgotten to update)
  Switch C: no spanning-tree mode command (inherited from template, not applied)

  Result: inconsistent STP behaviour → intermittent L2 loop → outage
  Root cause: 3 hours to diagnose because "all configs should be the same"

  Automation solution:
  Template defines: spanning-tree mode rapid-pvst
  Every switch receives exactly this line — no variation, no exceptions.
  Compliance check script detects any device that deviates from the template.

4.3 Reduced Human Error

The most dangerous moment in traditional network management is the change window — when engineers under time pressure make manual configuration changes. Studies consistently show that human error is responsible for the majority of network outages.

Error Type Manual CLI Risk Automation Mitigation
Typo in command High — typing 50+ commands under time pressure Eliminated — code is typed once, tested, then executed
Wrong device Common — multiple SSH sessions open, accidentally typing on the wrong window Eliminated — automation targets devices by inventory file, not by open session
Missed step Frequent — especially late in a long change window Eliminated — all steps are in the playbook/script; none can be skipped
Wrong value Possible — VLAN 100 entered as 10, /24 as /25 Reduced — value is defined once in variables; input validation can catch format errors
Untested change Common — "I know this command, I don't need to test it" Reduced — automation code is typically tested in a lab environment before production deployment

4.4 Idempotency — Run It Again Safely

A well-designed automation tool is idempotent — running the same playbook or script multiple times produces the same result as running it once. If a device already has the correct configuration, nothing changes. If it does not, the configuration is applied. This means automation can be run both as a deployment tool and as a continuous compliance checker.

  Idempotency example (Ansible):

  First run (device not configured):
  Task: "Configure VLAN 100" → changed=1 (VLAN was missing, now added)

  Second run (device already configured):
  Task: "Configure VLAN 100" → ok=1, changed=0 (VLAN already exists, no action)

  Benefit: The same playbook can be run:
  - As a change deployment (first run applies the change)
  - As a daily compliance check (subsequent runs verify nothing has drifted)
  - As a disaster recovery step (after a device failure, re-run restores config)

  Contrast with manual CLI:
  Running "vlan 100" a second time on a Cisco switch is harmless.
  But running a full manual change script twice might add duplicate
  ACL entries, double the number of routes, or corrupt a config.
  Automation tools check current state before acting.

5. The Automation Stack — Layers of Tools

Network automation uses a layered set of tools and technologies. Understanding where each layer fits helps clarify which tool is appropriate for which problem.

  Network automation technology stack:

  ┌─────────────────────────────────────────────────────────────────────┐
  │  Layer 4: Orchestration & Intent-Based Platforms                   │
  │  Cisco DNA Center, NSO, Ansible Tower, Terraform                   │
  │  "Describe what you WANT — platform figures out HOW"               │
  ├─────────────────────────────────────────────────────────────────────┤
  │  Layer 3: Automation Frameworks                                     │
  │  Ansible, Nornir, Salt, Chef, Puppet                                │
  │  "Describe the desired state — framework handles parallelism"       │
  ├─────────────────────────────────────────────────────────────────────┤
  │  Layer 2: Scripting & Libraries                                    │
  │  Python + Netmiko, NAPALM, Paramiko, ncclient                      │
  │  "Write code to connect, send commands, parse output"               │
  ├─────────────────────────────────────────────────────────────────────┤
  │  Layer 1: Programmatic Device Interfaces                            │
  │  REST API, NETCONF (SSH), RESTCONF (HTTP), gRPC, SNMP               │
  │  "How software talks TO the device — the protocol layer"            │
  └─────────────────────────────────────────────────────────────────────┘
  ◄──── Higher-level (more abstraction) ──── Lower-level (more control) ────►

6. Ansible — Agentless Network Automation

Ansible is the most widely adopted network automation framework. It is agentless — no software needs to be installed on the managed network devices. Ansible connects via SSH (or REST API for modern devices) and applies configuration using YAML-based playbooks. It is maintained by Red Hat and has a large ecosystem of network modules for Cisco IOS, IOS XE, NX-OS, ASA, Juniper JunOS, Arista EOS, and many others.

Ansible Key Concepts

Concept Description
Inventory A file listing all managed devices and their groupings. Can be static (INI or YAML file) or dynamic (auto-discovered from DNS, CMDBs, or cloud APIs).
Playbook A YAML file defining what tasks to perform on which hosts. Human-readable, version-controllable, and reusable.
Task A single action within a playbook — e.g., "configure OSPF on interface Gi0/0". Each task calls a module.
Module Pre-written code for a specific action — e.g., ios_config (send CLI commands to Cisco IOS), ios_vlan (manage VLANs), ios_bgp (configure BGP). Abstracts the device specifics.
Role A reusable, structured collection of tasks, variables, and templates for a common function — e.g., a "base_router" role that applies NTP, syslog, AAA, and SSH hardening to any router.
Variable Separates data (VLAN IDs, IP addresses, names) from logic (what to do with that data). Variables can be per-host, per-group, or global.
Template (Jinja2) A configuration template with placeholders that Ansible fills in per-device using variables — generates device-specific configurations from a single template file.
Agentless No software installed on managed devices — Ansible uses SSH or HTTPS. Compare with Puppet/Chef which require an agent daemon running on each managed node.

Simple Ansible Playbook Example

  # File: configure_ntp.yml
  ---
  - name: Configure NTP on all routers
    hosts: routers                    # targets the "routers" group in inventory
    gather_facts: no
    vars:
      ntp_server: 10.0.0.1

    tasks:
      - name: Set NTP server
        cisco.ios.ios_config:
          lines:
            - ntp server {{ ntp_server }}  # Jinja2 variable substitution

      - name: Set NTP source interface
        cisco.ios.ios_config:
          lines:
            - ntp source Loopback0

      - name: Verify NTP status
        cisco.ios.ios_command:
          commands:
            - show ntp status
        register: ntp_output

      - name: Display NTP verification result
        debug:
          var: ntp_output.stdout_lines

  # Run with: ansible-playbook configure_ntp.yml -i inventory.ini

See: Ansible for Network Automation | Ansible IOS Configuration Lab | Jinja2 Config Generation Lab

7. Python for Network Automation

Python is the dominant scripting language for network automation. It is easy to learn, has a rich ecosystem of networking libraries, and is supported by virtually every major network vendor. Python enables automation at every level — from simple scripts that connect to one device and collect output, to complex applications that orchestrate thousands of devices.

Key Python Networking Libraries

Library Purpose Best For
Netmiko Multi-vendor SSH connection library — simplifies connecting to Cisco, Juniper, Arista, and many other devices via SSH and sending CLI commands CLI-based automation on devices that do not have APIs; screen-scraping device output
Paramiko Low-level SSH implementation in Python — the foundation that Netmiko is built on. Full SSH control. Custom SSH automation when Netmiko's abstractions are insufficient
NAPALM Network Automation and Programmability Abstraction Layer with Multivendor support — provides a unified API across Cisco IOS, IOS XE, NX-OS, Juniper, Arista Vendor-agnostic automation; configuration compliance; state validation
ncclient Python library for NETCONF (RFC 6241) — enables structured XML-based configuration and state retrieval over SSH NETCONF-capable devices (Cisco IOS XE, Juniper, etc.); structured configuration management
requests Standard Python HTTP library — sends GET/POST/PUT/DELETE requests to REST APIs (RESTCONF, Cisco DNA Center, Meraki API, etc.) REST API automation on any device or platform with an HTTP-based API
TextFSM / TTP Template-based text parsers — convert unstructured CLI output (show commands) into structured data (Python dictionaries/lists) Parsing legacy CLI output when no API is available; extracting values from show command text

Simple Python Netmiko Example

  # Connect to multiple routers and collect interface status
  from netmiko import ConnectHandler
  import json

  # Device definition
  devices = [
      {"device_type": "cisco_ios", "host": "192.168.1.1",
       "username": "admin", "password": "P@ssw0rd"},
      {"device_type": "cisco_ios", "host": "192.168.1.2",
       "username": "admin", "password": "P@ssw0rd"},
  ]

  results = {}

  for device in devices:
      connection = ConnectHandler(**device)         # SSH connect
      output = connection.send_command(             # send CLI command
          "show ip interface brief",                # see: show-ip-interface-brief.html
          use_textfsm=True                          # parse output with TextFSM
      )
      results[device["host"]] = output
      connection.disconnect()

  # Print structured results (list of dicts, not raw text)
  for host, interfaces in results.items():
      print(f"\n--- {host} ---")
      for intf in interfaces:
          print(f"  {intf['intf']}: {intf['status']} / {intf['proto']}")

  # Output:
  # --- 192.168.1.1 ---
  #   GigabitEthernet0/0: up / up
  #   GigabitEthernet0/1: down / down
  # --- 192.168.1.2 ---
  #   GigabitEthernet0/0: up / up

See: Python for Networking | Python Script Examples | Python Netmiko Lab | Python NAPALM Lab

8. REST APIs — Programmatic Device Interfaces

A REST API (Representational State Transfer Application Programming Interface) is a standardised web-based interface that allows software to interact with a device or platform using standard HTTP methods. Modern network devices — Cisco IOS XE, Cisco Catalyst Centre, Cisco Meraki, and many others — expose REST APIs that allow configuration and monitoring without any CLI at all.

REST API Key Concepts

Concept Description
HTTP Methods GET (retrieve data), POST (create new resource), PUT (replace/update resource), PATCH (partial update), DELETE (remove resource) — map directly to CRUD operations
Endpoint (URI) The URL path to a specific resource — e.g., https://router/restconf/data/ietf-interfaces:interfaces targets the interfaces resource
JSON / XML Data formats used in API request and response bodies. JSON is preferred for modern APIs — human-readable and easily parsed by Python
Authentication Basic auth (username:password), API tokens, or OAuth2 — credentials included in HTTP headers, not CLI
Status Codes HTTP response codes: 200 OK, 201 Created, 204 No Content, 400 Bad Request, 401 Unauthorized, 404 Not Found, 500 Server Error

REST API Example — RESTCONF on Cisco IOS XE

  RESTCONF is the REST-based management interface on Cisco IOS XE.
  Base URL: https://<device-ip>/restconf/
  Headers:  Content-Type: application/yang-data+json
            Accept: application/yang-data+json
  Auth:     Basic authentication (admin credentials)

  Example: GET all interfaces using Python requests library:

  import requests
  import json
  requests.packages.urllib3.disable_warnings()   # suppress SSL warnings for lab

  url = "https://192.168.1.1/restconf/data/ietf-interfaces:interfaces"
  headers = {
      "Content-Type": "application/yang-data+json",
      "Accept": "application/yang-data+json"
  }
  auth = ("admin", "C1sco12345")

  response = requests.get(url, headers=headers, auth=auth, verify=False)

  if response.status_code == 200:
      data = response.json()
      interfaces = data["ietf-interfaces:interfaces"]["interface"]
      for intf in interfaces:
          print(f"Interface: {intf['name']}, Type: {intf['type']}")
  else:
      print(f"Error: {response.status_code}")

  # Output:
  # Interface: GigabitEthernet1, Type: iana-if-type:ethernetCsmacd
  # Interface: GigabitEthernet2, Type: iana-if-type:ethernetCsmacd
  # Interface: Loopback0,        Type: iana-if-type:softwareLoopback
  Example: PUT (configure) a new interface IP address via RESTCONF:

  payload = {
    "ietf-interfaces:interface": {
      "name": "GigabitEthernet2",
      "description": "WAN Link",
      "type": "iana-if-type:ethernetCsmacd",
      "enabled": True,
      "ietf-ip:ipv4": {
        "address": [{
          "ip": "203.0.113.1",
          "prefix-length": 30
        }]
      }
    }
  }

  url = "https://192.168.1.1/restconf/data/ietf-interfaces:interfaces/interface=GigabitEthernet2"
  response = requests.put(url, headers=headers, auth=auth,
                           json=payload, verify=False)
  print(f"Status: {response.status_code}")   # 204 = success, no content returned

See: REST API Overview | REST API Methods | NETCONF & RESTCONF | Postman & API Testing

9. NETCONF and RESTCONF

NETCONF and RESTCONF are the two standard programmatic management protocols defined for modern network devices. Unlike CLI (which returns unstructured text), both protocols exchange structured data (XML or JSON) modelled using YANG data models.

Feature NETCONF RESTCONF
Transport SSH (port 830) HTTPS (port 443)
Data format XML JSON or XML
Data model YANG YANG
Operations get, get-config, edit-config, copy-config, delete-config, lock, unlock, commit HTTP GET, POST, PUT, PATCH, DELETE
Transactions Full candidate/running config transactions with commit/rollback — atomic changes Per-request — no multi-step transaction support
Use case Service provider automation, complex multi-step configuration changes, strong consistency requirements Simpler REST-style automation, developer-friendly, easy integration with web tools
RFC RFC 6241 RFC 8040
  YANG data model — the common language:

  YANG (Yet Another Next Generation) is the data modelling language used to
  describe the structure and constraints of network device configuration and
  state. NETCONF and RESTCONF both use YANG models to define what can be
  configured and what format the data must be in.

  Example YANG model concept (simplified):
  module ietf-interfaces {
    list interface {
      key "name";
      leaf name       { type string; }
      leaf description{ type string; }
      leaf enabled    { type boolean; }
      container ipv4  {
        list address  {
          leaf ip     { type inet:ipv4-address; }
          leaf prefix-length { type uint8; }
        }
      }
    }
  }

  This YANG model defines exactly what an interface object looks like —
  the same structure appears in NETCONF XML and RESTCONF JSON responses.
  The NMS and the device speak the same structured language.

See: NETCONF & RESTCONF Overview | JSON, XML & YANG | RESTCONF Basics Lab | NETCONF ncclient Python Lab

10. Infrastructure as Code (IaC) and Version Control

Infrastructure as Code (IaC) is the practice of defining and managing network infrastructure through code — stored in files, version-controlled in Git, peer-reviewed, and deployed programmatically. IaC brings software engineering discipline to network management.

IaC Principles Applied to Networking

IaC Principle Network Application
Single source of truth The Git repository holds the authoritative network configuration — not the running-config on each device. What is in Git is what should be on the network.
Version control Every change to a playbook, template, or variable file is committed to Git with a message explaining the change. Any version can be retrieved or rolled back to.
Code review Network changes are submitted as Git pull requests and reviewed by peers before deployment — the same process as software code review. Catches errors before production.
CI/CD pipelines Continuous Integration/Continuous Deployment — when a change is merged to Git, a pipeline automatically tests the change in a lab environment and deploys to production if tests pass.
Declarative configuration Define the desired end state ("this VLAN should exist") rather than the procedure ("run these commands"). The automation tool figures out how to get from the current state to the desired state.
  Git workflow for a network change:

  1. Engineer creates a branch: git checkout -b add-vlan-100
  2. Edits the variable file: vlans.yml → adds "100: Finance"
  3. Tests in lab environment: ansible-playbook vlan.yml --check
  4. Commits: git commit -m "Add Finance VLAN 100 to access layer"
  5. Pushes and creates a Pull Request for peer review
  6. Colleague reviews, approves, and merges to main branch
  7. CI/CD pipeline triggers → runs playbook against production
  8. All 50 switches updated in 3 minutes, tested, and verified
  9. Full audit trail in Git: who changed what, when, and why

11. Controller-Based Networking and Intent-Based Networking

Modern network architectures separate the control plane (intelligence — deciding what to do) from the data plane (forwarding — doing it). A central controller handles all policy decisions and pushes configuration to devices via southbound APIs. This is controller-based networking. Intent-based networking (IBN) takes this a step further: the administrator declares high-level intent ("all IoT devices must be isolated from the corporate network") and the controller translates that intent into device-level configurations automatically.

  Controller-based networking architecture:

  ┌──────────────────────────────────────────────────────────────────┐
  │  Administrator / Orchestration System                            │
  │  (defines policy / intent)                                       │
  └─────────────────────┬────────────────────────────────────────────┘
                         │ Northbound API (REST API)
                         ▼
  ┌──────────────────────────────────────────────────────────────────┐
  │  Controller (e.g., Cisco Catalyst Centre / DNA Center)           │
  │  - Translates intent into configuration                          │
  │  - Maintains network topology view                               │
  │  - Distributes policy to all devices                             │
  └──────┬───────────────────────────────────────────────────┬───────┘
         │ Southbound API                                     │
         │ (NETCONF, RESTCONF, OpenFlow, CLI via SSH)         │
         ▼                                                    ▼
  [Router / Switch A]                                [Router / Switch B]
  Data plane: forwards traffic                       Data plane: forwards traffic
  Control plane: removed or minimal                  Control plane: removed or minimal
API Type Direction Purpose Examples
Northbound API Between controller and management applications Allows orchestration tools, custom apps, and administrators to communicate with the controller Cisco DNA Center REST API, OpenDaylight REST API
Southbound API Between controller and network devices Controller pushes configuration and policy to devices NETCONF, RESTCONF, OpenFlow, OVSDB, CLI over SSH

See: Controller-Based Networking | Northbound & Southbound APIs

12. Automation Tools Comparison

Tool Type Language Agent Required? Best For
Ansible Configuration management framework YAML playbooks (Python under the hood) No — agentless (SSH) Network configuration, multi-vendor, large scale
Python + Netmiko Scripting library Python No — SSH Custom scripts, CLI-based devices without APIs
Python + NAPALM Multi-vendor abstraction library Python No — SSH / API Vendor-agnostic state retrieval and compliance
RESTCONF + requests REST API client Python (or any HTTP client) No — HTTPS Modern IOS XE devices with REST API support
NETCONF + ncclient Structured config protocol Python + XML No — SSH Transaction-safe config on NETCONF-enabled devices
Terraform Infrastructure provisioning (IaC) HCL (declarative) No — API Cloud networking, data centre provisioning
Cisco DNA Center Intent-based network controller GUI + REST API No (controller-based) Enterprise campus automation and assurance

13. Network Automation Summary — Key Facts

Topic Key Fact
Primary automation benefit Speed, consistency, reduced human error, and scalability — changes applied to hundreds of devices in minutes
Idempotency Running the same automation multiple times produces the same result — safe to re-run for compliance checks
Ansible characteristics Agentless (SSH), YAML playbooks, idempotent, large network module ecosystem for multi-vendor support
Python key libraries Netmiko (CLI/SSH), NAPALM (multi-vendor abstraction), ncclient (NETCONF), requests (REST API)
REST API methods GET (read), POST (create), PUT (replace), PATCH (update), DELETE (remove) — HTTP-based, JSON/XML responses
NETCONF transport/port SSH port 830; XML data; YANG models; atomic transactions
RESTCONF transport/port HTTPS port 443; JSON or XML; YANG models; REST operations
YANG Data modelling language used by both NETCONF and RESTCONF to define the structure of configuration and state data
Northbound API Between management applications and the controller — typically REST (e.g., DNA Center API). See Northbound & Southbound APIs.
Southbound API Between controller and network devices — NETCONF, RESTCONF, OpenFlow, or CLI over SSH
Infrastructure as Code Network config stored as code in Git — version controlled, peer-reviewed, CI/CD deployed
Configuration drift Gradual divergence of device configs from the standard — automation prevents drift; compliance scripts detect it

14. Network Automation Quiz

1. What does it mean for an automation tool to be idempotent, and why is this property valuable for network management?

Correct answer is C. Idempotency is one of the most important properties of well-designed automation tools like Ansible. Before making a change, the tool checks the current state of the device. If the device already matches the desired state (e.g., VLAN 100 already exists), no action is taken — the task reports "ok" rather than "changed." This means the same playbook serves multiple purposes: it deploys configuration on new devices, verifies compliance on existing devices (if it shows "changed," something has drifted), and restores configuration after failures. Without idempotency, running a script twice might add duplicate ACL entries or create loops in the configuration.

2. What is a key advantage of Ansible over traditional CLI management for a change that affects 100 network devices?

Correct answer is B. Ansible's two most significant advantages over manual CLI are agentless operation and parallel execution. Agentless means no software needs to be pre-installed on network devices — Ansible connects via SSH (or API) on demand, which is essential for network devices where installing agents is not possible. Parallel execution means Ansible sends tasks to all target devices simultaneously rather than sequentially. A manual engineer working on 100 devices at 10 minutes each takes over 16 hours. Ansible running against 100 devices in parallel completes in roughly the same time as configuring a single device. Ansible also supports Cisco IOS, NX-OS, IOS XE, Juniper, Arista, and many other vendors — making it excellent for multi-vendor environments.

3. What is configuration drift, and how does automation address it?

Correct answer is D. Configuration drift is a pervasive problem in manually managed networks. It happens when engineers make ad-hoc changes ("just this once"), troubleshooting commands are left in the configuration, different engineers apply slightly different standards, or emergency changes are made without full documentation. Over months and years, devices accumulate subtle differences from each other and from any standard template. This causes mysterious failures ("why does switch 47 behave differently from switch 46?"). Automation addresses drift in two ways: prevention (automated deployment from templates means no manual variation is introduced) and detection (idempotent compliance scripts regularly re-apply the desired state and report any device that required changes — indicating drift has occurred).

4. Which Python library is best suited for connecting to Cisco IOS devices via SSH and sending CLI commands, especially on devices that do not have REST API support?

Correct answer is A. Netmiko was specifically created to solve the challenge of connecting to network devices via SSH and automating CLI interactions. Unlike raw Paramiko (which requires handling prompt detection and timing manually), Netmiko knows how to connect to Cisco IOS, handle the "Router#" prompt, enter enable mode, enter configuration mode, send commands, and capture output — all with device-type-specific handling. It supports over 80 device types. For devices without REST APIs (which is still the majority of production infrastructure), Netmiko is the go-to library. The other options are: requests (HTTP REST APIs), ncclient (NETCONF/XML), TextFSM (parsing output text — not connecting to devices).

5. What HTTP method is used to retrieve information from a REST API, and what HTTP status code indicates a successful response with returned data?

Correct answer is C. In REST APIs, each HTTP method maps to a specific CRUD operation. GET = Read (retrieve data without modifying it). POST = Create (create a new resource). PUT = Update/Replace (replace an existing resource entirely). PATCH = Update (partially modify an existing resource). DELETE = Delete (remove a resource). See REST API Methods for the full breakdown. HTTP status codes indicate the result: 200 OK = success with response body; 201 Created = new resource successfully created (used with POST); 204 No Content = success but no response body (used with PUT and DELETE); 400 Bad Request = malformed request; 401 Unauthorized = authentication required; 404 Not Found = resource does not exist; 500 Internal Server Error = server-side problem. For RESTCONF GET operations on Cisco IOS XE, a successful response returns 200 OK with JSON data.

6. What is the difference between NETCONF and RESTCONF, and what data modelling language do both use?

Correct answer is B. NETCONF (RFC 6241) runs over SSH on port 830 and uses XML for both configuration and state data. Its key advantage is full transactional support: operations can be staged in a candidate configuration, validated, and then committed atomically — or rolled back if something fails. This is similar to a database transaction. RESTCONF (RFC 8040) is a newer, simpler REST-based alternative that uses standard HTTP methods (GET/POST/PUT/PATCH/DELETE) over HTTPS. It supports both JSON and XML. RESTCONF is easier for developers familiar with REST APIs but lacks NETCONF's transactional capabilities. Critically, both protocols use YANG as their data modelling language — YANG defines what objects exist and their data types, making the structured data exchanged by both protocols predictable and machine-readable.

7. In a controller-based network architecture, what is the role of a northbound API and a southbound API?

Correct answer is D. The terms "northbound" and "southbound" describe the direction of the API relative to the controller, which sits in the middle of the management hierarchy. Northbound = "above" = toward management and orchestration applications. The northbound API is what Ansible, Terraform, or a custom application uses to talk to the controller — typically a REST API. Southbound = "below" = toward the physical/virtual network devices that carry traffic. The southbound API is how the controller pushes policy and configuration down to routers and switches — NETCONF, RESTCONF, OpenFlow, OVSDB, or even SSH CLI for legacy devices. The controller translates high-level intent (from northbound) into device-specific configuration (pushed southbound). See: Northbound & Southbound APIs

8. What is YANG, and what role does it play in NETCONF and RESTCONF?

Correct answer is A. YANG (RFC 6020, "Yet Another Next Generation") is the data modelling language that provides the schema for structured network management. Just as a database schema defines what tables and columns exist and what data types they hold, YANG defines what configuration objects exist on a network device, what their names are, what data types they hold, and what constraints apply. Both NETCONF (XML) and RESTCONF (JSON/XML) use YANG as their underlying data model — the same YANG model is expressed in XML for NETCONF and in JSON for RESTCONF. YANG models are standardised by the IETF (e.g., ietf-interfaces) and also defined by vendors (Cisco IOS XE, Juniper, etc.) for device-specific features. See: JSON, XML & YANG

9. An engineer manually applies a security hardening standard to 200 switches over two weeks. Three months later, a security audit finds that 23 switches are non-compliant — some commands were missed, and some were subsequently changed by other engineers. What automation approach would have prevented this and would detect future non-compliance automatically?

Correct answer is C. This scenario illustrates two separate problems: initial deployment inconsistency and ongoing configuration drift. An idempotent Ansible playbook solves both. For initial deployment: the playbook runs in parallel against all 200 switches simultaneously — every device receives exactly the same configuration from the same template, with no missed commands. For ongoing compliance: running the same playbook daily as a scheduled job costs nothing (idempotent — no changes made if already compliant). Any switch where changes are reported in the output ("changed=1") has drifted from the standard — an alert can be triggered. This approach turns compliance from a periodic expensive audit into a continuous automated process. SNMP traps (option A) show that changes occurred but not whether they were compliant or non-compliant.

10. An Ansible playbook is run with changed=0 across all devices. What does this tell the engineer, and what is the significance compared to a manual CLI change verification?

Correct answer is B. This question tests understanding of idempotency in practice. When Ansible reports changed=0 for all tasks across all devices, it means the playbook checked the current state of every device, compared it to the desired state defined in the playbook, and found that every device already matches — no configuration changes were required. This is a powerful compliance verification: in a single Ansible run against 200 switches, you get confirmation that all 200 are correctly configured. Compare this to manual verification — logging into 200 switches and checking each running-configuration manually would take an entire working day. The automated approach takes minutes and provides a machine-generated report. If any device shows changed=1, it means that device had drifted and the playbook brought it back into compliance automatically.

← Back to Home