Esc
Type to search posts, tags, and more...
Skip to content

YANG: the schema behind every modern network API

A practical guide to YANG data models — the modeling language that defines what NETCONF, RESTCONF, and gNMI actually transport, and why it matters for network automation and AI agents.

Contents

Every modern network management protocol — NETCONF, RESTCONF, gNMI — transports structured data between a client and a network device. But none of them define the shape of that data. They’re the transport. Something else has to describe what fields exist on an interface, what parameters a BGP neighbor takes, what values are legal for an admin-status. That something is YANG.

YANG is the schema language the networking industry standardized on but few engineers actually learn. Most people interact with it indirectly: they use a Python library that was generated from YANG models, or they copy XML filters from a blog post without understanding the model that defines them. That works until it doesn’t — until you need to figure out why your RESTCONF call returns a different JSON structure than the one in the documentation, or why the same interface data looks completely different across three YANG model families on the same device.

If you’re building automation, you need to understand the data model layer. If you’re building AI agents that interact with network infrastructure — which is what MCP enables — you need it even more, because the agent has to know the schema to construct valid requests and parse responses. YANG is that schema.

What YANG is

YANG (RFC 7950) is a data modeling language for network management protocols. It was introduced in RFC 6020 in 2010 and updated to version 1.1 in RFC 7950 in 2016. It is not a protocol. It doesn’t move data. It defines the structure, types, and constraints of data that protocols like NETCONF and RESTCONF transport.

The closest analogies: JSON Schema defines the shape of JSON documents. Protobuf .proto files define the shape of serialized messages. YANG defines the shape of network configuration and state data. The data itself gets encoded as XML (for NETCONF) or JSON (RFC 7951 defines the JSON encoding) — YANG doesn’t care about the wire format.

Core constructs

Here are the building blocks. Every YANG model is composed of these:

  • module — the file. Defines a namespace, a prefix, and contains everything else. One module = one .yang file.
  • container — a JSON object. Groups related nodes together but has no value itself. interface is a container that holds leaves like name and enabled.
  • list — a table with keyed entries. A routing table is a list keyed by prefix. An interface list is keyed by name. Each entry is a container.
  • leaf — the atomic unit. A single typed value: an IP address, a counter, an admin-status enum. This is where actual data lives.
  • leaf-list — an array of scalar values. A list of DNS servers, a set of tags.
  • typedef — custom types built on primitives. Define ipv4-address as a string with a regex pattern, then use it everywhere.
  • grouping / uses — reusable templates. Define a block of nodes once, use it in multiple places. Like a struct you can embed.
  • augment — extend someone else’s model. Cisco augments the IETF interface model with platform-specific leaves. This is how vendor extensions work.
  • choice / case — mutually exclusive branches. An interface is either ethernet or loopback, not both. Choice enforces that.
  • identity / identityref — named constants with inheritance. iana-if-type:ethernetCsmacd is an identity. Interface types, address families, and protocol identifiers use this pattern.
  • deviation — “my device doesn’t fully implement this model.” A vendor declares which leaves they don’t support or where their implementation differs from the standard.

Reading the tree

The standard way to visualize a YANG model is pyang -f tree. Here’s the output for ietf-interfaces — the model that defines the base interface data structure every vendor implements:

module: ietf-interfaces
  +--rw interfaces
     +--rw interface* [name]
        +--rw name                        string
        +--rw description?                string
        +--rw type                        identityref
        +--rw enabled?                    boolean
        +--rw link-up-down-trap-enable?   enumeration
        +--ro admin-status                enumeration
        +--ro oper-status                 enumeration
        +--ro last-change?                yang:date-and-time
        +--ro if-index                    int32
        +--ro phys-address?               yang:phys-address
        +--ro higher-layer-if*            interface-ref
        +--ro lower-layer-if*             interface-ref
        +--ro speed?                      yang:gauge64
        +--ro statistics
           +--ro discontinuity-time    yang:date-and-time
           +--ro in-octets?            yang:counter64
           +--ro in-unicast-pkts?      yang:counter64
           +--ro in-broadcast-pkts?    yang:counter64
           +--ro in-multicast-pkts?    yang:counter64
           +--ro in-discards?          yang:counter32
           +--ro in-errors?            yang:counter32
           +--ro out-octets?           yang:counter64
           +--ro out-unicast-pkts?     yang:counter64
           +--ro out-broadcast-pkts?   yang:counter64
           +--ro out-multicast-pkts?   yang:counter64
           +--ro out-discards?         yang:counter32
           +--ro out-errors?           yang:counter32

The notation:

  • rw — read-write. This is configuration data. You can set it via NETCONF edit-config or RESTCONF PUT/PATCH.
  • ro — read-only. This is operational state. Counters, oper-status, speed — things the device reports but you don’t configure.
  • +-- — child node indicator.
  • ? — optional leaf. description? means the device may or may not return a description.
  • * — list. interface* means zero or more entries. higher-layer-if* is a leaf-list (multiple values).
  • [name] — the list key. Each interface entry is uniquely identified by its name leaf.

The rw vs ro distinction is critical for automation. When you call get_config() in ncclient, you get only the rw nodes — what was configured. When you call get(), you get both rw and ro — configuration plus operational state. Mixing these up is the single most common NETCONF debugging issue.

The Python ecosystem

Five tools worth knowing.

pyang is the Swiss Army knife. It validates YANG models, generates tree visualizations (pyang -f tree), and serves as the foundation for code generation plugins. If you’re doing anything with YANG, pyang is the first tool you install. It’s also used to generate UML diagrams, DSDL schemas, and sample XML — but tree output is what you’ll use daily.

ncclient is the NETCONF workhorse. It opens an SSH session to port 830, exchanges capabilities, and lets you send XML-encoded RPCs. get_config() pulls from the configuration datastore. get() pulls configuration plus operational state. edit_config() pushes changes. It’s low-level — you’re constructing XML filters and parsing XML responses — but it maps directly to the protocol and there’s no magic hiding behavior from you.

pyangbind generates Python classes from YANG models. You run pyang with the pyangbind plugin, and it produces a Python module where YANG containers become objects, leaves become attributes with type checking. Set oc_iface.config.enabled = True, serialize to JSON. It brings type safety to what would otherwise be raw dict manipulation. Built by Rob Shakir, who also co-authored the OpenConfig models.

yangson is a JSON-focused YANG library from CZ.NIC Labs. It validates JSON instance data against YANG schemas — useful if you’re working with RESTCONF responses and want to verify they conform to the model. Niche but fills a gap that the other tools don’t cover.

ydk-py (archived) — Cisco’s YANG Development Kit was the most ambitious attempt at this problem. It generated full Python SDKs from YANG models — complete class hierarchies with methods for CRUD operations. The project is now archived and no longer maintained. Worth noting because it demonstrates both the value and the difficulty of auto-generating SDKs from YANG: the generated code was correct but the maintenance burden across hundreds of models and firmware versions was unsustainable.

Three models, one interface

Here’s where YANG gets complicated in practice. The same physical interface — GigabitEthernet1 on a Cisco IOS XE device — looks different depending on which YANG model family you query. Three families coexist on the same device:

Same device. Same physical interface. Three different JSON structures.

OpenConfig is vendor-neutral, driven by operators (Google, Microsoft, AT&T). It puts config and state in the same container — config holds the intended values, state holds the applied values and counters side by side. Written in YANG 1.0. The most portable option for multi-vendor automation.

IETF models are the simplest and cleanest. ietf-interfaces (RFC 8343) defines the base, with ietf-ip augmenting it for IP addressing. Limited in scope — you won’t find QoS, VRF, or platform-specific features here — but they’re the most stable and universally supported.

Cisco Native models mirror the CLI structure. GigabitEthernet is the container, name is just "1" (not the full interface name), and the hierarchy follows the IOS command tree. Most comprehensive, least portable. If you know the CLI, you can guess the JSON path. If you need to work across vendors, these are useless.

The vendor deviation problem

In practice, no vendor implements a standard model completely. Cisco publishes YANG deviations — separate .yang files that declare which leaves from the IETF or OpenConfig models are “not-supported” or “replace.” These deviations vary by platform and firmware version. An IOS XE 17.6 device may support a leaf that 17.3 marks as deviated. OpenConfig models get augmented with vendor-specific extensions that break the portability promise. And the single hardest rule to learn: never mix model families for the same feature. If you configure an interface via openconfig-interfaces, read it back via openconfig-interfaces. Querying the same data through ietf-interfaces or the native model may return stale or mismatched state.

The recommendation: OpenConfig first for multi-vendor consistency. IETF second where OpenConfig lacks coverage. Vendor native as fallback for platform-specific features that no standard model covers.

End-to-end: NETCONF vs RESTCONF

Two protocols, same YANG models underneath. Here’s what it looks like in practice — fetching GigabitEthernet1 data from a Cisco IOS XE device.

The critical distinction: get_config() returns only configuration data (read-write nodes). get() returns configuration plus operational state (read-write and read-only). If you need counters, oper-status, or speed, you must use get(). If you only need what was configured, get_config() is lighter. RESTCONF doesn’t have this split — it returns both config and state by default. Use the content=nonconfig query parameter to get only operational state, or content=config for configuration only.

AspectNETCONFRESTCONF
TransportSSH (port 830)HTTPS
EncodingXMLXML or JSON
StatefulnessSession-basedStateless
TransactionsCandidate config, commit/rollbackPer-request atomic
Python libraryncclientrequests
Best forBulk config, transactions, legacy devicesQuick reads, CI/CD integration, modern tooling

NETCONF is the heavier tool — session-based, XML-only, but with proper transactional semantics (candidate datastore, commit, rollback). RESTCONF is lighter — stateless HTTPS, JSON support, works with any HTTP client — but lacks native multi-operation transactions. For read-only automation and troubleshooting, RESTCONF is simpler. For configuration changes that need atomicity and rollback, NETCONF is safer.

YANG and MCP

YANG’s type system maps naturally to JSON Schema. Containers become objects with properties. Lists become arrays of objects. Leaves become primitives. Leaf-lists become arrays of primitives. Enumerations become JSON Schema enum. This means a YANG model can, in principle, be compiled into the inputSchema of an MCP tool definition — giving an LLM agent a machine-readable description of what parameters a network operation accepts and what it returns.

The IETF draft on MCP applicability demonstrates this explicitly, claiming that “private YANG is compiled to JSON-Schema in the controller (milliseconds); tool registers via /tools/register and is immediately callable.” The concept is sound. The reality is harder. YANG has constructs that don’t map cleanly to JSON Schema: leafref (a pointer to another node in the data tree — no JSON Schema equivalent), augment (requires flattening across modules), deviation (per-device schema variations), and when/must constraints (XPath expressions that JSON Schema’s conditional support can’t fully represent).

At scale — hundreds of YANG models across multiple vendors, each with their own deviations and augmentations — this translation layer is where the real complexity lives. It’s solvable, but it’s engineering work, not a simple compiler pass.

#
Building an MCP server for networks

The open-source vendor-agnostic MCP server project, covering architecture and roadmap.

#
Inside the IETF's MCP network management drafts

A deep analysis of the IETF draft constellation — architecture, companion specs, and gap analysis.

The schema underneath

YANG isn’t exciting. It’s a schema language. But it’s the schema language — the single source of truth for what data network devices expose and accept. Whether you’re writing NETCONF filters by hand, building RESTCONF clients, generating Python classes with pyangbind, or compiling MCP tool schemas for AI agents, YANG is the model underneath. The protocols change, the encodings change, the clients change. The data model doesn’t. Learn it once, use it everywhere.

! Was this useful?