Esc
Type to search posts, tags, and more...
Skip to content

Inside the IETF's MCP network management drafts

A deep technical analysis of draft-yang-nmrg-mcp-nm and its constellation of companion specs — what the IETF is proposing for MCP in network management, what holds up, and what's missing.

Contents

Three days ago I wrote about the vendor-agnostic MCP server I’m building for network troubleshooting. That post mentioned an IETF Internet-Draft in passing. This post takes that draft apart.

#
Related

The open-source vendor-agnostic MCP server I’m building, covering the architecture and roadmap.

What you find when you look closely isn’t one draft. It’s a constellation of at least seven, mostly from Huawei engineers, with co-authors from Telefonica, Deutsche Telekom, and Orange. That coalition — Huawei engineers plus three European tier-1 telcos — signals this is more than an academic exercise.

The draft constellation

Here’s what exists as of March 2026:

DraftFocus
draft-yang-nmrg-mcp-nm-02Primary architecture and deployment scenarios
draft-zw-nmrg-mcp-network-mgmt-00Technical spec: tools, resources, prompts, error codes
draft-zeng-opsawg-applicability-mcp-a2a-00Gap analysis: MCP + A2A vs NETCONF
draft-zeng-nmrg-mcp-usecases-requirements-00Problem statement and use cases
draft-zm-rtgwg-mcp-troubleshooting-01Intent-based troubleshooting
draft-zm-rtgwg-mcp-network-measurement-01Network measurement
draft-zhao-nmop-network-management-agentAI agent framework for NM

All are individual submissions spread across four IETF/IRTF groups — NMRG, NMOP, OPSAWG, and RTGWG. None carry formal IETF endorsement — the drafts say so explicitly. But the breadth of the effort matters. These aren’t scattered experiments; they cover architecture, tooling specifications, gap analysis, and vertical use cases. Someone is building a standards narrative.

The primary draft, draft-yang-nmrg-mcp-nm-02, has gone through three versions since July 2025. The evolution tells a story: v00 framed MCP as a way to “develop AI applications for network management.” By v02, the framing shifted to “refactor network management operations and network capabilities as tools.” That’s a meaningful change — from AI as the goal to AI as the integration method.

The architecture: three pillars and four deployment models

The primary draft proposes MCP as an AI integration layer on top of existing network management infrastructure. Three pillars hold it up.

Encapsulation: vendor-specific device operations get wrapped into MCP Tools with uniform JSON schemas. A centralized Tool Registry stores the metadata. You register a tool once; every MCP client discovers the same interface regardless of the underlying vendor CLI or YANG model.

Intent translation: an LLM maps natural language to structured JSON-RPC tool calls. The draft is careful here — the LLM generates the tool call, but the MCP client parses and executes the request. The LLM is the translator, not the executor.

Closed-loop automation: user input flows through intent parsing, tool discovery, execution, result aggregation, and LLM summarization back to the user. The loop supports retry and rollback, with full traceability of every LLM decision.

Where it gets interesting is the four deployment scenarios introduced in v02:

Scenario 1 is the most surprising: device-to-device MCP communication, with a Small Language Model running on the network element’s CPU for local intent parsing. Think natural language troubleshooting commands directly on the CLI, parsed locally without cloud dependency. The draft acknowledges the resource constraints — limited CPU, limited memory — but the concept of SLMs embedded in network equipment is the most forward-looking idea in any of these drafts. Today it’s aspirational. Give it two hardware generations.

Scenario 3 is the most practical: a standalone MCP server that acts as a protocol adaptor, translating MCP tool calls into NETCONF, RESTCONF, or gNMI operations against actual network elements. This is essentially what my open-source project does — and it validates the architectural approach.

The other two scenarios — controller consuming external APIs (Scenario 2) and gateway-to-device direct MCP (Scenario 4) — fill out the deployment matrix but are less novel.

The companion spec: actual tools and error codes

The architectural draft paints the vision. The real technical substance lives in draft-zw-nmrg-mcp-network-mgmt-00, which defines concrete MCP extensions for network equipment.

Seven tools:

ToolDescription
network.cli.execExecute operational show commands
network.cli.configureEnter config mode, send commands
network.yang.getRetrieve a YANG data node
network.yang.editEdit candidate datastore
network.commitCommit candidate to running
network.rollbackRollback to previous commit
network.file.pull / pushBackup and restore config files

A resource URI scheme: network:/// with templates like network:///interface/{name} for YANG data, network:///routing/ipv4/route-table for the RIB, and network:///system/cpu-utilization for operational metrics.

Three prompt templates: network.troubleshoot.ping-fail for step-by-step diagnosis, network.config.add-vlan for interactive config wizards, and network.security.audit for compliance checks.

And ten network-specific error codes (-32081 through -32090): Network.Timeout, Network.Unreachable, Network.AccessDenied, Network.ConfigIncompatible, Network.RollbackFailed, Network.YangSyntaxError, Network.HardwareFailure, among others.

The capability advertisement system matters. Servers declare yangModules, cliDialect, configDatastore, notificationStream, maxBulkEdit, and supportsRollback. This lets an MCP client know what a device can do before it tries to do it — solving a real pain point in multi-vendor environments where capability discovery is often trial and error.

The tool naming convention (network.cli.exec, network.yang.get) establishes a namespace that could become a standard vocabulary. If the industry converges on these names, interoperability between independent MCP server implementations becomes possible without coordination. That’s a big “if,” but the effort to define it matters.

The gap analysis: architectural invariants of RFC 6241

The third draft worth reading closely is draft-zeng-opsawg-applicability-mcp-a2a-00, titled “When NETCONF Is Not Enough.” It contains the strongest conceptual contribution across all the drafts.

The framing: the limitations of NETCONF aren’t implementation defects. They’re architectural invariants of RFC 6241. NETCONF was designed for XML-centric configuration transactions. What it wasn’t designed for:

GapRoot cause in NETCONFMCP approach
AI semantic layerXML-centric, no function registry/tools/list + JSON-Schema
DevOps iteration speedYANG revision cycles of 6-9 months, firmware lock-inMCP tool hot-registration
Large artifact delivery64 kB chunk limitMCP/A2A artifact delivery via cloud URLs

This is the right way to think about the relationship. NETCONF isn’t broken. It does what it was built for — reliable, transactional configuration management. But the world moved. AI agents need function registries, not XML document stores. DevOps teams need to ship tool updates without waiting for a YANG revision cycle and a firmware release. Observability pipelines produce artifacts that don’t fit in a 64 kB chunk.

The draft is careful to bound its claim: “Outside these scenarios, NETCONF continues to provide the most robust configuration transactions and should remain the south-bound protocol of choice.” That intellectual honesty strengthens the argument. MCP fills gaps; it doesn’t replace the stack.

What the drafts get right

The interworking position is sound. Section 9 of the primary draft states explicitly that MCP does not replace network management protocols or YANG data models. It integrates with them. Every architectural decision flows from this constraint, and it’s the correct one. Anyone proposing to replace NETCONF with JSON-RPC hasn’t operated a network at scale.

The security section is unusually honest for an early draft. It catalogs prompt injection, tool poisoning, rug pulls (post-installation modification of tool behavior), and tool shadowing (name collisions across MCP servers). It acknowledges that MCP lacks inherent auth/authz. It raises the identity ambiguity problem: when a request arrives at a network device, is it from the end user, the AI agent, or a shared system account? Most vendor whitepapers don’t touch these issues.

The MCP Repository concept addresses a real operational problem. In a network with hundreds of devices, you need a way to discover which MCP servers exist, what tools they expose, and how to authenticate. The draft proposes a centralized registry with registration, discovery, and OAuth-based authorization. The flow — server registers capabilities, client queries registry, registry returns server address, client authenticates via OAuth, client invokes tools — is straightforward and operational.

The SLM-on-device concept is the right long-term bet. Running small language models locally on network equipment for troubleshooting, without cloud round-trips, solves latency and air-gap constraints that matter in production networks. The draft doesn’t oversell it — it acknowledges resource limitations — but planting the flag is valuable.

What the drafts miss

No engagement with MCP’s own auth story. The MCP specification defines OAuth 2.1 authorization with scope-based consent. The drafts propose OAuth for server discovery but don’t address how MCP’s built-in auth mechanisms map to network management access control. TACACS+ and RADIUS are still the dominant AAA frameworks in production networks. The gap between MCP OAuth 2.1 and existing network AAA is real and unaddressed.

The code examples use hardcoded credentials. The appendix examples in the primary draft embed usernames and passwords directly in Python code. For a draft that dedicates a section to security concerns, this is a conspicuous oversight. It signals that the implementation guidance hasn’t caught up with the threat model.

No discussion of MCP sampling. The MCP specification includes a sampling capability that lets the server request LLM completions from the client. In a network management context, this could enable the server to ask the LLM to interpret ambiguous device output or classify an error condition. The drafts don’t mention it.

The discovery/registry concept reinvents service mesh patterns. The MCP Repository looks a lot like Consul or Eureka with an OAuth sidecar. That’s not necessarily wrong — network operations teams may not want to depend on cloud-native service mesh infrastructure — but the draft doesn’t acknowledge the prior art or explain why a new discovery mechanism is needed.

The authorship concentration question

Six of the seven drafts have Huawei employees as primary authors. The European telco co-authors provide geographic diversity, but the technical direction is concentrated. History suggests that IETF drafts with narrow authorship bases face adoption challenges — not because the technical content is wrong, but because implementation diversity drives interoperability testing, and interoperability testing drives adoption. The drafts would benefit from co-authors at Cisco, Juniper, or Arista.

No treatment of YANG-to-JSON-Schema translation at scale. The architecture assumes MCP tools derive their schemas from YANG models, but YANG-to-JSON-Schema conversion is non-trivial. YANG has constructs — leafref, augment, deviation, when/must constraints — that don’t map cleanly to JSON Schema. At scale, with hundreds of models across vendors, this translation layer is where the complexity lives. The drafts wave at it but don’t dig in.

What this means for you

If you run network infrastructure, you don’t need to act on these drafts today. They’re informational, not standards-track. No vendor is shipping implementations against them.

But the trajectory matters. A multi-vendor coalition is building a standards narrative around MCP as the AI integration layer for network management. The architectural position — MCP wraps existing protocols, doesn’t replace them — is sound. The companion spec’s tool definitions and error codes could become a shared vocabulary. And the gap analysis framing — NETCONF’s limitations as architectural invariants, not bugs — gives you language for evaluating where AI-assisted operations actually add value in your stack.

Watch the NMRG proceedings. Watch for implementation diversity beyond Huawei. And if you’re building MCP integrations for network operations now, align your tool naming with network.cli.* and network.yang.* from the companion spec. If the industry converges, you’ll already be there.

! Was this useful?