Using LLMs to Parse Cisco Show Commands
Using large language models to extract structured data from show command output when TextFSM templates don't exist or are too rigid.
Contents
The Parsing Problem
Every network engineer has been here: you need structured data from a Cisco show command, but there is no TextFSM template for your specific output format. Maybe it is a newer IOS-XE version with slightly different spacing, or a less common show command that nobody has written a parser for.
Here is a real example. You run show ip bgp summary and get this:
BGP router identifier 10.0.0.1, local AS number 65001
BGP table version is 142, main routing table version 142
12 network entries using 2976 bytes of memory
15 path entries using 2040 bytes of memory
4/3 BGP path/bestpath attribute entries using 1152 bytes of memory
Neighbor V AS MsgRcvd MsgSent TblVer InQ OutQ Up/Down State/PfxRcd
10.0.0.2 4 65002 8421 8390 142 0 0 3d14h 8
10.0.0.3 4 65002 8419 8391 142 0 0 3d14h 4
10.0.0.4 4 65003 0 0 1 0 0 never Active
192.168.1.1 4 65100 14223 14201 142 0 0 5w2d 12
You could write a regex. You could use TextFSM. Or you could ask an LLM.
The LLM Approach
The idea is simple: send the raw output to an LLM with a prompt describing the JSON schema you want back. The LLM handles the parsing, including edge cases like “never” in the Up/Down column and “Active” as a state instead of a prefix count.
import json
from openai import OpenAI
client = OpenAI()
def parse_bgp_summary(raw_output: str) -> list[dict]:
"""Parse show ip bgp summary output using an LLM."""
response = client.chat.completions.create(
model="gpt-4o-mini",
temperature=0,
response_format={"type": "json_object"},
messages=[
{
"role": "system",
"content": (
"You are a network data parser. Extract structured data "
"from Cisco IOS show command output. Return valid JSON only."
),
},
{
"role": "user",
"content": f"""Parse this BGP summary into JSON with this schema:
{{
"router_id": "string",
"local_as": number,
"table_version": number,
"neighbors": [
{{
"address": "string",
"version": number,
"remote_as": number,
"messages_received": number,
"messages_sent": number,
"up_down": "string",
"state_pfx_received": "string | number",
"is_established": boolean
}}
]
}}
Raw output:
{raw_output}""",
},
],
)
result = json.loads(response.choices[0].message.content)
return result
Validating the Output
Never trust LLM output blindly. We validate with Pydantic:
from pydantic import BaseModel
class BGPNeighbor(BaseModel):
address: str
version: int
remote_as: int
messages_received: int
messages_sent: int
up_down: str
state_pfx_received: str | int
is_established: bool
class BGPSummary(BaseModel):
router_id: str
local_as: int
table_version: int
neighbors: list[BGPNeighbor]
# Validate the LLM output
parsed = parse_bgp_summary(raw_output)
validated = BGPSummary(**parsed)
for neighbor in validated.neighbors:
status = "UP" if neighbor.is_established else "DOWN"
print(f" {neighbor.address:16s} AS{neighbor.remote_as:6d} {status}")
Output:
10.0.0.2 AS 65002 UP
10.0.0.3 AS 65002 UP
10.0.0.4 AS 65003 DOWN
192.168.1.1 AS 65100 UP
When to Use This
This approach works well for:
- One-off parsing where writing a TextFSM template is not worth the effort
- Unusual output formats from vendor-specific or version-specific commands
- Rapid prototyping when you need structured data quickly during troubleshooting
- Multi-vendor environments where output formats vary wildly
It is not a replacement for TextFSM in production pipelines. LLM calls add latency, cost money, and can occasionally hallucinate field values. For critical automation, write proper parsers. For exploration and ad-hoc analysis, LLMs are remarkably effective.
Cost and Performance
Parsing a single show command with gpt-4o-mini costs roughly $0.001 and takes about 800ms. That is fine for interactive use but adds up at scale. If you are parsing the same command across 500 routers, write a TextFSM template.
The sweet spot is using LLMs to generate the TextFSM template itself, then using the template for bulk parsing. That is a story for another post.
Practical Tips
- Always set
temperature=0for deterministic parsing - Use
response_format=json_objectto guarantee valid JSON output - Validate with Pydantic or a similar schema library
- Cache results when parsing the same output format repeatedly
- Include example output in your prompt for better accuracy with unusual formats