Monthly Archives: September 2018

Enabling application endpoints to communicate via a network requires a whole bunch of protocols in the networking layer. Different protocols providing different functionality each providing a brick making a wall which is achieving the end goal of endpoints communication.

Physically after transceivers have delivered ordered bits in a memory location in a network device they are digested. It could be any of a number of control plane or data plane datagrams that the network device needs to digest.

It could be a Layer 2 MAC / Ethernet layer frame aimed at information transfer within the local area. It could be an ARP control plane frame. It could be IP address reachability information like an OSPF or IS-IS control plane packet. It could be a TCP handshake packet or a TCP Payload packet. It could be a UDP packet. It could be a BGP Update message providing next layer (IP) reachability information.  It could an MPLS labeled packet being switched across through an IP core network.

It depends.

Regarding Reachability Wikipedia states:

” In graph theory, reachability refers to the ability to get from one vertex to another within a graph. A vertex s can reach a vertex t t (and t is reachable from s) if there exists a sequence of adjacent vertices (i.e. a path) which starts with s and ends with  t.

In an undirected graph, reachability between all pairs of vertices can be determined by identifying the connected components of the graph. Any pair of vertices in such a graph can reach each other if and only if they belong to the same connected component. The connected components of an undirected graph can be identified in linear time. The remainder of this article focuses on the more difficult problem of determining pairwise reachability in a directed graph. ”

It’s interesting that mathematically a network is a Graph and a networking device is a Vertex but we’re blogging on networks and not on math.


BGP neighbors are manually configured to utilize a TCP connection at port 179 to exchange IP address routing information. This is the most common use on the wider Internet where transit providers use BGP to exchange IP routes of connected networks. A large service provider which sells internet transit uses BGP to peer with similar other service provider networks and with server hosting providers.  BGP can also be leveraged to advertise information other than IP e.g. MAC routes in EVPN.

Practically speaking any two routers with an established BGP connection send update messages to add and withdraw IP Prefixes (routes) and the routes attributes (AS Path, Community etc.).

BGP has a full finite state machine diagram where a session transitions from Idle state to Established state. Initially Idle it transitions to Connect, OpenSent when Open message is sent, Active state, OpenConfirm where both sides have sent Open message and then Establised where a final acceptance Notification message is sent and thereafter keepalive messages are exchanged. In the OpenConfirm state the two BGP ends have both sent Open messages to each other and are checking the information to see if a BGP session with this peer should be established. The primary information in the Open message include the BGP version number, the AS number, the hold timer, the bgp router id and the optional parameters.  The optional parameters contain TLVs which negotiate attributes such as MP-BGP extension to be used between the peers.

Once established Update message is sent with the routing information and route attributes. Every Update message causes the BGP route table to update and route table version number to increment. An update message contains unfeasible routes, path attributes and NLRI which are IP routes. Path attributes such as AS_Path, LocalPref and MED are present in the Update message.

iBGP as opposed to eBGP is used to communicate routes with an Autonomous System. The AS_Path is treated different in the case of iBGP where a router only adds its own AS number in the path if its speaking to an eBGP peer and does not add its own AS number if its speaking to an iBGP peer. Otherwise if the BGP process sees its own AS it would drop the route assuming a loop. Either a full mesh is required for iBGP so that every router knows every destination of a Route Reflectors could be used to peer with iBGP speaker and reflect routes. Routes received from a client in an RR setup are reflected to other clients and non client neighbors.

One of the mechanisms in BGP is the best path selection methodology. If an IP prefix is reachable from multiple paths BGP has a list of if else steps through which it transitions to select one best path and advertise that.

The best path selection criteria are given below.
1) Weight (Cisco locally assigned – higher weight preferred)
2) Local Preference – Prefer path with higher local pref
3) Network or Aggregate (Cisco local route vs aggregate route)
4) Shortest AS_PATH  (Prefer path with shorter as path)
5) Lowest origin type IGP < EBGP
6) Lowest multi-exit discriminator
7) eBGP over iBGP
8) Lowest IGP metric
9) …

Another aspect of BGP is the route filtering and route manipulations via Community attributes. Where a community attribute is sent in a numbered format e.g. 6939:400 to trigger an impact on the far end neighbor path selection. For example if one neighbor send 6939:400 community to another neighbor the receiving side will set Local Pref of the route to 400 based on a previously agreed upon understanding. This is achieved by if-then-else route policies are the receiver end.  Commonly used communities include Local_Pref setting communities and blackhole communities.

Another aspect of BGP is Multihoming and traffic load balancing. If one autonomous system is multihomed to another autonomous system it will use LPref, Communities and AS Path prepending to influence traffic.

BGP has also been used as an IGP alternative is Massive Scale Data Center deployments using Clos fabrics.

BGP is flexible, scalable, stable and reliable but it is slow in convergence, has limitation is terms of load balancing and requires large CPU/TCAM in case of large routing table sizes.







Event Driven Network Automation is a term used to describe what large scale NetOps teams are doing to scale, deploy and manage networking infrastructure.

YAML data formatting and Jinja2 templating with Python glueing and executing.

Ansible/YAML and Netconf/API for configuration, execution operations.

Event Generation using SNMP/Telemetry/BGPMon.

BGPMon looks like it could be used to check up on changes in a Routed Core with BGP based Leaf-Spine Clos Fabric.

Zero Touch Provisioning – ZTP is best suited for quickly bringing up new devices.

An Orchestration-style GUI layer custom made for every domain in the network would definitely be required as well for various aspects of NetOps.

There can be human driven network automation but there can also be event driven network automation which can be termed as ‘closed loop’ with rule based actions defined by humans.

The events driven, closed loop, rule-based-actions execution layer would then be managed by humans. This layer would be evolving and to manage it there would be a requirement of necessary data structuring and scripting skills in addition to being mindful of what the impact is on the network layer (DC or WAN, both).


Network Automation: Template Configurations with Jinja2 and YAML

I attended the Amazon Network Development Engineer tech talk held in Sydney yesterday. While fishing for future Network Development Engineers Amazon gave a short presentation on their network from a DC and DCI/WAN perspective.

It was a good talk and the interaction with the Network Development Engineers afterwards was insightful. A lot of their work is circling around Automation and Scripting. This is also obvious from the Job title and the Job Descriptions for the role advertisements.