To gain an understanding of components that make up networks we’ll start by stating that a network is a combination of tools working together to provide connectivity to endpoints.

Let’s list the tools.

Network Device (Switch and Router and others) – This is a device which terminates multiple cables into itself with the other end of the cable being other devices. The network device interconnects multiple endpoints via its ports on which cables terminate.

Protocols – These are tools which provide for a coordination mechanism. This coordination mechanism is an exchange of information which makes possible the exchange of traffic.

Protocol Messages – These are messages exchanged between Protocols while they coordinate the laying of the network foundations for exchange of traffic.

Addresses – These come in many flavours and are intended to identify the source and destination of a data payload which is traversing the network.  They can be layered/structured for aggregation and division pools.

Lookup – This is done on the various addresses to find the next hop. Lookup is done to find the next point to which to send the data payload to so that it reaches its ultimate destination after traversing the network.

Appended Information – This is a general term which encompasses information traversing the network which is other than payload and addresses. These are information and tools which are put into packets for protocol operations. This is information inside headers other than the addresses.

Identity Tags – This is a specific class of Appended Information which provides for identity functionality during a lookup and for identification and separation of protocol functions.

Filters & Actions – These are deployed on the network devices to provide intelligent selection and resulting actions over the traversing data payload. They utilize the addresses and appended information inside the data payloads and also the headers.

Network Over Network – This is a general term for a network on top of a network for provision of separate connectivity. A combination another layer of protocols and addresses result in a network over a network.

Network + Network – This is a term identifying the interconnection of 2 or more separate networks resulting in a larger network. Also called internetwork it signifies one domain interconnected to another domain.

Control and Data Plane – Control Plane is the network protocols laying the network foundations and data plane is the traffic traversing the network. Control Plane enables Data Plane.

Network Inside Network Device – This is a term signifying the division of a network device to facilitate a software separation in networks. It creates separate networks inside a network device via operating system software constructs.

We can put brands on these:

OSPF/ISIS/BGP are Protocols to lay the Control Plane for IP addresses

LDP is the Protocol to lay the control plane for MPLS addresses (labels)

MAC Address / IP Address / MPLS Labels are addresses and Lookups are done on them during Data Plane operation

MPLS L2 VPN / MPLS L3 VPN are a Network Over Network function based on labels.

MP-BGP is a protocol to lay Control Plane for Network over Network (MPLS L2VPN & EVPN)

AS to AS BGP connectivity is a Network + Network function

Route Maps / Prefix Lists / AS Path Lists are part of Filters and Actions

OSPF Areas and ISIS Levels are a Network domain + Network domain layering type function

QoS Diffserv and CoS are appended information for actions and functionalities

EVPN, OTV & VXLAN are Network over a Network options. These provide a network over a network Control Plane and network over a network Data Plane.

VXLAN VNID / VLAN TAG / Route Target / Route Distinguishers / BGP Communities are Identity tags for protocol operations where they aid the control plane or data plane.

VDC / VRF / EVPN EVI are Network inside Network Device features primarily being operating system software constructs.

This is a rough approach with much simplification but is intended to view the various network components as tools providing functionality working in unison for connectivity provision. This view aids looking at the components from a Design perspective.

Whether it is a Service Provider, Enterprise or Data Center / Cloud IaaS network the components interact and provide functionality.

This posts focuses on the trend of Microservices and the various related terminologies and trends. In the end it lists the brands in their categories.

An application is software. It is composed of different components. These are the application components. Together they make up the application. The difference between one application software component and another application software component is one of separation of concerns. This is simply dividing a computer program (the application) into different sections. If the different components are somewhat independent of each other they are termed loosely coupled.

The different components of an application communicate with each other. When they need to interact with each other they do it via interfaces. A client component does not need to know the inner workings of the other application software component and uses only the interface.

This is where the word service comes into play where what one application software component provides to another software component is called a service.

Now this application may be placed on a distributed system where its different components are located on networked computers. Thereafter in terms of an application running on a distributed system, SOA or Service Oriented Architecture is where services are provided to other software components over a communications protocol over a network.  This is due to the underlying hardware being networked and distributed in nature and the application software on them being distributed on it.

In terminology of Distributed Systems when when one of its components communicates with another component they do this via messages. We can say that in a distributed system, an application’s software component sends a message to another software component to utilise its service via an interface and that interface is also utilising a network protocol.

We now know about an Application which is a software program, its components and that services are provided by its components. We now know about Distributed Systems, its components networked together and messages being passed between them over a network. We know about applications running on distributed systems where application software components are running on components of the distributed system. We know the application software components communicate with each other via a network.

In Microservices a distributed systems component is running an applications software component and is providing a service. It’s a process now in execution mode. So one software component is placed and is running on one distributed system component and is providing a service from there to other similar independent components.

A normal process is a running software program in execution mode. Inter Process communications are IPCs in terms of processes. In Microservices IPCs will be network messages.

What we discussed above earlier is the application software architecture and its transition into the distributed systems environment. When you say that each independent software component is now running, is a process, it is running on a distributed systems components and the Inter Process Communications are over a network you have Microservices. These Microservices form an Application.

Furthermore, in Microservices there is a bare minimum of centralized management of different services and they may be written in different programming languages and use different data storage technologies. So we can have one software component written in Go, and another in NodeJS and they will provide each other services. These services will also be over a network. So a Go software component can be running on one distributed system component and a NodeJS software component can be running on another distributed system component and they will interact via the network composing the distributed system. Multiple such distributed software components providing services to each other make up a Microservices Application.

A container provides an environment to run a microservice component. A container is a distributed system object which can be termed loosely as a distributed system hardware+software components service.

In terms of branding:

Amazon AWS is a Distributed Systems Provider.

EC2 is Amazon AWS’s product to provide a distributed system compute component online.

S3 is Simple Storage Service, a product for simple storage of files by Amazon AWS online.

DynamoDB is Amazon AWS’s NoSQL Database product which available as a product online.

Golang and NodeJS are programming languages in which backend server side software components are written.

React is a programming language in which frontend user side application software components are written.

Docker is a software which provides for individual container management. One container provide the environment where a software component can be executed on a distributed system.

Kubernetes and Docker Swarm manages multiple (lots of) containers deployed on distributed systems for running a distributed application. They are for containers management.

RabbitMQ and Kafka work as message brokers for passing messages between microservices

RESTFul HTTP APIs are also a means for intermicroservice communication.

Protocol Buffers and GRPC are means of faster intermicroservice communication messaging.

MongoDB and Couchbase are NoSQL databases which can be run in containers and be utilised by application software components for Database purposes.

Git is an application software component version control system

Promethues is an application (software) to be run (can be in containers) built specifically for the purpose of monitoring microservices software component health (metrics)

Grafana is an application (software) to be run (can be in containers) for the purpose visualizing metrics/health of microservices.

ELK stack which is ElasticSearch, Logstash and Kibana are softwares which provide for logging of events and their search and visualization.

https://en.wikipedia.org/wiki/Component-based_software_engineering

https://en.wikipedia.org/wiki/Event-driven_architecture

https://en.wikipedia.org/wiki/Service-oriented_architecture

http://www.d-net.research-infrastructures.eu/node/34

https://martinfowler.com/articles/microservices.html

https://en.wikipedia.org/wiki/Process_(computing)

https://static.googleusercontent.com/media/research.google.com/en//pubs/archive/45406.pdf

 

Similar trends in multiple industries are apparent.

  • Telecommunications Provider e.g. AT&T
  • Networking/Internet IP,MPLS Service Providers
  • Cloud Native Iaas, PaaS, SaaS industry

Let’s see what the trend is that they have in common:

  • Telecom – AT&T ONAP’s DCAE – Data Collection, Analytics and Events
  • Networking/Internet Service Provider – Cisco/Juniper Telemetry
  • Cloud Native – Kafka and streaming events data from Microservices architectures

What they have in common is events & data production and thereafter streaming of the data and thereafter analytics on these events/data resulting in near real time decision making.

The naming is different, the products are different and the industries are different but the production of data, its streaming and analytics is common.

Telecom industry is moving from PNF (Physical Network Functions) to VNF (Virtual Network Functions). Which is a move from tightly coupled hardware/software devices to a more software driven architecture

The ISP industry is still shifting around IP Packets but they are now looking for more streaming style analytics of their devices and the traffic flows which they are calling Telemetry.

The Cloud Native industry is in the pack with its Microservices based software centric application architectures.

They all have event generation in common and want to process the data and then use it in real time. Real Time Data Streaming and Processing.

Lets now see dig a little deeper and start correlating the terminologies. From the products category we will take Telecom’s ONAP, Networking’s Cisco DNA, Cloud Native’s Prometheus and Kafka and Information Security’s Splunk from the industries to analyse.

ONAPs VNF Event Stream or VES is the stream event producer. ONAPs logging section utilises the same ELK, Elastricsearch Logstash and Kibana dashboarding available in AWS cloud.

Juniper’s Telemetry streaming utilises Google’s Protocol Buffers (gpb) structured messages are relayed to a performance management application. Cisco’s Model Driven Telemetry utilises the same Google Protocol Buffers for streaming data from its devices.

Cloud Native applications are Microservices based which has Event Sourcing and CQRS and are requiring Rabbit MQ/Kafka style message brokers in addition to stream processors and analytics such as the same ELK stack mentioned earlier.

Large organisations such as Linkedin faced the problem of data deluge earlier than the rest of the world in terms of its handling, processing and analytics in real time. This has resulted in products such as Kafka.

 

 

Link:

https://wiki.opnfv.org/display/PROJ/VNF+Event+Stream

https://wiki.onap.org/display/DW/Logging+User+Guide

 

 

What does a Site Reliability Engineer do?

 

Site Reliability Engineer is a term for the operations and administration of complex computer systems involving:

 

  • Networking,
  • Virtualized operating system environments including VM/Containers,
  • The orchestrations tools for Networking/Virtualized infrastructure,
  • Applications,
  • The interactions of the above in a multi-site/multi-pop environment,
  • And utilising the above to deliver a service/product and ensuring it is working well.

 

It basically appears to be an operations role but within a complex environment where multiple technology silos interact heavily to deliver the product to the end user. Google hires and assigns the role of Site Reliability Engineer and they are operating such a complex environment delivering Google.com/Gmail.com/Youtube.com etc. Facebook does the same.

 

Looking at a particular set of Site Reliability Engineer job advertisements they appear to have one thing in common for diverse roles within the SRE domain:

 

  • ‘You have an ‘infrastructure as code’ approach to managing infrastructure’

 

So from the Site Reliability Engineer title we reach to the term Infrastructure as a Code approach.

 

Infrastructure as a Code ‘tools’ sort of sit on top of Configurations Management tools like Puppet, Chef, Ansible and provide increased functionality. Terraform and AWS Cloudformation are two Infrastructure as a Code tools but what the Job Ads are asking for is it approach.

 

Coding is common:

  • One thing that is apparent is that when you take a look at an Ansible Playbook YAML .yml file or a Teraform Configuration .tf file or a Chef Recipe .rb file or a Puppet Manifest .pp file or a AWS Cloudformation Template they all look code-like. In fact, they are all code but at a plane where the code is not intended to utilize processor, memory and hard disk of a single machine in a setup.exe resulting file-form to deploy on a single computer. They are coded or code-like data expressions which translate into the deployment, configuration and orchestration of more complicated computing systems. They are code-like expression which for example deploy AWS products which are themselves ‘infrastructure as a service’ public cloud systems.

 

From this it appears that Infrastructure as a Code is a term signifying another layer of abstraction.

There are levels of abstraction. Where the levels can be :

  • solid state physics, silicon/CPU, memory, hard disk hardware.
  • then 0’s and 1’s & bits on top of these at the next level
  • then integers & strings utilizing the above layer
  • then arithmetic operations and string manipulations on top of above
  • then programs and software applications running on computers/devices,
  • then interconnected computers
  • then distributed systems composed of interconnected computers/devices
  • then Public Cloud Infrastructure as a Service
  • then an Application running on this silicon,cpu,memory,hardisk,0’s,1’s, bits, integers, strings, arithmetic operations, string manipulations, individual programs/software, interconnected infrastructure/computers/servers/routers/switches, public cloud.

By inference Infrastructure as a Code approach presents/preserves information that is relevant to our end application plane and environment and abstracts information that is not relevant in our Application’s environment.

The internet is a good examples of multiple systems on top of other systems.

It looks like even Google and Facebook still need a human operate their systems. They will not be flying through them like an aeroplane in clouds. They will know the layers and the systems in place. They will navigate from symptoms to root cause and then codify/rectify & adjust for continual optimal service.

Moving on, three job ads for Site Reliability Engineer are given below.

Infrastructure as Code is common. Lets see the rest.

Job Ad:

  1. Site Reliability Engineer | Data Stores | Redis & Kafka

 

Our Tech Stack across Site Reliability as a whole:

• Data Analytics software including Kafka and Redis
• Open Source technologies (We constantly look to innovate and adopt)
• Amazon Web Services – AWS, and a load of services
• Coding with React, NodeJS and Python
• Couchbase, Kubernetes, ElasticSearch & Microservices Infrastructure
• Linux Operating systems, we look for passion
• Infrastructure as Code & Automate everything are a couple of our mottos

Job Ad:

  1. Site Reliability Engineers | Multiple Roles | Golang | AWS | ReactThe TechStack you will be getting your hands dirty with:

    • Open Source technologies (We constantly look to innovate and adopt)
    • Amazon Web Services – AWS, and a load of services
    • Coding with React, NodeJS and Python
    • Couchbase, Kubernetes, ElasticSearch & Microservices Infrastructure
    • Linux Operating systems, we look for passion
    • Infrastructure as Code & Automate everything are a couple of our mottos

 

Job Ad:

  1. Site Reliability Engineer | Edge Computing | AWS | Networking

 

Our Tech Stack across Site Reliability as a whole:

• Networking – Load balancers, Proxies, Routing, DC, AWS
• Open Source technologies (We constantly look to innovate and adopt)
• Amazon Web Services – AWS, and a load of services
• Coding with React, NodeJS and Python
• Couchbase, Kubernetes, ElasticSearch & Microservices Infrastructure
• Linux Operating systems, we look for passion
• Infrastructure as Code & Automate everything are a couple of our mottos

 

The three SRE roles are diverse and they are geared towards multiple parts of the stack which run the end application. SRE tilting towards Networking, SRE tilting towards Data/Stream Processing and SRE tilting towards Development (Front-End/Back-End).

 

The below are common to all three:

 

  • React, NodeJS and Python
  • Couchbase, Kubernetes, ElasticSearch & Microservices Infrastructure
  • Linux/AWS

 

The below varies amongst them:

 

  • SRE Networking tilted role – Edge Computing, Load balancers, Proxies, Routing, DC, AWS
  • SRE Data/Stream Processing tilted role – Kafka, Redis
  • SRE Dev tilted role – Golang

 

And so what is this system achieving and what is it composed of? How do they interact and what do the multiple SREs do?

 

In terms of programming languages, we have Golang, React, NodeJS and Python.

In terms hardware we have AWS and Edge Computing PoPs/nodes/devices

In terms of data store and streaming we have Kafka and Redis

In terms of containers management there is Kubernetes

In terms of Data retrievals / search and possibly analytics there is ElasticSearch

In terms database there is Couchbase

 

An SRE is not an end-application software developer. So the above listed tools are part of the system to be run. This will be done with infrastructure as a code approach to programify for optimal operations.

 

So lets now try to put the clues in the Job Description together.

 

  • React & NodeJS are Javascript frameworks with React being the User Interface/FrontEnd (used by Facebook UI) and NodeJS being the Server/BackendEnd for Scalable Data I/O. Python can be used as for programming services at various locations. Golang is also used in the the Backend Serverside providing for its concurrency feature for applications/services.
  • Redis can be used to store application state information. In-memory fast, scalable and distributed. It is a key value store provider for application state cache-like.
  • Kafka is a distributed data streaming platform and can be stated to be in the middle. Producers producing data and consumers using data and stream processors processing it are connected to Kafka clusters. It can be used for event streaming/aggregation.
  • With no other stream processing engine present in the Job description Kafka with Kafka Streams can be stated to provide for stream processing as well.
  • ElasticSearch can be used for indexing and search. Data can be copied in via Kafka connector APIs and then indexed. Kibana is not listed but it might have skipped mentioning and can be used for the visualization and dashboarding.
  • Couchbase can be used as a NoSQL JSON-style distributed database as an external store for storage of logs/events (documents). It can take in data and deliver it via its Kafka connector.
  • Kubernetes manages the containers furbishing the application environment.

 

It looks to be a full Cloud Native environment which needs to be kept up and running optimally with continued service.

 

Part of this environment is the networking aspect.  This includes the listed edge computing component which means this high performance cloud native application also has near-user-location edge devices within its architecture.

 

Geolocation Routing and CDNs are the tools used to decrease application latency times. AWS Availability Zones can be considered as multi-site replicated PoPs. Edge networking nodes will also branch off as required and can be mini PoPs. Depending on the size of the user base being serviced by the Edge PoP node it might scale into being a small DC.

 

Branching within the networking domain is the use of Proxies. Forward + Reverse + Side-car if required.

 

A scenario of an increase in application demand resulting in container scaling which can result in requirement of on demand load balancing and proxying. One such tool is the F5 Application Services Proxy which from a networking perspective is a proxy but it integrates with the Kubernetes and can be used for an infrastructure as a coded deployment. F5’s Application Services Proxy is itself a Node.js application but is middleware here.

 

The list of VNFs at the OPNFV website contains 9 Open Source and 5 proprietary VNFs available to them. Of the 9 the most impressive from a Telco perspective is Clearwater vIMS which provides IMS Functionality in a VNF package. This is the VNF that is used during MWC 2016 to demo the OSM project by ETSI as well. They make a SIP call using it.