Docker + Kubernetes

For some time, I wanted to take a look at kubernetes. There is a lot of talking about microservices in the cloud and after attending some meetups, I wasnt sure what was all this about so I signed for kodekloud to learn about it.

So far, I have completed the beginners course for Docker and Kubernetes. To be honest, I think the product is very good value for money.

I have been using docker a bit the last couple of months but still wanted to take a bit more info to improve my knowledge.

I was surprised when reading that kubernets pods rely on docker images.

Docker Notes

Docker commands

docker run -it xxx (interactive+pseudoterminal)
docker run -d xxx (detach)
docker attach ID (attach)
docker run --name TEST xxx (provide name to container)
docker run -p 80:5000 xxx (maps host port 80 to container port 5000)

docker run -v /opt/datadir:/var/lib/mysql mysql (map a host folder to container folder for data persistence)

docker run -e APP_COLOR=blue xxx (pass env var to the container)
docker inspect "container"  -> check IP, env vars, etc
docker logs "container"

docker build . -t account_name/app_name
docker login
docker push account_name/app_name

docker -H=remote-docker-engine:2375 xxx

cgroups: restrict resources in container
  docker run --cpus=.5  xxx (no more than 50% CPU)
  docker run --memory=100m xxx (no more than 100M memory)

Docker File

----
FROM Ubuntu
ENTRYPOINT ["sleep"]
CMD ["5"]        --> if you dont pass any value in "docker run .." it uses by default 5.
----

Docker Compose

$ cat docker-compose.yml
version: "3"
services:
 db:
  image: postgres
  environment:
    - POSTGRES_PASSWORD=mysecretpassword
 wordpress:
  image: wordpress
  links:
    - db
  ports:
    - 8085:80


verify file: $ docker-compose config

Docker Volumes

docker volume create NAME  --> create /var/lib/docker/volumes/NAME

docker run -v NAME:/var/lib/mysql mysql  (docker volume)
or
docker run -v PATH:/var/lib/mysql mysql  (local folder)
or
docker run --mount type=bind,source=/data/mysql,target=/var/lib/mysql mysql

Docker Networking

networks: --network=xxx
 bridge (default)
 none   isolation
 host (only communication with other containers)   

docker network create --driver bridge --subnet x.x.x.x/x NAME
docker network ls
               inspect

Docker Swarm

I didnt use this, just had the theory. This is for clustering docker hosts: manager, workers.

   manager: docker swarm init
   workers: docker swarm join --token xxx
   manager: docker service create --replicas=3 my-web-server

Kubernetes Notes

container + orchestration
 (docker)    (kubernetes)

node: virtual or physical, kube is installed here
cluster: set of nodes
master: node that manage clusters

kube components:
  api,
  etcd (key-value store),
  scheduler (distribute load),
  kubelet (agent),
  controller (brain: check status),
  container runtime (sw to run containers: docker)

master: api, etcd, controller, scheduler,
   $ kubectl cluster-info
           get nodes -o wide (extra info)
node: kubelet, container

Setup Kubernetes with minikube

Setting up kubernetes doesnt look like an easy task so there are tools to do that like microk8s, kubeadm (my laptop needs more RAM, can’t handle 1master+2nodes) and minikube.

minikube needs: virtualbox(couldnt make it work with kvm2…) and kubectl

Install kubectl

I assume virtualbox is already installed

$ curl -LO "https://storage.googleapis.com/kubernetes-release/release/$(curl -s https://storage.googleapis.com/kubernetes-release/release/stable.txt)/bin/linux/amd64/kubectl"

$ chmod +x ./kubectl
$ sudo mv ./kubectl /usr/local/bin/kubectl
$ kubectl version --client

Install minikube

$ grep -E --color 'vmx|svm' /proc/cpuinfo   // verify your CPU support 
                                               virtualization
$ curl -Lo minikube https://storage.googleapis.com/minikube/releases/latest/minikube-linux-amd64 \
>   && chmod +x minikube
$ sudo install minikube /usr/local/bin/

Start/Status minikube

$ minikube start --driver=virtualbox  --> it takes time!!!! 2cpu + 2GB ram !!!!
😄  minikube v1.12.3 on Debian bullseye/sid
✨  Using the virtualbox driver based on user configuration
💿  Downloading VM boot image ...
    > minikube-v1.12.2.iso.sha256: 65 B / 65 B [-------------] 100.00% ? p/s 0s
    > minikube-v1.12.2.iso: 173.73 MiB / 173.73 MiB [] 100.00% 6.97 MiB p/s 25s
👍  Starting control plane node minikube in cluster minikube
💾  Downloading Kubernetes v1.18.3 preload ...
    > preloaded-images-k8s-v5-v1.18.3-docker-overlay2-amd64.tar.lz4: 510.91 MiB
🔥  Creating virtualbox VM (CPUs=2, Memory=2200MB, Disk=20000MB) ...
🐳  Preparing Kubernetes v1.18.3 on Docker 19.03.12 ...
🔎  Verifying Kubernetes components...
🌟  Enabled addons: default-storageclass, storage-provisioner
🏄  Done! kubectl is now configured to use "minikube"

$ minikube status
minikube
type: Control Plane
host: Running
kubelet: Running
apiserver: Running
kubeconfig: Configured

$ kubectl get nodes
NAME       STATUS   ROLES    AGE     VERSION
minikube   Ready    master   5m52s   v1.18.3

$ minikube stop  // stop the virtualbox VM to free up resources once you finish

Basic Test

$ kubectl create deployment hello-minikube --image=k8s.gcr.io/echoserver:1.10
deployment.apps/hello-minikube created

$ kubectl get deployments
NAME             READY   UP-TO-DATE   AVAILABLE   AGE
hello-minikube   1/1     1            1           22s

$ kubectl expose deployment hello-minikube --type=NodePort --port=8080
service/hello-minikube exposed

$ minikube service hello-minikube --url
http://192.168.99.100:30020

$ kubectl delete services hello-minikube
$ kubectl delete deployment hello-minikube
$ kubectl get pods

Pods

Based on documentation:

Pods are the smallest deployable units of computing that you can create and manage in Kubernetes.
A Pod is a group of one or more containers, with shared storage/network resources, and a specification for how to run the containers. A Pod's contents are always co-located and co-scheduled, and run in a shared context. A Pod models an application-specific "logical host": it contains one or more application containers which are relatively tightly coupled. In non-cloud contexts, applications executed on the same physical or virtual machine are analogous to cloud applications executed on the same logical host.
$ kubectl run nginx --image=nginx
$ kubectl describe pod nginx
$ kubectl get pods -o wi
$ kubectl delete pod nginx

Pods – Yaml

Pod yaml structure:

pod-definition.yml:
---
apiVersion: v1
kind: (type of object: Pod, Service, ReplicatSet, Deployment)
metadata: (only valid k-v)
 name: myapp-pod
 labels: (any kind of k-v)
   app: myapp
   type: front-end
spec:
  containers:
   - name: nginx-container
     image: nginx

Example:

$ cat pod.yaml 
apiVersion: v1
kind: Pod
metadata:
  name: nginx
  labels:
    app: nginx
    type: frontend
spec:
  containers:
  - name: nginx
    image: nginx

$ kubectl apply -f pod.yaml 
$ kubectl get pods

Replica-Set

Based on documentation:

A ReplicaSet's purpose is to maintain a stable set of replica Pods running at any given time. As such, it is often used to guarantee the availability of a specified number of identical Pods.
> cat replicaset-definition.yml
---
 apiVersion: apps/v1
 kind: ReplicaSet
 metadata:
   name: myapp-replicaset
   labels:
     app: myapp
     type: front-end
 spec:
   template:
     metadata:      -------
       name: nginx         |
       labels:             |
         app: nginx        |
         type: frontend    |-> POD definition
     spec:                 |
       containers:         |
       - name: nginx       |
         image: nginx  -----
   replicas: 3
   selector:       <-- main difference from replication-controller
     matchLabels:
       type: front-end
       
> kubectl create -f replicaset-definition.yml
> kubectl get replicaset
> kubectl get pods

> kubectl delete replicaset myapp-replicaset

How to scale via replica-set

> kubectl replace -f replicaset-definition.yml  (first update file to replicas: 6)

> kubectl scale --replicas=6 -f replicaset-definition.yml  // no need to modify file

> kubectl scale --replicas=6 replicaset myapp-replicaset   // no need to modify file

> kubectl edit replicaset myapp-replicaset (NAME of the replicaset!!!)

> kubectl describe replicaset myapp-replicaset

> kubectl get rs new-replica-set -o yaml > new-replica-set.yaml ==> returns the rs definition in yaml!

Deployments

Based on documentation:

Deployment provides declarative updates for Pods ReplicaSets.
You describe a desired state in a Deployment, and the Deployment Controller changes the actual state to the desired state at a controlled rate. You can define Deployments to create new ReplicaSets, or to remove existing Deployments and adopt all their resources with new Deployments.

Example:

cat deployment-definition.yml
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: myapp-deployment
  labels:
    app: myapp
    type: front-end
spec:
  template:
    metadata:
      name: myapp-pod
      labels:
        app: myapp
        type: front-end
    spec:
      containers:
      - name: nginx-controller
        image: nginx
  replicas: 3
  selector:
    matchLabels:
      type: front-end


> kubectl create -f deployment-definition.yml
> kubectl get deployments
> kubectl get replicaset
> kubectl get pods

> kubectl get all

Update/Rollback

From documentation.

By default, it follows a “rolling update”: destroy one, create new one. So this doesnt cause an outage

$ kubectl create -f deployment.yml --record
$ kubectl rollout status deployment/myapp-deployment
$ kubectl rollout history deployment/myapp-deployment
$ kubectl rollout undo deployment/myapp-deployment ==> rollback!!!

Networking

Not handled natively by kubernetes, you need another tool like calico, weave, etc. More info here. This has not been covered in details yet. It looks complex (a network engineer talking…)

Services

Based on documentation:

An abstract way to expose an application running on a set of Pods as a network service.
With Kubernetes you don't need to modify your application to use an unfamiliar service discovery mechanism. Kubernetes gives Pods their own IP addresses and a single DNS name for a set of Pods, and can load-balance across them.
types:
   NodePort: like docker port-mapping
   ClusterIP:
   LoadBalancer

Examples:

nodeport
--------
service: like a virtual server
  targetport - in the pod: 80
  service - 80
  nodeport: 30080 (in the node)
  
service-definition.yml
apiVersion: v1
kind: Service
metadata:
  name: mypapp-service
spec:
  type: NodePort
  ports:
  - targetPort: 80
    port: 80
    nodePort: 30080  (range: 30000-32767)
  selector:
    app: myapp
    type: front-end

> kubectl create -f service-definition.yml
> kubectl get services
> minikube service mypapp-service


clusterip: 
---------
service-definition.yml
apiVersion: v1
kind: Service
metadata:
  name: back-end
spec:
  type: ClusterIP
  ports:
  - targetPort: 80
    port: 80
  selector:
    app: myapp
    type: back-end


loadbalance: gcp, aws, azure only !!!!
-----------
service-definition.yml
apiVersion: v1
kind: Service
metadata:
  name: back-end
spec:
  type: LoadBalancer
  ports:
  - targetPort: 80
    port: 80
    nodePort: 30080
  selector:
    app: myapp


> kubectl create -f service-definition.yml
> kubectl get services

Microservices architecture example

Diagram
=======

voting-app     result-app
 (python)       (nodejs)
   |(1)           ^ (4)
   v              |
in-memoryDB       db
 (redis)       (postgresql)
    ^ (2)         ^ (3)
    |             |
    ------- -------
          | |
         worker
          (.net)

1- deploy containers -> deploy PODs (deployment)
2- enable connectivity -> create service clusterIP for redis
                          create service clusterIP for postgres
3- external access -> create service NodePort for voting
                      create service NodePort for result

Code here. Steps:

$ kubectl create -f voting-app-deployment.yml
$ kubectl create -f voting-app-service.yml

$ kubectl create -f redis-deployment.yml
$ kubectl create -f redis-service.yml

$ kubectl create -f postgres-deployment.yml
$ kubectl create -f postgres-service.yml

$ kubectl create -f worker-deployment.yml

$ kubectl create -f result-app-deployment.yml
$ kubectl create -f result-app-service.yml

Verify:

$ minikube service voting-service --url
http://192.168.99.100:30004
$ minikube service result-service --url
http://192.168.99.100:30005

Optical in Networking: 101

This is a very good presentation about optical stuff from NANOG 70 (2017). And I noticed there is an updated version from NANOG 77 (2019). I watched the 2017 (2h) and there is something always bites me: db vs dbm

A bit more info about dB vs dBm: here and here

Close to the end, there are some common questions about optical that he provides answers. I liked the ones about “looking at the lasers can make you blind” and the point that is worth cleaning your fibers. A bit about cleaning here.

cEOS Netconf – Ncclient

I am still trying to play with / understand Openconfig/YANG/Netconf modelling. Initially I tried to use ansible to configure EOS via netconf but I didnt get very far 🙁

I have found an Arista blog to deal with netconf using the python library ncclient.

This is my adapted code. Keep in mind that I think there is a typo/bug in Arista blog in “def irbrpc(..)” as it should return “snetrpc” instead of “irbrpc”. This is the error I had:

Traceback (most recent call last):
File "eos-ncc.py", line 171, in
main()
File "eos-ncc.py", line 168, in main
execrpc(hostip, uname, passw, rpc)
File "eos-ncc.py", line 7, in execrpc
rpcreply = conn.dispatch(to_ele(rpc))
File "xxx/lib/python3.7/site-packages/ncclient/xml_.py", line 126, in to_ele
return x if etree.iselement(x) else etree.fromstring(x.encode('UTF-8'), parser=_get_parser(huge_tree))
AttributeError: 'function' object has no attribute 'encode'

After a couple of prints in “ncclient/xml_.py” I could see “x” was a function but I couldnt understand why. Just by chance I notices the typo in the return.

As well, I couldn’t configure the vxlan interface using XML as per the blog and I had to remove it and add it via a RPC call with CLI commands “intfrpcvxlan_cli”. This is the error I had:

Traceback (most recent call last):
File "eos-ncc.py", line 171, in
main()
File "eos-ncc.py", line 168, in main
execrpc(hostip, uname, passw, rpc)
File "eos-ncc.py", line 7, in execrpc
rpcreply = conn.dispatch(to_ele(rpc))
File "xxx/lib/python3.7/site-packages/ncclient/manager.py", line 236, in execute
huge_tree=self._huge_tree).request(*args, **kwds)
File "xxx/lib/python3.7/site-packages/ncclient/operations/retrieve.py", line 239, in request
return self._request(node)
File "xxx/lib/python3.7/site-packages/ncclient/operations/rpc.py", line 348, in _request
raise self._reply.error
ncclient.operations.rpc.RPCError: Request could not be completed because leafref at path "/interfaces/interface[name='Vxlan1']/name" had error "leaf value (Vxlan1) not present in reference path (../config/name)"

So in my script, I make two calls and print the reply:

$ python eos-ncc.py
<rpc-reply xmlns="urn:ietf:params:xml:ns:netconf:base:1.0" message-id="urn:uuid:7b1be88e-36b7-4289-a0d2-396a0f21cf5e"><ok></ok></rpc-reply>
<rpc-reply xmlns="urn:ietf:params:xml:ns:netconf:base:1.0" message-id="urn:uuid:750a50be-3534-442b-bad4-2f8c916afd77"><ok></ok></rpc-reply>

And this is what the logs show:

## first rpc call
2020-08-13T12:52:41.937079+00:00 r01 ConfigAgent: %SYS-5-CONFIG_SESSION_ENTERED: User tomas entered configuration session session630614618267084 on NETCONF (172.27.0.1)
2020-08-13T12:52:42.302111+00:00 r01 ConfigAgent: %SYS-5-CONFIG_SESSION_COMMIT_SUCCESS: User tomas committed configuration session session630614618267084 successfully on NETCONF (172.27.0.1)
2020-08-13T12:52:42.302928+00:00 r01 ConfigAgent: %SYS-5-CONFIG_SESSION_EXITED: User tomas exited configuration session session630614618267084 on NETCONF (172.27.0.1)
2020-08-13T12:52:42.325878+00:00 r01 Launcher: %LAUNCHER-6-PROCESS_START: Configuring process 'HostInject' to start in role 'ActiveSupervisor'
2020-08-13T12:52:42.334151+00:00 r01 Launcher: %LAUNCHER-6-PROCESS_START: Configuring process 'ArpSuppression' to start in role 'ActiveSupervisor'
2020-08-13T12:52:42.369660+00:00 r01 Ebra: %LINEPROTO-5-UPDOWN: Line protocol on Interface Vlan100 (VLAN_100), changed state to up
2020-08-13T12:52:42.527568+00:00 r01 ProcMgr-worker: %PROCMGR-6-WORKER_WARMSTART: ProcMgr worker warm start. (PID=553)
2020-08-13T12:52:42.557663+00:00 r01 ProcMgr-worker: %PROCMGR-7-NEW_PROCESSES: New processes configured to run under ProcMgr control: ['ArpSuppression', 'HostInject']
2020-08-13T12:52:42.570208+00:00 r01 ProcMgr-worker: %PROCMGR-7-PROCESSES_ADOPTED: ProcMgr (PID=553) adopted running processes: (SharedSecretProfile, PID=1024) (Lldp, PID=832) (SlabMonitor, PID=555) (Pim, PID=1156) (MplsUtilLsp, PID=902) (Mpls, PID=903) (Isis, PID=1087) (PimBidir, PID=1164) (Igmp, PID=1172) (Acl, PID=920) (StaticRoute, PID=1060) (IgmpSnooping, PID=1030) (IpRib, PID=1064) (Stp, PID=939) (KernelNetworkInfo, PID=940) (Etba, PID=1073) (KernelMfib, PID=1139) (ConnectedRoute, PID=1076) (RouteInput, PID=1080) (EvpnrtrEncap, PID=1082) (McastCommon6, PID=956) (ConfigAgent, PID=702) (Fru, PID=703) (Launcher, PID=704) (Bgp, PID=1089) (McastCommon, PID=834) (SuperServer, PID=836) (OpenConfig, PID=839) (LacpTxAgent, PID=970) (AgentMonitor, PID=845) (Snmp, PID=848) (PortSec, PID=850) (Ira, PID=852) (IgmpHostProxy, PID=1146) (EventMgr, PID=862) (Sysdb, PID=607) (CapiApp, PID=866) (Arp, PID=995) (StpTxRx, PID=871) (KernelFib, PID=1000) (StageMgr, PID=700) (Lag, PID=876) (Qos, PID=1005) (L2Rib, PID=1008) (PimBidirDf, PID=1137) (Tunnel, PID=883) (PimBsr, PID=1150) (Msdp, PID=1142) (BgpCliHelper, PID=1067) (TopoAgent, PID=1017) (Aaa, PID=890) (StpTopology, PID=891) (Ebra, PID=1022) (ReloadCauseAgent, PID=1023)
2020-08-13T12:52:42.586632+00:00 r01 ProcMgr-worker: %PROCMGR-6-PROCESS_STARTED: 'HostInject' starting with PID=23450 (PPID=553) -- execing '/usr/bin/HostInject'
2020-08-13T12:52:42.604711+00:00 r01 ProcMgr-worker: %PROCMGR-7-WORKER_WARMSTART_DONE: ProcMgr worker warm start done. (PID=553)
2020-08-13T12:52:42.604786+00:00 r01 ProcMgr-worker: %PROCMGR-6-PROCESS_STARTED: 'ArpSuppression' starting with PID=23452 (PPID=553) -- execing '/usr/bin/ArpSuppression'
2020-08-13T12:52:42.749880+00:00 r01 HostInject: %AGENT-6-INITIALIZED: Agent 'HostInject' initialized; pid=23453
2020-08-13T12:52:43.102567+00:00 r01 ArpSuppression: %AGENT-6-INITIALIZED: Agent 'ArpSuppression' initialized; pid=23452

## second rpc call
2020-08-13T12:52:43.250995+00:00 r01 ConfigAgent: %SYS-5-CONFIG_SESSION_ENTERED: User tomas entered configuration session session630615932519210 on NETCONF (172.27.0.1)
2020-08-13T12:52:43.465035+00:00 r01 ConfigAgent: %SYS-5-CONFIG_SESSION_COMMIT_SUCCESS: User tomas committed configuration session session630615932519210 successfully on NETCONF (172.27.0.1)
2020-08-13T12:52:43.466480+00:00 r01 ConfigAgent: %SYS-5-CONFIG_SESSION_EXITED: User tomas exited configuration session session630615932519210 on NETCONF (172.27.0.1)
2020-08-13T12:52:43.472728+00:00 r01 Launcher: %LAUNCHER-6-PROCESS_START: Configuring process 'VxlanSwFwd' to start in role 'ActiveSupervisor'
2020-08-13T12:52:43.475470+00:00 r01 Launcher: %LAUNCHER-6-PROCESS_START: Configuring process 'Vxlan' to start in role 'ActiveSupervisor'
2020-08-13T12:52:43.674498+00:00 r01 ProcMgr-worker: %PROCMGR-6-WORKER_WARMSTART: ProcMgr worker warm start. (PID=553)
2020-08-13T12:52:43.701854+00:00 r01 ProcMgr-worker: %PROCMGR-7-NEW_PROCESSES: New processes configured to run under ProcMgr control: ['Vxlan', 'VxlanSwFwd']
2020-08-13T12:52:43.714484+00:00 r01 ProcMgr-worker: %PROCMGR-7-PROCESSES_ADOPTED: ProcMgr (PID=553) adopted running processes: (SharedSecretProfile, PID=1024) (Lldp, PID=832) (SlabMonitor, PID=555) (Pim, PID=1156) (MplsUtilLsp, PID=902) (Mpls, PID=903) (Isis, PID=1087) (PimBidir, PID=1164) (Igmp, PID=1172) (Acl, PID=920) (HostInject, PID=23450) (ArpSuppression, PID=23452) (StaticRoute, PID=1060) (IgmpSnooping, PID=1030) (IpRib, PID=1064) (Stp, PID=939) (KernelNetworkInfo, PID=940) (Etba, PID=1073) (KernelMfib, PID=1139) (ConnectedRoute, PID=1076) (RouteInput, PID=1080) (EvpnrtrEncap, PID=1082) (McastCommon6, PID=956) (ConfigAgent, PID=702) (Fru, PID=703) (Launcher, PID=704) (Bgp, PID=1089) (McastCommon, PID=834) (SuperServer, PID=836) (OpenConfig, PID=839) (LacpTxAgent, PID=970) (AgentMonitor, PID=845) (Snmp, PID=848) (PortSec, PID=850) (Ira, PID=852) (IgmpHostProxy, PID=1146) (EventMgr, PID=862) (Sysdb, PID=607) (CapiApp, PID=866) (Arp, PID=995) (StpTxRx, PID=871) (KernelFib, PID=1000) (StageMgr, PID=700) (Lag, PID=876) (Qos, PID=1005) (L2Rib, PID=1008) (PimBidirDf, PID=1137) (Tunnel, PID=883) (PimBsr, PID=1150) (Msdp, PID=1142) (BgpCliHelper, PID=1067) (TopoAgent, PID=1017) (Aaa, PID=890) (StpTopology, PID=891) (Ebra, PID=1022) (ReloadCauseAgent, PID=1023)
2020-08-13T12:52:43.731810+00:00 r01 ProcMgr-worker: %PROCMGR-6-PROCESS_STARTED: 'Vxlan' starting with PID=23482 (PPID=553) -- execing '/usr/bin/Vxlan'
2020-08-13T12:52:43.746053+00:00 r01 ProcMgr-worker: %PROCMGR-7-WORKER_WARMSTART_DONE: ProcMgr worker warm start done. (PID=553)
2020-08-13T12:52:43.746199+00:00 r01 ProcMgr-worker: %PROCMGR-6-PROCESS_STARTED: 'VxlanSwFwd' starting with PID=23484 (PPID=553) -- execing '/usr/bin/VxlanSwFwd'
2020-08-13T12:52:43.942447+00:00 r01 VxlanSwFwd: %AGENT-6-INITIALIZED: Agent 'VxlanSwFwd' initialized; pid=23487
2020-08-13T12:52:43.974473+00:00 r01 Vxlan: %AGENT-6-INITIALIZED: Agent 'Vxlan' initialized; pid=23485
2020-08-13T12:52:44.310150+00:00 r01 Launcher: %LAUNCHER-6-PROCESS_START: Configuring process 'Fhrp' to start in role 'AllSupervisors'
2020-08-13T12:52:44.512110+00:00 r01 ProcMgr-worker: %PROCMGR-6-WORKER_WARMSTART: ProcMgr worker warm start. (PID=553)
2020-08-13T12:52:44.538052+00:00 r01 ProcMgr-worker: %PROCMGR-7-NEW_PROCESSES: New processes configured to run under ProcMgr control: ['Fhrp']
2020-08-13T12:52:44.550918+00:00 r01 ProcMgr-worker: %PROCMGR-7-PROCESSES_ADOPTED: ProcMgr (PID=553) adopted running processes: (SharedSecretProfile, PID=1024) (Lldp, PID=832) (SlabMonitor, PID=555) (Pim, PID=1156) (MplsUtilLsp, PID=902) (Mpls, PID=903) (Isis, PID=1087) (PimBidir, PID=1164) (Igmp, PID=1172) (Acl, PID=920) (HostInject, PID=23450) (ArpSuppression, PID=23452) (StaticRoute, PID=1060) (IgmpSnooping, PID=1030) (IpRib, PID=1064) (Stp, PID=939) (KernelNetworkInfo, PID=940) (Vxlan, PID=23482) (Etba, PID=1073) (KernelMfib, PID=1139) (ConnectedRoute, PID=1076) (RouteInput, PID=1080) (EvpnrtrEncap, PID=1082) (McastCommon6, PID=956) (ConfigAgent, PID=702) (Fru, PID=703) (Launcher, PID=704) (Bgp, PID=1089) (McastCommon, PID=834) (SuperServer, PID=836) (OpenConfig, PID=839) (LacpTxAgent, PID=970) (AgentMonitor, PID=845) (Snmp, PID=848) (PortSec, PID=850) (Ira, PID=852) (IgmpHostProxy, PID=1146) (EventMgr, PID=862) (Sysdb, PID=607) (CapiApp, PID=866) (Arp, PID=995) (StpTxRx, PID=871) (KernelFib, PID=1000) (StageMgr, PID=700) (VxlanSwFwd, PID=23484) (Lag, PID=876) (Qos, PID=1005) (L2Rib, PID=1008) (PimBidirDf, PID=1137) (Tunnel, PID=883) (PimBsr, PID=1150) (Msdp, PID=1142) (BgpCliHelper, PID=1067) (TopoAgent, PID=1017) (Aaa, PID=890) (StpTopology, PID=891) (Ebra, PID=1022) (ReloadCauseAgent, PID=1023)
2020-08-13T12:52:44.565230+00:00 r01 ProcMgr-worker: %PROCMGR-7-WORKER_WARMSTART_DONE: ProcMgr worker warm start done. (PID=553)
2020-08-13T12:52:44.565339+00:00 r01 ProcMgr-worker: %PROCMGR-6-PROCESS_STARTED: 'Fhrp' starting with PID=23491 (PPID=553) -- execing '/usr/bin/Fhrp'
2020-08-13T12:52:44.720343+00:00 r01 Fhrp: %AGENT-6-INITIALIZED: Agent 'Fhrp' initialized; pid=23493

I still would like to be able to get the full config via netconf. Just copy/paste the rpc in the ssh shell (like juniper) or maybe using ydk like this. And keep dreaming, to be able to fully configure the switch via netconf/ansible.

Streaming vEOS Telemetry

One thing I wanted to get my hands dirty is telemetry. I found this blog post from Arista and I have tried to use it for vEOS.

As per the blog, I had to install go and docker in my VM (debian) on GCP.

Installed docker via aptitude:

# aptitude install docker.io
# aptitude install docker-compose
# aptitude install telnet

Installed golang via updating .bashrc

########################
# Go configuration
########################
#
# git clone -b v0.0.4 https://github.com/wfarr/goenv.git $HOME/.goenv
if [ ! -d "$HOME/.goenv" ]; then
    git clone https://github.com/syndbg/goenv.git $HOME/.goenv
fi
#export GOPATH="$HOME/storage/golang/go"
#export GOBIN="$HOME/storage/golang/go/bin"
#export PATH="$GOPATH/bin:$PATH"
if [ -d "$HOME/.goenv"   ]
then
    export GOENV_ROOT="$HOME/.goenv"
    export PATH="$GOENV_ROOT/bin:$PATH"
    if  type "goenv" &> /dev/null; then
        eval "$(goenv init -)"
        # Add the version to my prompt
        __goversion (){
            if  type "goenv" &> /dev/null; then
                goenv_go_version=$(goenv version | sed -e 's/ .*//')
                printf $goenv_go_version
            fi
        }
        export PS1="go:\$(__goversion)|$PS1"
        export PATH="$GOROOT/bin:$PATH"
        export PATH="$PATH:$GOPATH/bin"
    fi
fi

################## End GoLang #####################

Then started a new bash session to trigger the installation of goenv and then install a go version

$ goenv install 1.14.6
$ goenv global 1.14.6

Now we can start the docker containers for influx and grafana:

mkdir telemetry
cd telemetry
mkdir influxdb_data
mkdir grafana_data

cat docker-compose.yaml 
version: "3"

services:
 influxdb:
   container_name: influxdb-tele
   environment:
     INFLUXDB_DB: grpc
     INFLUXDB_ADMIN_USER: "admin"
     INFLUXDB_ADMIN_PASSWORD: "arista"
     INFLUXDB_USER: tac
     INFLUXDB_USER_PASSWORD: arista
     INFLUXDB_RETENTION_ENABLED: "false"
     INFLUXDB_OPENTSDB_0_ENABLED: "true"
     INFLUXDB_OPENTSDB_BIND_ADDRESS: ":4242"
     INFLUXDB_OPENTSDB_DATABASE: "grpc"
   ports:
     - '8086:8086'
     - '4242:4242'
     - '8083:8083'
   networks:
     - monitoring
   volumes:
     - influxdb_data:/var/lib/influxdb
   command:
     - '-config'
     - '/etc/influxdb/influxdb.conf'
   image: influxdb:latest
   restart: always

 grafana:
   container_name: grafana-tele
   environment:
     GF_SECURITY_ADMIN_USER: admin
     GF_SECURITY_ADMIN_PASSWORD: arista
   ports:
     - '3000:3000'
   networks:
     - monitoring
   volumes:
     - grafana_data:/var/lib/grafana
   image: grafana/grafana
   restart: always

networks:
 monitoring:

volumes:
 influxdb_data: {}
 grafana_data: {}

Now we should be able to start the containers:

sudo docker-compose up -d    // start containers
sudo docker-compose down -v  // for stopping containers

sudo docker ps -a
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
cad339ead2ee influxdb:latest "/entrypoint.sh -con…" 5 hours ago Up 5 hours 0.0.0.0:4242->4242/tcp, 0.0.0.0:8083->8083/tcp, 0.0.0.0:8086->8086/tcp influxdb-tele
ef88acc47ee3 grafana/grafana "/run.sh" 5 hours ago Up 5 hours 0.0.0.0:3000->3000/tcp grafana-tele

sudo docker network ls
NETWORK ID NAME DRIVER SCOPE
fe19e7876636 bridge bridge local
0a8770578f3f host host local
6e128a7682f1 none null local
3d27d0ed3ab3 telemetry_monitoring bridge local

sudo docker network inspect telemetry_monitoring
[
{
"Name": "telemetry_monitoring",
"Id": "3d27d0ed3ab3b0530441206a128d849434a540f8e5a2c109ee368b01052ed418",
"Created": "2020-08-12T11:22:03.05783331Z",
"Scope": "local",
"Driver": "bridge",
"EnableIPv6": false,
"IPAM": {
"Driver": "default",
"Options": null,
"Config": [
{
"Subnet": "172.18.0.0/16",
"Gateway": "172.18.0.1"
}
]
},
"Internal": false,
"Attachable": true,
"Ingress": false,
"ConfigFrom": {
"Network": ""
},
"ConfigOnly": false,
"Containers": {
"cad339ead2eeb0b479bd6aa024cb2150fb1643a0a4a59e7729bb5ddf088eba19": {
"Name": "influxdb-tele",
"EndpointID": "e3c7f853766ed8afe6617c8fac358b3302de41f8aeab53d429ffd1a28b6df668",
"MacAddress": "02:42:ac:12:00:03",
"IPv4Address": "172.18.0.3/16",
"IPv6Address": ""
},
"ef88acc47ee30667768c5af9bbd70b95903d3690c4d80b83ba774b298665d15d": {
"Name": "grafana-tele",
"EndpointID": "3fe2b424cbb66a93e9e06f4bcc2e7353a0b40b2d56777c8fee8726c96c97229a",
"MacAddress": "02:42:ac:12:00:02",
"IPv4Address": "172.18.0.2/16",
"IPv6Address": ""
}
},
"Options": {},
"Labels": {
"com.docker.compose.network": "monitoring",
"com.docker.compose.project": "telemetry"
}
}
]

Now we have to generate the octsdb binary and copy it to the switches as per instructions

$ go get github.com/aristanetworks/goarista/cmd/octsdb
$ cd $GOPATH/src/github.com/aristanetworks/goarista/cmd/octsdb

$ GOOS=linux GOARCH=386 go build   // I used this one
$ GOOS=linux GOARCH=amd64 go build // if you use EOS 64 Bits

$ scp octsdb user@SWITCH_IP:/mnt/flash/

An important thing is the configuration file for octsdb. I struggled trying to get a config file that provided me CPU and interface counters. All the examples are based on hardware platforms but I am using containers/VMs. But using this other blog post, I worked out the path for the interfaces in vEOS.

This is what I see in vEOS:

bash-4.2# curl localhost:6060/rest/Smash/counters/ethIntf/EtbaDut/current
{
    "counter": {
        "Ethernet1": {
            "counterRefreshTime": 0,
            "ethStatistics": {
...

bash-4.2# curl localhost:6060/rest/Kernel/proc/cpu/utilization/cpu/0
{
    "idle": 293338,
    "name": "0",
    "nice": 2965,
    "system": 1157399,
    "user": 353004,
    "util": 100
}

And this is what I see in cEOS. It seems this is not functional:

bash-4.2# curl localhost:6060/rest/Smash/counters/ethIntf
curl: (7) Failed to connect to localhost port 6060: Connection refused
bash-4.2# 
bash-4.2# 
bash-4.2# curl localhost:6060/rest/Kernel/proc/cpu/utilization/cpu/0
curl: (7) Failed to connect to localhost port 6060: Connection refused
bash-4.2# 

This is the file I have used and pushed to the switches:

$ cat veos4.23.json 
{
	"comment": "This is a sample configuration for vEOS 4.23",
	"subscriptions": [
		"/Smash/counters/ethIntf",
		"/Kernel/proc/cpu/utilization"
	],
	"metricPrefix": "eos",
	"metrics": {
		"intfCounter": {
			"path": "/Smash/counters/ethIntf/EtbaDut/current/(counter)/(?P<intf>.+)/statistics/(?P<direction>(?:in|out))(Octets|Errors|Discards)"
		},
		"intfPktCounter": {
			"path": "/Smash/counters/ethIntf/EtbaDut/current/(counter)/(?P<intf>.+)/statistics/(?P<direction>(?:in|out))(?P<type>(?:Ucast|Multicast|Broadcast))(Pkt)"
		},
                "totalCpu": {
                        "path": "/Kernel/proc/(cpu)/(utilization)/(total)/(?P<type>.+)"
                },
                "coreCpu": {
                        "path": "/Kernel/proc/(cpu)/(utilization)/(.+)/(?P<type>.+)"
                }
	}
}
$ scp veos4.23.json r1:/mnt/flash

Now you have to configure the switch to generate and send the data. In my case, I am using the MGMT vrf.

!
daemon TerminAttr
   exec /usr/bin/TerminAttr -disableaaa -grpcaddr MGMT/0.0.0.0:6042
   no shutdown
!
daemon octsdb
   exec /sbin/ip netns exec ns-MGMT /mnt/flash/octsdb -addr 192.168.249.4:6042 -config /mnt/flash/veos4.23.json -tsdb 10.128.0.4:4242
   no shutdown
!

TermiAttr it is listening on 0.0.0.0:6042. “octsdb” is using the mgmt IP 192.168.249.4 (that is in MGMT vrf) to connect to the influxdb container that is running in 10.128.0.4:4242

Verify that things are running:

# From the switch
show agent octsdb log

# From InfluxDB container
$ sudo docker exec -it influxdb-tele bash
root@cad339ead2ee:/# influx -precision 'rfc3339'
Connected to http://localhost:8086 version 1.8.1
InfluxDB shell version: 1.8.1
> show databases
name: databases
name
----
grpc
_internal
> use grpc
Using database grpc
> show measurements
name: measurements
name
----
eos.corecpu.cpu.utilization._counts
eos.corecpu.cpu.utilization.cpu.0
eos.corecpu.cpu.utilization.total
eos.intfcounter.counter.discards
eos.intfcounter.counter.errors
eos.intfcounter.counter.octets
eos.intfpktcounter.counter.pkt
eos.totalcpu.cpu.utilization.total
> exit
root@cad339ead2ee:/# exit

Now we need to configure grafana. So first we create a connection to influxdb. Again, I struggled with the URL. Influx and grafana are two containers running in the same host. I was using initially localhost and it was failing. At the end I had to find out the IP assigned to the influxdb container and use it.

Now you can create a dashboard with panels.

For octet counters:

For packet types:

For CPU usage:

And this is the dashboard:

Keep in mind that I am not 100% sure my grafana panels are correct for CPU and Counters (PktCounter makes sense)

At some point, I want to try telemetry for LANZ via this telerista plugin.

Multicast + 5G

I have been cleaning up my email box and found some interesting stuff. This is from APNIC regarding a new approach to deploy multicast. Slides from nanog page (check tuesday) In my former employer, we suffered traffic congestion several times after some famous games got new updates. So it is interesting that Akamai is trying to deploy inter-domain multicast in the internet. They have a massive network and I guess they suffered with those updates and this is an attempt to “digest” better those spikes. At minute 16 you can see the network changes required. It doesnt look like a quick/easy change but would be a great thing to happen.

And reading a nanog conversation about 5G I realised that this technology promises high bandwidth (and that could fix the issue of requiring multicast). But still we should have a smarter way to deliver same content to eyeball networks?

From the nanog thread, there are several links to videos about 5G like this from verizon that gives the vision from a big provider and its providers (not technical). This one is more technical with 5G terms (I lost contact of Telco term with early 4G). As well, I see mentioning kubernetes in 5G deployments quite often. I guess something new to learn.

Vim + Golang

I am trying to learn a bit of golang (although sometimes I think I should try to master python and bash first..) with this udemy course.

The same way I have pyenv/virtualenv to create python environments, I want to do the same for golang. So for that we have goenv:

Based on goenv install instructions and ex-collegue snipset, this is my goenv snipset in .bashrc:

########################
# Go configuration
########################
#
# git clone -b v0.0.4 https://github.com/wfarr/goenv.git $HOME/.goenv
if [ ! -d "$HOME/.goenv" ]; then
    git clone https://github.com/syndbg/goenv.git $HOME/.goenv
fi

if [ -d "$HOME/.goenv"   ]
then
    export GOENV_ROOT="$HOME/.goenv"
    export PATH="$GOENV_ROOT/bin:$PATH"
    if  type "goenv" &> /dev/null; then
        eval "$(goenv init -)"
        # Add the version to my prompt
        __goversion (){
            if  type "goenv" &> /dev/null; then
                goenv_go_version=$(goenv version | sed -e 's/ .*//')
                printf $goenv_go_version
            fi
        }
        #PS1_GO="go:\$(__goversion) "
        export PS1="go:\$(__goversion)|$PS1"
        export PATH="$GOROOT/bin:$PATH"
        export PATH="$PATH:$GOPATH/bin"
    fi
fi

################## End GoLang #####################

From time to time, remember to go to ~.goenv and do a “git pull” to get the latest versions of golang.

Ok, once we can install any golang version, I was thinking about the equivalent to python virtualenv, but it seems it is not really needed in golang. At the moment, I am super beginner so no rush about this.

And finally, as I try to use VIM for everything so I can keep learning, I want to use similar python plugins for golang. So I searched and this one looks quite good: vim-go

So I updated vundle config in .vimrc:

Plugin 'fatih/vim-go', { 'do': ':GoUpdateBinaries' }

Then install the new new plugin once in VIM.

:PluginInstall

or >

:PluginUpdate

There is a good tutorial you can follow to learn the new commands.

I am happy enough with “GoRun”, “GoFmt”, “GoImports”, “GoTest”

Keep practising, Keep learning.

OOB

I was reading this blog and realised that OOB is something is not talked about very often. Based on what I have seen in my career:

Design

You need to sell the idea that this is a must. Then you need to secure some budget. You dont need much:

1x switch

1x firewall

1x Internet access (if you have your ASN and IP range, dont use it)

Keep it simple..

Most network kit (firewalls, routers, switches, pdus, console servers, etc) have 1xmgmt port and 1xconsole port. So all those need to go to the console server. I guess most server vendors offer some OOB access (I just know Dell and HP). So all those go to the oob switch.

If you have a massive network with hundreds of devices/servers, then you will need more oob switches and console servers. You still need just one firewall and 1 internet connection. The blog comments about the spine-leaf oob network. I guess this is the way for a massive network/DC.

Access to OOB

You need to be able to access it via your corporate network and from anywhere in the internet.

You need to be sure linux/windows/macs can VPN.

Use very strong passwords and keys.

You need to be sure the oob firewall is quite tight in access. At the end of the day you only want to allow ssh to the console server and https to the ILO/iDRACS. Nothing initiated internally can go to the internet.

Dependencies

Think in the worse scenario. Your DNS server is down. Your authentication is down.

You need to be sure you have local auth enabled in all devices for emergency

You need to work out some DNS service. Write the key IPs in the documentation?

You IP transit has to be reliable. You dont need a massive pipe but you need to be sure it is up.

Monitoring

You dont want to be in the middle of the outage and realise that your OOB is not functional. You need to be sure the ISP for the OOB is up and the devices (oob switch and oob firewall) are functional all the time.

How to check the serial connections? conserver.com

Documentation

Another point frequently lost. You need to be sure people can find info about the OOB: how is built and how to access it.

Training

At the end of the day, if you have a super OOB network but then nobody knows how to connect and use it, then it is useful. Schedule routine checkups with the team to be sure everybody can OOB. This is useful when you get a call at 3am.

Diagram

Update

Funny enough, I was watching today NLNOG live and there was a presentation about OOB with too different approaches: in-band out-of-band and pure out-of-band.

From the NTT side, I liked the comment about conserver.com to manage your serial connections. I will try to use it once I have access to a new network.

Forward TCPDump to Wireshark

Reading this blog entry I realised that very likely I have never tried forward tcpdump to a wireshark. How many times I have taken a pcap in a switch and then I need to download to see the details in wireshark…

I guess you can find some blocking points in firewalls (at least for 2-steps option)

So I tried the single command with a switch in my ceoslab and it works!

Why it works?

ssh <username>@<switch>  "bash tcpdump -s 0 -U -n -w - -i <interface>" | wireshark -k -i -

The ssh command is actually executing the “bash tcpdump…” remotely. But the key is the “-U” and “-w -” flags. “-U” in conjunction with “-w” sends the packet without waiting for the buffer to fill. Then “-w -” says that it writes the output to stdout instead of a file. If you run the command without -U, it would work but it will update a bit slower as it needs to fill the buffers.

From tcpdump manual:

       -U
       --packet-buffered
              If the -w option is not specified, make the printed packet output ``packet-buffered''; i.e., as the description of the contents of each packet is printed, it will be written to the standard  output,  rather  than, when not writing to a terminal, being written only when the output buffer fills.

              If  the  -w  option  is  specified, make the saved raw packet output ``packet-buffered''; i.e., as each packet is saved, it will be written to the output file, rather than being written only when the output buffer fills.

              The -U flag will not be supported if tcpdump was built with an older version of libpcap that lacks the pcap_dump_flush() function.

......
   -w file
          Write the raw packets to file rather than parsing and printing them out.  They can later be printed with the -r option.  Standard output is used if file is ``-''.

          This  output will be buffered if written to a file or pipe, so a program reading from the file or pipe may not see packets for an arbitrary amount of time after they are received.  Use the -U flag to cause packets to be written as soon as they are received.

And the stdout of that process is the ssh command so we redirect that outout via a pipe “|” and it is sent as input for wireshark thanks to “-i -” that makes wireshark to read from stdin (that is the stdout from the tcpdump in the switch!)

The wireshark manual:

       -i|--interface  <capture interface>|-
           Set the name of the network interface or pipe to use for live packet capture.

           Network interface names should match one of the names listed in "wireshark -D" (described above); a number, as reported by "wireshark -D", can also be used.  If you're using UNIX, "netstat -i", "ifconfig -a" or "ip link" might also work to list interface names, although not all versions of UNIX support the -a flag to ifconfig.

           If no interface is specified, Wireshark searches the list of interfaces, choosing the first non-loopback interface if there are any non-loopback interfaces, and choosing the first loopback interface if there are no non-loopback interfaces.  If there are no interfaces at all, Wireshark reports an error and doesn't start the capture.

           Pipe names should be either the name of a FIFO (named pipe) or "-" to read data from the standard input.  On Windows systems, pipe names must be of the form "\\pipe\.\pipename".  Data read from pipes must be in standard pcapng or pcap format. Pcapng data must have the same endianness as the capturing host.

           This option can occur multiple times. When capturing from multiple interfaces, the capture file will be saved in pcapng format.

....

       -k  Start the capture session immediately.  If the -i flag was specified, the capture uses the specified interface.  Otherwise, Wireshark searches the list of interfaces, choosing the first non-loopback interface if
           there are any non-loopback interfaces, and choosing the first loopback interface if there are no non-loopback interfaces; if there are no interfaces, Wireshark reports an error and doesn't start the capture.

The two-steps option relies on “nc” to send/receive the data, but it is the same idea regarding the tcpdump/wireshark flags using “-“

On switch: tcpdump -s 0 -U -n -w - -i <interface> | nc <computer-ip> <port>
On PC: netcat -l -p <port> | wireshark -k -S -i -

Linux Networking – Bonding/Bridging/VxLAN

Bonding

$ sudo modprobe bonding
$ ip link help bond
$ sudo ip link add bond0 type bond mode 802.3ad
$ sudo ip link set eth0 master bond0
$ sudo ip link set eth1 master bond0

Bridging: vlans + trunks

ip neigh show // l2 table
ip route show // l3 table

ip route add default via 192.168.1.1 dev eth1

sudo modprobe 8021q

// create bridge and add links to bridge (switch)
sudo ip link add br0 type bridge vlan_filtering 1 // native vlan = 1
sudo ip link set eth1 master br0
sudo ip link set eth2 master br0
sudo ip link set eth3 master br0

// make eth1 access port for v11
sudo bridge vlan add dev eth1 vid 11 pvid untagged

// make eth3 access port for v12
sudo bridge vlan add dev eth3 vid 12 pvid untagged

// make eth2 trunk port for v11 and v12
sudo bridge vlan add dev eth2 vid 11
sudo bridge vlan add dev eth2 vid 12

// enable bridge and links
sudo ip link set up dev br0
sudo ip link set up dev eth1
sudo ip link set up dev eth2
sudo ip link set up dev eth3

bridge link show
bridge vlan show
bridge fdb show

VxLAN

I havent tried this yet:

Linux System 1
  sudo ip link add br0 type bridge vlan_filtering 1
  sudo ip link add vlan10 type vlan id 10 link bridge protocol none
  sudo ip addr add 10.0.0.1/24 dev vlan10
  sudo ip link add vtep10 type vxlan id 1010 local 10.1.0.1 remote 10.3.0.1 learning
  sudo ip link set eth1 master br0
  sudo bridge vlan add dev eth1 vid 10 pvid untagged

Linux System 2
  sudo ip link add br0 type bridge vlan_filtering 1
  sudo ip link add vlan10 type vlan id 10 link bridge protocol none
  sudo ip addr add 10.0.0.2/24 dev vlan10
  sudo ip link add vtep10 type vxlan id 1010 local 10.3.0.1 remote 10.1.0.1 learning
  sudo ip link set eth1 master br0
  sudo bridge vlan add dev eth1 vid 10 pvid untagged