Essentialism

This is a book about “simplifying” your life, removing what is not necessary and focus in the important. These are nice words but actually very difficult to accomplish nowadays. I have read several book that are quite related to this subject, like “indistractable”, “drive”,”deep work”, “flow”, “atomic habits”, etc. Focus in “less is better”. If you dont put a limit, somebody will do it for you

For me, I am struggling in the “Tech” side. I want to learn so many things that at the end of the day/week/months/years I notice I haven’t reached anywhere. I have hundreds of tabs open in my browser with things I want to read “soon”. Something similar happens with recipes, I have so many pieces of magazines, videos and pics in my phone, that I feel overwhelmed. At least I am focus in my climbing (getting fitter), baking (bread!)/cooking and reading.

The author explains the process to become an Essentialist as three phases:

Explore:

  • Create space. It is good to be bored. Read old books. Meaning.
  • Play: Sir Ken Robinson
  • Sleep: Protect the most important asset, you!
  • Select: Hell yeah or no. Trade-offs. Good to Great (book).

Eliminate

  • clarify: cleanup the wardrobe
  • dare: say “no”
  • uncommit: sunk cost, endowment effect, fear of missing out
  • edit: options, condense
  • limit: you can pay a price for setting boundaries BUT boundaries are freedom!

Execute:

  • Buffer: add 50% in your planning. extreme preparation
  • Substract
  • Progress: repeat, repeat
  • flow: create routines

This is the typical book that I put in my stack of good reads, to read again at some point so I can refresh concepts because I forget things. Even after finishing it a couple of days ago, I got the feeling that I have forgotten most of it. And in this case, I gave away the book so I am even struggling to get my notes/thoughts here 🙂

Pandas

This is something I have heard about in the past but never used. So this week, as finally decided to write a script to help me to find the peers of flapping ports, learned about pandas from first time. I used another script as “inspiration” and after seeing it was using pandas, I decided to read a bit about it and give it a go.

The main type is the DataFrame. In my head, pandas is just a library to deal with CSV, spreadsheets, etc like when you use a program like libreoffice. And this page, gave me the hints for creating the query I wanted to make.

So at the end I have my very basic script but saves me time logging to each device and find the peer port.

Of course, there are different ways to tackle this problem, but in my environment, the source of truth for links is in a file. You could have that info in the port description too, or in a database, etc.

$ python3 flapping-peer.py -f flapping-list.txt

Result:
SW1 Ethernet1/1 SW2 Ethernet1/1
SW1 Ethernet1/4 SW2 Ethernet1/4

$  
$ cat flapping-list.txt 
SW1,Ethernet1/1
SW2,Ethernet1/4
$
$ cat patching-file.csv 
Site,Source Device,Source Interface,Destination Device,Destination Interface,Media
A,SW1,Ethernet1/1,SW2,Ethernet1/1,SMF
A,SW1,Ethernet1/2,SW2,Ethernet1/2,SMF
A,SW1,Ethernet1/3,SW2,Ethernet1/3,SMF
A,SW1,Ethernet1/4,SW2,Ethernet1/4,SMF
A,SW1,Ethernet1/5,SW2,Ethernet1/5,SMF
$ 

TCP Asymmetric

I got escalated an issue recently that had caused several outages and needed an urgent fix.

For different reasons, we had asymmetric routing in SITE-A. The normal flow is the green arrow. During the asymmetric routing, the flow is the red line. Routing wise, things should work. BUT, we have firewalls in the path. The firewalls were configured to allow asymmetric connections (I was told). As far as I could see in the config and logs, nothing was dropped in the firewalls during the issue.

So first thing, I fixed the asymmetric routing so it didnt happen again. I took me a while to come up with the solution (and it was quite simple) as I had to understand properly the routing before and during the issue. The diagram is quite simplified at the end of the day.

So during the maintenance window when I applied the fix for the asymmetric routing, I managed to take some traces in the firewalls, as I was trying to understand where the traffic was dropped/lost during the asymmetric scenario. As well, I was not very familiar with several parts of the network and the monitoring, I didnt know which links where already tapped or not. Once I was happy with the routing fix, I tried to take a look at the traces. At high level, I could see the return traffic leaving FW1 and leaving DC1-SW1. Based on that, I started to think that the firewalls were fine…..

In another maintenance, I tried to take more logs in different part of the network and I could see clearly the traffic reaching A-SW1. As I ran of time and missed to tap some links, I couldnt carry on.

So based on the second maintenance, the issue had to be inside SITE-A. Somehow it didnt make sense. I checked I didnt have uRPF enabled. The rest was pure L2 so it couldnt see the L3…

So in the third maintenance, I got all my debugging tools to verify that any network kit was dropping the traffic in SITE-A…. and it was useless. I realized that I could do a tcpdump in the client IP1 i was using for testing and I could see some return traffic!!!!

So, I was just socked. I didnt get it. It didnt make sense.

Somehow, I reviewed the tcp captures I was doing in each interface of both firewalls. I was trying to get to basics.

I was assuming the TCP handshake was completed properly. After paying a bit of attention to the client logs… I could see the TCP handshake completed. And I could see the HTTP GET getting to and leaving DC2-FW…. so why the server IP2 was not answering!!!!???

So back to the tcp handshake and firewall captures, I was comparing step by step. Somehow, I missed that the TCP ACK from client IP2 was reaching DC2-FW…. but it was not leaving DC2-FW!!!! even worse, the HTTP GET it was actually crossing the DC2-FW !!!

SLAP IN THE FACE!!!

This is the TCP handshake. This is networking 101…..

The TCP state-machine in client and server during the asymmetric scenario

So I was asumming that because the client was sending HTTP get, the tcp handshake was completed in both ends!!!!

It didnt make sense why I was seeing TCP SYN-ACK retransmissions from the server IP1…. BECAUSE the TCP ACK from client IP2 never reached.

For that reason server IP2 never answered the HTTP GET, because from its end the tcp hanshake was not completed.

I banged my head several times on the table. I “saw” this during the first maintenance window when I took the tcpdump in the firewalls BUT I didnt pay attention to the basic details.

I trusted too much to see a wireshark trace because it is more visual and shows more info but the clues were all the time in the tcpdump from the firewalls that I didnt bother to pay full attention.

At least, I found out where and why the connections failed during the asymmetric routing scenario. A firewall upgrade did the job.

So all fixed.

Lessons learned:

  • without proper foundation, you can’t build knowledge (tcp handshake state in client and server)
  • when things dont make sense, get back to basics (tcp handshake)
  • get the most of the tools at hand (tcpdump – PSH packets were the HTTP GET !!!!)