I was quite surprised about the backgroup and links between both sides. We know that there are still some big companies today that had collaborations with the regime in that time. But I didnt know any one related to today’s IT.
Today I had to troubleshoot a websocket issue. I had never dealt with this before. I was told that HAproxy config was fine that it was to be our NGFW doing something nasty at L7.
The connection directly to the server doing websocket was fine from my PC but for some requirement we need to put that server behing a HAproxy. From my PC to the haproxy that is doing “proxy” fore the websocket service failed…
Funny enough HAproxy and the websocket service were running in the same host.
As usual I took a look at the firewall logs. Nothing wrong there at first sight. I took a tcpdump from my pc when connecting to the websocket service and to the haproxy.
The service is very verbose and it is difficult to follow in the capture files as it spawns several connections. I went to the easy part, the capture to the haproxy was showing a lot of TCP retransmissions… The other trace to the websocket service was pretty clean.
Taking into account that the path from my PC to the haproxy server is the always the same (and I was going through a VPN) I could think it was a NGFW issue or something between HAproxy and the websocket service (that is a localhost connection).
As well, I was seeing weird things latency wise. Some TCP resets were taking more than 200ms to arrive to the server when the average RTT was 3ms.
I tried to take a tcpdump between the haproxy service and the websocket service just in case that packet loss was caused locally. The capture was chaos to follow. I had to understand better the sessions in HAproxy.
I changed direction and I went to the NGFW and created a rule that disabled any fancy security check for me to the haproxy server. I wanted to be sure the firewall was innocent.
It was. Same issue. I tried different browsers and always the same.
So I was nearly sure the problem was in HAproxy but I had to prove it. I kind of failed checking the backend connection (haproxy to websockt proxy) so I took again a look to the trace from my pc to haproxy. I was quite frustrated because there was so many connetions openned and then retransmissions started happening that I couldnt really see any problem.
By luck, I noticed that in the good trace (the one going directly to the websocket service) I could see a HTTP GET request for “socket” from my PC. Keep in mind that I have no idea how websocket works. I tried to find a similar request in the haproxy trace, and I saw the problem….
Rejected HTTP GET socket request
and this is a good connection:
Successful HTTP GET socket request
So at the end, HAproxy was at fault (we dont know how to fix it though yet) and my firewall (for once) it is innocent.
The summary, I got overwhelmed by the TCP retransmissions. I was lucky that I saw the GET socket and I assumed that had to be the way to get the websocket connection established. So I should have started investigating how a websocket connections is stablished. As well, I didnt manage to find the HAproxy logs, I am pretty sure I should have found the same answer. So I need to learn to check that.
I learned something new. As usual, it didnt come easy neither quick ๐
But at the end, it is not all about the Bloom filters. It is understanding how things work under the hood and see if they are actually delivering, if not, you should change your approach. So the debugging section “A secret weapon – a profiler” is very good. Profiling is not one of my strengths so the tools used are the ones I need to understand and use more often:
strace -cf
perf stat -d
perf record
perf record | head -n 20
perf annotate process_line --source
google-perftools' with kcachegrind
As well the reference to the performance numbers that are good to have in mind:
Notice the magnitude differences in the performance of different options.
Datacenters are far away so it takes a long time to send anything between them.
Memory is fast and disks are slow.
By using a cheap compression algorithm a lot (by a factor of 2) of network bandwidth can be saved.
Writes are 40 times more expensive than reads.
Global shared data is expensive. This is a fundamental limitation of distributed systems. The lock contention in shared heavily written objects kills performance as transactions become serialized and slow.
Architect for scaling writes.
Optimize for low write contention.
Optimize wide. Make writes as parallel as you can.
As well, “The lessons learned” is a great summary of his trip.
Sequential memory access great / Random memory access costly -> cache prefetching
Advanced data structures to fit L3: optimize for reduced number loads than the amount of memory used.
There are many things that I didnt know but these two caught my attention:
The root DNS zone of the internet is composed of thirteen DNS server clusters. There are only 13 server clusters, because thatโs all we can fit in a single UDP packet. Historically, DNS has operated through UDP packets, meaning the response to a request can never be more than 512 bytes.
I knew there were 13 root DNS cluster but I didnt think the reason why was the UDP packet size!
And from Punycode, interesting you can create emoji urls!
I use gkrellm as my linux monitoring app. I have used it since I started but something I miss is I would like to know what app and destination IPs are causing a traffic spike in my laptop.
Searching a bit a come up with this page with several tools:
The article is a couple of years old but I think it is still relevant. Most people I know they have their infrastructure in the cloud. In my current job we are still based on bare metal due to the nature of our business but some years ago we were in that point when deciding what to do with our CI/CD environment. I wasnt involved in that decision (only in the deployment/implementation). Our capex was higher but long term (3y), it was cheaper to build in premise than in the cloud. I agree with the article that when you dont know how things are going to grow, scale requirements, etc cloud is the best choice. Once you ran pass the start-up phase, you should reconsider the position.
A couple of weeks ago, at work, sysadmin guys were working on some ZFS issues. They were talking about ZIL and ARC, and I had no idea what was that.
I always wanted to run ZFS, so I think early 2019 I configured my laptop to use ZFS, not in the root partition but in a different partition. I had to configure my Debian Testing to support ZFS (I dont remember if it was very difficult) and then backup some data to make room for my new ZFS partition.
For ZFS basics, you can follow the link below but there are many good tutorial searching in your favourite engine:
# zpool status
pool: storage
state: ONLINE
status: Some supported features are not enabled on the pool. The pool can
still be used, but some features are unavailable.
action: Enable all features using 'zpool upgrade'. Once this is done,
the pool may no longer be accessible by software that does not support
the features. See zpool-features(5) for details.
scan: scrub repaired 0B in 0 days 00:10:39 with 0 errors on Sun Jan 12 00:34:40 2020
config:
NAME STATE READ WRITE CKSUM
storage ONLINE 0 0 0
laptop--vg-storage ONLINE 0 0 0
errors: No known data errors
#
This is too basic, in most cases your will want to have a kinf of RAID. But again, this is a simple laptop. As well, you can configure snapshots (useful if you want to have rollback a server upgrade that involves a huge amount of data) and other performance parameters (as per document below):
Make it permanent, edit /etc/sysctl.conf like this
# Based on https://www.simula.no/file/lj-219-jul-2012pdf/download
# enabling tcp thin-steam modifications for reducing latency in interactive apps
net.ipv4.tcp_thin_linear_timeouts = 1
Now it is time to test and see if you see any improvement or degradation!
This week attended a webinar from Alex Blewitt about CPU microarchiteture to increase application performance. The link was sent by a work colleague but you can get pdf and see the presentation from the below source:
As you can see, my humble laptop just have one NUMA node, with two cores/processors, and two hyperthreads per core.
But in a server, very likely you will have more NUMA nodes, more cores and more processors so you want to be sure that is used properly.
I am not expert in CPU performance at all but there many important points like memory allocation, huge pages, pinning memory/threads (isolcpu, taskset, etc), compiler strategies and tools to test the performance. Some of them ring the bell and it is nice to know that exist. You never know when you will have to dive in this type of water.