{"id":85,"date":"2020-04-20T23:37:01","date_gmt":"2020-04-20T22:37:01","guid":{"rendered":"https:\/\/blog.thomarite.uk\/?p=85"},"modified":"2020-04-20T23:37:28","modified_gmt":"2020-04-20T22:37:28","slug":"outages-part-1","status":"publish","type":"post","link":"https:\/\/blog.thomarite.uk\/index.php\/2020\/04\/20\/outages-part-1\/","title":{"rendered":"Outages part 1"},"content":{"rendered":"\n<p>Cloudflare had an outage last week. And this time, I felt quite identify with that situation as it could happen to me:<\/p>\n\n\n\n<p><a href=\"https:\/\/blog.cloudflare.com\/cloudflare-dashboard-and-api-outage-on-april-15-2020\/\">https:\/\/blog.cloudflare.com\/cloudflare-dashboard-and-api-outage-on-april-15-2020\/<\/a><\/p>\n\n\n\n<p>Conclusions<\/p>\n\n\n\n<ul class=\"wp-block-list\"><li><strong>Design<\/strong>: When you aim for HA, even a single patch panel is a SPOF no matther how much redundancy you have in your transit providers, routers, switches, firewalls, etc etc. So, look for <strong>SPOF<\/strong>!<\/li><li><strong>Documentation<\/strong>: For DC stuff, in my current employer we use <a href=\"https:\/\/patchmanager.com\/\">patchmanager.<\/a>  It is supper handy for remote locations and it is our source of truth. Keep in mind that tool is as good as you keep it updated&#8230;. For example, for the PoPs we visit more often and we make more changes, we find more failures that we would like&#8230; For remote PoPs, as we know we are not going to come back for a couple of years, we are much more throrough. For network kit, we have <strong>RANCID+Git<\/strong> so we know always the lattest config and when changes where introduced (in 30m intervals at least). <\/li><li><strong>Process<\/strong>: We follow a risk assesment for any change we plan to introduce. Then on Thursday we have a <a href=\"https:\/\/en.wikipedia.org\/wiki\/Change-advisory_board\">CAB metting<\/a> to schedule what changes are going to happen during the weekend. The aim is to have several people from different teams to understand and have a say in what is going to happen. This has proobed very useful. Four pairs of eyes are better than half \ud83d\ude42 Still you need to be regirous in this process<\/li><\/ul>\n\n\n\n<p>Even having all this into account, you will have an outage. Have a retrospective, learn from it (no finger pointing) and apply it. Trully agile \ud83d\ude1b<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Cloudflare had an outage last week. And this time, I felt quite identify with that situation as it could happen to me: https:\/\/blog.cloudflare.com\/cloudflare-dashboard-and-api-outage-on-april-15-2020\/ Conclusions Design: When you aim for HA, even a single patch panel is a SPOF no matther how much redundancy you have in your transit providers, routers, switches, firewalls, etc etc. So, &hellip; <a href=\"https:\/\/blog.thomarite.uk\/index.php\/2020\/04\/20\/outages-part-1\/\" class=\"more-link\">Continue reading<span class=\"screen-reader-text\"> &#8220;Outages part 1&#8221;<\/span><\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"closed","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[2],"tags":[],"class_list":["post-85","post","type-post","status-publish","format-standard","hentry","category-networks"],"_links":{"self":[{"href":"https:\/\/blog.thomarite.uk\/index.php\/wp-json\/wp\/v2\/posts\/85","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/blog.thomarite.uk\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/blog.thomarite.uk\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/blog.thomarite.uk\/index.php\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/blog.thomarite.uk\/index.php\/wp-json\/wp\/v2\/comments?post=85"}],"version-history":[{"count":1,"href":"https:\/\/blog.thomarite.uk\/index.php\/wp-json\/wp\/v2\/posts\/85\/revisions"}],"predecessor-version":[{"id":86,"href":"https:\/\/blog.thomarite.uk\/index.php\/wp-json\/wp\/v2\/posts\/85\/revisions\/86"}],"wp:attachment":[{"href":"https:\/\/blog.thomarite.uk\/index.php\/wp-json\/wp\/v2\/media?parent=85"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/blog.thomarite.uk\/index.php\/wp-json\/wp\/v2\/categories?post=85"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/blog.thomarite.uk\/index.php\/wp-json\/wp\/v2\/tags?post=85"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}