Ansible – Troubleshooting

A couple of years a go I wrote a playbook with ansible to use napalm for configuring some switches.

I wanted to test again ansible as I am quite rusty and there is always demand for that.

I started with just something basic with my ceos lab.

All my code is here:

https://github.com/thomarite/ceos-testing/tree/master/ansible

Initially I was following the official documentation for Arista EOS Ansible modules:

https://ansible-arista-howto.readthedocs.io/en/latest/COLLECTING_STATUS.html

https://github.com/titom73/ansible-arista-module-howto

Installing ansible was fine with pip in my venv.

But I hit the wall with just the first example using “eos_facts”. Initially I wasnt adding debugging flags so was even worse. Fortunately I remembered “-vvv”. I was seeing this:

The full traceback is:
Traceback (most recent call last):
File "/home/tomas/.ansible/tmp/ansible-tmp-1594296522.1539829-295453-189146847007138/AnsiballZ_eos_facts.py", line 102, in
_ansiballz_main()
File "/home/tomas/.ansible/tmp/ansible-tmp-1594296522.1539829-295453-189146847007138/AnsiballZ_eos_facts.py", line 94, in _ansiballz_main
invoke_module(zipped_mod, temp_path, ANSIBALLZ_PARAMS)
File "/home/tomas/.ansible/tmp/ansible-tmp-1594296522.1539829-295453-189146847007138/AnsiballZ_eos_facts.py", line 40, in invoke_module
runpy.run_module(mod_name='ansible.modules.network.eos.eos_facts', init_globals=None, run_name='main', alter_sys=True)
File "/home/tomas/.pyenv/versions/3.7.3/lib/python3.7/runpy.py", line 205, in run_module
return _run_module_code(code, init_globals, run_name, mod_spec)
File "/home/tomas/.pyenv/versions/3.7.3/lib/python3.7/runpy.py", line 96, in _run_module_code
mod_name, mod_spec, pkg_name, script_name)
File "/home/tomas/.pyenv/versions/3.7.3/lib/python3.7/runpy.py", line 85, in _run_code
exec(code, run_globals)
File "/tmp/ansible_eos_facts_payload_r5gz8rov/ansible_eos_facts_payload.zip/ansible/modules/network/eos/eos_facts.py", line 206, in
File "/tmp/ansible_eos_facts_payload_r5gz8rov/ansible_eos_facts_payload.zip/ansible/modules/network/eos/eos_facts.py", line 197, in main
File "/tmp/ansible_eos_facts_payload_r5gz8rov/ansible_eos_facts_payload.zip/ansible/module_utils/network/common/facts/facts.py", line 23, in init
File "/tmp/ansible_eos_facts_payload_r5gz8rov/ansible_eos_facts_payload.zip/ansible/module_utils/network/common/network.py", line 213, in get_resource_connection
File "/tmp/ansible_eos_facts_payload_r5gz8rov/ansible_eos_facts_payload.zip/ansible/module_utils/network/common/network.py", line 229, in get_capabilities
File "/tmp/ansible_eos_facts_payload_r5gz8rov/ansible_eos_facts_payload.zip/ansible/module_utils/connection.py", line 121, in init
AssertionError: socket_path must be a value
fatal: [r3]: FAILED! => {
"changed": false,
"module_stderr": "Traceback (most recent call last):\n File \"/home/tomas/.ansible/tmp/ansible-tmp-1594296522.1539829-295453-189146847007138/AnsiballZ_eos_facts.py\", line 102, in \n _ansiballz_main()\n File \"/home/tomas/.ansible/tmp/ansible-tmp-1594296522.1539829-295453-189146847007138/AnsiballZ_eos_facts.py\", line 94, in _ansiballz_main\n invoke_module(zipped_mod, temp_path, ANSIBALLZ_PARAMS)\n File \"/home/tomas/.ansible/tmp/ansible-tmp-1594296522.1539829-295453-189146847007138/AnsiballZ_eos_facts.py\", line 40, in invoke_module\n runpy.run_module(mod_name='ansible.modules.network.eos.eos_facts', init_globals=None, run_name='main', alter_sys=True)\n File \"/home/tomas/.pyenv/versions/3.7.3/lib/python3.7/runpy.py\", line 205, in run_module\n return _run_module_code(code, init_globals, run_name, mod_spec)\n File \"/home/tomas/.pyenv/versions/3.7.3/lib/python3.7/runpy.py\", line 96, in _run_module_code\n mod_name, mod_spec, pkg_name, script_name)\n File \"/home/tomas/.pyenv/versions/3.7.3/lib/python3.7/runpy.py\", line 85, in _run_code\n exec(code, run_globals)\n File \"/tmp/ansible_eos_facts_payload_r5gz8rov/ansible_eos_facts_payload.zip/ansible/modules/network/eos/eos_facts.py\", line 206, in \n File \"/tmp/ansible_eos_facts_payload_r5gz8rov/ansible_eos_facts_payload.zip/ansible/modules/network/eos/eos_facts.py\", line 197, in main\n File \"/tmp/ansible_eos_facts_payload_r5gz8rov/ansible_eos_facts_payload.zip/ansible/module_utils/network/common/facts/facts.py\", line 23, in init\n File \"/tmp/ansible_eos_facts_payload_r5gz8rov/ansible_eos_facts_payload.zip/ansible/module_utils/network/common/network.py\", line 213, in get_resource_connection\n File \"/tmp/ansible_eos_facts_payload_r5gz8rov/ansible_eos_facts_payload.zip/ansible/module_utils/network/common/network.py\", line 229, in get_capabilities\n File \"/tmp/ansible_eos_facts_payload_r5gz8rov/ansible_eos_facts_payload.zip/ansible/module_utils/connection.py\", line 121, in init\nAssertionError: socket_path must be a value\n",
"module_stdout": "",
"msg": "MODULE FAILURE\nSee stdout/stderr for the exact error",
"rc": 1
}

So, “socket_path” was defined. I checked all the python files mentioned in the stack but couldnt find anything.

It was clear that I wasn’t providing enough info to ansible to establish the socket for connection to the devices (ip:port)

And the example from the documentation didnt work neither:

https://docs.ansible.com/ansible/latest/network/user_guide/platform_eos.html#using-eapi-in-ansible

I knew my old ansible script was working before I left my job. But I knew as well that I was using the latest version of ansible so very likely things have changed since then.

$ ansible --version
ansible 2.9.10

So I had to read about the “eos_fact” and “eos_config” module searching here:

https://docs.ansible.com/ansible/latest/modules/list_of_network_modules.html

https://docs.ansible.com/ansible/latest/modules/eos_facts_module.html#eos-facts-module

https://docs.ansible.com/ansible/latest/modules/eos_config_module.html

After some time, I managed to fix the playbook and my environment and I could run the playbook using the ssh connector (but I was ignoring a warning about “provider” not needed…)

(testdir2)/ansible master$ cat ansible-hosts
[all:vars]
ansible_python_interpreter=/home/tomas/storage/technology/arista/testdir2/bin/python
ansible_user='tomas'
ansible_password='tomas123'
[ceoslab]
r1 ansible_host=127.0.0.1 ansible_port=2000
r2 ansible_host=127.0.0.1 ansible_port=2001
r3 ansible_host=127.0.0.1 ansible_port=2002

(testdir2)/ansible master$ cat group_vars/ceoslab.yaml
ansible_network_os: eos

The output:

/ansible master$ ansible-playbook playbooks/collect-facts-cli.yaml
PLAY [Run commands on ceos lab]
TASK [Collect all facts from device] ***
[WARNING]: provider is unnecessary when using network_cli and will be ignored
[WARNING]: default value for gather_subset will be changed to min from !config v2.11 onwards
ok: [r1]
ok: [r3]
ok: [r2]
TASK [Display result] ****
ok: [r2] => {
"msg": "Model is cEOSLab and it is running 4.23.3M"
}
ok: [r1] => {
"msg": "Model is cEOSLab and it is running 4.23.3M"
}
ok: [r3] => {
"msg": "Model is cEOSLab and it is running 4.23.3M"
}
PLAY RECAP *
r1 : ok=2 changed=0 unreachable=0 failed=0 skipped=0 rescued=0 ignored=0
r2 : ok=2 changed=0 unreachable=0 failed=0 skipped=0 rescued=0 ignored=0
r3 : ok=2 changed=0 unreachable=0 failed=0 skipped=0 rescued=0 ignored=0

Ok, so getting the playbook using the API shouldnt be that difficult? It was.

The full traceback is:
File "/tmp/ansible_eos_facts_payload_vz7c7ipu/ansible_eos_facts_payload.zip/ansible/module_utils/network/common/network.py", line 229, in get_capabilities
capabilities = Connection(module._socket_path).get_capabilities()
File "/tmp/ansible_eos_facts_payload_vz7c7ipu/ansible_eos_facts_payload.zip/ansible/module_utils/connection.py", line 185, in rpc
raise ConnectionError(to_text(msg, errors='surrogate_then_replace'), code=code)
fatal: [r1]: FAILED! => {
"changed": false,
"invocation": {
"module_args": {
"auth_pass": null,
"authorize": null,
"gather_network_resources": null,
"gather_subset": [
"all"
],
"host": null,
"password": null,
"port": null,
"provider": null,
"ssh_keyfile": null,
"timeout": null,
"transport": null,
"use_ssl": null,
"username": null,
"validate_certs": null
}
},
"msg": "Could not connect to http://127.0.0.1:80/command-api: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: self signed certificate (_ssl.c:1056)"
}

I was surprised that it was using port 80. I was pretty sure I was providing the correct port (900x) so somehow my data wasnt being processed.

I wasn’t clearly paying attention to the documentation:

https://docs.ansible.com/ansible/latest/modules/eos_facts_module.html#eos-facts-module

It says clearly that “provider” is deprecated since 2.5! I am using 2.9

As well, I have a very poor knowledge of ansible and I didnt understand the concept of “connection”. The SSH was using “network_cli” and API was using “httpapi”.

I was very close to give up the API connection when somehow I searched for “ansible network_cli” and I found documentation for that plugging. Then I searched for “httpapi” and it was gold!

https://docs.ansible.com/ansible/latest/plugins/connection/network_cli.html

https://docs.ansible.com/ansible/latest/plugins/connection/httpapi.html

I realised that I need to pass specific vars for getting the HTTPS connection. So at the end, managed to get the config right for both SSH and HTTPS:

/ansible master$ cat ansible-hosts
[all:vars]
ansible_python_interpreter=/home/tomas/storage/technology/arista/testdir2/bin/python
ansible_user='tomas'
ansible_password='tomas123'
[ceoslab]
r1 ansible_host=127.0.0.1 ansible_port=2000 ansible_httpapi_port=9000
r2 ansible_host=127.0.0.1 ansible_port=2001 ansible_httpapi_port=9001
r3 ansible_host=127.0.0.1 ansible_port=2002 ansible_httpapi_port=9002

/ansible master$ cat group_vars/ceoslab.yaml
ansible_network_os: eos
#start - eapi config
ansible_httpapi_use_ssl: 'yes'
ansible_httpapi_validate_certs: 'no'
ansible_httpapi_password: "{{ ansible_password }}"
#end - eapi config

The output:

ansible master$ ansible-playbook playbooks/collect-facts-eapi.yaml
PLAY [Run commands on remote ceos lab] *
TASK [Collect all facts from device] ***
[WARNING]: default value for gather_subset will be changed to min from !config v2.11 onwards
ok: [r3]
ok: [r1]
ok: [r2]
TASK [Display result] ****
ok: [r2] => {
"msg": "Model is cEOSLab and it is running 4.23.3M"
}
ok: [r1] => {
"msg": "Model is cEOSLab and it is running 4.23.3M"
}
ok: [r3] => {
"msg": "Model is cEOSLab and it is running 4.23.3M"
}
PLAY RECAP *
r1 : ok=2 changed=0 unreachable=0 failed=0 skipped=0 rescued=0 ignored=0
r2 : ok=2 changed=0 unreachable=0 failed=0 skipped=0 rescued=0 ignored=0
r3 : ok=2 changed=0 unreachable=0 failed=0 skipped=0 rescued=0 ignored=0

At the end of the day, the scripts are identical apart from the “connection” var:

/ansible/playbooks master$ diff collect-facts-cli.yaml collect-facts-eapi.yaml
4c4
< connection: network_cli
---
> connection: httpapi

I think you can pass a var to the playbook via the CLI so I will try later.

My recommendation is always to use “-vvv”.

BTW, A good ansible summary I found:

https://gist.github.com/andreicristianpetcu/b892338de279af9dac067891579cad7d

In summary, I found ansible more difficult to troubleshoot that nornir. In nornir, is pure python, I can run ipdb wherever a I want.

But anyway, I learned things. I will add try to write a bit more complex playbooks.

Netbox – API Troubleshooting

Yesterday managed to get netbox and my lab connected. So today followed up with the original article, and found a new issue that took me several hours.

Initially I was seeing an error that I couldn’t undestand “

netbox.exceptions.CreateException: This field is required

From

(venv) /netbox-example/nornir-napalm-netbox-demo master$ python scripts/create_interfaces.py
nb_url = http://0.0.0.0:8080
Creating Netbox Interface for device r1, interface Loopback1
Traceback (most recent call last):
File "scripts/create_interfaces.py", line 42, in
task=create_netbox_interface,
File "/home/tomas/storage/technology/netbox-example/venv/lib/python3.7/site-packages/nornir/core/init.py", line 146, in run
result = self._run_serial(task, run_on, **kwargs)
File "/home/tomas/storage/technology/netbox-example/venv/lib/python3.7/site-packages/nornir/core/init.py", line 72, in _run_serial
result[host.name] = task.copy().start(host, self)
File "/home/tomas/storage/technology/netbox-example/venv/lib/python3.7/site-packages/nornir/core/task.py", line 85, in start
r = self.task(self, **self.params)
File "scripts/create_interfaces.py", line 34, in create_netbox_interface
device_id=device_id,
File "/home/tomas/storage/technology/netbox-example/venv/lib/python3.7/site-packages/netbox/dcim.py", line 431, in create_interface
return self.netbox_con.post('/dcim/interfaces/', required_fields, **kwargs)
File "/home/tomas/storage/technology/netbox-example/venv/lib/python3.7/site-packages/netbox/connection.py", line 124, in post
raise exceptions.CreateException(resp_data)
netbox.exceptions.CreateException: This field is required.

So I started to follow the trace, adding “print” and using “ipdb” to see what was going on:

....
/home/tomas/storage/technology/netbox-example/venv/lib/python3.7/site-packages/netbox/connection.py(71)__request()
70 finally:
---> 71 self.close()
72
ipdb> dir(response)
['attrs', 'bool', 'class', 'delattr', 'dict', 'dir', 'doc', 'enter', 'eq', 'exit', 'format', 'ge', 'getattribute', 'getstate', 'gt', 'hash', 'init', 'init_subclass', 'iter', 'le', 'lt', 'module', 'ne', 'new', 'nonzero', 'reduce', 'reduce_ex', 'repr', 'setattr', 'setstate', 'sizeof', 'str', 'subclasshook', 'weakref', '_content', '_content_consumed', '_next', 'apparent_encoding', 'close', 'connection', 'content', 'cookies', 'elapsed', 'encoding', 'headers', 'history', 'is_permanent_redirect', 'is_redirect', 'iter_content', 'iter_lines', 'json', 'links', 'next', 'ok', 'raise_for_status', 'raw', 'reason', 'request', 'status_code', 'text', 'url']
ipdb> response.url
'http://0.0.0.0:8080/api/dcim/interfaces/'
ipdb> response.text
'{"type":["This field is required."]}'
ipdb> response.status_code
400
ipdb> response.content
b'{"type":["This field is required."]}'
ipdb> response.reason
'Bad Request'
ipdb> response.request

ipdb> prepared_request

ipdb> prepared_request.url
'http://0.0.0.0:8080/api/dcim/interfaces/'
ipdb> dir(prepared_request)
['class', 'delattr', 'dict', 'dir', 'doc', 'eq', 'format', 'ge', 'getattribute', 'gt', 'hash', 'init', 'init_subclass', 'le', 'lt', 'module', 'ne', 'new', 'reduce', 'reduce_ex', 'repr', 'setattr', 'sizeof', 'str', 'subclasshook', 'weakref', '_body_position', '_cookies', '_encode_files', '_encode_params', '_get_idna_encoded_host', 'body', 'copy', 'deregister_hook', 'headers', 'hooks', 'method', 'path_url', 'prepare', 'prepare_auth', 'prepare_body', 'prepare_content_length', 'prepare_cookies', 'prepare_headers', 'prepare_hooks', 'prepare_method', 'prepare_url', 'register_hook', 'url']
ipdb> prepared_request.path_url
'/api/dcim/interfaces/'
ipdb> response.__content
*** AttributeError: 'Response' object has no attribute '__content'
ipdb> response._content
b'{"type":["This field is required."]}'
ipdb> response.content
b'{"type":["This field is required."]}'
ipdb> response.headers
{'Server': 'nginx', 'Date': 'Wed, 08 Jul 2020 12:36:35 GMT', 'Content-Type': 'application/json', 'Content-Length': '36', 'Connection': 'keep-alive', 'Vary': 'Accept, Cookie, Origin', 'Allow': 'GET, POST, HEAD, OPTIONS, TRACE', 'API-Version': '2.8', 'X-Content-Type-Options': 'nosniff', 'X-Frame-Options': 'SAMEORIGIN'}
ipdb> response.reason
'Bad Request'
ipdb> response.request

ipdb> response.test
*** AttributeError: 'Response' object has no attribute 'test'
ipdb> response.text
'{"type":["This field is required."]}'
ipdb> response.url
'http://0.0.0.0:8080/api/dcim/interfaces/'
ipdb> quit
Create Netbox Interfaces
r1 ** changed : False
vvvv Create Netbox Interfaces ** changed : False vvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv ERROR
---- napalm_get ** changed : False --------------------------------------------- INFO
(venv) go:1.12.5|py:3.7.3|tomas@athens:~/storage/technology/netbox-example/nornir-napalm-netbox-demo master$ python scripts/create_interfaces.py
nb_url = http://0.0.0.0:8080
url3=http://0.0.0.0:8080/api/dcim/interfaces?limit=0
Creating Netbox Interface for device r1, interface Loopback1
url3=http://0.0.0.0:8080/api/dcim/devices/?name=r1&limit=0
device_id = 1
url3=http://0.0.0.0:8080/api/dcim/interfaces/
resp_ok=False resp_status=400
body_data= {'name': 'Loopback1', 'form_factor': 1200, 'device': 1}
params= /dcim/interfaces/
resp_data= {'type': ['This field is required.']}
Traceback (most recent call last):
File "scripts/create_interfaces.py", line 43, in
task=create_netbox_interface,
File "/home/tomas/storage/technology/netbox-example/venv/lib/python3.7/site-packages/nornir/core/init.py", line 146, in run
result = self._run_serial(task, run_on, **kwargs)
File "/home/tomas/storage/technology/netbox-example/venv/lib/python3.7/site-packages/nornir/core/init.py", line 72, in _run_serial
result[host.name] = task.copy().start(host, self)
File "/home/tomas/storage/technology/netbox-example/venv/lib/python3.7/site-packages/nornir/core/task.py", line 85, in start
r = self.task(self, **self.params)
File "scripts/create_interfaces.py", line 35, in create_netbox_interface
device_id=device_id,
File "/home/tomas/storage/technology/netbox-example/venv/lib/python3.7/site-packages/netbox/dcim.py", line 431, in create_interface
return self.netbox_con.post('/dcim/interfaces/', required_fields, **kwargs)
File "/home/tomas/storage/technology/netbox-example/venv/lib/python3.7/site-packages/netbox/connection.py", line 130, in post
raise exceptions.CreateException(resp_data)
netbox.exceptions.CreateException: This field is required.

So it seems that at the end I realised that I was missing the parameter “type” !!!

I was checking the documentation from netbox in github but I couldnt see clearly what kind of config I had to provide…

I checked the “type” value for the only interfaces I already had in netbox: “http://0.0.0.0:8080/api/dcim/interfaces/

So I tried to pass exactly that but it was still failing…

(venv) go:1.12.5|py:3.7.3|tomas@athens:~/storage/technology/netbox-example/nornir-napalm-netbox-demo master$ python scripts/create_interfaces.py
nb_url = http://0.0.0.0:8080
url3=http://0.0.0.0:8080/api/dcim/interfaces?limit=0
Creating Netbox Interface for device r1, interface Loopback1
url3=http://0.0.0.0:8080/api/dcim/devices/?name=r1&limit=0
device_id = 1
url3=http://0.0.0.0:8080/api/dcim/interfaces/
resp_ok=False resp_status=400
body_data= {'name': 'Loopback1', 'form_factor': 1200, 'device': 1, 'type': {'value': '1000base-t', 'label': '1000BASE-T (1GE)', 'id': 1000}}
params= /dcim/interfaces/
resp_data= {'type': ['Value must be passed directly (e.g. "foo": 123); do not use a dictionary or list.']}
Traceback (most recent call last):
File "scripts/create_interfaces.py", line 50, in
task=create_netbox_interface,
File "/home/tomas/storage/technology/netbox-example/venv/lib/python3.7/site-packages/nornir/core/init.py", line 146, in run
result = self._run_serial(task, run_on, **kwargs)
File "/home/tomas/storage/technology/netbox-example/venv/lib/python3.7/site-packages/nornir/core/init.py", line 72, in _run_serial
result[host.name] = task.copy().start(host, self)
File "/home/tomas/storage/technology/netbox-example/venv/lib/python3.7/site-packages/nornir/core/task.py", line 85, in start
r = self.task(self, **self.params)
File "scripts/create_interfaces.py", line 42, in create_netbox_interface
**interface_type,
File "/home/tomas/storage/technology/netbox-example/venv/lib/python3.7/site-packages/netbox/dcim.py", line 431, in create_interface
return self.netbox_con.post('/dcim/interfaces/', required_fields, **kwargs)
File "/home/tomas/storage/technology/netbox-example/venv/lib/python3.7/site-packages/netbox/connection.py", line 130, in post
raise exceptions.CreateException(resp_data)
netbox.exceptions.CreateException: Value must be passed directly (e.g. "foo": 123); do not use a dictionary or list.
(venv) go:1.12.5|py:3.7.3|tomas@athens:~/storage/technology/netbox-example/nornir-napalm-netbox-demo master$

Somehow the API had to be documented… by chance, looking at the bottom of the netbox page, there was an”API” link….

So, now I needed to look up the correct API call. Based on the script and logs, it was a “POST” for “/dcim/interfaces/”. Here we go!

So finally, I had the info. I confirmed what fields were mandatory and the value they needed!

interface_type = {}
interface_type["type"] = "1000base-t"
for interface_name in interfaces.keys():
    if not is_interface_present(nb_interfaces, f"{task.host}", interface_name):
        print(
            f"* Creating Netbox Interface for device {task.host}, interface {interface_name}"
        )
        device_id = get_device_id(f"{task.host}", netbox)
        print("device_id = %s" % device_id)
        netbox.dcim.create_interface(
           name=f"{interface_name}",
           form_factor=1200,  # default
           device_id=device_id,
           **interface_type,
        )

So the script ran fine for all my devices:

netbox-example/nornir-napalm-netbox-demo master$ python scripts/create_interfaces.py
nb_url = http://0.0.0.0:8080
url3=http://0.0.0.0:8080/api/dcim/interfaces?limit=0
Create Netbox Interfaces
r1 ** changed : False
vvvv Create Netbox Interfaces ** changed : False vvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv INFO
---- napalm_get ** changed : False --------------------------------------------- INFO
^^^^ END Create Netbox Interfaces ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
r2 ** changed : False
vvvv Create Netbox Interfaces ** changed : False vvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv INFO
---- napalm_get ** changed : False --------------------------------------------- INFO
^^^^ END Create Netbox Interfaces ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
r3 ** changed : False
vvvv Create Netbox Interfaces ** changed : False vvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv INFO
---- napalm_get ** changed : False --------------------------------------------- INFO
^^^^ END Create Netbox Interfaces ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

And it is updated in GUI:

Backup Blog

I have decided to backup this blog. First time. So following instructions from wordpress.

  • 1- Backup Database:
https://your_domain_or_IP/phpmyadmin
https://wordpress.org/support/article/backing-up-your-database/
  • 2- Backup webserver files
root@vps:/var/www# ls
html wordpress
root@vps:/var/www# cd ..
root@vps:/var# tar zcvf web-backup.tgz www
  • 3- Transfer the tgz and the sql file to a folder in my laptop that is backed-up to external hard drive.
/blog-backup$ tree
.
├── apache-files
│   └── web-backup-20200707.tgz
└── mysql
    └── blog-backup-20200707.sql

Netbox

Another thing I wanted to play with is Netbox and found a good article to follow. So credits to the authors.

Using my current ceos lab from https://github.com/thomarite/ceos-testing

I followed Rick’s article to install netbox-docker and his own repo with the nornir examples using netbox. In this case nornir is going to use netbox as inventory. Normally I use local files. I created a python venv for 3.7.3

mkdir netbox-example; cd netbox-example
pyenv local 3.7.3
python -m virtualenv venv
source venv/bin/activate
git clone https://github.com/netbox-community/netbox-docker.git
cd netbox-docker
vim docker-compose.yml  --> so it always expose 8080
      nginx:
      ...
        ports:
          - 8080:8080
docker-compose pull
docker-compose up

When installing the requirements for “nornir-napalm-netbox-demo” I had to modify the version of some packages. So I removed the required version and I left pip to install the latest. I didnt use the makefile.

git clone https://github.com/rickdonato/nornir-napalm-netbox-demo
cd nornir-napalm-netbox-demo
python -m pip install -r requirements.txt

I struggled quite a bit with the management IP in netbox and the meaning of “platform”

  • Create Manufacturers under Device Types: I created “Arista”
  • Create Device Types under Device Types: I created “ceos”
  • Create Platforms under Devices: This is VERY important as it has to be a supported NAPALM platform!!! So for Arista, I need “eos”.
  • Create Device Roles under Devices. I created “pe”
  • Create Devices under Devices.
  • Within each device: add a management interface. Here, I got confused as I was adding the interface in the inventory section. The inventory section is info to/from the device using NAPALM. So you need to go to the bottom of the page, add the interface
  • and then add an IP to that interface and mark it as primary.

Keep in mind that initially, I was using “0.0.0.0” for each device as that’s the IP I have be using for all my scripts lately.

Keep in mind (II) that we are using docker twice (from different commands…) one to get netbox and the other via docker(-topo) to get the Arista ceos containers…. and we have iptables rules under the hood created by both…

But, let’s go step by step. Now we need to confirm that our nornir scrip can connect to netbox. So follow “Nornir-to-Netbox Configuration” section. This is my file. I updated the nb_url and nb_token. Notice the usage of “transform_function“.

---
core:
num_workers: 20
inventory:
plugin: nornir.plugins.inventory.netbox.NBInventory
options:
nb_url: 'http://0.0.0.0:8080'
nb_token: '<NETBOX_API_TOKEN>'
ssl_verify: False
transform_function: "helpers.adapt_user_password"

You need to update “scripts/secrets.py” with the devices you have in your inventory and the user/pass:

creds = {
"r1": {"username": "user", "password": "pas123"},
"r2": {"username": "user", "password": "pas123"},
"r3": {"username": "user", "password": "pas123"},
}

So now we can test if nornir can connect to netbox:

/netbox-example/nornir-napalm-netbox-demo master$ python scripts/helpers.py --inventory
{'defaults': {'connection_options': {},
'data': {},
'hostname': None,
'password': None,
'platform': None,
'port': None,
'username': None},
'groups': {},
'hosts': {'r1': {'connection_options': {},
'data': {'asset_tag': 'r1',
'model': 'ceos',
'role': 'pe',
'serial': 'r1',
'site': 'lab1',
'vendor': 'Arista'},
'groups': [],
'hostname': '192.168.16.2',
'password': 'pas123',
'platform': 'eos',
'port': None,
'username': 'user'},
'r2': {'connection_options': {},
'data': {'asset_tag': 'r2',
'model': 'ceos',
'role': 'pe',
'serial': 'r2',
'site': 'lab1',
'vendor': 'Arista'},
'groups': [],
'hostname': '192.168.16.3',
'password': 'pas123',
'platform': 'eos',
'port': None,
'username': 'user'},
'r3': {'connection_options': {},
'data': {'asset_tag': 'r3',
'model': 'ceos',
'role': 'pe',
'serial': 'r3',
'site': 'lab1',
'vendor': 'Arista'},
'groups': [],
'hostname': '192.168.16.4',
'password': 'pas123',
'platform': 'eos',
'port': None,
'username': 'user'}}}

All good. Let’s see if we can get backups from the devices.

netbox-example/nornir-napalm-netbox-demo master$ python scripts/backup_configs.py
Backup Device configurations**
r1 ** changed : True *
vvvv Backup Device configurations ** changed : False vvvvvvvvvvvvvvvvvvvvvvvvvvv INFO
---- napalm_get ** changed : False --------------------------------------------- INFO
---- write_file ** changed : True ---------------------------------------------- INFO
^^^^ END Backup Device configurations ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
r2 ** changed : True *
vvvv Backup Device configurations ** changed : False vvvvvvvvvvvvvvvvvvvvvvvvvvv INFO
---- napalm_get ** changed : False --------------------------------------------- INFO
---- write_file ** changed : True ---------------------------------------------- INFO
^^^^ END Backup Device configurations ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
r3 ** changed : True *
vvvv Backup Device configurations ** changed : False vvvvvvvvvvvvvvvvvvvvvvvvvvv INFO
---- napalm_get ** changed : False --------------------------------------------- INFO
---- write_file ** changed : True ---------------------------------------------- INFO
^^^^ END Backup Device configurations ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(venv) /netbox-example/nornir-napalm-netbox-demo master$

Great all good.

Now, let’s see netbox using NAPALM. If you click on “Status” for any device, netbox will use NAPALM to get the facts from the device. If netbox is not configured properly with NAPALM, it will fail. This is a working scenario:

The tabs “LLDP neighbors” and “Configuration” relay too in NAPALM.

So for configuring netbox with napalm you need to tell netbox the user/pass that NAPALM needs:

netbox-example$ vim netbox-docker/env/netbox.env
...
NAPALM_USERNAME=user
NAPALM_PASSWORD=pas123
NAPALM_TIMEOUT=10
...

Very likely you will have to restart netbox:

/netbox-docker release$ docker-compose down
/netbox-docker release$ docker-compose up

As mentioned before, I had an issue when I was using “0.0.0.0” as IP. By default (as It seems I can’t think) I was using the exposed IP/port from docker-topo to reach the ceos switches. I haven’t had an issue until using netbox.

I am using docker for netbox and docker(-topo) for my arista cEOS switches. So the connectivity between netbox and ceos is via the IPs/interfaces/bridges created by docker. And remember… you have iptables under the hood. My first mistake was telling netbox to use 0.0.0.0 as it is the one I am using to testing from my scripts when connecting to ceos. Netbox needs to point to the IP assigned by docker :facepalm: 192.16.16.x in my case. Second one, the port, same thing docker exposes the port 443 as 900x for external connections and I use 900x in my scripts. From netbox point of view, it is still 443 :facepalm: And finally, I am calling docker twice for building my lab, one for netbox-docker and the other for ceos. You need to keep an eye on iptables changes when restarting netbox via docker-compose because you can be in the situation that netbox traffic is dropped in DOCKER-ISOLATION-STAGE-1 :facepalm: (need to try to write a docker-compose to build everything in one go)

So when I was having errors from netbox that it was being rejected when connecting to ceos devices via NAPALM, I couldnt understand it. My scripts were fine using those details (0.0.0.0:900x)

I ran tcpdump on one ceos on the “ethernet0” interface and NOTHING was hitting the interface from netbox on port 900x but my scripts could…..

Somehow netbox wasnt able to reach ceos r1??? In my head, netbox and ceos devices were all in 0.0.0.0….. so no routing, no firewalls, they are connected in the same network 0.0.0.0…..

At the end I waked up and realised that the docker devices are using the IPs provided by docker so it is following normal routing… and firewalling by iptables. The same for ceos devices, they have IPs (different from 0.0.0.0)

So I updated netbox with the correct management IPs for r1, r2 and r3 ceos.

When I filtered by the real netbox IP in r1 tcpdump ethernet0, I was seeing traffic on 900x!!! Good. Then I realised that it has to be 443. So I removed my hack to update the port to 900x.

For a different reason I had to restart docker-topo (for ceos) and then docker netbox. And now, I coudnt see any traffic from netbox in r1….. I “didn’t” change anything. So the routing didnt change, there was something else “cutting” the connection: iptables

docker uses iptables very heavily. I realised that after restart docker-netbox, iptables changed…

before restart:

# iptables -t filter -S DOCKER-ISOLATION-STAGE-1
Warning: iptables-legacy tables present, use iptables-legacy to see them
-N DOCKER-ISOLATION-STAGE-1
-A DOCKER-ISOLATION-STAGE-1 -j ACCEPT
-A DOCKER-ISOLATION-STAGE-1 -i br-94a8183a4fb1 ! -o br-94a8183a4fb1 -j DOCKER-ISOLATION-STAGE-2
-A DOCKER-ISOLATION-STAGE-1 -i br-0d4ec9aba9bd ! -o br-0d4ec9aba9bd -j DOCKER-ISOLATION-STAGE-2
-A DOCKER-ISOLATION-STAGE-1 -i br-609619313dc8 ! -o br-609619313dc8 -j DOCKER-ISOLATION-STAGE-2
-A DOCKER-ISOLATION-STAGE-1 -i br-61d32350cb58 ! -o br-61d32350cb58 -j DOCKER-ISOLATION-STAGE-2
-A DOCKER-ISOLATION-STAGE-1 -i br-384488acbc99 ! -o br-384488acbc99 -j DOCKER-ISOLATION-STAGE-2
-A DOCKER-ISOLATION-STAGE-1 -i docker0 ! -o docker0 -j DOCKER-ISOLATION-STAGE-2
-A DOCKER-ISOLATION-STAGE-1 -j RETURN

after restart:

# iptables -t filter -S DOCKER-ISOLATION-STAGE-1
Warning: iptables-legacy tables present, use iptables-legacy to see them
-N DOCKER-ISOLATION-STAGE-1
-A DOCKER-ISOLATION-STAGE-1 -i br-381cdff63d2f ! -o br-381cdff63d2f -j DOCKER-ISOLATION-STAGE-2
-A DOCKER-ISOLATION-STAGE-1 -j ACCEPT
-A DOCKER-ISOLATION-STAGE-1 -i br-94a8183a4fb1 ! -o br-94a8183a4fb1 -j DOCKER-ISOLATION-STAGE-2
-A DOCKER-ISOLATION-STAGE-1 -i br-0d4ec9aba9bd ! -o br-0d4ec9aba9bd -j DOCKER-ISOLATION-STAGE-2
-A DOCKER-ISOLATION-STAGE-1 -i br-609619313dc8 ! -o br-609619313dc8 -j DOCKER-ISOLATION-STAGE-2
-A DOCKER-ISOLATION-STAGE-1 -i br-61d32350cb58 ! -o br-61d32350cb58 -j DOCKER-ISOLATION-STAGE-2
-A DOCKER-ISOLATION-STAGE-1 -i docker0 ! -o docker0 -j DOCKER-ISOLATION-STAGE-2
-A DOCKER-ISOLATION-STAGE-1 -j RETURN

With the restart, docker created a new bridge interface for netbox (old: br-384488acbc99, new: br-381cdff63d2f) and it wasnt hitting anymore the “DOCKER-ISOLATION-STAGE-1 -j ACCEPT”

So I had to make an iptables change:

# iptables -t filter -D DOCKER-ISOLATION-STAGE-1 -j ACCEPT
# iptables -t filter -I DOCKER-ISOLATION-STAGE-1 -j ACCEPT
# iptables -t filter -S DOCKER-ISOLATION-STAGE-1
Warning: iptables-legacy tables present, use iptables-legacy to see them
-N DOCKER-ISOLATION-STAGE-1
-A DOCKER-ISOLATION-STAGE-1 -j ACCEPT
-A DOCKER-ISOLATION-STAGE-1 -i br-381cdff63d2f ! -o br-381cdff63d2f -j DOCKER-ISOLATION-STAGE-2
-A DOCKER-ISOLATION-STAGE-1 -i br-94a8183a4fb1 ! -o br-94a8183a4fb1 -j DOCKER-ISOLATION-STAGE-2
-A DOCKER-ISOLATION-STAGE-1 -i br-0d4ec9aba9bd ! -o br-0d4ec9aba9bd -j DOCKER-ISOLATION-STAGE-2
-A DOCKER-ISOLATION-STAGE-1 -i br-609619313dc8 ! -o br-609619313dc8 -j DOCKER-ISOLATION-STAGE-2
-A DOCKER-ISOLATION-STAGE-1 -i br-61d32350cb58 ! -o br-61d32350cb58 -j DOCKER-ISOLATION-STAGE-2
-A DOCKER-ISOLATION-STAGE-1 -i docker0 ! -o docker0 -j DOCKER-ISOLATION-STAGE-2
-A DOCKER-ISOLATION-STAGE-1 -j RETURN
#

And finally, netbox could use napalm to contact the ceos devices…. Calling docker twice is not a great idea….

BTW, this is my docker ps with netbox and ceos devices:

(venv) /netbox-example$ docker ps -a
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
c23be76ffd54 nginx:1.17-alpine "nginx -c /etc/netbo…" 4 hours ago Up 4 hours 80/tcp, 0.0.0.0:8080->8080/tcp netbox-docker_nginx_1
5a0b89f18578 netboxcommunity/netbox:latest "/opt/netbox/docker-…" 4 hours ago Up 4 hours netbox-docker_netbox_1
528948de329b netboxcommunity/netbox:latest "python3 /opt/netbox…" 4 hours ago Up 4 hours netbox-docker_netbox-worker_1
29529302ba1c redis:5-alpine "docker-entrypoint.s…" 4 hours ago Up 4 hours 6379/tcp netbox-docker_redis_1
5e975ec2aa70 redis:5-alpine "docker-entrypoint.s…" 4 hours ago Up 4 hours 6379/tcp netbox-docker_redis-cache_1
6158672a4ae6 postgres:11-alpine "docker-entrypoint.s…" 4 hours ago Up 4 hours 5432/tcp netbox-docker_postgres_1
34841aa098d4 ceos-lab:4.23.3M "/sbin/init systemd.…" 5 hours ago Up 5 hours 0.0.0.0:2002->22/tcp, 0.0.0.0:9002->443/tcp 3node_r03
4ca92c6a3b09 ceos-lab:4.23.3M "/sbin/init systemd.…" 5 hours ago Up 5 hours 0.0.0.0:2001->22/tcp, 0.0.0.0:9001->443/tcp 3node_r02
67e8b7ab84e0 ceos-lab:4.23.3M "/sbin/init systemd.…" 5 hours ago Up 5 hours 0.0.0.0:2000->22/tcp, 0.0.0.0:9000->443/tcp 3node_r01

I was painful but I learned a couple of things about netbox, nornir and docker/iptables!!!

Dynamic DNS

I am using GCP for EVE-NG but I dont have permanent public IPs for the VMs as it has a cost and they are not running all the time. I am not really bother about that but talking with a friend a couple of weeks ago he wanted to have a public IP to his home lab using a commercial broadband that obviously provides dynamic IPs. So I searched a bit and found different solutions and found this:

https://www.duckdns.org

For my needs it is enough. It is free up to 5 domains. And you dont have to install any software in your systems. Just a cron job calling a basic script with one line.

The thing I don’t like. If you don’t pay with money… you pay with your data. You have to use an account from Twitter, Google, Reddit or Github. Fortunately I have an account from one of those services that I dont use so it has minimum data.

Choco Cookies

It is something I have never tried to bake. I consider it a very American/British thing. I have tried good ones in UK in Ben’s Cookies and it seems there is a great version in USA in Levain Bakery.

So searching for recipes, I chose this one:

Ingredients (adapted to what I have):

Plain flour 200g
Self-Raising flour 100g
100% cocoa powder 50g
1 teaspoon of corn flour
1 teaspoon of baking soda
1 teaspoon of baking powder
Half a teaspoon of salt

Cold butter 200g
Brown sugar 130g
White sugar 70g

2 free range eggs - beaten room temperature 

300g 80% dark chocolate in pieces
20g of mixed nuts crashed

Process:

  • Sieve flour, cocoa powder, corn, baking soda/powder, and salt.
  • Cut the cold butter into a small cubes.
  • Chop chocolate tablets into small pieces.
  • In a big enough bowl, cream the butter and sugar. I do it by hand.
  • Add eggs in 3 separate steps, and keep mixing.
  • Add the dry ingredients (flour mix), and mix lightly until you still see unmixed flour.
  • Add chocolate pieces and nuts. Mix lightly.
  • Cover and move the dough to the fridge for 2-3 hours.
  • Weight 170gr of dough per cookie. Make a ball and place it on a tray.
  • Move the try to the fridge for 30 minutes in the fridge.
  • Pre heat the oven to 200℃ and bake it at 180℃ for 10 minutes.
  • Let them rest for 15 minutes or more until the surface is a bit hard. If not they will break down in your hand.

Veredict

Obviously, they dont look like the ones in the video or the other sites but they were good.

Difficult to believe, but they dont taste super sweet. I used 100% cocoa powder and 85% dark chocolate. Still one cookie has the amount of chocolate and sugar that I take in one week 😛

Reminder

You can put the rest of cookies in the freezer! And enjoy fresh baked cookies any day!!! (I had three left over)

Mistakes

  • They where in the oven for 13 minutes or more… so they flat out more than I wished.
  • They are very big cookies so make sure they have plenty of room. I only put four and one moved when I put the tray in the oven.

Another time, I will try the walnut one.

The Great Suspender

No, it is not me when I was a kid. It is a GC extension. I have a very bad habit of opening many tabs in my browser with the excuse, I will take a look later. That takes a big toll in CPU/Memory. With this extension, my laptop is running very smoothly even when I have three cEOS docker boxes running in the background. The fan runs less often. I have been using it for over a week and I am very happy with it. Need to find something for Firefox.

Github + ssh-key

There are many links for this in the Internet so I am not going to discover the fire but I struggled a bit so….

The official links from github were ok and other people did a very good job too documenting the process.

https://docs.github.com/en/github/authenticating-to-github/testing-your-ssh-connection

https://docs.github.com/en/github/authenticating-to-github/error-permission-denied-publickey

https://jdblischak.github.io/2014-09-18-chicago/novice/git/05-sshkeys.html

I had already a key that I wanted to use. So adding it to the repo was ok.

Testing it was my challenge. I was missing two things. My key wasn’t following the standard file name so it wasn’t used by my ssh-agent and then, i wasn’t using the “git” user when testing…. I was using my github username.

So add the key and check it is there.

$ ssh-add ~/.ssh/id_ed25519-gh
$ ssh-add -l -E md5
256 MD5:xx:xx:xx:xx:xx:67:xx:6a:73:xx:8a:xx:7f:78:xx:xx user@gh (ED25519)

Check you can ssh to github.

$ ssh -T git@github.com
Hi xxxx! You've successfully authenticated, but GitHub does not provide shell access.
$

Ok, all good now. But this is not a new repo, how I move from the “old” user/pass to the “new” ssh-key process?

You can clone the repo again using ssh:

Or you can change the git config locally in the “url” bit.

/ceos-testing/.git master$ cat config
[core]
repositoryformatversion = 0
filemode = true
bare = false
logallrefupdates = true
[remote "origin"]
#url = https://github.com/thomarite/ceos-testing.git
url = git@github.com:thomarite/ceos-testing.git
fetch = +refs/heads/:refs/remotes/origin/
[branch "master"]
remote = origin
merge = refs/heads/master
$

After that you can “git push” using your ssh-key.

2023-01

Looks like I dont learn the lesson….

1- Create Key

$ ssh-keygen -t ed25519 -C "your@email.com"
Generating public/private ed25519 key pair.
Enter file in which to save the key (/home/USERNAME/.ssh/id_ed25519): /home/USERNAME/.ssh/id_ed25519.github

2- Upload key to Github

3- Start agent and add key

$ ssh-agent -s
SSH_AUTH_SOCK=/tmp/ssh-XXXXXXjMtZn7/agent.250293; export SSH_AUTH_SOCK;
SSH_AGENT_PID=250294; export SSH_AGENT_PID;
echo Agent pid 250294;
$ ssh-add ~/.ssh/id_ed25519.github
Identity added: /home/USERNAME/.ssh/id_ed25519.github (your@email.com)
$ 

4- Authenticate to git

$ ssh -T git@github.com
Hi USERNAME! You've successfully authenticated, but GitHub does not provide shell access.
$ 

5- Push to git. Be sure your repo is not using https! Change it as showed here.

$ git remote get-url origin
https://github.com/SOMEBODY/scripts.git
$ git remote set-url origin git@github.com:SOMEBODY/scripts.git
$ 
$ git remote get-url origin
git@github.com:SOMEBODY/scripts.git
$ 
$ git push
Enumerating objects: 5, done.
Counting objects: 100% (5/5), done.
Delta compression using up to 4 threads
Compressing objects: 100% (4/4), done.
Writing objects: 100% (4/4), 2.07 KiB | 2.07 MiB/s, done.
Total 4 (delta 1), reused 0 (delta 0), pack-reused 0
remote: Resolving deltas: 100% (1/1), completed with 1 local object.
To github.com:SOMEBODY/scripts.git
   6a4cb1a..07a4a83  main -> main
$ 

Which SSH keyfile was used to authenticate a login?

I have realised that I had two keys in my VPS and I wasn’t sure which one it was used when I was ssh-ing so I had to search a bit to find out.

These two links cover the process:

https://unix.stackexchange.com/questions/15575/can-i-find-out-which-ssh-key-was-used-to-access-an-account

https://unix.stackexchange.com/questions/147295/how-can-i-determine-which-ssh-keyfile-was-used-to-authenticate-a-login

1- You need to increase the logging of your sshd (destination – server)

server# vim /etc/ssh/sshd_config
LogLevel VERBOSE
server# service sshd restart
server# tail -f /var/log/auth.log

2- From client, just ssh as usual to the server and check auth.log as per above

Jul 3 14:17:55 server sshd[8600]: Connection from IPV6 port 57628 on IPV6::453 port 64022
Jul 3 14:17:55 server sshd[8600]: Postponed publickey for client from IPv6 port 57628 ssh2 [preauth]
Jul 3 14:17:55 server sshd[8600]: Accepted publickey for client from IPv6 port 57628 ssh2: ED25519 SHA256:BtOAX9eVpFJJgJ5HzjKU8E973m+MX+3gDxsm7eT/iEQ
Jul 3 14:17:55 server sshd[8600]: pam_unix(sshd:session): session opened for user client by (uid=0)
Jul 3 14:17:55 server sshd[8600]: User child is on pid 8606
Jul 3 14:17:55 server sshd[8606]: Starting session: shell on pts/7 for client from IPv6 port 57628 id 0

3- So we have the fingertip of the key used by client. Now we need to get the fingertips of our clients keys to find the match:

client $ ssh-keygen -l -f ~/.ssh/id_ed25519.pub
256 SHA256:BtOAX9eVpFJJgJ5HzjKU8E973m+MX+3gDxsm7eT/iEQ client@local (ED25519)

4- So the we can see that I am using my id_ed25519.pub key to connect to the server