{"id":669,"date":"2021-04-26T17:21:41","date_gmt":"2021-04-26T16:21:41","guid":{"rendered":"https:\/\/blog.thomarite.uk\/?p=669"},"modified":"2021-04-26T17:21:41","modified_gmt":"2021-04-26T16:21:41","slug":"grep-multiline","status":"publish","type":"post","link":"https:\/\/blog.thomarite.uk\/index.php\/2021\/04\/26\/grep-multiline\/","title":{"rendered":"grep multiline"},"content":{"rendered":"\n<p>I want to count the number of interfaces that have some specific configuration in my router. I want to use the most basic tools found in linux (so dont have to assume anything else is installed) and I want to use as less commands as possible.<\/p>\n\n\n\n<p>So this is my config:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>frr version 7.5\nfrr defaults traditional\nhostname R2\nlog syslog informational\nno ipv6 forwarding\nservice integrated-vtysh-config\n!\ninterface ens6\n ip router isis ISIS \n isis circuit-type level-2-only\n isis network point-to-point\n!\ninterface lo1\n ip router isis ISIS \n isis passive\n!\ninterface ens7\n ip router isis ISIS \n isis circuit-type level-2-only\n isis network point-to-point\n!\ninterface lo2\n ip router isis ISIS \n isis passive\n!\nmpls ldp\n router-id 172.20.15.2\n !\n address-family ipv4\n  discovery transport-address 172.20.15.2\n  !\n  interface ens6\n  !\n  interface ens7\n  !\n exit-address-family\n !\n!\nrouter isis ISIS \n net 49.0001.1720.2001.5002.00\n!\nline vty\n!\n<\/code><\/pre>\n\n\n\n<p>And I want to count the number of interfaces that have &#8220;isis network point-to-point&#8221; regardless of any other config.<\/p>\n\n\n\n<p>In this example, we have just two interfaces.<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>interface ens6\n ip router isis ISIS \n isis circuit-type level-2-only\n isis network point-to-point\n\ninterface ens7\n ip router isis ISIS \n isis circuit-type level-2-only\n isis network point-to-point\n<\/code><\/pre>\n\n\n\n<p>The pseudo-pattern should be something like:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>^interface ens.*point-to-point$<\/code><\/pre>\n\n\n\n<p>So something that starts with &#8220;interface ens&#8221;, it can have anything after that and then it ends with &#8220;point-to-point&#8221;<\/p>\n\n\n\n<p>Ideally I want to use just &#8220;grep&#8221; and it is a standard and common tool<\/p>\n\n\n\n<p>But grep mainly works in one line each time. And my pattern covers multiple lines.<\/p>\n\n\n\n<p>So I searched for some help and found <a href=\"https:\/\/stackoverflow.com\/questions\/2686147\/how-to-find-patterns-across-multiple-lines-using-grep\">this<\/a> that uses &#8220;perl compatible regular expressions&#8221; (PCRE). I have no idea about perl but let&#8217;s give it a go:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code><strong><em>$ grep -Pz '(?s)interface ens.*point-to-point\\n' r5.txt<\/em><\/strong>\nfrr version 7.5\nfrr defaults traditional\nhostname R2\nlog syslog informational\nno ipv6 forwarding\nservice integrated-vtysh-config\n!\n<strong>interface ens6\n ip router isis ISIS <\/strong>\n<strong> isis circuit-type level-2-only\n isis network point-to-point<\/strong>\n<strong>!\ninterface lo1\n ip router isis ISIS \n isis passive\n!<\/strong>\n<strong>interface ens7\n ip router isis ISIS \n isis circuit-type level-2-only\n isis network point-to-point<\/strong>\n!\ninterface lo2\n ip router isis ISIS \n isis passive\n!\nmpls ldp\n router-id 172.20.15.2\n !\n address-family ipv4\n  discovery transport-address 172.20.15.2\n  !\n  interface ens6\n  !\n  interface ens7\n  !\n exit-address-family\n !\n!\nrouter isis ISIS \n net 49.0001.1720.2001.5002.00\n!\nline vty\n!\n<\/code><\/pre>\n\n\n\n<p>Let&#8217;s explain the parameters provided to grep so far:<\/p>\n\n\n\n<ul class=\"wp-block-list\"><li>-P: Use perl compatible regular expressions (PCRE).<\/li><li>-z: Treat the input as a set of lines, each terminated by a zero byte instead of a newline. i.e. grep treats the input as a one big line.<\/li><li>(?s): activate PCRE_DOTALL, which means that &#8216;.&#8217; matches any character or newline.<\/li><\/ul>\n\n\n\n<p>But if I count, we dont have the expected answer of 2:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>$ grep -Pzc '(?s)interface ens.*point-to-point\\n' r5.txt\n1\n<\/code><\/pre>\n\n\n\n<p>The &#8220;z&#8221; parameter is treating the file as a single line so for grep, there is one match only. The initial command shows in bold just one block.<\/p>\n\n\n\n<p>We notice that the pattern is matching &#8220;interface lo1&#8221; and that is not what we want, it should be ignored.<\/p>\n\n\n\n<p>So our pattern should match the smallest string. So we want a non-greedy matching regex. So searching again, found <a href=\"https:\/\/stackoverflow.com\/questions\/3027518\/how-to-do-a-non-greedy-match-in-grep\/3027524\">this<\/a>.  It seems for Perl regex, we need to use<strong> ? <\/strong>after <strong>*<\/strong><\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>$ grep -Pz '(?s)interface ens<strong>.*?<\/strong>point-to-point\\n' r5.txt\nfrr version 7.5\nfrr defaults traditiona\nhostname R2\nlog syslog informational\nno ipv6 forwarding\nservice integrated-vtysh-config\n!\n<strong>interface ens6<\/strong>\n<strong> ip router isis ISIS \n isis circuit-type level-2-only\n isis network point-to-point<\/strong>\n!\ninterface lo1\n ip router isis ISIS \n isis passive\n!\n<strong>interface ens7<\/strong>\n<strong> ip router isis ISIS \n isis circuit-type level-2-only\n isis network point-to-point<\/strong>\n!\ninterface lo2\n ip router isis ISIS \n isis passive\n!\nmpls ldp\n router-id 172.20.15.2\n !\n address-family ipv4\n  discovery transport-address 172.20.15.2\n  !\n  interface ens6\n  !\n  interface ens7\n  !\n exit-address-family\n !\n!\nrouter isis ISIS \n net 49.0001.1720.2001.5002.00\n!\nline vty\n!\n\n<\/code><\/pre>\n\n\n\n<p>So now, we can see two blocks highlighted. So now let&#8217;s print only the matched strings using <strong>-o<\/strong>:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>$ grep -Pzo '(?s)interface ens.*?point-to-point\\n' r5.txt\ninterface ens6\n ip router isis ISIS \n isis circuit-type level-2-only\n isis network point-to-point\ninterface ens7\n ip router isis ISIS \n isis circuit-type level-2-only\n isis network point-to-point\n<\/code><\/pre>\n\n\n\n<p>So this looks correct but still counting (-c) doesnt work properly because -z is treating the entry as one big line.<\/p>\n\n\n\n<p>I haven&#8217;t been able to find the solution with just one command so at the end, I have to pipe another grep. The initial grep matches the pattern, so the second one should just count a specific pattern like &#8220;point&#8221;. It should be that simple:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>$ grep -Pzo '(?s)interface ens.*?point-to-point\\n' r5.txt | grep point\ngrep: (standard input): binary file matches\n<\/code><\/pre>\n\n\n\n<p>Weird, I thought this was pure text but seems the ouput of the first grep has some binary data:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>$ grep -Pzo '(?s)interface ens.*?point-to-point\\n' r5.txt > r55.txt\n$ vim r55.txt\ninterface ens6\n ip router isis ISIS\n isis circuit-type level-2-only\n isis network point-to-point\n<strong>^@<\/strong>interface ens7\n ip router isis ISIS \n isis circuit-type level-2-only\n isis network point-to-point\n<strong>^@<\/strong>\n<\/code><\/pre>\n\n\n\n<p>But we can tell grep to read binary data too using -a as per this <a href=\"https:\/\/unix.stackexchange.com\/questions\/335716\/grep-returns-binary-file-standard-input-matches-when-trying-to-find-a-string\">blog<\/a> and then count.<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>$ grep -Pzo '(?s)interface ens.*?point-to-point\\n' r5.txt | grep -a point\n isis network point-to-point\n isis network point-to-point\n$ grep -Pzo '(?s)interface ens.*?point-to-point\\n' r5.txt | grep -ac point\n2\n<\/code><\/pre>\n\n\n\n<p>Funny enough, if I just want to count, I dont need -a:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>$ grep -Pzo '(?s)interface ens.*?point-to-point\\n' r5.txt | grep -c point\n2\n<\/code><\/pre>\n\n\n\n<p>So not sure if this is the best solution but it took me a bit to find it. It seems to work:<\/p>\n\n\n\n<p><strong>grep -Pzo &#8216;(?s)interface ens.*?point-to-point\\n&#8217; r5.txt | grep -ac point<\/strong><\/p>\n\n\n\n<p><\/p>\n","protected":false},"excerpt":{"rendered":"<p>I want to count the number of interfaces that have some specific configuration in my router. I want to use the most basic tools found in linux (so dont have to assume anything else is installed) and I want to use as less commands as possible. So this is my config: And I want to &hellip; <a href=\"https:\/\/blog.thomarite.uk\/index.php\/2021\/04\/26\/grep-multiline\/\" class=\"more-link\">Continue reading<span class=\"screen-reader-text\"> &#8220;grep multiline&#8221;<\/span><\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"closed","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[3],"tags":[],"class_list":["post-669","post","type-post","status-publish","format-standard","hentry","category-unix"],"_links":{"self":[{"href":"https:\/\/blog.thomarite.uk\/index.php\/wp-json\/wp\/v2\/posts\/669","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/blog.thomarite.uk\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/blog.thomarite.uk\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/blog.thomarite.uk\/index.php\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/blog.thomarite.uk\/index.php\/wp-json\/wp\/v2\/comments?post=669"}],"version-history":[{"count":1,"href":"https:\/\/blog.thomarite.uk\/index.php\/wp-json\/wp\/v2\/posts\/669\/revisions"}],"predecessor-version":[{"id":670,"href":"https:\/\/blog.thomarite.uk\/index.php\/wp-json\/wp\/v2\/posts\/669\/revisions\/670"}],"wp:attachment":[{"href":"https:\/\/blog.thomarite.uk\/index.php\/wp-json\/wp\/v2\/media?parent=669"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/blog.thomarite.uk\/index.php\/wp-json\/wp\/v2\/categories?post=669"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/blog.thomarite.uk\/index.php\/wp-json\/wp\/v2\/tags?post=669"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}