<!DOCTYPE HTML>
<html lang="en" class="light sidebar-visible" dir="ltr">
<head>
<!-- Book generated using mdBook -->
<meta charset="UTF-8">
<title>awk - Notes</title>
<!-- Custom HTML head -->
<meta name="description" content="">
<meta name="viewport" content="width=device-width, initial-scale=1">
<meta name="theme-color" content="#ffffff">
<link rel="icon" href="../favicon.svg">
<link rel="shortcut icon" href="../favicon.png">
<link rel="stylesheet" href="../css/variables.css">
<link rel="stylesheet" href="../css/general.css">
<link rel="stylesheet" href="../css/chrome.css">
<link rel="stylesheet" href="../css/print.css" media="print">
<!-- Fonts -->
<link rel="stylesheet" href="../FontAwesome/css/font-awesome.css">
<link rel="stylesheet" href="../fonts/fonts.css">
<!-- Highlight.js Stylesheets -->
<link rel="stylesheet" href="../highlight.css">
<link rel="stylesheet" href="../tomorrow-night.css">
<link rel="stylesheet" href="../ayu-highlight.css">
<!-- Custom theme stylesheets -->
<!-- Provide site root to javascript -->
<script>
var path_to_root = "../";
var default_theme = window.matchMedia("(prefers-color-scheme: dark)").matches ? "navy" : "light";
</script>
<!-- Start loading toc.js asap -->
<script src="../toc.js"></script>
</head>
<body>
<div id="body-container">
<!-- Work around some values being stored in localStorage wrapped in quotes -->
<script>
try {
var theme = localStorage.getItem('mdbook-theme');
var sidebar = localStorage.getItem('mdbook-sidebar');
if (theme.startsWith('"') && theme.endsWith('"')) {
localStorage.setItem('mdbook-theme', theme.slice(1, theme.length - 1));
}
if (sidebar.startsWith('"') && sidebar.endsWith('"')) {
localStorage.setItem('mdbook-sidebar', sidebar.slice(1, sidebar.length - 1));
}
} catch (e) { }
</script>
<!-- Set the theme before any content is loaded, prevents flash -->
<script>
var theme;
try { theme = localStorage.getItem('mdbook-theme'); } catch(e) { }
if (theme === null || theme === undefined) { theme = default_theme; }
const html = document.documentElement;
html.classList.remove('light')
html.classList.add(theme);
html.classList.add("js");
</script>
<input type="checkbox" id="sidebar-toggle-anchor" class="hidden">
<!-- Hide / unhide sidebar before it is displayed -->
<script>
var sidebar = null;
var sidebar_toggle = document.getElementById("sidebar-toggle-anchor");
if (document.body.clientWidth >= 1080) {
try { sidebar = localStorage.getItem('mdbook-sidebar'); } catch(e) { }
sidebar = sidebar || 'visible';
} else {
sidebar = 'hidden';
}
sidebar_toggle.checked = sidebar === 'visible';
html.classList.remove('sidebar-visible');
html.classList.add("sidebar-" + sidebar);
</script>
<nav id="sidebar" class="sidebar" aria-label="Table of contents">
<!-- populated by js -->
<mdbook-sidebar-scrollbox class="sidebar-scrollbox"></mdbook-sidebar-scrollbox>
<noscript>
<iframe class="sidebar-iframe-outer" src="../toc.html"></iframe>
</noscript>
<div id="sidebar-resize-handle" class="sidebar-resize-handle">
<div class="sidebar-resize-indicator"></div>
</div>
</nav>
<div id="page-wrapper" class="page-wrapper">
<div class="page">
<div id="menu-bar-hover-placeholder"></div>
<div id="menu-bar" class="menu-bar sticky">
<div class="left-buttons">
<label id="sidebar-toggle" class="icon-button" for="sidebar-toggle-anchor" title="Toggle Table of Contents" aria-label="Toggle Table of Contents" aria-controls="sidebar">
<i class="fa fa-bars"></i>
</label>
<button id="theme-toggle" class="icon-button" type="button" title="Change theme" aria-label="Change theme" aria-haspopup="true" aria-expanded="false" aria-controls="theme-list">
<i class="fa fa-paint-brush"></i>
</button>
<ul id="theme-list" class="theme-popup" aria-label="Themes" role="menu">
<li role="none"><button role="menuitem" class="theme" id="light">Light</button></li>
<li role="none"><button role="menuitem" class="theme" id="rust">Rust</button></li>
<li role="none"><button role="menuitem" class="theme" id="coal">Coal</button></li>
<li role="none"><button role="menuitem" class="theme" id="navy">Navy</button></li>
<li role="none"><button role="menuitem" class="theme" id="ayu">Ayu</button></li>
</ul>
<button id="search-toggle" class="icon-button" type="button" title="Search. (Shortkey: s)" aria-label="Toggle Searchbar" aria-expanded="false" aria-keyshortcuts="S" aria-controls="searchbar">
<i class="fa fa-search"></i>
</button>
</div>
<h1 class="menu-title">Notes</h1>
<div class="right-buttons">
<a href="../print.html" title="Print this book" aria-label="Print this book">
<i id="print-button" class="fa fa-print"></i>
</a>
<a href="https://github.com/johannst/notes" title="Git repository" aria-label="Git repository">
<i id="git-repository-button" class="fa fa-github"></i>
</a>
</div>
</div>
<div id="search-wrapper" class="hidden">
<form id="searchbar-outer" class="searchbar-outer">
<input type="search" id="searchbar" name="searchbar" placeholder="Search this book ..." aria-controls="searchresults-outer" aria-describedby="searchresults-header">
</form>
<div id="searchresults-outer" class="searchresults-outer hidden">
<div id="searchresults-header" class="searchresults-header"></div>
<ul id="searchresults">
</ul>
</div>
</div>
<!-- Apply ARIA attributes after the sidebar and the sidebar toggle button are added to the DOM -->
<script>
document.getElementById('sidebar-toggle').setAttribute('aria-expanded', sidebar === 'visible');
document.getElementById('sidebar').setAttribute('aria-hidden', sidebar !== 'visible');
Array.from(document.querySelectorAll('#sidebar a')).forEach(function(link) {
link.setAttribute('tabIndex', sidebar === 'visible' ? 0 : -1);
});
</script>
<div id="content" class="content">
<main>
<h1 id="awk1"><a class="header" href="#awk1">awk(1)</a></h1>
<pre><code class="language-markdown">awk [opt] program [input]
-F <sepstr> field separator string (can be regex)
program awk program
input file or stdin if not file given
</code></pre>
<h2 id="input-processing"><a class="header" href="#input-processing">Input processing</a></h2>
<p>Input is processed in two stages:</p>
<ol>
<li>Splitting input into a sequence of <code>records</code>.
By default split at <code>newline</code> character, but can be changed via the
builtin <code>RS</code> variable.</li>
<li>Splitting a <code>record</code> into <code>fields</code>. By default strings without <code>whitespace</code>,
but can be changed via the builtin variable <code>FS</code> or command line option
<code>-F</code>.</li>
</ol>
<p>Fields are accessed as follows:</p>
<ul>
<li><code>$0</code> whole <code>record</code></li>
<li><code>$1</code> field one</li>
<li><code>$2</code> field two</li>
<li>...</li>
</ul>
<h2 id="program"><a class="header" href="#program">Program</a></h2>
<p>An <code>awk</code> program is composed of pairs of the form:</p>
<pre><code class="language-markdown">pattern { action }
</code></pre>
<p>The program is run against each <code>record</code> in the input stream. If a <code>pattern</code>
matches a <code>record</code> the corresponding <code>action</code> is executed and can access the
<code>fields</code>.</p>
<pre><code class="language-markdown">INPUT
|
v
record ----> ∀ pattern matched
| |
v v
fields ----> run associated action
</code></pre>
<p>Any valid awk <code>expr</code> can be a <code>pattern</code>.</p>
<p>An example is the regex pattern <code>/abc/ { print $1 }</code> which prints the first
field if the record matches the regex <code>/abc/</code>. This form is actually a short
version for <code>$0 ~ /abc/ { print $1 }</code>, see the regex comparison operator
below.</p>
<h3 id="special-pattern"><a class="header" href="#special-pattern">Special pattern</a></h3>
<p>awk provides two special patterns, <code>BEGIN</code> and <code>END</code>, which can be used
multiple times. Actions with those patterns are <strong>executed exactly once</strong>.</p>
<ul>
<li><code>BEGIN</code> actions are run before processing the first record</li>
<li><code>END</code> actions are run after processing the last record</li>
</ul>
<h3 id="special-variables"><a class="header" href="#special-variables">Special variables</a></h3>
<ul>
<li><code>RS</code> <em>record separator</em>: first char is the record separator, by default
<newline></li>
<li><code>FS</code> <em>field separator</em>: regex to split records into fields, by default
<space></li>
<li><code>NR</code> <em>number record</em>: number of current record</li>
<li><code>NF</code> <em>number fields</em>: number of fields in the current record</li>
</ul>
<h3 id="special-statements--functions"><a class="header" href="#special-statements--functions">Special statements & functions</a></h3>
<ul>
<li>
<p><code>printf "fmt", args...</code></p>
<p>Print format string, args are comma separated.</p>
<ul>
<li><code>%s</code> string</li>
<li><code>%d</code> decimal</li>
<li><code>%x</code> hex</li>
<li><code>%f</code> float</li>
</ul>
<p>Width can be specified as <code>%Ns</code>, this reserves <code>N</code> chars for a string.
For floats one can use <code>%N.Mf</code>, <code>N</code> is the total number including <code>.</code> and
<code>M</code>.</p>
</li>
<li>
<p><code>sprintf("fmt", expr, ...)</code></p>
<p>Format the expressions according to the format string. Similar as <code>printf</code>,
but this is a function and return value can be assigned to a variable.</p>
</li>
<li>
<p><code>strftime("fmt")</code></p>
<p>Print time stamp formatted by <code>fmt</code>.</p>
<ul>
<li><code>%Y</code> full year (eg 2020)</li>
<li><code>%m</code> month (01-12)</li>
<li><code>%d</code> day (01-31)</li>
<li><code>%F</code> alias for <code>%Y-%m-%d</code></li>
<li><code>%H</code> hour (00-23)</li>
<li><code>%M</code> minute (00-59)</li>
<li><code>%S</code> second (00-59)</li>
<li><code>%T</code> alias for <code>%H:%M:%S</code></li>
</ul>
</li>
<li>
<p><code>S ~ R</code>, <code>S !~ R</code></p>
<p>The regex comparison operator, where the former returns true if the string
<code>S</code> matches the regex <code>R</code>, and the latter is the negated form.
The regex can be either a
<a href="https://www.gnu.org/software/gawk/manual/html_node/Regexp-Usage.html">constant</a>
or <a href="https://www.gnu.org/software/gawk/manual/html_node/Computed-Regexps.html">dynamic</a>
regex.</p>
</li>
</ul>
<h2 id="examples"><a class="header" href="#examples">Examples</a></h2>
<h3 id="filter-records"><a class="header" href="#filter-records">Filter records</a></h3>
<pre><code class="language-bash">awk 'NR%2 == 0 { print $0 }' <file>
</code></pre>
<p>The pattern <code>NR%2 == 0</code> matches every second record and the action <code>{ print $0 }</code>
prints the whole record.</p>
<h3 id="negative-patterns"><a class="header" href="#negative-patterns">Negative patterns</a></h3>
<pre><code class="language-bash">awk '!/^#/ { print $1 }' <file>
</code></pre>
<p>Matches records not starting with <code>#</code>.</p>
<h3 id="range-patterns"><a class="header" href="#range-patterns">Range patterns</a></h3>
<pre><code class="language-bash">echo -e "a\nFOO\nb\nc\nBAR\nd" | \
awk '/FOO/,/BAR/ { print }'
</code></pre>
<p><code>/FOO/,/BAR/</code> define a range pattern of <code>begin_pattern, end_pattern</code>. When
<code>begin_pattern</code> is matched the range is <strong>turned on</strong> and when the
<code>end_pattern</code> is matched the range is <strong>turned off</strong>. This matches every record
in the range <em>inclusive</em>.</p>
<p>An <em>exclusive</em> range must be handled explicitly, for example as follows.</p>
<pre><code class="language-bash">echo -e "a\nFOO\nb\nc\nBAR\nd" | \
awk '/FOO/,/BAR/ { if (!($1 ~ "FOO") && !($1 ~ "BAR")) { print } }'
</code></pre>
<h3 id="access-last-fields-in-records"><a class="header" href="#access-last-fields-in-records">Access last fields in records</a></h3>
<pre><code class="language-bash">echo 'a b c d e f' | awk '{ print $NF $(NF-1) }'
</code></pre>
<p>Access last fields with arithmetic on the <code>NF</code> number of fields variable.</p>
<h3 id="split-on-multiple-tokens"><a class="header" href="#split-on-multiple-tokens">Split on multiple tokens</a></h3>
<pre><code class="language-bash">echo 'a,b;c:d' | awk -F'[,;:]' '{ printf "1=%s | 4=%s\n", $1, $4 }'
</code></pre>
<p>Use regex as field separator.</p>
<h3 id="capture-in-variables"><a class="header" href="#capture-in-variables">Capture in variables</a></h3>
<pre><code class="language-bash"># /proc/<pid>/status
# Name: cat
# ...
# VmRSS: 516 kB
# ...
for f in /proc/*/status; do
cat $f | awk '
/^VmRSS/ { rss = $2/1024 }
/^Name/ { name = $2 }
END { printf "%16s %6d MB\n", name, rss }';
done | sort -k2 -n
</code></pre>
<p>We capture values from <code>VmRSS</code> and <code>Name</code> into variables and print them at the
<code>END</code> once processing all records is done.</p>
<h3 id="capture-in-array"><a class="header" href="#capture-in-array">Capture in array</a></h3>
<pre><code class="language-bash">echo 'a 10
b 2
b 4
a 1' | awk '{
vals[$1] += $2
cnts[$1] += 1
}
END {
for (v in vals)
printf "%s %d\n", v, vals[v] / cnts [v]
}'
</code></pre>
<p>Capture keys and values from different columns and some up the values.
At the <code>END</code> we compute the average of each key.</p>
<h3 id="run-shell-command-and-capture-output"><a class="header" href="#run-shell-command-and-capture-output">Run shell command and capture output</a></h3>
<pre><code class="language-bash">cat /proc/1/status | awk '
/^Pid/ {
"ps --no-header -o user " $2 | getline user;
print user
}'
</code></pre>
<p>We build a <code>ps</code> command line and capture the first line of the processes output
in the <code>user</code> variable and then print it.</p>
</main>
<nav class="nav-wrapper" aria-label="Page navigation">
<!-- Mobile navigation buttons -->
<a rel="prev" href="../cli/index.html" class="mobile-nav-chapters previous" title="Previous chapter" aria-label="Previous chapter" aria-keyshortcuts="Left">
<i class="fa fa-angle-left"></i>
</a>
<a rel="next prefetch" href="../cli/cut.html" class="mobile-nav-chapters next" title="Next chapter" aria-label="Next chapter" aria-keyshortcuts="Right">
<i class="fa fa-angle-right"></i>
</a>
<div style="clear: both"></div>
</nav>
</div>
</div>
<nav class="nav-wide-wrapper" aria-label="Page navigation">
<a rel="prev" href="../cli/index.html" class="nav-chapters previous" title="Previous chapter" aria-label="Previous chapter" aria-keyshortcuts="Left">
<i class="fa fa-angle-left"></i>
</a>
<a rel="next prefetch" href="../cli/cut.html" class="nav-chapters next" title="Next chapter" aria-label="Next chapter" aria-keyshortcuts="Right">
<i class="fa fa-angle-right"></i>
</a>
</nav>
</div>
<script>
window.playground_copyable = true;
</script>
<script src="../elasticlunr.min.js"></script>
<script src="../mark.min.js"></script>
<script src="../searcher.js"></script>
<script src="../clipboard.min.js"></script>
<script src="../highlight.js"></script>
<script src="../book.js"></script>
<!-- Custom JS scripts -->
</div>
</body>
</html>