cli: add new group for cli foo tools

author: Johannes Stoelp <johannes.stoelp@gmail.com> 2024-05-01 14:57:52 +0200
committer: Johannes Stoelp <johannes.stoelp@gmail.com> 2024-05-01 14:57:52 +0200
commit: b737cc8ca5bb8ca5e07cd0151d678a7b4b10d5cb (patch)
tree: 86814d8fb3557ea2cbf73892dd0ec4e590e854de /src/cli
parent: 50e07a8bca68d2f568df44166fa94383141c2696 (diff)
download: notes-b737cc8ca5bb8ca5e07cd0151d678a7b4b10d5cb.tar.gz
notes-b737cc8ca5bb8ca5e07cd0151d678a7b4b10d5cb.zip
5 files changed, 350 insertions, 0 deletions
diff --git a/src/cli/README.md b/src/cli/README.md
new file mode 100644
index 0000000..53c64b7
--- /dev/null
+++ b/src/cli/README.md
@@ -0,0 +1,6 @@
+# CLI foo
+
+- [awk](./awk.md)
+- [sed](./sed.md)
+- [column](./column.md)
+- [sort](./sort.md)
diff --git a/src/cli/awk.md b/src/cli/awk.md
new file mode 100644
index 0000000..d6f6c9c
--- /dev/null
+++ b/src/cli/awk.md
@@ -0,0 +1,197 @@
+# awk(1)
+
+```markdown
+awk [opt] program [input]
+    -F <sepstr>        field separator string (can be regex)
+    program            awk program
+    input              file or stdin if not file given
+```
+
+## Input processing
+
+Input is processed in two stages:
+1. Splitting input into a sequence of `records`.
+   By default split at `newline` character, but can be changed via the
+   builtin `RS` variable.
+2. Splitting a `record` into `fields`. By default strings without `whitespace`,
+   but can be changed via the builtin variable `FS` or command line option
+   `-F`.
+
+Fields are accessed as follows:
+- `$0` whole `record`
+- `$1` field one
+- `$2` field two
+- ...
+
+## Program
+
+An `awk` program is composed of pairs of the form:
+```markdown
+pattern { action }
+```
+The program is run against each `record` in the input stream. If a `pattern`
+matches a `record` the corresponding `action` is executed and can access the
+`fields`.
+
+```markdown
+INPUT
+  |
+  v
+record ----> ∀ pattern matched
+  |                   |
+  v                   v
+fields ----> run associated action
+```
+
+Any valid awk `expr` can be a `pattern`.
+
+An example is the regex pattern `/abc/ { print $1 }` which prints the first
+field if the record matches the regex `/abc/`. This form is actually a short
+version for `$0 ~ /abc/ { print $1 }`, see the regex comparison operator
+below.
+
+### Special pattern
+
+awk provides two special patterns, `BEGIN` and `END`, which can be used
+multiple times. Actions with those patterns are **executed exactly once**.
+- `BEGIN` actions are run before processing the first record
+- `END` actions are run after processing the last record
+
+### Special variables
+
+- `RS` _record separator_: first char is the record separator, by default
+  <newline>
+- `FS` _field separator_: regex to split records into fields, by default
+  <space>
+- `NR` _number record_: number of current record
+- `NF` _number fields_: number of fields in the current record
+
+### Special statements & functions
+
+- `printf "fmt", args...`
+
+  Print format string, args are comma separated.
+  - `%s` string
+  - `%d` decimal
+  - `%x` hex
+  - `%f` float
+
+  Width can be specified as `%Ns`, this reserves `N` chars for a string.
+  For floats one can use `%N.Mf`, `N` is the total number including `.` and
+  `M`.
+
+- `sprintf("fmt", expr, ...)`
+
+    Format the expressions according to the format string. Similar as `printf`,
+    but this is a function and return value can be assigned to a variable.
+
+- `strftime("fmt")`
+
+  Print time stamp formatted by `fmt`.
+  - `%Y` full year (eg 2020)
+  - `%m` month (01-12)
+  - `%d` day (01-31)
+  - `%F` alias for `%Y-%m-%d`
+  - `%H` hour (00-23)
+  - `%M` minute (00-59)
+  - `%S` second (00-59)
+  - `%T` alias for `%H:%M:%S`
+
+- `S ~ R`, `S !~ R`
+
+  The regex comparison operator, where the former returns true if the string
+  `S` matches the regex `R`, and the latter is the negated form.
+  The regex can be either a
+  [constant](https://www.gnu.org/software/gawk/manual/html_node/Regexp-Usage.html)
+  or [dynamic](
+  https://www.gnu.org/software/gawk/manual/html_node/Computed-Regexps.html)
+  regex.
+
+## Examples
+
+### Filter records
+```bash
+awk 'NR%2 == 0 { print $0 }' <file>
+```
+The pattern `NR%2 == 0` matches every second record and the action `{ print $0 }`
+prints the whole record.
+
+### Negative patterns
+```bash
+awk '!/^#/ { print $1 }' <file>
+```
+Matches records not starting with `#`.
+
+### Range patterns
+```bash
+echo -e "a\nFOO\nb\nc\nBAR\nd" | \
+    awk '/FOO/,/BAR/ { print }'
+```
+`/FOO/,/BAR/` define a range pattern of `begin_pattern, end_pattern`. When
+`begin_pattern` is matched the range is **turned on** and when the
+`end_pattern` is matched the range is **turned off**. This matches every record
+in the range _inclusive_.
+
+An _exclusive_ range must be handled explicitly, for example as follows.
+```bash
+echo -e "a\nFOO\nb\nc\nBAR\nd" | \
+    awk '/FOO/,/BAR/ { if (!($1 ~ "FOO") && !($1 ~ "BAR")) { print } }'
+```
+
+### Access last fields in records
+```bash
+echo 'a b c d e f' | awk '{ print $NF $(NF-1) }'
+```
+Access last fields with arithmetic on the `NF` number of fields variable.
+
+### Split on multiple tokens
+```bash
+echo 'a,b;c:d' | awk -F'[,;:]' '{ printf "1=%s | 4=%s\n", $1, $4 }'
+```
+Use regex as field separator.
+
+### Capture in variables
+```bash
+# /proc/<pid>/status
+#   Name:    cat
+#   ...
+#   VmRSS:   516 kB
+#   ...
+
+for f in /proc/*/status; do
+    cat $f | awk '
+             /^VmRSS/ { rss = $2/1024 }
+             /^Name/ { name = $2 }
+             END { printf "%16s %6d MB\n", name, rss }';
+done | sort -k2 -n
+```
+We capture values from `VmRSS` and `Name` into variables and print them at the
+`END` once processing all records is done.
+
+### Capture in array
+```bash
+echo 'a 10
+b 2
+b 4
+a 1' | awk '{
+    vals[$1] += $2
+    cnts[$1] += 1
+}
+END {
+    for (v in vals)
+        printf "%s %d\n", v, vals[v] / cnts [v]
+}'
+```
+Capture keys and values from different columns and some up the values.
+At the `END` we compute the average of each key.
+
+### Run shell command and capture output
+```bash
+cat /proc/1/status | awk '
+                     /^Pid/ {
+                        "ps --no-header -o user " $2 | getline user;
+                         print user
+                     }'
+```
+We build a `ps` command line and capture the first line of the processes output
+in the `user` variable and then print it.
diff --git a/src/cli/column.md b/src/cli/column.md
new file mode 100644
index 0000000..4a3b2c4
--- /dev/null
+++ b/src/cli/column.md
@@ -0,0 +1,10 @@
+# column(1)
+
+## Examples
+```sh
+# Show as table (aligned columns), with comma as delimiter from stdin.
+echo -e 'a,b,c\n111,22222,33' | column -t -s ','
+
+# Show file as table.
+column -t -s ',' test.csv
+```
diff --git a/src/cli/sed.md b/src/cli/sed.md
new file mode 100644
index 0000000..5b5f741
--- /dev/null
+++ b/src/cli/sed.md
@@ -0,0 +1,102 @@
+# sed(1)
+
+```
+sed [opts] [script] [file]
+  opts:
+    -i          edit file in place
+    -i.bk       edit file in place and create backup file
+                (with .bk suffix, can be specified differently)
+    --follow-symlinks
+                follow symlinks when editing in place
+    -e SCRIPT   add SCRIPT to commands to be executed
+                (can be specified multiple times)
+    -f FILE     add content of FILE to command to be executed
+
+    --debug     annotate program execution
+```
+
+## Examples
+### Delete lines
+```sh
+# Delete two lines.
+echo -e 'aa\nbb\ncc\ndd' | sed '1d;3d'
+# bb
+# dd
+
+# Delete last ($) line.
+echo -e 'aa\nbb\ncc\ndd' | sed '$d'
+# aa
+# bb
+# cc
+
+# Delete range of lines.
+echo -e 'aa\nbb\ncc\ndd' | sed '1,3d'
+# dd
+
+# Delete lines matching pattern.
+echo -e 'aa\nbb\ncc\ndd' | sed '/bb/d'
+# aa
+# cc
+# dd
+
+# Delete lines NOT matching pattern.
+echo -e 'aa\nbb\ncc\ndd' | sed '/bb/!d'
+# bb
+```
+
+### Insert lines
+```sh
+# Insert before line.
+echo -e 'aa\nbb' | sed '2iABC'
+# aa
+# ABC
+# bb
+
+# Insert after line.
+echo -e 'aa\nbb' | sed '2aABC'
+# aa
+# bb
+# ABC
+
+# Replace line.
+echo -e 'aa\nbb' | sed '2cABC'
+# aa
+# ABC
+
+# Insert before pattern match.
+echo -e 'aa\nbb' | sed '/bb/i 123'
+# aa
+# 123
+# bb
+```
+
+### Substitute lines
+```sh
+# Substitute by regex.
+echo -e 'aafooaa\ncc' | sed 's/foo/MOOSE/'
+# aaMOOSEaa
+# cc
+```
+
+### Multiple scripts
+```sh
+echo -e 'foo\nbar' | sed -e 's/foo/FOO/' -e 's/FOO/BAR/'
+# BAR
+# bar
+```
+
+### Edit inplace through symlink
+```sh
+touch file
+ln -s file link
+ls -l link
+# lrwxrwxrwx 1 johannst johannst 4 Feb  7 23:02 link -> file
+
+sed -i --follow-symlinks '1iabc' link
+ls -l link
+# lrwxrwxrwx 1 johannst johannst 4 Feb  7 23:02 link -> file
+
+sed -i '1iabc' link
+ls -l link
+# -rw-r--r-- 1 johannst johannst 0 Feb  7 23:02 link
+```
diff --git a/src/cli/sort.md b/src/cli/sort.md
new file mode 100644
index 0000000..e74b490
--- /dev/null
+++ b/src/cli/sort.md
@@ -0,0 +1,35 @@
+# sort(1)
+
+```
+sort [opts] [file]
+  opts:
+    -r      reverse output
+    -b      ignore leading blanks
+
+    -n      sort by numeric
+    -h      sort by human numeric
+    -V      sort by version
+
+    -k<N>  sort by Nth key
+    -t<S>  field separator
+```
+
+## Examples
+```sh
+# Sort by directory sizes.
+du -sh * | sort -h
+```
+
+```sh
+# Sort numeric by second key.
+# The default key separator is non-blank to blank transition.
+echo 'a 4
+d 10
+c 21' | sort -k2 -n
+
+# Sort numeric by second key, split at comma.
+echo 'a,4
+d,10
+c,21' | sort -k2 -n -t,
+```
+> Use `--debug` to annotate part of the line used to sort and hence debug the key usage.
author	Johannes Stoelp <johannes.stoelp@gmail.com>	2024-05-01 14:57:52 +0200
committer	Johannes Stoelp <johannes.stoelp@gmail.com>	2024-05-01 14:57:52 +0200
commit	b737cc8ca5bb8ca5e07cd0151d678a7b4b10d5cb (patch)
tree	86814d8fb3557ea2cbf73892dd0ec4e590e854de /src/cli
parent	50e07a8bca68d2f568df44166fa94383141c2696 (diff)
download	notes-b737cc8ca5bb8ca5e07cd0151d678a7b4b10d5cb.tar.gz notes-b737cc8ca5bb8ca5e07cd0151d678a7b4b10d5cb.zip