Notes

A personal collection of notes and cheatsheets.

Source code is located at johannst/notes.

Tools

zsh(1)

Keybindings

Change input mode:

bindkey -v              change to vi keymap
bindkey -e              change to emacs keymap

Define key-mappings:

bindkey                 list mappings in current keymap
bindkey in-str cmd      create mapping for `in-str` to `cmd`
bindkey -r in-str       remove binding for `in-str`

# C-v <key>             dump <key> code, which can be used in `in-str`
# zle -l                list all functions for keybindings
# man zshzle(1)         STANDARD WIDGETS: get description of functions

Access edit buffer in zle widget:

$BUFFER       # Entire edit buffer content
$LBUFFER      # Edit buffer content left to cursor
$RBUFFER      # Edit buffer content right to cursor

# create zle widget which adds text right of the cursor
function add-text() {
    RBUFFER="some text $RBUFFER"
}
zle -N add-text

bindkey "^p" add-text

Parameter

Default value:

# default value
echo ${foo:-defval}   # defval
foo=bar
echo ${foo:-defval}   # bar

Alternative value:

echo ${foo:+altval}   # ''
foo=bar
echo ${foo:+altval}   # altval

Check variable set, error if not set:

echo ${foo:?msg}      # print `msg` and return errno `1`
foo=bar
echo ${foo:?msg}      # bar

Sub-string ${var:offset:length}:

foo=abcdef
echo ${foo:1:3}       # bcd

Trim prefix ${var#prefix}:

foo=bar.baz
echo ${foo#bar}       # .baz

Trim suffix ${var%suffix}:

foo=bar.baz
echo ${foo%.baz}      # bar

Substitute pattern ${var/pattern/replace}:

foo=aabbccbbdd
echo ${foo/bb/XX}    # aaXXccbbdd
echo ${foo//bb/XX}   # aaXXccXXdd
# replace prefix
echo ${foo/#bb/XX}   # aabbccbbdd
echo ${foo/#aa/XX}   # XXbbccbbdd
# replace suffix
echo ${foo/%bb/XX}   # aabbccbbdd
echo ${foo/%dd/XX}   # aabbccbbXX

Note: prefix/suffix/pattern are expanded as pathnames.

Variables

# Variable with local scope
local var=val

# Read-only variable
readonly var=bal

Indexed arrays:

arr=(aa bb cc dd)
echo $arr[1]           # aa
echo $arr[-1]          # dd

arr+=(ee)
echo $arr[-1]          # ee

echo $arr[1,3]         # aa bb cc

Associative arrays:

typeset -A arr
arr[x]='aa'
arr[y]='bb'
echo $arr[x]           # aa

Tied arrays:

typeset -T VEC vec=(1 2 3) '|'

echo $vec              # 1 2 3
echo $VEC              # 1|2|3

Unique arrays (set):

typeset -U vec=(1 2 3)

echo $vec             # 1 2 3
vec+=(1 2 4)
echo $vec             # 1 2 3 4

Expansion Flags

Join array to string j:sep::

foo=(1 2 3 4)
echo ${(j:-:)foo}     # 1-2-3-4
echo ${(j:\n:)foo}    # join with new lines

Split string to array s:sep:

foo='1-2-3-4'
bar=(${(s:-:)foo})    # capture as array
echo $bar             # 1 2 3 4
echo $bar[2]          # 2

Upper/Lower case string:

foo=aaBB
echo ${(L)foo}        # aabb
echo ${(U)foo}        # AABB

Key/values in associative arrays:

typeset -A vec; vec[a]='aa'; vec[b]='bb'

echo ${(k)vec}        # a b
echo ${(v)vec}        # aa bb
echo ${(kv)vec}       # a aa b bb

# Iterate over key value pairs.
for k v in ${(kv)vec)}; do ...; done

Argument parsing with zparseopts

zparseopts [-D] [-E] [-A assoc] specs

Arguments are copied into the associative array assoc according to specs. Each spec is described by an entry as opt[:][=array].

  • opt is the option without the - char. Passing -f is matched against f opt, --long is matched against -long.
  • Using : means the option will take an argument.
  • The optional =array specifies an alternate storage container where this option should be stored.

Documentation can be found in man zshmodules.

Example

#!/bin/zsh
function test() {
    zparseopts -D -E -A opts f=flag o: -long:
    echo "flag $flag"
    echo "o    $opts[-o]"
    echo "long $opts[--long]"
    echo "pos  $1"
}

test -f -o OPTION --long LONG_OPT POSITIONAL

# Outputs:
#   flag -f
#   o    OPTION
#   long LONG_OPT
#   pos  POSITIONAL

Regular Expressions

Zsh supports regular expression matching with the binary operator =~. The match results can be accessed via the $MATCH variable and $match indexed array:

  • $MATCH contains the full match
  • $match[1] contains match of the first capture group
INPUT='title foo : 1234'
REGEX='^title (.+) : ([0-9]+)$'
if [[ $INPUT =~ $REGEX ]]; then
    echo "$MATCH"       # title foo : 1234
    echo "$match[1]"    # foo
    echo "$match[2]"    # 1234
fi

Completion

Installation

Completion functions are provided via files and need to be placed in a location covered by $fpath. By convention the completion files are names as _<CMD>.

A completion skeleton for the command foo, stored in _foo

#compdef _foo foo

function _foo() {
    ...
}

Alternatively one can install a completion function explicitly by calling compdef <FUNC> <CMD>.

Completion Variables

Following variables are available in Completion functions:

$words              # array with command line in words
$#words             # number words
$CURRENT            # index into $words for cursor position
$words[CURRENT-1]   # previous word (relative to cursor position)

Completion Functions

  • _describe simple completion, just words + description
  • _arguments sophisticated completion, allow to specify actions

Completion with _describe

_describe MSG COMP
  • MSG simple string with header message
  • COMP array of completions where each entry is "opt:description"
function _foo() {
    local -a opts
    opts=('bla:desc for bla' 'blu:desc for blu')
    _describe 'foo-msg' opts
}
compdef _foo foo

foo <TAB><TAB>
 -- foo-msg --
bla  -- desc for bla
blu  -- desc for blu

Completion with _arguments

_arguments SPEC [SPEC...]

where SPEC can have one of the following forms:

  • OPT[DESC]:MSG:ACTION
  • N:MSG:ACTION

Available actions

(op1 op2)   list possible matches
->VAL       set $state=VAL and continue, `$state` can be checked later in switch case
FUNC        call func to generate matches
{STR}       evaluate `STR` to generate matches

Example

Skeleton to copy/paste for writing simple completions.

Assume a program foo with the following interface:

foo -c green|red|blue -s low|high -f <file> -d <dir> -h

The completion handler could be implemented as follows in a file called _foo:

#compdef _foo foo

function _foo_color() {
    local colors=()
    colors+=('green:green color')
    colors+=('red:red color')
    colors+=('blue:blue color')
    _describe "color" colors
}

function _foo() {
    _arguments                              \
        "-c[define color]:color:->s_color"  \
        "-s[select sound]:sound:(low high)" \
        "-f[select file]:file:_files"       \
        "-d[select dir]:dir:_files -/"      \
        "-h[help]"

    case $state in
        s_color) _foo_color;;
    esac
}

_files is a zsh builtin utility function to complete files/dirs see

bash(1)

Expansion

Generator

# generate sequence from n to m
{n..m}
# generate sequence from n to m step by s
{n..m..s}

# expand cartesian product
{a,b}{c,d}

Parameter

# default value
bar=${foo:-some_val}  # if $foo set, then bar=$foo else bar=some_val

# alternate value
bar=${foo:+bla $foo}  # if $foo set, then bar="bla $foo" else bar=""

# check param set
bar=${foo:?msg}  # if $foo set, then bar=$foo else exit and print msg

# indirect
FOO=foo
BAR=FOO
bar=${!BAR}  # deref value of BAR -> bar=$FOO

# prefix
${foo#prefix}  # remove prefix when expanding $foo
# suffix
${foo%suffix}  # remove suffix when expanding $foo

# substitute
${foo/pattern/string}  # replace pattern with string when expanding foo
# pattern starts with
# '/'   replace all occurences of pattern
# '#'   pattern match at beginning
# '%'   pattern match at end

Note: prefix/suffix/pattern are expanded as pathnames.

Pathname

*           match any string
?           match any single char
\\          match backslash
[abc]       match any char of 'a' 'b' 'c'
[a-z]       match any char between 'a' - 'z'
[^ab]       negate, match all not 'a' 'b'
[:class:]   match any char in class, available:
              alnum,alpha,ascii,blank,cntrl,digit,graph,lower,
              print,punct,space,upper,word,xdigit

With extglob shell option enabled it is possible to have more powerful patterns. In the following pattern-list is one ore more patterns separated by | char.

?(pattern-list)   matches zero or one occurrence of the given patterns
*(pattern-list)   matches zero or more occurrences of the given patterns
+(pattern-list)   matches one or more occurrences of the given patterns
@(pattern-list)   matches one of the given patterns
!(pattern-list)   matches anything except one of the given patterns

Note: shopt -s extglob/shopt -u extglob to enable/disable extglob option.

I/O redirection

Note: The trick with bash I/O redirection is to interpret from left-to-right.

# stdout & stderr to file
command >file 2>&1
# equivalent
command &>file

# stderr to stdout & stdout to file
command 2>&1 >file

The article Bash One-Liners Explained, Part III: All about redirections contains some nice visualization to explain bash redirections.

Explanation

j>&i

Duplicate fd i to fd j, making j a copy of i. See dup2(2).

Example:

command 2>&1 >file
  1. duplicate fd 1 to fd 2, effectively redirecting stderr to stdout
  2. redirect stdout to file

Argument parsing with getopts

The getopts builtin uses following global variables:

  • OPTARG, value of last option argument
  • OPTIND, index of the next argument to process (user must reset)
  • OPTERR, display errors if set to 1
getopts <optstring> <param> [<args>]
  • <optstring> specifies the names of supported options, eg f:c
    • f: means -f option with an argument
    • c means -c option without an argument
  • <param> specifies a variable name which getopts fills with the last parsed option argument
  • <args> optionally specify argument string to parse, by default getopts parses $@

Example

#!/bin/bash
function parse_args() {
    while getopts "f:c" PARAM; do
        case $PARAM in
            f) echo "GOT -f $OPTARG";;
            c) echo "GOT -c";;
            *) echo "ERR: print usage"; exit 1;;
        esac
    done
    # users responsibility to reset OPTIND
    OPTIND=1
}

parse_args -f xxx -c
parse_args -f yyy

Regular Expressions

Bash supports regular expression matching with the binary operator =~. The match results can be accessed via the $BASH_REMATCH variable:

  • ${BASH_REMATCH[0]} contains the full match
  • ${BASH_REMATCH[1]} contains match of the first capture group
INPUT='title foo : 1234'
REGEX='^title (.+) : ([0-9]+)$'
if [[ $INPUT =~ $REGEX ]]; then
    echo "${BASH_REMATCH[0]}"    # title foo : 1234
    echo "${BASH_REMATCH[1]}"    # foo
    echo "${BASH_REMATCH[2]}"    # 1234
fi

Caution: When specifying a regex in the [[ ]] block directly, quotes will be treated as part of the pattern. [[ $INPUT =~ "foo" ]] will match against "foo" not foo!

Completion

The complete builtin is used to interact with the completion system.

complete                    # print currently installed completion handler
complete -F <func> <cmd>    # install <func> as completion handler for <cmd>
complete -r <cmd>           # uninstall completion handler for <cmd>

Variables available in completion functions:

# in
$1              # <cmd>
$2              # current word
$3              # privous word

COMP_WORDS      # array with current command line words
COMP_CWORD      # index into COMP_WORDS with current cursor position

# out
COMPREPLY       # array with possible completions

The compgen builtin is used to generate possible matches by comparing word against words generated by option.

compgen <option> <word>

# usefule options:
# -W <list>    specify list of possible completions
# -d           generate list with dirs
# -f           generate list with files
# -u           generate list with users
# -e           generate list with exported variables

# compare "f" against words "foo" "foobar" "bar" and generate matches
compgen -W "foo foobar bar" "f"

# compare "hom" against file/dir names and generate matches
compgen -d -f "hom"

Example

Skeleton to copy/paste for writing simple completions.

Assume a program foo with the following interface:

foo -c green|red|blue -s low|high -f <file> -h

The completion handler could be implemented as follows:

function _foo() {
    local curr=$2
    local prev=$3

    local opts="-c -s -f -h"
    case $prev in
        -c) COMPREPLY=( $(compgen -W "green red blue" -- $curr) );;
        -s) COMPREPLY=( $(compgen -W "low high" -- $curr) );;
        -f) COMPREPLY=( $(compgen -f -- $curr) );;
        *)  COMPREPLY=( $(compgen -W "$opts" -- $curr) );;
    esac
}

complete -F _foo foo

fish(1)

Quick Info

Fish initialization file ~/.config/fish/config.fish

Switch between different key bindings:

  • fish_default_key_bindings to use default key bindings
  • fish_vi_key_bindings to use vi key bindings

Variables

Available scopes

  • local variable local to a block
  • global variable global to shell instance
  • universal variable universal to all shell instances + preserved across shell restart

Set/Unset Variables

set <name> [<values>]
    -l  local scope
    -g  global scope
    -U  universal scope
    -e  erase variable
    -S  show verbose info
    -x  export to ENV
    -u  unexport from ENV

Lists

In fish all variables are lists (start with index 1, but lists can't contain lists.

set foo a b c d

echo $foo[1]      # a
echo $foo[-1]     # d
echo $foo[2..3]   # b c
echo $foo[1 3]    # a c

$ can be seen as dereference operator.

set foo a; set a 1337
echo $$foo  # outputs 1337

Cartesian product.

echo file.{h,cc}
# file.h file.cc

echo {a,b}{1,2}
# a1 b1 b2

Special Variables (Lists)

$status      # exit code of last command
$pipestatus  # list of exit codes of pipe chain

$CMD_DURATION   # runtime of last command in ms

*PATH

Lists ending with PATH are automatically split at : when used and joined with : when exported to the environment.

set -x BLA_PATH a:b:c:d
echo $BLA_PATH              # a b c d
env | grep BLA_PATH         # BLA_PATH=a:b:c:d

Command Handling

# sub-commands are not run in quotes
echo "ls output: "(ls)

I/O redirection

# 'noclobber', fail if 'log' already exists
echo foo >? log

Control Flow

if / else

if grep foo bar
    # do sth
else if grep foobar bar
    # do sth else
else
    # do sth else
end

switch

switch (echo foo)
case 'foo*'
    # do start with foo
case bar dudel
    # do bar and dudel
case '*'
    # do else
end

while Loop

while true
    echo foo
end

for Loop

for f in (ls)
    echo $f
end

Functions

Function arguments are passed via $argv list.

function fn_foo
    echo $argv
end

Autoloading

When running a command fish attempts to autoload a function. The shell looks for <cmd>.fish in the locations defined by $fish_function_path and loads the function lazily if found.

This is the preferred way over monolithically defining all functions in a startup script.

Helper

functions         # list al functions
functions foo     # describe function 'foo'
functions -e foo  # erase function 'foo'

funced foo        # edit function 'foo'
                  # '-e vim' to edit in vim

Prompt

The prompt is defined by the output of the fish_prompt function.

function fish_prompt
    set -l cmd_ret
    echo "> "(pwd) $cmd_ret" "
end

Use set_color to manipulate terminal colors.

Useful Builtins

# history
history search <str>   # search history for <str>
history merge          # merge histories from fish sessions

# list
count $var            # count elements in list

# string
string split SEP STRING

Keymaps

  Shift-Tab ........... tab-completion with search
  Alt-Up / Alt-Down ... search history with token under the cursor
  Alt-l ............... list content of dir under cursor
  Alt-p ............... append '2>&1 | less;' to current cmdline

Debug

  status print-stack-trace .. prints function stacktrace (can be used in scripts)
  breakpoint ................ halt script execution and gives shell (C-d | exit
                              to continue)

tmux(1)

Terminology:

  • session is a collection of pseudo terminals which can have multiple windows
  • window uses the entire screen and can be split into rectangular panes
  • pane is a single pseudo terminal instance

Tmux cli

# Session
tmux                        creates new session
tmux ls                     list running sessions
tmux kill-session -t <s>    kill running session <s>
tmux attach -t <s> [-d]     attach to session <s>, detach other clients [-d]
tmux detach -s <s>          detach all clients from session <s>

# Environment
tmux showenv -g             show global tmux environment variables
tmux setenv -g <var> <val>  set variable in global tmux env

# Misc
tmux source-file <file>     source config <file>
tmux lscm                   list available tmux commnds
tmux show -g                show global tmux options
tmux display <msg>          display message in tmux status line

Scripting

# Session
tmux list-sessions -F '#S'           list running sessions, only IDs

# Window
tmux list-windows -F '#I' -t <s>     list window IDs for session <s>
tmux selectw -t <s>:<w>              select window <w> in session <s>

# Pane
tmux list-panes -F '#P' -t <s>:<w>   list pane IDs for window <w> in session <s>
tmux selectp -t <s>:<w>.<p>          select pane <p> in window <w> in session <s>

# Run commands
tmux send -t <s>:<w>.<p> "ls" C-m    send cmds/keys to pane
tmux run -t <p> <sh-cmd>             run shell command <sh-cmd> in background and report output on pane -t <p>

For example cycle through all panes in all windows in all sessions:

# bash
for s in $(tmux list-sessions -F '#S'); do
    for w in $(tmux list-windows -F '#I' -t $s); do
        for p in $(tmux list-panes -F '#P' -t $s:$w); do
            echo $s:$w.$p
        done
    done
done

Bindings

prefix d    detach from current session
prefix c    create new window
prefix w    open window list
prefix $    rename session
prefix ,    rename window
prefix .    move current window

Following bindings are specific to my tmux.conf:

C-s         prefix

# Panes
prefix s    horizontal split
prefix v    vertical split
prefix f    toggle maximize/minimize current pane

# Movement
prefix Tab  toggle between window

prefix h    move to pane left
prefix j    move to pane down
prefix k    move to pane up
prefix l    move to pane right

# Resize
prefix C-h  resize pane left
prefix C-j  resize pane down
prefix C-k  resize pane up
prefix C-l  resize pane right

# Copy/Paste
prefix C-v    enter copy mode
prefix C-p    paste yanked text
prefix C-b    open copy-buffer list

# In Copy Mode
v     enable visual mode
y     yank selected text

Command mode

To enter command mode prefix :.

Some useful commands are:

setw synchronize-panes on/off       enables/disables synchronized input to all panes
list-keys -t vi-copy                list keymaps for vi-copy mode

git(1)

Working areas

+-------------------+ --- stash -----> +-------+
| working directory |                  | stash |  // Shelving area.
+-------------------+ <-- stash pop -- +-------+
      |       ^
     add      |
      |     reset
      v       |
+-------------------+
|   staging area    |
+-------------------+
      |
    commit
      |
      v
+-------------------+
| local repository  |
+-------------------+
      |       ^
     push     |
      |     fetch /
      |      pull
      v       |
+-------------------+
| remote repository |
+-------------------+

Staging

  git add -p [<file>] ............ partial staging (interactive)

Remote

  git remote -v .................. list remotes verbose (with URLs)
  git remote show [-n] <remote> .. list info for <remote> (like remote HEAD,
                                   remote branches, tracking mapping)

Branching

  git branch [-a] ................ list available branches; -a to include
                                   remote branches
  git branch -vv ................. list branch & annotate with head sha1 &
                                   remote tracking branch
  git branch <bname> ............. create local branch with name <bname>
  git branch -d <bname> .......... delete local branch with name <bname>
  git checkout <bname> ........... switch to branch with name <bname>
  git checkout --track <branch> .. start to locally track a remote branch

  # Remote

  git push -u origin <rbname> ........ push local branch to origin (or other
                                       remote), and setup <rbname> as tracking
                                       branch
  git push origin --delete <rbname> .. delete branch <rbname> from origin (or
                                       other remote)

Tags

  git tag -a <tname> -m "descr" ........ creates an annotated tag (full object
                                         containing tagger, date, ...)
  git tag -l ........................... list available tags
  git checkout tag/<tname> ............. checkout specific tag
  git checkout tag/<tname> -b <bname> .. checkout specific tag in a new branch

  # Remote

  git push origin --tags .... push local tags to origin (or other remote)

Log & Commit History

  git log --oneline ......... shows log in single line per commit -> alias for
                              '--pretty=oneline --abbrev-commit'
  git log --graph ........... text based graph of commit history
  git log --decorate ........ decorate log with REFs

  git log -p <file> ......... show commit history + diffs for <file>
  git log --oneline <file> .. show commit history for <file> in compact format

Diff & Commit Info

  git diff <commit>..<commit> [<file>] .... show changes between two arbitrary
                                            commits. If one <commit> is omitted
                                            it is if HEAD is specified.
  git diff -U$(wc -l <file>) <file> ....... shows complete file with diffs
                                            instead of usual diff snippets
  git diff --staged ....................... show diffs of staged files

  git show --stat <commit> ................ show files changed by <commit>
  git show <commit> [<file>] .............. show diffs for <commit>

  git git show <commit>:<file> ............ show <file> at <commit>

Patching

  git format-patch <opt> <since>/<revision range>
    opt:
      -N ................... use [PATCH] instead [PATCH n/m] in subject when
                             generating patch description (for patches spanning
                             multiple commits)
      --start-number <n> ... start output file generation with <n> as start
                             number instead '1'
    since spcifier:
      -3 .................. e.g: create a patch from last three commits
      <commit hash> ....... create patch with commits starting after <commit hash>

  git am <patch> ......... apply patch and create a commit for it

  git apply --stat <PATCH> ... see which files the patch would change
  git apply --check <PATCH> .. see if the patch can be applied cleanly
  git apply <PATCH> .......... apply the patch locally without creating a commit

  # eg: generate patches for each commit from initial commit on
  git format-patch -N $(git rev-list --max-parents=0 HEAD)

  # generate single patch file from a certain commit/ref
  git format-patch <COMMIT/REF> --stdout > my-patch.patch

Resetting

  git reset [opt] <ref|commit>
    opt:
      --mixed .................... resets index, but not working tree
      --hard ..................... matches the working tree and index to that
                                   of the tree being switched to any changes to
                                   tracked files in the working tree since
                                   <commit> are lost
  git reset HEAD <file> .......... remove file from staging
  git reset --soft HEAD~1 ........ delete most recent commit but keep work
  git reset --hard HEAD~1 ........ delete most recent commit and delete work

Submodules

  git submodule add <url> [<path>] .......... add new submodule to current project
  git clone --recursive <url> ............... clone project and recursively all
                                              submodules (same as using
                                              'git submodule update --init
                                              --recursive' after clone)
  git submodule update --init --recursive ... checkout submodules recursively
                                              using the commit listed in the
                                              super-project (in detached HEAD)
  git submodule update --remote <submod> .... fetch & merge remote changes for
                                              <submod>, this will pull
                                              origin/HEAD or a branch specified
                                              for the submodule
  git diff --submodule ...................... show commits that are part of the
                                              submodule diff

Inspection

  git ls-tree [-r] <ref> .... show git tree for <ref>, -r to recursively ls sub-trees
  git show <obj> ............ show <obj>
  git cat-file -p <obj> ..... print content of <obj>

Revision Specifier

  HEAD ........ last commit
  HEAD~1 ...... last commit-1
  HEAD~N ...... last commit-N (linear backwards when in tree structure, check
                difference between HEAD^ and HEAD~)
  git rev-list --max-parents=0 HEAD ........... first commit

awk(1)

awk [opt] program [input]
    -F <sepstr>        field separator string (can be regex)
    program            awk program
    input              file or stdin if not file given

Input processing

Input is processed in two stages:

  1. Splitting input into a sequence of records. By default split at newline character, but can be changed via the builtin RS variable.
  2. Splitting a record into fields. By default strings without whitespace, but can be changed via the builtin variable FS or command line option -F.

Fields are accessed as follows:

  • $0 whole record
  • $1 field one
  • $2 field two
  • ...

Program

An awk program is composed of pairs of the form:

pattern { action }

The program is run against each record in the input stream. If a pattern matches a record the corresponding action is executed and can access the fields.

INPUT
  |
  v
record ----> ∀ pattern matched
  |                   |
  v                   v
fields ----> run associated action

Any valid awk expr can be a pattern.

Special pattern

awk provides two special patterns, BEGIN and END, which can be used multiple times. Actions with those patterns are executed exactly once.

  • BEGIN actions are run before processing the first record
  • END actions are run after processing the last record

Special variables

  • RS record separator: first char is the record separator, by default
  • FS field separator: regex to split records into fields, by default
  • NR number record: number of current record
  • NF number fields: number of fields in the current record

Special statements & functions

  • printf "fmt", args...

    Print format string, args are comma separated.

    • %s string
    • %d decimal
    • %x hex
    • %f float

    Width can be specified as %Ns, this reserves N chars for a string. For floats one can use %N.Mf, N is the total number including . and M.

  • sprintf("fmt", expr, ...)

    Format the expressions according to the format string. Similar as printf, but this is a function and return value can be assigned to a variable.

  • strftime("fmt")

    Print time stamp formatted by fmt.

    • %Y full year (eg 2020)
    • %m month (01-12)
    • %d day (01-31)
    • %F alias for %Y-%m-%d
    • %H hour (00-23)
    • %M minute (00-59)
    • %S second (00-59)
    • %T alias for %H:%M:%S

Examples

Filter records

awk 'NR%2 == 0 { print $0 }' <file>

The pattern NR%2 == 0 matches every second record and the action { print $0 } prints the whole record.

Access last fields in records

echo 'a b c d e f' | awk '{ print $NF $(NF-1) }'

Access last fields with arithmetic on the NF number of fields variable.

Capture in variables

# /proc/<pid>/status
#   Name:    cat
#   ...
#   VmRSS:   516 kB
#   ...

for f in /proc/*/status; do
    cat $f | awk '
             /^VmRSS/ { rss = $2/1024 }
             /^Name/ { name = $2 }
             END { printf "%16s %6d MB\n", name, rss }';
done | sort -k2 -n

We capture values from VmRSS and Name into variables and print them at the END once processing all records is done.

Run shell command and capture output

cat /proc/1/status | awk '
                     /^Pid/ {
                        "ps --no-header -o user " $2 | getline user;
                         print user
                     }'

We build a ps command line and capture the first line of the processes output in the user variable and then print it.

emacs(1)

help

  C-h ?         list available help modes
  C-h e         show message output (`*Messages*` buffer)
  C-h f         describe function
  C-h v         describe variable
  C-h w         describe which key invoke function (where-is)
  C-h c <KEY>   print command bound to <KEY>
  C-h k <KEY>   describe command bound to <KEY>
  C-h b         list buffer local key-bindings
  <kseq> C-h    list possible key-bindings with <kseq>
                eg C-x C-h -> list key-bindings beginning with C-x

package manager

  key    fn                          description
------------------------------------------------
         package-refresh-contents    refresh package list
         package-list-packages       list available/installed packages
                                     `U x` to mark packages for Upgrade & eXecute

window

  key      fn                      description
----------------------------------------------
  C-x 0    delete-window           kill focused window
  C-x 1    delete-other-windows    kill all other windows
  C-x 2    split-window-below      split horizontal
  C-x 3    split-window-right      split vertical
  C-x o    other-window            other window (cycle)

buffer

  key        fn                   description
---------------------------------------------
  C-x C-q    read-only-mode       toggle read-only mode for buffer
  C-x k      kill-buffer          kill buffer
  C-x s      save-some-buffers    save buffer
  C-x w      write-file           write buffer (save as)
  C-x b      switch-to-buffer     switch buffer
  C-x C-b    list-buffers         buffer list

ibuffer

Builtin advanced buffer selection mode

  key        fn            description
--------------------------------------
             ibuffer       enter buffer selection

  h                        ibuffer help

  o                        open buffer in other window
  C-o                      open buffer in other window keep focus in ibuffer

  s a                      sort by buffer name
  s f                      sort by file name
  s v                      sort by last viewed
  s v                      sort by major mode
  ,                        cycle sorting mode

  =                        compare buffer against file on disk (if file is dirty `*`)

  /m                       filter by major mode
  /n                       filter by buffer name
  /f                       filter by file name
  //                       remove all filter

  /g                       create filter group
  /\                       remove all filter groups

isearch

  key    fn                           description
-------------------------------------------------
  C-s    isearch-forward              search forward from current position (C-s to go to next match)
  C-r    isearch-backward             search backwards from current position (C-r to go to next match)
  C-w    isearch-yank-word-or-char    feed next word to current search (extend)
  M-p    isearch-ring-advance         previous search input
  M-n    isearch-ring-retreat         next search input

occur

  key      fn           description
-----------------------------------
  M-s o    occur        get matches for regexp in buffer
                        use during `isearch` to use current search term

  C-n                   goto next line
  C-p                   goto previous line
  o                     open match in other window
  C-o                   open match in other window keep focus in ibuffer
  key      fn                                 description
---------------------------------------------------------
           multi-occur-in-matching-buffers    run occur in buffers matching regexp

grep

  key    fn           description
-----------------------------------
         rgrep        recursive grep
         find-grep    run find-grep result in *grep* buffer

  n/p                 navigate next/previous match in *grep* buffer
  q                   quit *grep* buffer

yank/paste

  key         fn                  description
---------------------------------------------
  C-<SPACE>   set-mark-command    set start mark to select text
  M-w         kill-ring-save      copy selected text
  C-w         kill-region         kill selected text
  C-y         yank                paste selected text
  M-y         yank-pop            cycle through kill-ring (only after paste)

register

  key             fn                 description
------------------------------------------------
  C-x r s <reg>   copy-to-register   save region in register <reg>
  C-x r i <reg>   insert-register    insert content of register <reg>

block/rect

  key          fn                    description
------------------------------------------------
  C-x <SPC>    rectangle-mark-mode   activate rectangle-mark-mode
               string-rectangle      insert text in marked rect

mass edit

  key       fn                       description
------------------------------------------------
  C-x h     mark-whole-buffer        mark whole buffer
            delete-matching-line     delete lines matching regex
  M-%       query-replace            search & replace
  C-M-%     query-replace-regexp     search & replace regex

narrow

  key       fn                    description
---------------------------------------------
  C-x n n   narrow-to-region      show only focused region (narrow)
  C-x n w   widen                 show whole buffer (wide)

org

  key              fn   description
------------------------------------
  M-up/M-down           re-arrange items in same hierarchy
  M-left/M-right        change item hierarchy
  C-RET                 create new item below current
  C-S-RET               create new TODO item below current
  S-left/S-right        cycle TODO states

org source

  key       fn     description
------------------------------
  <s TAB           generate a source block
  C-c '            edit source block (in lang specific buffer)
  C-c C-c          eval source block

comapny

  key         fn   description
-------------------------------
  C-s              search through completion candidates
  C-o              filter completion candidates based on search term
  <f1>             get doc for completion condidate
  M-<digit>        select completion candidate

tags

To generate etags using ctags

  ctags -R -e .         generate emacs tag file (important `-e`)

Navigate using tags

  key      fn                       description
-----------------------------------------------
           xref-find-definitions    find definition of tag
           xref-find-apropos        find symbols matching regexp
           xref-find-references     find references of tag

lisp

  key   fn        description
------------------------------
        ielm      open interactive elips shell

In lisp-interaction-mode (*scratch* buffer by defult)

  key              fn                        description
--------------------------------------------------------
  C-j              eval-print-last-sexp      evaluate & print preceeding lisp expr

  C-x C-e          eval-last-sexp            evaluate lisp expr
  C-u C-x C-e      eval-last-sexp            evaluate & print

ido

Builtin fuzzy completion mode (eg buffer select, dired, ...).

  key              fn          description
------------------------------------------
                  ido-mode     toggle ido mode
  <Left>/<Right>               cycle through available competions
  <RET>                        select completion

evil

  key    fn    description
--------------------------
  C-z          toggle emacs/evil mode
  C-^          toggle between previous and current buffer
  C-p          after paste cycle kill-ring back
  C-n          after paste cycle kill-ring forward

dired

  key    fn    description
--------------------------
  i            open sub-dir in same buffer
  +            create new directory
  C            copy file/dir

  q            quit

gpg(1)

gpg
  -o|--output                 Specify output file
  -a|--armor                  Create ascii output
  -u|--local-user <name>      Specify key for signing
  -r|--recipient              Encrypt for user

Generate new keypair

gpg --full-generate-key

List keys

gpg -k / --list-key               # public keys
gpg -K / --list-secret-keys       # secret keys

Edit keys

gpg --edit-key <KEY ID>

Gives prompt to modify KEY ID, common commands:

help         show help
save         save & quit

list         list keys and user IDs
key <N>      select subkey <N>
uid <N>      select user ID <N>

expire       change expiration of selected key

adduid       add user ID
deluid       delete selected user ID

addkey       add subkey
delkey       delete selected subkey

Export & Import Keys

gpg --export --armor --output <KEY.PUB> <KEY ID>
gpg --export-secret-key --armor --output <KEY.PUB> <KEY ID>
gpg --import <FILE>

Search & Send keys

gpg --keyserver <SERVER> --send-keys <KEY ID>
gpg --keyserver <SERVER> --search-keys <KEY ID>

Encrypt (passphrase)

Encrypt file using passphrase and write encrypted data to <file>.gpg.

gpg --symmetric <file>

# Decrypt using passphrase
gpg -o <file> --decrypt <file>.gpg

Encrypt (public key)

Encrypt file with public key of specified recipient and write encrypted data to <file>.gpg.

gpg --encrypt -r foo@bar.de <file>

# Decrypt at foos side (private key required)
gpg -o <file> --decrypt <file>.gpg

Signing

Generate a signed file and write to <file>.gpg.

# Sign with private key of foo@bar.de
gpg --sign -u foor@bar.de <file>

# Verify with public key of foo@bar.de
gpg --verify <file>

# Extract content from signed file
gpg -o <file> --decrypt <file>.gpg

Without -u use first private key in list gpg -K for signing.

Files can also be signed and encrypted at once, gpg will first sign the file and then encrypt it.

gpg --sign --encrypt -r <recipient> <file>

Signing (detached)

Generate a detached signature and write to <file>.asc. Send <file>.asc along with <file> when distributing.

gpg --detach-sign --armor -u foor@bar.de <file>

# Verify
gpg --verify <file>.asc <file>

Without -u use first private key in list gpg -K for signing.

Abbreviations

  • sec secret key
  • ssb secret subkey
  • pub public key
  • sub public subkey

Keyservers

  • http://pgp.mit.edu
  • http://keyserver.ubuntu.com
  • hkps://pgp.mailbox.org

gdb(1)

CLI

  gdb [opts] [prg [-c coredump | -p pid]]
  gdb [opts] --args prg <prg-args>
    opts:
      -p <pid>        attach to pid
      -c <coredump>   use <coredump>
      -x <file>       execute script <file> before prompt
      -ex <cmd>       execute command <cmd> before prompt
      --tty <tty>     set I/O tty for debugee

Interactive usage

Misc

  tty <tty>
          Set <tty> as tty for debugee.
          Make sure nobody reads from target tty, easiest is to spawn a shell
          and run following in target tty:
          > while true; do sleep 1024; done

  sharedlibrary [<regex>]
          Load symbols of shared libs loaded by debugee. Optionally use <regex>
          to filter libs for symbol loading.

  display [/FMT] <expr>
          Print <expr> every time debugee stops. Eg print next instr, see
          examples below.

  undisplay [<num>]
          Delete display expressions either all or one referenced by <num>.

  info display
          List display expressions.

Breakpoints

  break [-qualified] <sym> thread <tnum>
          Set a breakpoint only for a specific thread.
          -qualified: Treat <sym> as fully qualified symbol (quiet handy to set
          breakpoints on C symbols in C++ contexts)

  break <sym> if <cond>
          Set conditional breakpoint (see examples below).

  delete [<num>]
          Delete breakpoint either all or one referenced by <num>.

  info break
          List breakpoints.

  cond <bp> <cond>
          Make existing breakpoint <bp> conditional with <cond>.

  tbreak
          Set temporary breakpoint, will be deleted when hit.
          Same syntax as `break`.

  rbreak <regex>
          Set breakpoints matching <regex>, where matching internally is done
          on: .*<regex>.*

  command [<bp_list>]
          Define commands to run after breakpoint hit. If <bp_list> is not
          specified attach command to last created breakpoint. Command block
          terminated with 'end' token.

          <bp_list>: Space separates list, eg 'command 2 5-8' to run command
          for breakpoints: 2,5,6,7,8.

Watchpoints

  watch [-location|-l] <expr> [thread <tnum>]
          Create a watchpoint for <expr>, will break if <expr> is written to.
          Watchpoints respect scope of variables, -l can be used to watch the
          memory location instead.
  rwatch ...
          Sets a read watchpoint, will break if <expr> is read from.
  awatch ...
          Sets an access watchpoint, will break if <expr> is written to or read
          from.

Inspection

  info functions [<regex>]
          List functions matching <regex>. List all functions if no <regex>
          provided.

  info variables [<regex>]
          List variables matching <regex>. List all variables if no <regex>
          provided.

Signal handling

  info handle [<signal>]
          Print how to handle <signal>. If no <signal> specified print for all
          signals.

  handle <signal> <action>
          Configure how gdb handles <signal> sent to debugee.
          <action>:
            stop/nostop       Catch signal in gdb and break.
            print/noprint     Print message when gdb catches signal.
            pass/nopass       Pass signal down to debugee.

  catch signal <signal>
          Create a catchpoint for <signal>.

Source file locations

  dir <path>
          Add <path> to the beginning of the searh path for source files.

  show dir
          Show current search path.

  set substitute-path <from> <to>
          Add substitution rule checked during source file lookup.

  show substitute-path
          Show current substitution rules.

Configuration

  set follow-fork-mode <child | parent>
          Specify which process to follow when debuggee makes a fork(2)
          syscall.

  set pagination <on | off>
          Turn on/off gdb's pagination.

  set breakpoint pending <on | off | auto>
          on: always set pending breakpoints.
          off: error when trying to set pending breakpoints.
          auto: interatively query user to set breakpoint.

  set print pretty <on | off>
          Turn on/off pertty printing of structures.

  set logging <on | off>
          Enable output logging to file (default gdb.txt).

  set logging file <fname>
          Change output log file to <fname>

  set logging redirect <on/off>
          on: only log to file.
          off: log to file and tty.

User commands (macros)

Gdb allows to create & document user commands as follows:

  define <cmd>
    # cmds
  end

  document <cmd>
    # docu
  end

To get all user commands or documentations one can use:

  help user-defined
  help <cmd>

Hooks

Gdb allows to create two types of command hooks

  • hook- will be run before <cmd>
  • hookpost- will be run after <cmd>
  define hook-<cmd>
    # cmds
  end

  define hookpost-<cmd>
    # cmds
  end

Examples

Automatically print next instr

When ever the debugee stops automatically print the memory at the current instruction pointer ($rip x86) and format as instruction /i.

  # rip - x86
  display /i $rip

  # step instruction, after the step the next instruction is automatically printed
  si

Conditional breakpoints

Create conditional breakpoints for a function void foo(int i) in the debugee.

  # Create conditional breakpoint
  b foo if i == 42

  b foo     # would create bp 2
  # Make existing breakpoint conditional
  cond 2 if i == 7

Catch SIGSEGV and execute commands

This creates a catchpoint for the SIGSEGV signal and attached the command to it.

  catch signal SIGSEGV
  command
    bt
    c
  end

Run backtrace on thread 1 (batch mode)

  gdb --batch -ex 'thread 1' -ex 'bt' -p <pid>

Script gdb for automating debugging sessions

To script gdb add commands into a file and pass it to gdb via -x. For example create run.gdb:

  set pagination off

  break mmap
  command
    info reg rdi rsi rdx
    bt
    c
  end

  #initial drop
  c

This script can be used as:

  gdb --batch -x ./run.gdb -p <pid>

Know Bugs

Workaround command + finish bug

When using finish inside a command block, commands after finish are not executed. To workaround that bug one can create a wrapper function which calls finish.

  define handler
    bt
    finish
    info reg rax
  end

  command
    handler
  end

gdbserver(1)

CLI

  gdbserver [opts] comm prog [args]
    opts:
      --disable-randomization
      --no-disable-randomization

    comm:
      host:port
      tty

Example

# Start gdbserver.
gdbserver localhost:1234 /bin/ls

# Attach gdb.
gdb -ex 'target remote localhost:1234'

radare2(1)

print


  pd <n> [@ <addr>]     # print disassembly for <n> instructions
                        # with optional temporary seek to <addr>

flags

  fs            # list flag-spaces
  fs <fs>       # select flag-space <fs>
  f             # print flags of selected flag-space

help

  ?*~<kw>       # '?*' list all commands and '~' grep for <kw>
  ?*~...        # '..' less mode /'...' interactive search

relocation

  > r2 -B <baddr> <exe>         # open <exe> mapped to addr <baddr>
  oob <addr>                    # reopen current file at <baddr>

Examples

Patch file (alter bytes)

  > r2 [-w] <file>
  oo+           # re-open for write if -w was not passed
  s <addr>      # seek to position
  wv <data>     # write 4 byte (dword)

Assemble / Disassmble (rasm2)

  rasm2 -L      # list supported archs

  > rasm2 -a x86 'mov eax, 0xdeadbeef'
  b8efbeadde

  > rasm2 -a x86 -d "b8efbeadde"
  mov eax, 0xdeadbeef

qemu(1)

All the examples & notes use qemu-system-x86_64 but in most cases this can be swapped with the system emulator for other architectures.

Keybindings

Graphic mode:

Ctrl+Alt+g         release mouse capture from VM

Ctrl+Alt+1         switch to display of VM
Ctrl+Alt+2         switch to qemu monitor

No graphic mode:

Ctrl+a h           print help
Ctrl+a x           exit emulator
Ctrl+a c           switch between monitor and console

VM config snippet

Following command-line gives a good starting point to assemble a VM:

qemu-system-x86_64                 \
    -cpu host -enable-kvm -smp 4   \
    -m 8G                          \
    -vga virtio -display sdl,gl=on \
    -boot menu=on                  \
    -cdrom <iso>                   \
    -hda <disk>                    \
    -device qemu-xhci,id=xhci      \
    -device usb-host,bus=xhci.0,vendorid=0x05e1,productid=0x0408,id=capture-card

CPU & RAM

# Emulate host CPU in guest VM, enabling all supported host featured (requires KVM).
# List available CPUs `qemu-system-x86_64 -cpu help`.
-cpu host

# Enable KVM instead software emulation.
-enable-kvm

# Configure number of guest CPUs.
-smp <N>

# Configure size of guest RAM.
-m 8G

Graphic & Display

# Use sdl window as display and enable openGL context.
-display sdl,gl=on

# Use vnc server as display (eg on display `:42` here).
-display vnc=localhost:42

# Confifure virtio as 3D video graphic accelerator (requires virgl in guest).
-vga virtio

Boot Menu

# Enables boot menu to select boot device (enter with `ESC`).
-boot menu=on

Block devices

# Attach cdrom drive with iso to a VM.
-cdrom <iso>

# Attach disk drive to a VM.
-hda <disk>

# Generic way to configure & attach a drive to a VM.
-drive file=<file>,format=qcow2

Create a disk with qemu-img

To create a qcow2 disk (qemu copy-on-write) of size 10G:

qemu-img create -f qcow2 disk.qcow2 10G

The disk does not contain any partitions or a partition table. We can format the disk from within the guest as following example:

# Create `gpt` partition table.
sudo parted /dev/sda mktable gpt

# Create two equally sized primary partitions.
sudo parted /dev/sda mkpart primary 0% 50%
sudo parted /dev/sda mkpart primary 50% 100%

# Create filesystem on each partition.
sudo mkfs.ext3 /dev/sda1
sudo mkfs.ext4 /dev/sda2

lsblk -f /dev/sda
  NAME   FSTYPE LABEL UUID FSAVAIL FSUSE% MOUNTPOINT
  sda
  ├─sda1 ext3         ....
  └─sda2 ext4         ....

USB

Host Controller

# Add XHCI USB controller to the VM (supports USB 3.0, 2.0, 1.1).
# `id=xhci` creates a usb bus named `xhci`.
-device qemu-xhci,id=xhci

USB Device

# Pass-through USB device from host identified by vendorid & productid and
# attach to usb bus `xhci.0` (defined with controller `id`).
-device usb-host,bus=xhci.0,vendorid=0x05e1,productid=0x0408

Debugging

# Open gdbstub on tcp `<port>` (`-s` shorthand for `-gdb tcp::1234`).
-gdb tcp::<port>

# Freeze guest CPU at startup and wait for debugger connection.
-S

IO redirection

# Create raw tcp server for `serial IO` and wait until a client connects
# before executing the guest.
-serial tcp:localhost:12345,server,wait

# Create telnet server for `serial IO` and wait until a client connects
# before executing the guest.
-serial telnet:localhost:12345,server,wait

# Configure redirection for the QEMU `mointor`, arguments similar to `-serial`
# above.
-monitor ...

In server mode use nowait to execute guest without waiting for a client connection.

Network

# Redirect host tcp port `1234` to guest port `4321`.
-nic user,hostfwd=tcp:localhost:1234-:4321

Shared drives

# Attach a `virtio-9p-pci` device to the VM.
# The guest requires 9p support and can mount the shared drive as:
#   mount -t 9p -o trans=virtio someName /mnt
-virtfs local,id=someName,path=<someHostPath>,mount_tag=someName,security_model=none

Debug logging

# List debug items.
-d help

# Write debug log to file instead stderr.
-D <file>

# Examples
-d in_asm       Log executed guest instructions.

Tracing

# List name of all trace points.
-trace help

# Enable trace points matching pattern and optionally write trace to file.
-trace <pattern>[,file=<file>]

# Enable trace points for all events listed in the <events> file.
# File must contain one event/pattern per line.
-trace events=<events>

VM snapshots

VM snapshots require that there is at least on qcow2 disk attached to the VM (VM Snapshots).

Commands for qemu Monitor or QMP:

# List available snapshots.
info snapshots

# Create/Load/Delete snapshot with name <tag>.
savevm <tag>
loadvm <tag>
delvm <tag>

The snapshot can also be directly specified when invoking qemu as:

qemu-system-x86_64 \
    -loadvm <tag>  \
    ...

VM Migration

Online migration example:

# Start machine 1 on host ABC.
qemu-system-x86_64 -monitor stdio -cdrom <iso>

# Prepare machine 2 on host DEF as migration target.
# Listen for any connection on port 12345.
qemu-system-x86_64 -monitor stdio -incoming tcp:0.0.0.0:12345

# Start migration from the machine 1 monitor console.
(qemu) migrate tcp:DEF:12345

Save to external file example:

```bash
# Start machine 1.
qemu-system-x86_64 -monitor stdio -cdrom <iso>

# Save VM state to file.
(qemu) migrate "exec:gzip -c > vm.gz"

# Load VM from file.
qemu-system-x86_64 -monitor stdio -incoming "exec: gzip -d -c vm.gz"

The migration source machine and the migration target machine should be launched with the same parameters.

Appendix: Direct Kernel boot

Example command line to directly boot a Kernel with an initrd ramdisk.

qemu-system-x86_64                                                     \
    -cpu host                                                          \
    -enable-kvm                                                        \
    -kernel <dir>/arch/x86/boot/bzImage                                \
    -append "earlyprintk=ttyS0 console=ttyS0 nokaslr init=/init debug" \
    -initrd <dir>/initramfs.cpio.gz                                    \
    ...

Instructions to build a minimal Kernel and initrd.

References

pacman(1)

Remote package repositories

pacman -Sy              refresh package database
pacman -S <pkg>         install pkg
pacman -Ss <regex>      search remote package database
pacman -Si <pkg>        get info for pkg
pacman -Su              upgrade installed packages
pacman -Sc              clean local package cache

Remove packages

pacman -Rsn <pkg>               uninstall package and unneeded deps + config files

Local package database

Local package database of installed packages.

pacman -Q               list all installed packages
pacman -Qs <regex>      search local package database
pacman -Ql <pkg>        list files installed by pkg
pacman -Qo <file>       query package that owns file
pacman -Qe              only list explicitly installed packages

Local file database

Local file database which allows to search packages owning certain files. Also searches non installed packages, but database must be synced.

pacman -Fy              refresh file database
pacman -Fl <pkg>        list files in pkg (must not be installed)
pacman -Fx <regex>      search 

Hacks

Uninstall all orphaned packages (including config files) that were installed as dependencies.

pacman -Rsn $(pacman -Qqtq)

List explicitly installed packages that are not required as dependency by any package and sort by size.

pacman -Qetq | xargs pacman -Qi |
    awk '/Name/ { name=$3 }
         /Installed Size/ { printf "%8.2f%s %s\n", $4, $5, name }' |
    sort -h

dot(1)

Online playground

Example dot file to copy & paste from.

Can be rendered to svg with the following command.

dot -T svg -o g.svg g.dot

Example dot file.

// file: g.dot
digraph {
    // Render ranks from left to right.
    rankdir=LR
    // Make background transparent.
    bgcolor=transparent

    // Global node attributes.
    node [shape=box]
    // Global edge attributes.
    edge [style=dotted,color=red]

    // Add nodes & edge.
    stage1 -> stage2
    // Add multiple edges at once.
    stage2 -> { stage3_1, stage3_2 }
    // Add edge with custom attributes.
    stage3_2 -> stage4 [label="some text"]

    // Set custom attributes for specific node.
    stage4 [color=green,fillcolor=lightgray,style="filled,dashed",label="s4"]

    // Create a subgraph. This can be used to group nodes/edges or as scope for
    // global node/edge attributes.
    // If the name starts with 'cluster' a border is drawn.
    subgraph cluster_1 {
        stage5_1
        stage5_2
    }

    // Add some edges to subgraph nodes.
    stage3_1 -> { stage5_1, stage5_2 }
}

Rendered svg file. g.svg

References

Resource analysis & monitor

lsof(8)

lsof
  -r <s> ..... repeatedly execute command ervery <s> seconds
  -a ......... AND slection filters instead ORing (OR: default)
  -p <pid> ... filter by <pid>
  +fg ........ show file flags for file descripros
  -n ......... don't convert network addr to hostnames
  -P ......... don't convert network port to service names
  -i <@h[:p]>. show connections to h (hostname|ip addr) with optional port p
  -s <p:s> ... in conjunction with '-i' filter for protocol <p> in state <s>
  -U ......... show unix domain sockets ('@' indicates abstract sock name, see unix(7))
file flags:
  R/W/RW ..... read/write/read-write
  CR ......... create
  AP ......... append
  TR ......... truncate
-s protocols
  TCP, UDP

-s states (TCP)
  CLOSED, IDLE, BOUND, LISTEN, ESTABLISHED, SYN_SENT, SYN_RCDV, ESTABLISHED,
  CLOSE_WAIT, FIN_WAIT1, CLOSING, LAST_ACK, FIN_WAIT_2, TIME_WAIT

-s states (UDP)
  Unbound, Idle

Examples

File flags

Show open files with file flags for process:

lsof +fg -p <pid>

Open TCP connections

Show open tcp connections for $USER:

lsof -a -u $USER -i TCP

Note: -a ands the results. If -a is not given all open files matching $USER and all tcp connections are listed (ored).

Open connection to specific host

Show open connections to localhost for $USER:

lsof -a -u $USER -i @localhost

Open connection to specific port

Show open connections to port :1234 for $USER:

lsof -a -u $USER -i :1234

IPv4 TCP connections in ESTABLISHED state

lsof -i 4TCP -s TCP:ESTABLISHED

ss(8)

ss [option] [filter]
[option]
  -p ..... Show process using socket
  -l ..... Show sockets in listening state
  -4/-6 .. Show IPv4/6 sockets
  -x ..... Show unix sockets
  -n ..... Show numeric ports (no resolve)
  -O ..... Oneline output per socket
[filter]
  dport/sport PORT .... Filter for destination/source port
  dst/src ADDR ........ Filter for destination/source address

  and/or .............. Logic operator
  ==/!= ............... Comparison operator

  (EXPR) .............. Group exprs

Examples

Show all tcp IPv4 sockets connecting to port 443:

ss -4 'dport 443'

Show all tcp IPv4 sockets that don't connect to port 443 or connect to address 1.2.3.4.

ss -4 'dport != 443 or dst 1.2.3.4'

pidstat(1)

pidstat [opt] [interval] [cont]
  -U [user]     show username instead UID, optionally only show for user
  -r            memory statistics
  -d            I/O statistics
  -h            single line per process and no lines with average

Page fault and memory utilization

pidstat -r -p <pid> [interval] [count]
minor_pagefault: Happens when the page needed is already in memory but not
                 allocated to the faulting process, in that case the kernel
                 only has to create a new page-table entry pointing to the
                 shared physical page (not required to load a memory page from
                 disk).

major_pagefault: Happens when the page needed is NOT in memory, the kernel
                 has to create a new page-table entry and populate the
                 physical page (required to load a memory page from disk).

I/O statistics

pidstat -d -p <pid> [interval] [count]

pgrep(1)

pgrep [opts] <pattern>
  -n         only list newest matching process
  -u <usr>   only show matching for user <usr>
  -l         additionally list command
  -a         additionally list command + arguments

Debug newest process

For example attach gdb to newest zsh process from $USER.

gdb -p $(pgrep -n -u $USER zsh)

pmap(1)

pmap <pid>
    Dump virtual memory map of process.
    Compared to /proc/<pid>/maps it shows the size of the mappings.

pstack(1)

pstack <pid>
    Dump stack for all threads of process.

Trace and Profile

strace(1)

strace [opts] [prg]
  -f .......... follow child processes on fork(2)
  -p <pid> .... attach to running process
  -s <size> ... max string size, truncate of longer (default: 32)
  -e <expr> ... expression for trace filtering
  -o <file> ... log output into <file>
  -c .......... dump syscall statitics at the end
  -k .......... dump stack trace for each syscall
  -P <path> ... only trace syscall accesing path
  -y .......... print paths for FDs
  -tt ......... print absolute timestamp (with us precision)
  -r .......... print relative timestamp
<expr>:
  trace=syscall[,syscall] .... trace only syscall listed
  trace=file ................. trace all syscall that take a filename as arg
  trace=process .............. trace process management related syscalls
  trace=signal ............... trace signal related syscalls
  signal ..................... trace signals delivered to the process

Examples

Trace open(2) & socket(2) syscalls for a running process + child processes:

strace -f -e trace=open,socket -p <pid>

Trace signals delivered to a running process:

strace -e signal -e 'trace=!all' -p <pid>

ltrace(1)

ltrace [opts] [prg]
  -f .......... follow child processes on fork(2)
  -p <pid> .... attach to running process
  -o <file> ... log output into <file>
  -l <filter> . show who calls into lib matched by <filter>
  -C .......... demangle

Example

List which program/libs call into libstdc++:

ltrace -l '*libstdc++*' -C -o ltrace.log ./main

perf(1)

perf list      show supported hw/sw events

perf stat
  -p <pid> .. show stats for running process
  -I <ms> ... show stats periodically over interval <ms>
  -e <ev> ... filter for events

perf top
  -p <pid> .. show stats for running process
  -F <hz> ... sampling frequency
  -K ........ hide kernel threads

perf record
  -p <pid> ............... record stats for running process
  -F <hz> ................ sampling frequency
  --call-graph <method> .. [fp, dwarf, lbr] method how to caputre backtrace
                           fp   : use frame-pointer, need to compile with
                                  -fno-omit-frame-pointer
                           dwarf: use .cfi debug information
                           lbr  : use hardware last branch record facility
  -g ..................... short-hand for --call-graph fp
  -e <ev> ................ filter for events

perf report
  -n .................... annotate symbols with nr of samples
  --stdio ............... report to stdio, if not presen tui mode
  -g graph,0.5,caller ... show caller based call chains with value >0.5
Useful <ev>:
  page-faults
  minor-faults
  major-faults
  cpu-cycles`
  task-clock

Flamegraph

Flamegraph with single event trace

perf record -g -e cpu-cycles -p <pid>
perf script | FlameGraph/stackcollapse-perf.pl | FlameGraph/flamegraph.pl > cycles-flamegraph.svg

Flamegraph with multiple event traces

perf record -g -e cpu-cycles,page-faults -p <pid>
perf script --per-event-dump
# fold & generate as above

OProfile

operf -g -p <pid>
  -g ...... caputre call-graph information

opreport [opt] FILE
            show time spent per binary image
  -l ...... show time spent per symbol
  -c ...... show callgraph information (see below)
  -a ...... add column with time spent accumulated over child nodes

ophelp      show supported hw/sw events

/usr/bin/time(1)

# statistics of process run
/usr/bin/time -v <cmd>

Binary

od(1)

  od [opts] <file>
    -An         don't print addr info
    -tx4        print hex in 4 byte chunks
    -ta         print as named character
    -tc         printable chars or backslash escape
    -w4         print 4 bytes per line
    -j <n>      skip <n> bytes from <file> (hex if start with 0x)
    -N <n>      dump <n> bytes (hex of start with 0x)

ASCII to hex string

  echo -n AAAABBBB | od -An -w4 -tx4
    >> 41414141
    >> 42424242

  echo -n '\x7fELF\n' | od -tx1 -ta -tc
    >> 0000000  7f  45  4c  46  0a      # tx1
    >>         del   E   L   F  nl      # ta
    >>         177   E   L   F  \n      # tc

Extract parts of file

For example .rodata section from an elf file. We can use readelf to get the offset into the file where the .rodata section starts.

  readelf -W -S foo
    >> Section Headers:
    >> [Nr] Name              Type            Address          Off    Size   ES Flg Lk Inf Al
    >> ...
    >> [15] .rodata           PROGBITS        00000000004009c0 0009c0 000030 00   A  0   0 16

With the offset of -j 0x0009c0 we can dump -N 0x30 bytes from the beginning of the .rodata section as follows:

  od -j 0x0009c0 -N 0x30 -tx4 -w4 foo
    >> 0004700 00020001
    >> 0004704 00000000
    >> *
    >> 0004740 00000001
    >> 0004744 00000002
    >> 0004750 00000003
    >> 0004754 00000004

Note: Numbers starting with 0x will be interpreted as hex by od.

xxd(1)

  xxd [opts]
    -p          dump continuous hexdump
    -r          convert hexdump into binary ('revert')
    -e          dump as little endian mode
    -i          output as C array

ASCII to hex stream

  echo -n 'aabb' | xxd -p
    >> 61616262

Hex to binary stream

  echo -n '61616262' | xxd -p -r
    >> aabb

ASCII to binary

  echo -n '\x7fELF' | xxd -p | xxd -p -r | file -p -
    >> ELF

ASCII to C array (hex encoded)

  xxd -i <(echo -n '\x7fELF')
    >> unsigned char _proc_self_fd_11[] = {
    >>   0x7f, 0x45, 0x4c, 0x46
    >> };
    >> unsigned int _proc_self_fd_11_len = 4;

readelf(1)

  readelf [opts] <elf>
    -W|--wide     wide output, dont break output at 80 chars
    -h            print ELF header
    -S            print section headers
    -l            print program headers + segment mapping
    -d            print .dynamic section (dynamic link information)
    --syms        print symbol tables (.symtab .dynsym)
    --dyn-syms    print dynamic symbol table (exported symbols for dynamic linker)
    -r            print relocation sections (.rel.*, .rela.*)

objdump(1)

  objdump [opts] <elf>
    -M intel                use intil syntax
    -d                      disassemble text section
    -D                      disassemble all sections
    -S                      mix disassembly with source code
    -C                      demangle
    -j <section>            display info for section
    --[no-]show-raw-insn    [dont] show object code next to disassembly

Disassemble section

For example .plt section:

  objdump -j .plt -d <elf>

nm(1)

  nm [opts] <elf>
    -C          demangle
    -u          undefined only

Development

c++filt(1)

Demangle symbol

  c++-filt <symbol_str>

Demangle stream

For example dynamic symbol table:

  readelf -W --dyn-syms <elf> | c++filt

c++

Type deduction

Force compile error to see what auto is deduced to.

auto foo = bar();

// force compile error
typename decltype(foo)::_;

Variadic templates (parameter pack)

{{#include c++/meta.cc:3:}}

SFINAE example (enable_if)

{{#include c++/meta2.cc:3:}}

glibc

malloc tracer mtrace(3)

Trace memory allocation and de-allocation to detect memory leaks. Need to call mtrace(3) to install the tracing hooks.

If we can't modify the binary to call mtrace we can create a small shared library and pre-load it.

// libmtrace.c
#include <mcheck.h>
__attribute__((constructor))  static void init_mtrace() { mtrace(); }

Compile as:

gcc -shared -fPIC -o libmtrace.so libmtrace.c

To generate the trace file run:

export MALLOC_TRACE=<file>
LD_PRELOAD=./libmtrace.so <binary>

Note: If MALLOC_TRACE is not set mtrace won't install tracing hooks.

To get the results of the trace file:

mtrace <binary> $MALLOC_TRACE

malloc check mallopt(3)

Configure action when glibc detects memory error.

export MALLOC_CHECK_=<N>

Useful values:

1   print detailed error & continue
3   print detailed error + stack trace + memory mappings & abort
7   print simple error message + stack trace + memory mappings & abort

gcc(1)

CLI

Preprocessing

While debugging can be helpful to just pre-process files.

gcc -E [-dM] ...
  • -E run only preprocessor
  • -dM list only #define statements
  • -### dry-run, outputting exact compiler/linker invocations
  • -print-multi-lib print available multilib configurations

Target options

# List all target options with their description.
gcc --help=target

# Configure for current cpu arch and query (-Q) value of options.
gcc -march=native -Q --help=target

Builtins

__builtin_expect(expr, cond)

Give the compiler a hint which branch is hot, so it can lay out the code accordingly to reduce number of jump instructions. See on compiler explorer.

echo "
extern void foo();
extern void bar();
void run0(int x) {
  if (__builtin_expect(x,0)) { foo(); }
  else { bar(); }
}
void run1(int x) {
  if (__builtin_expect(x,1)) { foo(); }
  else { bar(); }
}
" | gcc -O2 -S -masm=intel -o /dev/stdout -xc -

Will generate something similar to the following.

  • run0: bar is on the path without branch
  • run1: foo is on the path without branch
run0:
        test    edi, edi
        jne     .L4
        xor     eax, eax
        jmp     bar
.L4:
        xor     eax, eax
        jmp     foo
run1:
        test    edi, edi
        je      .L6
        xor     eax, eax
        jmp     foo
.L6:
        xor     eax, eax
        jmp     bar

ABI (Linux)

make(1)

Anatomy of make rules

target .. : prerequisite ..
	recipe
	..
  • target: an output generated by the rule
  • prerequisite: an input that is used to generate the target
  • recipe: list of actions to generate the output from the input

Use make -p to print all rules and variables (implicitly + explicitly defined).

Pattern rules & Automatic variables

Pattern rules

A pattern rule contains the % char (exactly one of them) and look like this example:

%.o : %.c
	$(CC) -c $(CFLAGS) $(CPPFLAGS) $< -o $@

The target matches files of the pattern %.o, where % matches any none-empty substring and other character match just them self.

The substring matched by % is called the stem.

% in the prerequisite stands for the matched stem in the target.

Automatic variables

As targets and prerequisites in pattern rules can't be spelled explicitly in the recipe, make provides a set of automatic variables to work with:

  • $@: Name of the target that triggered the rule.
  • $<: Name of the first prerequisite.
  • $^: Names of all prerequisites (without duplicates).
  • $+: Names of all prerequisites (with duplicates).
  • $*: Stem of the pattern rule.
# file: Makefile

all: foobar blabla

foo% bla%: aaa bbb bbb
	@echo "@ = $@"
	@echo "< = $<"
	@echo "^ = $^"
	@echo "+ = $+"
	@echo "* = $*"
	@echo "----"

aaa:
bbb:

Running above Makefile gives:

@ = foobar
< = aaa
^ = aaa bbb
+ = aaa bbb bbb
* = bar
----
@ = blabla
< = aaa
^ = aaa bbb
+ = aaa bbb bbb
* = bla
----

Variables related to filesystem paths:

  • $(CURDIR): Path of current working dir after using make -C path

Useful functions

Substitution references

Substitute strings matching pattern in a list.

in  := a.o l.a c.o
out := $(in:.o=.c)
# => out = a.c l.a c.c

filter

Keep strings matching a pattern in a list.

in  := a.a b.b c.c d.d
out := $(filter %.b %.c, $(in))
# => out = b.b c.c

filter-out

Remove strings matching a pattern from a list.

in  := a.a b.b c.c d.d
out := $(filter-out %.b %.c, $(in))
# => out = a.a d.d

abspath

Resolve each file name as absolute path (don't resolve symlinks).

$(abspath fname1 fname2 ..)

### `realpath`
Resolve each file name as canonical path.
```make
$(realpath fname1 fname2 ..)

ld.so(8)

Environment Variables

  LD_PRELOAD=<l_so>       colon separated list of libso's to be pre loaded
  LD_DEBUG=<opts>         comma separated list of debug options
          =help           list available options
          =libs           show library search path
          =files          processing of input files
          =symbols        show search path for symbol lookup
          =bindings       show against which definition a symbol is bound

LD_PRELOAD: Initialization Order and Link Map

Libraries specified in LD_PRELOAD are loaded from left-to-right but initialized from right-to-left.

  > ldd ./main
    >> libc.so.6 => /usr/lib/libc.so.6

  > LD_PRELOAD=liba.so:libb.so ./main
             -->
      preloaded in this order
             <--
      initialized in this order

The preload order determines:

  • the order libraries are inserted into the link map
  • the initialization order for libraries

For the example listed above the resulting link map will look like the following:

  +------+    +------+    +------+    +------+
  | main | -> | liba | -> | libb | -> | libc |
  +------+    +------+    +------+    +------+

This can be seen when running with LD_DEBUG=files:

  > LD_DEBUG=files LD_PRELOAD=liba.so:libb.so ./main
    # load order (-> determines link map)
    >> file=liba.so [0];  generating link map
    >> file=libb.so [0];  generating link map
    >> file=libc.so.6 [0];  generating link map

    # init order
    >> calling init: /usr/lib/libc.so.6
    >> calling init: <path>/libb.so
    >> calling init: <path>/liba.so
    >> initialize program: ./main

To verify the link map order we let ld.so resolve the memcpy(3) libc symbol (used in main) dynamically, while enabling LD_DEBUG=symbols,bindings to see the resolving in action.

  > LD_DEBUG=symbols,bindings LD_PRELOAD=liba.so:libb.so ./main
    >> symbol=memcpy;  lookup in file=./main [0]
    >> symbol=memcpy;  lookup in file=<path>/liba.so [0]
    >> symbol=memcpy;  lookup in file=<path>/libb.so [0]
    >> symbol=memcpy;  lookup in file=/usr/lib/libc.so.6 [0]
    >> binding file ./main [0] to /usr/lib/libc.so.6 [0]: normal symbol `memcpy' [GLIBC_2.14]

Dynamic Linking (x86_64)

Dynamic linking basically works via one indirect jump. It uses a combination of function trampolines (.plt section) and a function pointer table (.got.plt section). On the first call the trampoline sets up some metadata and then jumps to the ld.so runtime resolve function, which in turn patches the table with the correct function pointer.

  .plt ....... procedure linkage table, contains function trampolines, usually
               located in code segment (rx permission)
  .got.plt ... global offset table for .plt, holds the function pointer table

Using radare2 we can analyze this in more detail:

  [0x00401040]> pd 4 @ section..got.plt
              ;-- section..got.plt:
              ;-- .got.plt:    ; [22] -rw- section size 32 named .got.plt
              ;-- _GLOBAL_OFFSET_TABLE_:
         [0]  0x00404000      .qword 0x0000000000403e10 ; section..dynamic
         [1]  0x00404008      .qword 0x0000000000000000
              ; CODE XREF from section..plt @ +0x6
         [2]  0x00404010      .qword 0x0000000000000000
              ;-- reloc.puts:
              ; CODE XREF from sym.imp.puts @ 0x401030
         [3]  0x00404018      .qword 0x0000000000401036 ; RELOC 64 puts

  [0x00401040]> pd 6 @ section..plt
              ;-- section..plt:
              ;-- .plt:       ; [12] -r-x section size 32 named .plt
          ┌─> 0x00401020      ff35e22f0000   push qword [0x00404008]
          ╎   0x00401026      ff25e42f0000   jmp qword [0x00404010]
          ╎   0x0040102c      0f1f4000       nop dword [rax]
  ┌ 6: int sym.imp.puts (const char *s);
  └       ╎   0x00401030      ff25e22f0000   jmp qword [reloc.puts]
          ╎   0x00401036      6800000000     push 0
          └─< 0x0040103b      e9e0ffffff     jmp sym..plt
  • At address 0x00401030 in the .plt section we see the indirect jump for puts using the function pointer in _GLOBAL_OFFSET_TABLE_[3] (GOT).
  • GOT[3] initially points to instruction after the puts trampoline 0x00401036.
  • This pushes the relocation index 0 and then jumps to the first trampoline 0x00401020.
  • The first trampoline jumps to GOT[2] which will be filled at program startup by the ld.so with its resolve function.
  • The ld.so resolve function fixes the relocation referenced by the relocation index pushed by the puts trampoline.
  • The relocation entry at index 0 tells the resolve function which symbol to search for and where to put the function pointer:
      > readelf -r <main>
        >> Relocation section '.rela.plt' at offset 0x4b8 contains 1 entry:
        >>   Offset          Info           Type           Sym. Value    Sym. Name + Addend
        >> 000000404018  000200000007 R_X86_64_JUMP_SLO 0000000000000000 puts@GLIBC_2.2.5 + 0
    
    As we can see the offset from relocation at index 0 points to GOT[3].

ELF Symbol Versioning

The ELF symbol versioning mechanism allows to attach version information to symbols. This can be used to express symbol version requirements or to provide certain symbols multiple times in the same ELF file with different versions (eg for backwards compatibility).

The libpthread.so library is an example which provides the pthread_cond_wait symbol multiple times but in different versions. With readelf the version of the symbol can be seen after the @.

> readelf -W --dyn-syms /lib/libpthread.so

Symbol table '.dynsym' contains 342 entries:
   Num:    Value  Size Type    Bind   Vis      Ndx Name
   ...
   141: 0000f080   696 FUNC    GLOBAL DEFAULT   16 pthread_cond_wait@@GLIBC_2.3.2
   142: 00010000   111 FUNC    GLOBAL DEFAULT   16 pthread_cond_wait@GLIBC_2.2.5

The @@ denotes the default symbol version which will be used during static linking against the library. The following dump shows that the tmp program linked against lpthread will depend on the symbol version GLIBC_2.3.2, which is the default version.

> echo "#include <pthread.h>
        int main() {
          return pthread_cond_wait(0,0);
        }" | gcc -o tmp -xc - -lpthread;
  readelf -W --dyn-syms tmp | grep pthread_cond_wait;

Symbol table '.dynsym' contains 7 entries:
   Num:    Value  Size Type    Bind   Vis      Ndx Name
   ...
     2: 00000000     0 FUNC    GLOBAL DEFAULT  UND pthread_cond_wait@GLIBC_2.3.2 (2)

Only one symbol can be annotated as the @@ default version.

Using the --version-info flag with readelf, more details on the symbol version info compiled into the tmp ELF file can be obtained.

  • The .gnu.version section contains the version definition for each symbol in the .dynsym section. pthread_cond_wait is at index 2 in the .dynsym section, the corresponding symbol version is at index 2 in the .gnu.version section.
  • The .gnu.version_r section contains symbol version requirements per shared library dependency (DT_NEEDED dynamic entry).
> readelf -W --version-info --dyn-syms tmp

Symbol table '.dynsym' contains 7 entries:
   Num:    Value          Size Type    Bind   Vis      Ndx Name
     0: 0000000000000000     0 NOTYPE  LOCAL  DEFAULT  UND 
     1: 0000000000000000     0 NOTYPE  WEAK   DEFAULT  UND _ITM_deregisterTMCloneTable
     2: 0000000000000000     0 FUNC    GLOBAL DEFAULT  UND pthread_cond_wait@GLIBC_2.3.2 (2)
     3: 0000000000000000     0 FUNC    GLOBAL DEFAULT  UND __libc_start_main@GLIBC_2.2.5 (3)
     4: 0000000000000000     0 NOTYPE  WEAK   DEFAULT  UND __gmon_start__
     5: 0000000000000000     0 NOTYPE  WEAK   DEFAULT  UND _ITM_registerTMCloneTable
     6: 0000000000000000     0 FUNC    WEAK   DEFAULT  UND __cxa_finalize@GLIBC_2.2.5 (3)

Version symbols section '.gnu.version' contains 7 entries:
 Addr: 0x0000000000000534  Offset: 0x000534  Link: 6 (.dynsym)
  000:   0 (*local*)       0 (*local*)       2 (GLIBC_2.3.2)   3 (GLIBC_2.2.5)
  004:   0 (*local*)       0 (*local*)       3 (GLIBC_2.2.5)

Version needs section '.gnu.version_r' contains 2 entries:
 Addr: 0x0000000000000548  Offset: 0x000548  Link: 7 (.dynstr)
  000000: Version: 1  File: libc.so.6  Cnt: 1
  0x0010:   Name: GLIBC_2.2.5  Flags: none  Version: 3
  0x0020: Version: 1  File: libpthread.so.0  Cnt: 1
  0x0030:   Name: GLIBC_2.3.2  Flags: none  Version: 2

The gnu dynamic linker allows to inspect the version processing during runtime by setting the LD_DEBUG environment variable accordingly.

# version: Display version dependencies.
> LD_DEBUG=versions ./tmp
    717904: checking for version `GLIBC_2.2.5' in file /usr/lib/libc.so.6 [0] required by file ./tmp [0]
    717904: checking for version `GLIBC_2.3.2' in file /usr/lib/libpthread.so.0 [0] required by file ./tmp [0]
    ...

#  symbols : Display symbol table processing.
#  bindings: Display information about symbol binding.
> LD_DEBUG=symbols,bindings ./tmp
    ...
    718123: symbol=pthread_cond_wait;  lookup in file=./tmp [0]
    718123: symbol=pthread_cond_wait;  lookup in file=/usr/lib/libpthread.so.0 [0]
    718123: binding file ./tmp [0] to /usr/lib/libpthread.so.0 [0]: normal symbol `pthread_cond_wait' [GLIBC_2.3.2]

Example: version script

The following shows an example C++ library libfoo which provides the same symbol multiple times but in different versions.

// file: libfoo.cc
#include<stdio.h>

// Bind function symbols to version nodes.
//
// ..@       -> Is the unversioned symbol.
// ..@@..    -> Is the default symbol.

__asm__(".symver func_v0,func@");
__asm__(".symver func_v1,func@LIB_V1");
__asm__(".symver func_v2,func@@LIB_V2");

extern "C" {
    void func_v0() { puts("func_v0"); }
    void func_v1() { puts("func_v1"); }
    void func_v2() { puts("func_v2"); }
}

__asm__(".symver _Z11func_cpp_v1i,_Z8func_cppi@LIB_V1");
__asm__(".symver _Z11func_cpp_v2i,_Z8func_cppi@@LIB_V2");

void func_cpp_v1(int) { puts("func_cpp_v1"); }
void func_cpp_v2(int) { puts("func_cpp_v2"); }

void func_cpp(int) { puts("func_cpp_v2"); }

Version script for libfoo which defines which symbols for which versions are exported from the ELF file.

# file: libfoo.ver
LIB_V1 {
    global:
        func;
        extern "C++" {
            "func_cpp(int)";
        };
    local:
        *;
};

LIB_V2 {
    global:
        func;
        extern "C++" {
            "func_cpp(int)";
        };
} LIB_V1;

The local: section in LIB_V1 is a catch all, that matches any symbol not explicitly specified, and defines that the symbol is local and therefore not exported from the ELF file.

The library libfoo can be linked with the version definitions in libfoo.ver by passing the version script to the linker with the --version-script flag.

> g++ -shared -fPIC -o libfoo.so libfoo.cc -Wl,--version-script=libfoo.ver
> readelf -W --dyn-syms libfoo.so | c++filt

Symbol table '.dynsym' contains 14 entries:
   Num:    Value          Size Type    Bind   Vis      Ndx Name
   ...
     6: 0000000000000000     0 OBJECT  GLOBAL DEFAULT  ABS LIB_V1
     7: 000000000000114b    29 FUNC    GLOBAL DEFAULT   13 func_cpp(int)@LIB_V1
     8: 0000000000001168    29 FUNC    GLOBAL DEFAULT   13 func_cpp(int)@@LIB_V2
     9: 0000000000001185    29 FUNC    GLOBAL DEFAULT   13 func_cpp(int)@@LIB_V1
    10: 0000000000000000     0 OBJECT  GLOBAL DEFAULT  ABS LIB_V2
    11: 0000000000001109    22 FUNC    GLOBAL DEFAULT   13 func
    12: 000000000000111f    22 FUNC    GLOBAL DEFAULT   13 func@LIB_V1
    13: 0000000000001135    22 FUNC    GLOBAL DEFAULT   13 func@@LIB_V2

The following program demonstrates how to make use of the different versions:

// file: main.cc
#include <dlfcn.h>
#include <assert.h>

// Links against default symbol in the lib.so.
extern "C" void func();

int main() {
    // Call the default version.
    func();

#ifdef _GNU_SOURCE
    typedef void (*fnptr)();

    // Unversioned lookup.
    fnptr fn_v0 = (fnptr)dlsym(RTLD_DEFAULT, "func");
    // Version lookup.
    fnptr fn_v1 = (fnptr)dlvsym(RTLD_DEFAULT, "func", "LIB_V1");
    fnptr fn_v2 = (fnptr)dlvsym(RTLD_DEFAULT, "func", "LIB_V2");

    assert(fn_v0 != 0);
    assert(fn_v1 != 0);
    assert(fn_v2 != 0);

    fn_v0();
    fn_v1();
    fn_v2();
#endif

    return 0;
}

Compiling and running results in:

> g++ -o main main.cc -ldl ./libfoo.so && ./main
func_v2
func_v0
func_v1
func_v2

References

python

Decorator [run]

Some decorator examples with type annotation.

from typing import Callable

def log(f: Callable[[int], None]) -> Callable[[int], None]:
    def inner(x: int):
        print(f"log::inner f={f.__name__} x={x}")
        f(x)
    return inner

@log
def some_fn(x: int):
    print(f"some_fn x={x}")


def log_tag(tag: str) -> Callable[[Callable[[int], None]], Callable[[int], None]]:
    def decorator(f: Callable[[int], None]) -> Callable[[int], None]:
        def inner(x: int):
            print(f"log_tag::inner f={f.__name__} tag={tag} x={x}")
            f(x)
        return inner
    return decorator

@log_tag("some_tag")
def some_fn2(x: int):
    print(f"some_fn2 x={x}")

Walrus operator [run]

Walrus operator := added since python 3.8.

from typing import Optional

# Example 1: if let statements

def foo(ret: Optional[int]) -> Optional[int]:
    return ret

if r := foo(None):
    print(f"foo(None) -> {r}")

if r := foo(1337):
    print(f"foo(1337) -> {r}")

# Example 2: while let statements

toks = iter(['a', 'b', 'c'])
while tok := next(toks, None):
    print(f"{tok}")

# Example 3: list comprehension

print([tok for t in ["  a", "  ", " b "] if (tok := t.strip())])

Unittest [run]

Run unittests directly from the command line as
python3 -m unittest -v test

Optionally pass -k <patter> to only run subset of tests.

# file: test.py

import unittest

class MyTest(unittest.TestCase):
    def setUp(self):
        pass
    def tearDown(self):
        pass
    # Tests need to start with the prefix 'test'.
    def test_foo(self):
        self.assertEqual(1 + 2, 3)
    def test_bar(self):
        with self.assertRaises(IndexError):
            list()[0]

Doctest [run]

Run doctests directly from the command line as
python -m doctest -v test.py

# file: test.py

def sum(a: int, b: int) -> int:
    """Sum a and b.

    >>> sum(1, 2)
    3

    >>> sum(10, 20)
    30
    """
    return a + b

timeit

Micro benchmarking.

python -m timeit '[x.strip() for x in ["a ", " b"]]'

Linux

systemd

systemctl

Inspect units:

systemctl [opts] [cmd]
[opts]
    --user

[cmd]
    list-units <pattern>    List units in memory

    status <unit>           Show runtime status of unit

    start <unit>            Start a unit
    stop <unit>             Stop a unit
    restart <unit>          Restart a unit
    reload <unit>           Reload a unit

    enable <unit>           Enable a unit (persistent)
    disable <unit>          Disable a unit

    cat <unit>      Print unit file
    show <unit>     Show properties of unit

Example: Trivial user unit

# Generate unit
mkdir -p ~/.config/systemd/user
echo '[Unit]
Description=Test logger

[Service]
Type=oneshot
ExecStart=logger "Hello from test unit"' > ~/.config/systemd/user/test.service

# Run unit
systemctl --user start test

# See log message
journalctl --user -u test -n 5

journalctl

Inspect journal logs:

journalctl [opts] [matches]
    --user          Current user journal (system by default)
    -u <unit>       Show logs for specified <unit>
    -n <lines>      Show only last <lines>
    -f              Follow journal
    -g <pattern>    Grep for <pattern>

Cleanup:

journalctl [opts]
    --disk-usage            Show current disk usage
    --vacuum-size=<size>    Reduce journal log to <size> (K/M/G)

References

core(5)

There are multiple requirements that must be satisfied that coredumps are being generated, a full list can be found in core(5).

An important one is to configure the soft resource limit RLMIT_CORE (typically as unlimited during debugging). In a typical bash/zsh this can be done as

ulimit -Sc unlimited

Naming of coredump files

There are two important kernel configs to control the naming:

/proc/sys/kernel/core_pattern
    <pattern>    => Specifies a name pattern for the coredump file. This can
                    include certain FORMAT specifier.
    |<cmdline>   => Coredump is pipe through stdin to the user space process
                    specified by the cmdline, this can also contain FORMAT specifier.

  FORMAT specifier (full list, see core(5)):
    %E      Pathname of the executable ('/' replaced by '!').
    %p      PID of the dumping process in its pid namespace.
    %P      PID of the dumping process in the initial pid namespace.
    %u      Real UID of dumping process.
    %s      Signal number causing the dump.


/proc/sys/kernel/core_uses_pid
    1  => Append ".<pid>" suffic to the coredump file name
          (pid of the dumping process).
    0  => Do not append the suffix.

Control which segments are dumped

Each process has a coredump filter defined in /proc/<pid>/coredump_filter which specifies which memory segments are being dumped. Filters are preseved across fork/exec calls and hence child processes inherit the parents filters.

The filter is a bitmask where 1 indicates to dump the given type.

From core(5):
  bit 0  Dump anonymous private mappings.
  bit 1  Dump anonymous shared mappings.
  bit 2  Dump file-backed private mappings.
  bit 3  Dump file-backed shared mappings.
  bit 4  Dump ELF headers.
  bit 5  Dump private huge pages.
  bit 6  Dump shared huge pages.
  bit 7  Dump private DAX pages.
  bit 8  Dump shared DAX pages.

Default filter 0x33.

Some examples out there

coredumpctl (systemd)

# List available coredumps.
coredumpctl list
    TIME                             PID  UID  GID SIG     COREFILE EXE               SIZE
    ...
    Fri 2022-03-11 12:10:48 CET     6363 1000 1000 SIGSEGV present  /usr/bin/sleep   18.1K

# Get detailed info on specific coredump.
coredumpctl info 6363

# Debug specific coredump.
coredumpctl debug 6363

# Dump specific coredump to file.
coredumpctl dump 6363 -o <file>

apport (ubuntu)

Known crash report locations:

  • /var/crash

To get to the raw coredump, crash reports can be unpacked as:

apport-unpack <crash_repot> <dest_dir>

The coredump resides under <dest_dir>/CoreDump.

ptrace_scope

In case the kernel was compiled with the yama security module (CONFIG_SECURITY_YAMA), tracing processes with ptrace(2) can be restricted.

/proc/sys/kernel/yama/ptrace_scope
    0 => No restrictions.
    1 => Restricted attach, only the following can attach
            - A process in the parent hierarchy.
            - A process with CAP_SYS_PTRACE.
            - A process with the PID that the tracee allowed by via
              PR_SET_PTRACER.
    2 => Only processes with CAP_SYS_PTRACE in the user namespace of the tracee
         can attach.
    3 => No tracing allowed.

Further details in ptrace(2).

Network

tcpdump(1)

CLI

tcpdump [opts] -i <if> [<filter>]
    -n              Don't covert host/port names.
    -w <file|->     Write pcap trace to file or stdout (-).
    -r <file>       Read & parse pcap file.

Some useful filters.

src <ip>                Filter for source IP.
dst <ip>                Filter for destination IP.
host <ip>               Filter for IP (src + dst).
net <ip>/<range>        Filter traffic on subnet.
[src/dst] port <port>   Filter for port (optionally src/dst).
tcp/udp/icmp            Filter for protocol.

Use and/or/not and () to build filter expressions.

Examples

Capture packets from remote host

# -k: Start capturing immediately.
ssh <host> tcpdump -i <IF> -w - | sudo wireshark -k -i -

Arch

x86_64

keywords: x86_64, x86, abi

  • 64bit synonyms: x86_64, x64, amd64, intel 64
  • 32bit synonyms: x86, ia32, i386
  • ISA type: CISC
  • Endianness: little

Registers

General purpose register

bytes
[7:0]      [3:0]   [1:0]   [1]   [0]     desc
----------------------------------------------------------
rax        eax     ax      ah    al      accumulator
rbx        ebx     bx      bh    bl      base register
rcx        ecx     cx      ch    cl      counter
rdx        edx     dx      dh    dl      data register
rsi        esi     si      -     sil     source index
rdi        edi     di      -     dil     destination index
rbp        ebp     bp      -     bpl     base pointer
rsp        esp     sp      -     spl     stack pointer
r8-15      rNd     rNw     -     rNb

Special register

bytes
[7:0]      [3:0]     [1:0]      desc
---------------------------------------------------
rflags     eflags    flags      flags register
rip        eip       ip         instruction pointer

FLAGS register

rflags
bits    desc                            instr        comment
--------------------------------------------------------------------------------------------------------------
   [21]   ID   identification                        ability to set/clear -> indicates support for CPUID instr
   [18]   AC   alignment check                       alignment exception for PL 3 (user), requires CR0.AM
[13:12] IOPL   io privilege level
   [11]   OF   overflow flag
   [10]   DF   direction flag           cld/std      increment (0) or decrement (1) registers in string operations
    [9]   IF   interrupt enable         cli/sti
    [7]   SF   sign flag
    [6]   ZF   zero flag
    [4]   AF   auxiliary carry flag
    [2]   PF   parity flag
    [0]   CF   carry flag

Change flag bits with pushf / popf instructions:

pushfd                          // push flags (4bytes) onto stack
or dword ptr [esp], (1 << 18)   // enable AC flag
popfd                           // pop flags (4byte) from stack

There is also pushfq / popfq to push and pop all 8 bytes of rflags.

Model Specific Register (MSR)

rdmsr     // Read MSR register, effectively does EDX:EAX <- MSR[ECX]
wrmsr     // Write MSR register, effectively does MSR[ECX] <- EDX:EAX

Size directives

Explicitly specify size of the operation.

mov  byte ptr [rax], 0xff    // save 1 byte(s) at [rax]
mov  word ptr [rax], 0xff    // save 2 byte(s) at [rax]
mov dword ptr [rax], 0xff    // save 4 byte(s) at [rax]
mov qword ptr [rax], 0xff    // save 8 byte(s) at [rax]

Addressing

mov qword ptr [rax], rbx         // save val in rbx at [rax]
mov qword ptr [imm], rbx         // save val in rbx at [imm]
mov rax, qword ptr [rbx+4*rcx]   // load val at [rbx+4*rcx] into rax

rip relative addressing:

lea rax, [rip+.my_str]       // load addr of .my_str into rax
...
.my_str:
.asciz "Foo"

String instructions

The operand size of a string instruction is defined by the instruction suffix b | w | d | q.

Source and destination registers are modified according to the direction flag (DF) in the flags register

  • DF=0 increment src/dest registers
  • DF=1 decrement src/dest registers

Following explanation assumes byte operands with DF=0:

movsb   // move data from string to string
        // ES:[DI] <- DS:[SI]
        // DI <- DI + 1
        // SI <- SI + 1

lodsb   // load string
        // AL <- DS:[SI]
        // SI <- SI + 1

stosb   // store string
        // ES:[DI] <- AL
        // DI <- DI + 1

cmpsb   // compare string operands
        // DS:[SI] - ES:[DI]    ; set status flag (eg ZF)
        // SI <- SI + 1
        // DI <- DI + 1

scasb   // scan string
        // AL - ES:[DI]         ; set status flag (eg ZF)
        // DI <- DI + 1

String operations can be repeated:

rep     // repeat until rcx = 0
repz    // repeat until rcx = 0 or while ZF = 0
repnz   // repeat until rcx = 0 or while ZF = 1

Example: Simple memset

// memset (dest, 0xaa /* char */, 0x10 /* len */)

lea di, [dest]
mov al, 0xaa
mov cx, 0x10
rep stosb

SysV x86_64 ABI

Passing arguments to functions

  • Integer/Pointer arguments
    reg     arg
    -----------
    rdi       1
    rsi       2
    rdx       3
    rcx       4
    r8        5
    r9        6
    
  • Floating point arguments
    reg     arg
    -----------
    xmm0      1
      ..     ..
    xmm7      8
    
  • Additional arguments are passed on the stack. Arguments are pushed right-to-left (RTL), meaning next arguments are closer to current rsp.

Return values from functions

  • Integer/Pointer return values
    reg          size
    -----------------
    rax        64 bit
    rax+rdx   128 bit
    
  • Floating point return values:
    reg            size
    -------------------
    xmm0         64 bit
    xmm0+xmm1   128 bit
    

Caller saved registers

Caller must save these registers if they should be preserved across function calls.

  • rax
  • rcx
  • rdx
  • rsi
  • rdi
  • rsp
  • r8 - r11

Callee saved registers

Caller can expect these registers to be preserved across function calls. Callee must must save these registers in case they are used.

  • rbx
  • rbp
  • r12r15

Stack

  • grows downwards
  • frames aligned on 16 byte boundary
    Hi ADDR
     |                +------------+
     |                | prev frame |
     |                +------------+ <--- 16 byte aligned (X & ~0xf)
     |       [rbp+8]  | saved RIP  |
     |       [rbp]    | saved RBP  |
     |       [rbp-8]  | func stack |
     |                | ...        |
     v                +------------+
    Lo ADDR
    

Function prologue & epilogue

  • prologue
    push rbp        // save caller base pointer
    mov rbp, rsp    // save caller stack pointer
    
  • epilogue
    mov rsp, rbp    // restore caller stack pointer
    pop rbp         // restore caller base pointer
    

    Equivalent to leave instruction.

ASM skeleton

Small assembler skeleton, ready to use with following properties:

  • use raw Linux syscalls (man 2 syscall for ABI)
  • no C runtime (crt)
  • gnu assembler gas
  • intel syntax
# file: greet.s

    .intel_syntax noprefix

    .section .text, "ax", @progbits
    .global _start
_start:
    mov rdi, 1                      # fd
    lea rsi, [rip + greeting]       # buf
    mov rdx, [rip + greeting_len]   # count
    mov rax, 1                      # write(2) syscall nr
    syscall

    mov rdi, 0                      # exit code
    mov rax, 60                     # exit(2) syscall nr
    syscall

    .section .rdonly, "a", @progbits
greeting:
    .asciz "Hi ASM-World!\n"
greeting_len:
    .int .-greeting

Syscall numbers are defined in /usr/include/asm/unistd.h.

To compile and run:

> gcc -o greet greet.s -nostartfiles -nostdlib && ./greet
Hi ASM-World!

References

arm64

keywords: arm64, aarch64, abi

  • 64bit synonyms: arm64, aarch64
  • ISA type: RISC
  • Endianness: little, big

Registers

General purpose registers

bytes
[7:0]     [3:0]     desc
---------------------------------------------
x0-x28    w0-w28    general purpose registers
x29       w29       frame pointer (FP)
x30       w30       link register (LR)
sp        wsp       stack pointer (SP)
pc                  program counter (PC)
xzr       wzr       zero register

Write to wN register clears upper 32bit.

Special registers per EL

bytes
[7:0]       desc
---------------------------------------------
sp_el0      stack pointer EL0

sp_el1      stack pointer EL1
elr_el1     exception link register EL1
spsr_el1    saved process status register EL1

sp_el2      stack pointer EL2
elr_el2     exception link register EL2
spsr_el2    saved process status register EL2

sp_el3      stack pointer EL3
elr_el3     exception link register EL3
spsr_el3    saved process status register EL3

Instructions cheatsheet

Accessing system registers

Reading from system registers:

mrs x0, vbar_el1      // move vbar_el1 into x0

Writing to system registers:

msr vbar_el1, x0      // move x0 into vbar_el1

Control Flow

b <offset>    // relative forward/back branch
br <Xn>       // absolute branch to address in register Xn

// branch & link, store return address in X30 (LR)
bl <offset>   // relative forward/back branch
blr <Xn>      // absolute branch to address in register Xn

ret {Xn}      // return to address in X30, or Xn if supplied

Addressing

Offset

ldr x0, [x1]                // x0 = [x1]
ldr x0, [x1, 8]             // x0 = [x1 + 8]
ldr x0, [x1, x2, lsl #3]    // x0 = [x1 + (x2<<3)]
ldr x0, [x1, w2, stxw]      // x0 = [x1 + sign_ext(w2)]
ldr x0, [x1, w2, stxw #3]   // x0 = [x1 + (sign_ext(w2)<<3)]

Shift amount can either be 0 or log2(access_size_bytes). Eg for 8byte access it can either be {0, 3}.

Index

ldr x0, [x1, 8]!    // pre-inc : x1+=8; x0 = [x1]
ldr x0, [x1], 8     // post-inc: x0 = [x1]; x1+=8

Pair access

ldp x1, x2, [x0]    // x1 = [x0]; x2 = [x0 + 8]
stp x1, x2, [x0]    // [x0] = x1; [x0 + 8] = x2

Procedure Call Standard ARM64 (aapcs64)

Passing arguments to functions

  • Integer/Pointer arguments
    reg     arg
    -----------
    x0        1
    ..       ..
    x7        8
    
  • Additional arguments are passed on the stack. Arguments are pushed right-to-left (RTL), meaning next arguments are closer to current sp.
    void take(..., int a9, int a10);
                       |       |   | ... |       Hi
                       |       +-->| a10 |       |
                       +---------->| a9  | <-SP  |
                                   +-----+       v
                                   | ... |       Lo
    

Return values from functions

  • Integer/Pointer return values
    reg          size
    -----------------
    x0         64 bit
    

Callee saved registers

  • x19 - x28
  • SP

Stack

  • full descending
    • full: sp points to the last used location (valid item)
    • descending: stack grows downwards
  • sp must be 16byte aligned when used to access memory for r/w
  • sp must be 16byte aligned on public interface interfaces

Frame chain

  • linked list of stack-frames
  • each frame links to the frame of its caller by a frame record
    • a frame record is described as a (FP,LR) pair
  • x29 (FP) must point to the frame record of the current stack-frame
          +------+                   Hi
          |   0  |     frame0        |
       +->|   0  |                   |
       |  |  ... |                   |
       |  +------+                   |
       |  |  LR  |     frame1        |
       +--|  FP  |<-+                |
          | ...  |  |                |
          +------+  |                |
          |  LR  |  |  current       |
    x29 ->|  FP  |--+  frame         v
          | ...  |                   Lo
    
  • end of the frame chain is indicated by following frame record (0,-)
  • location of the frame record in the stack frame is not specified

Function prologue & epilogue

  • prologue
    sub sp, sp, 16
    stp x29, x30, [sp]      // [sp] = x29; [sp + 8] = x30
    mov x29, sp             // FP points to frame record
    
  • epilogue
    ldp x29, x30, [sp]      // x29 = [sp]; x30 = [sp + 8]
    add sp, sp, 16
    ret
    

ASM skeleton

Small assembler skeleton, ready to use with following properties:

  • use raw Linux syscalls (man 2 syscall for ABI)
  • no C runtime (crt)
  • gnu assembler gas
// file: greet.S

#include <asm/unistd.h>      // syscall NRs

    .arch armv8-a

    .section .text, "ax", @progbits
    .balign 4                // align code on 4byte boundary
    .global _start
_start:
    mov x0, 2                // fd
    ldr x1, =greeting        // buf
    ldr x2, =greeting_len    // &len
    ldr x2, [x2]             // len
    mov w8, __NR_write       // write(2) syscall
    svc 0

    mov x0, 0                // exit code
    mov w8, __NR_exit        // exit(2) syscall
    svc 0

    .balign 8                // align data on 8byte boundary
    .section .rodata, "a", @progbits
greeting:
    .asciz "Hi ASM-World!\n"
greeting_len:
    .int .-greeting

man gcc: file.S assembler code that must be preprocessed.

To cross-compile and run:

> aarch64-linux-gnu-g++ -o greet greet.S -nostartfiles -nostdlib          \
    -Wl,--dynamic-linker=/usr/aarch64-linux-gnu/lib/ld-linux-aarch64.so.1 \
  && qemu-aarch64 ./greet
Hi ASM-World!

Cross-compiling on Ubuntu 20.04 (x86_64), paths might differ on other distributions. Explicitly specifying the dynamic linker should not be required when compiling natively on arm64.

References

armv7a

keywords: arm, armv7, abi

  • ISA type: RISC
  • Endianness: little, big

Registers

General purpose registers

bytes
[3:0]     alt     desc
---------------------------------------------
r0-r12            general purpose registers
r11       fp
r13       sp      stack pointer
r14       lr      link register
r15       pc      program counter

Special registers

bytes
[3:0]             desc
---------------------------------------------
cpsr              current program status register

CPSR register

cpsr
bits  desc
-----------------------------
 [31]  N negative flag
 [30]  Z zero flag
 [29]  C carry flag
 [28]  V overflow flag
 [27]  Q cummulative saturation (sticky)
  [9]  E load/store endianness
  [8]  A disable asynchronous aborts
  [7]  I disable IRQ
  [6]  F disable FIQ
  [5]  T indicate Thumb state
[4:0]  M process mode (USR, FIQ, IRQ, SVC, ABT, UND, SYS)

Instructions cheatsheet

Accessing system registers

Reading from system registers:

mrs r0, cpsr      // move cpsr into r0

Writing to system registers:

msr cpsr, r0      // move r0 into cpsr

Control Flow

b <lable>     // relative forward/back branch
bl <lable>    // relative forward/back branch & link return addr in r14 (LR)

// branch & exchange (can change between ARM & Thumb instruction set)
//   bit Rm[0] == 0 -> ARM
//   bit Rm[0] == 1 -> Thumb
bx <Rm>       // absolute branch to address in register Rm
blx <Rm>      // absolute branch to address in register Rm &
              // link return addr in r14 (LR)

Load/Store

Different addressing modes.

ldr r1, [r0]                // r1 = [r0]
ldr r1, [r0, #4]            // r1 = [r0+4]

ldr r1, [r0, #4]!           // pre-inc : r0+=4; r1 = [r0]
ldr r1, [r0], #4            // post-inc: [r0] = r1; r0+=4

ldr r0, [r1, r2, lsl #3]    // r0 = [r1 + (r2<<3)]

Load/store multiple registers full-descending.

stmfd r0!, {r1-r2, r5}    // r0-=4; [r0]=r5
                          // r0-=4; [r0]=r2
                          // r0-=4; [r0]=r1
ldmfd r0!, {r1-r2, r5}    // r1=[r0]; r0+=4
                          // r2=[r0]; r0+=4
                          // r5=[r0]; r0+=4

! is optional but has the effect to update the base pointer register r0 here.

Push/Pop

push {r0-r2}    // effectively stmfd sp!, {r0-r2}
pop {r0-r2}     // effectively ldmfd sp!, {r0-r2}

Procedure Call Standard ARM (aapcs32)

Passing arguments to functions

  • integer/pointer arguments
    reg     arg
    -----------
    r0        1
    ..       ..
    r3        4
    
  • a double word (64bit) is passed in two consecutive registers (eg r1+r2)
  • additional arguments are passed on the stack. Arguments are pushed right-to-left (RTL), meaning next arguments are closer to current sp.
    void take(..., int a5, int a6);
                       |       |   | ... |       Hi
                       |       +-->| a6  |       |
                       +---------->| a5  | <-SP  |
                                   +-----+       v
                                   | ... |       Lo
    

Return values from functions

  • integer/pointer return values
    reg          size
    -----------------
    r0         32 bit
    r0+r1      64 bit
    

Callee saved registers

  • r4 - r11
  • sp

Stack

  • full descending
    • full: sp points to the last used location (valid item)
    • descending: stack grows downwards
  • sp must be 4byte aligned (word boundary) at all time
  • sp must be 8byte aligned on public interface interfaces

Frame chain

  • not strictly required by each platform
  • linked list of stack-frames
  • each frame links to the frame of its caller by a frame record
    • a frame record is described as a (FP,LR) pair (2x32bit)
  • r11 (FP) must point to the frame record of the current stack-frame
          +------+                   Hi
          |   0  |     frame0        |
       +->|   0  |                   |
       |  |  ... |                   |
       |  +------+                   |
       |  |  LR  |     frame1        |
       +--|  FP  |<-+                |
          | ...  |  |                |
          +------+  |                |
          |  LR  |  |  current       |
    r11 ->|  FP  |--+  frame         v
          | ...  |                   Lo
    
  • end of the frame chain is indicated by following frame record (0,-)
  • location of the frame record in the stack frame is not specified
  • r11 is not updated before the new frame record is fully constructed

Function prologue & epilogue

  • prologue
    push {fp, lr}
    mov fp, sp              // FP points to frame record
    
  • epilogue
    pop {fp, pc}            // pop LR directly into PC
    

ASM skeleton

Small assembler skeleton, ready to use with following properties:

  • use raw Linux syscalls (man 2 syscall for ABI)
  • no C runtime (crt)
  • gnu assembler gas
// file: greet.S

#include <asm/unistd.h>      // syscall NRs

    .arch armv7-a

    .section .text, "ax"
    .balign 4

    // Emit `arm` instructions, same as `.arm` directive.
    .code 32
    .global _start
_start:
    // Branch with link and exchange instruction set.
    blx _do_greet

    mov r0, #0               // exit code
    mov r7, #__NR_exit       // exit(2) syscall
    swi 0x0

    // Emit `thumb` instructions, same as `.thumb` directive.
    .code 16
    .thumb_func
_do_greet:
    mov r0, #2               // fd
    ldr r1, =greeting        // buf
    ldr r2, =greeting_len    // &len
    ldr r2, [r2]             // len
    mov r7, #__NR_write      // write(2) syscall
    swi 0x0

    // Branch and exchange instruction set.
    bx lr

    .balign 8                // align data on 8byte boundary
    .section .rodata, "a"
greeting:
    .asciz "Hi ASM-World!\n"
greeting_len:
    .int .-greeting

man gcc: file.S assembler code that must be preprocessed.

To cross-compile and run:

> arm-linux-gnueabi-gcc -o greet greet.S -nostartfiles -nostdlib  \
    -Wl,--dynamic-linker=/usr/arm-linux-gnueabi/lib/ld-linux.so.3 \
  && qemu-arm ./greet
Hi ASM-World!

Cross-compiling on Ubuntu 20.04 (x86_64), paths might differ on other distributions. Explicitly specifying the dynamic linker should not be required when compiling natively on arm.

References

riscv

keywords: rv32, rv64

  • ISA type: RISC
  • Endianness: little, big

Registers

  • riscv32 => XLEN=32
  • riscv64 => XLEN=64

General purpose registers

[XLEN-1:0]     abi name     desc
---------------------------------------------
x0             zero         zero register
x1             ra           return addr
x2             sp           stack ptr
x3             gp           global ptr
x4             tp           thread ptr
x5-x7          t0-t2        temp regs
x8-x9          s0-s1        saved regs
x10-x17        a0-a7        arg regs
x18-x27        s2-s11       saved regs
x28-x31        t3-t6        temp regs

ASM skeleton

Small assembler skeleton, ready to use with following properties:

  • use raw Linux syscalls (man 2 syscall for ABI)
  • no C runtime (crt)
  • gnu assembler gas
// file: greet.S

#include <asm/unistd.h>     // syscall NRs

    .section .text, "ax", @progbits
    .balign 4               // align code on 4byte boundary
    .global _start
_start:
    li a0, 2                // fd
    la a1, greeting         // buf
    ld a2, (greeting_len)   // &len
    li a7, __NR_write       // write(2) syscall
    ecall

    li a0, 42               // exit code
    li a7, __NR_exit        // exit(2) syscall
    ecall

    .balign 8               // align data on 8byte boundary
    .section .rodata, "a", @progbits
greeting:
    .asciz "Hi ASM-World!\n"
greeting_len:
    .int .-greeting

man gcc: file.S assembler code that must be preprocessed.

To cross-compile and run:

> riscv64-linux-gnu-gcc -o greet greet.S -nostartfiles -nostdlib                \
    -Wl,--dynamic-linker=/usr/riscv64-linux-gnu/lib/ld-linux-riscv64-lp64d.so.1 \
  && qemu-riscv64 ./greet
Hi ASM-World!

Cross-compiling on Ubuntu 20.04 (x86_64), paths might differ on other distributions. Explicitly specifying the dynamic linker should not be required when compiling natively on riscv.

Select dynamic linker according to abi used during compile & link.

References