Notes
A personal collection of notes and cheatsheets.
Source code is located at johannst/notes.
Tools
zsh(1)
Keybindings
Change input mode:
bindkey -v change to vi keymap
bindkey -e change to emacs keymap
Define key-mappings:
bindkey list mappings in current keymap
bindkey in-str cmd create mapping for `in-str` to `cmd`
bindkey -r in-str remove binding for `in-str`
# C-v <key> dump <key> code, which can be used in `in-str`
# zle -l list all functions for keybindings
# man zshzle(1) STANDARD WIDGETS: get description of functions
Access edit buffer in zle widget:
$BUFFER # Entire edit buffer content
$LBUFFER # Edit buffer content left to cursor
$RBUFFER # Edit buffer content right to cursor
# create zle widget which adds text right of the cursor
function add-text() {
RBUFFER="some text $RBUFFER"
}
zle -N add-text
bindkey "^p" add-text
Parameter
Default value:
# default value
echo ${foo:-defval} # defval
foo=bar
echo ${foo:-defval} # bar
Alternative value:
echo ${foo:+altval} # ''
foo=bar
echo ${foo:+altval} # altval
Check variable set, error if not set:
echo ${foo:?msg} # print `msg` and return errno `1`
foo=bar
echo ${foo:?msg} # bar
Sub-string ${var:offset:length}
:
foo=abcdef
echo ${foo:1:3} # bcd
Trim prefix ${var#prefix}
:
foo=bar.baz
echo ${foo#bar} # .baz
Trim suffix ${var%suffix}
:
foo=bar.baz
echo ${foo%.baz} # bar
Substitute pattern ${var/pattern/replace}
:
foo=aabbccbbdd
echo ${foo/bb/XX} # aaXXccbbdd
echo ${foo//bb/XX} # aaXXccXXdd
# replace prefix
echo ${foo/#bb/XX} # aabbccbbdd
echo ${foo/#aa/XX} # XXbbccbbdd
# replace suffix
echo ${foo/%bb/XX} # aabbccbbdd
echo ${foo/%dd/XX} # aabbccbbXX
Note:
prefix
/suffix
/pattern
are expanded as pathnames.
Variables
# Variable with local scope
local var=val
# Read-only variable
readonly var=bal
Indexed arrays:
arr=(aa bb cc dd)
echo $arr[1] # aa
echo $arr[-1] # dd
arr+=(ee)
echo $arr[-1] # ee
echo $arr[1,3] # aa bb cc
Associative arrays:
typeset -A arr
arr[x]='aa'
arr[y]='bb'
echo $arr[x] # aa
Tied arrays:
typeset -T VEC vec=(1 2 3) '|'
echo $vec # 1 2 3
echo $VEC # 1|2|3
Unique arrays (set):
typeset -U vec=(1 2 3)
echo $vec # 1 2 3
vec+=(1 2 4)
echo $vec # 1 2 3 4
Expansion Flags
Join array to string j:sep:
:
foo=(1 2 3 4)
echo ${(j:-:)foo} # 1-2-3-4
echo ${(j:\n:)foo} # join with new lines
Split string to array s:sep
:
foo='1-2-3-4'
bar=(${(s:-:)foo}) # capture as array
echo $bar # 1 2 3 4
echo $bar[2] # 2
Upper/Lower case string:
foo=aaBB
echo ${(L)foo} # aabb
echo ${(U)foo} # AABB
Argument parsing with zparseopts
zparseopts [-D] [-E] [-A assoc] specs
Arguments are copied into the associative array assoc
according to specs
.
Each spec is described by an entry as opt[:][=array]
.
opt
is the option without the-
char. Passing-f
is matched againstf
opt,--long
is matched against-long
.- Using
:
means the option will take an argument. - The optional
=array
specifies an alternate storage container where this option should be stored.
Documentation can be found in
man zshmodules
.
Example
#!/bin/zsh
function test() {
zparseopts -D -E -A opts f=flag o: -long:
echo "flag $flag"
echo "o $opts[-o]"
echo "long $opts[--long]"
echo "pos $1"
}
test -f -o OPTION --long LONG_OPT POSITIONAL
# Outputs:
# flag -f
# o OPTION
# long LONG_OPT
# pos POSITIONAL
Regular Expressions
Zsh supports regular expression matching with the binary operator =~
.
The match results can be accessed via the $MATCH
variable and
$match
indexed array:
$MATCH
contains the full match$match[1]
contains match of the first capture group
INPUT='title foo : 1234'
REGEX='^title (.+) : ([0-9]+)$'
if [[ $INPUT =~ $REGEX ]]; then
echo "$MATCH" # title foo : 1234
echo "$match[1]" # foo
echo "$match[2]" # 1234
fi
Completion
Installation
Completion functions are provided via files and need to be placed in a location
covered by $fpath
. By convention the completion files are names as _<CMD>
.
A completion skeleton for the command foo
, stored in _foo
#compdef _foo foo
function _foo() {
...
}
Alternatively one can install a completion function explicitly by calling compdef <FUNC> <CMD>
.
Completion Variables
Following variables are available in Completion functions:
$words # array with command line in words
$#words # number words
$CURRENT # index into $words for cursor position
$words[CURRENT-1] # previous word (relative to cursor position)
Completion Functions
_describe
simple completion, just words + description_arguments
sophisticated completion, allow to specify actions
Completion with _describe
_describe MSG COMP
MSG
simple string with header messageCOMP
array of completions where each entry is"opt:description"
function _foo() {
local -a opts
opts=('bla:desc for bla' 'blu:desc for blu')
_describe 'foo-msg' opts
}
compdef _foo foo
foo <TAB><TAB>
-- foo-msg --
bla -- desc for bla
blu -- desc for blu
Completion with _arguments
_arguments SPEC [SPEC...]
where SPEC
can have one of the following forms:
OPT[DESC]:MSG:ACTION
N:MSG:ACTION
Available actions
(op1 op2) list possible matches
->VAL set $state=VAL and continue, `$state` can be checked later in switch case
FUNC call func to generate matches
{STR} evaluate `STR` to generate matches
Example
Skeleton to copy/paste for writing simple completions.
Assume a program foo
with the following interface:
foo -c green|red|blue -s low|high -f <file> -d <dir> -h
The completion handler could be implemented as follows in a file called _foo
:
#compdef _foo foo
function _foo_color() {
local colors=()
colors+=('green:green color')
colors+=('red:red color')
colors+=('blue:blue color')
_describe "color" colors
}
function _foo() {
_arguments \
"-c[define color]:color:->s_color" \
"-s[select sound]:sound:(low high)" \
"-f[select file]:file:_files" \
"-d[select dir]:dir:_files -/" \
"-h[help]"
case $state in
s_color) _foo_color;;
esac
}
_files
is a zsh builtin utility function to complete files/dirs see
bash(1)
Expansion
Generator
# generate sequence from n to m
{n..m}
# generate sequence from n to m step by s
{n..m..s}
# expand cartesian product
{a,b}{c,d}
Parameter
# default value
bar=${foo:-some_val} # if $foo set, then bar=$foo else bar=some_val
# alternate value
bar=${foo:+bla $foo} # if $foo set, then bar="bla $foo" else bar=""
# check param set
bar=${foo:?msg} # if $foo set, then bar=$foo else exit and print msg
# indirect
FOO=foo
BAR=FOO
bar=${!BAR} # deref value of BAR -> bar=$FOO
# prefix
${foo#prefix} # remove prefix when expanding $foo
# suffix
${foo%suffix} # remove suffix when expanding $foo
# substitute
${foo/pattern/string} # replace pattern with string when expanding foo
# pattern starts with
# '/' replace all occurences of pattern
# '#' pattern match at beginning
# '%' pattern match at end
Note:
prefix
/suffix
/pattern
are expanded as pathnames.
Pathname
* match any string
? match any single char
\\ match backslash
[abc] match any char of 'a' 'b' 'c'
[a-z] match any char between 'a' - 'z'
[^ab] negate, match all not 'a' 'b'
[:class:] match any char in class, available:
alnum,alpha,ascii,blank,cntrl,digit,graph,lower,
print,punct,space,upper,word,xdigit
Wit extglob
shell option enabled it is possible to have more powerful
patterns. In the following pattern-list
is one ore more patterns separated
by |
char.
?(pattern-list) matches zero or one occurrence of the given patterns
*(pattern-list) matches zero or more occurrences of the given patterns
+(pattern-list) matches one or more occurrences of the given patterns
@(pattern-list) matches one of the given patterns
!(pattern-list) matches anything except one of the given patterns
Note:
shopt -s extglob
/shopt -u extglob
to enable/disableextglob
option.
I/O redirection
Note: The trick with bash I/O redirection is to interpret from left-to-right.
# stdout & stderr to file
command >file 2>&1
# equivalent
command &>file
# stderr to stdout & stdout to file
command 2>&1 >file
Explanation
j>&i
Duplicate fd i
to fd j
, making j
a copy of i
. See dup2(2).
Example:
command 2>&1 >file
- duplicate
fd 1
tofd 2
, effectively redirectingstderr
tostdout
- redirect
stdout
tofile
Argument parsing with getopts
The getopts
builtin uses following global variables:
OPTARG
, value of last option argumentOPTIND
, index of the next argument to process (user must reset)OPTERR
, display errors if set to1
getopts <optstring> <param> [<args>]
<optstring>
specifies the names of supported options, egf:c
f:
means-f
option with an argumentc
means-c
option without an argument
<param>
specifies a variable name whichgetopts
fills with the last parsed option argument<args>
optionally specify argument string to parse, by defaultgetopts
parses$@
Example
#!/bin/bash
function parse_args() {
while getopts "f:c" PARAM; do
case $PARAM in
f) echo "GOT -f $OPTARG";;
c) echo "GOT -c";;
*) echo "ERR: print usage"; exit 1;;
esac
done
# users responsibility to reset OPTIND
OPTIND=1
}
parse_args -f xxx -c
parse_args -f yyy
Regular Expressions
Bash supports regular expression matching with the binary operator =~
.
The match results can be accessed via the $BASH_REMATCH
variable:
${BASH_REMATCH[0]}
contains the full match${BASH_REMATCH[1]}
contains match of the first capture group
INPUT='title foo : 1234'
REGEX='^title (.+) : ([0-9]+)$'
if [[ $INPUT =~ $REGEX ]]; then
echo "${BASH_REMATCH[0]}" # title foo : 1234
echo "${BASH_REMATCH[1]}" # foo
echo "${BASH_REMATCH[2]}" # 1234
fi
Caution: When specifying a
regex
in the[[ ]]
block directly, quotes will be treated as part of the pattern.[[ $INPUT =~ "foo" ]]
will match against"foo"
notfoo
!
Completion
The complete
builtin is used to interact with the completion system.
complete # print currently installed completion handler
complete -F <func> <cmd> # install <func> as completion handler for <cmd>
complete -r <cmd> # uninstall completion handler for <cmd>
Variables available in completion functions:
# in
$1 # <cmd>
$2 # current word
$3 # privous word
COMP_WORDS # array with current command line words
COMP_CWORD # index into COMP_WORDS with current cursor position
# out
COMPREPLY # array with possible completions
The compgen
builtin is used to generate possible matches by comparing word
against words generated by option
.
compgen <option> <word>
# usefule options:
# -W <list> specify list of possible completions
# -d generate list with dirs
# -f generate list with files
# -u generate list with users
# -e generate list with exported variables
# compare "f" against words "foo" "foobar" "bar" and generate matches
compgen -W "foo foobar bar" "f"
# compare "hom" against file/dir names and generate matches
compgen -d -f "hom"
Example
Skeleton to copy/paste for writing simple completions.
Assume a program foo
with the following interface:
foo -c green|red|blue -s low|high -f <file> -h
The completion handler could be implemented as follows:
function _foo() {
local curr=$2
local prev=$3
local opts="-c -s -f -h"
case $prev in
-c) COMPREPLY=( $(compgen -W "green red blue" -- $curr) );;
-s) COMPREPLY=( $(compgen -W "low high" -- $curr) );;
-f) COMPREPLY=( $(compgen -f -- $curr) );;
*) COMPREPLY=( $(compgen -W "$opts" -- $curr) );;
esac
}
complete -F _foo foo
fish(1)
Quick Info
Fish initialization file ~/.config/fish/config.fish
Switch between different key bindings:
fish_default_key_bindings
to use default key bindingsfish_vi_key_bindings
to use vi key bindings
Variables
Available scopes
local
variable local to a blockglobal
variable global to shell instanceuniversal
variable universal to all shell instances + preserved across shell restart
Set/Unset Variables
set <name> [<values>]
-l local scope
-g global scope
-U universal scope
-e erase variable
-S show verbose info
-x export to ENV
-u unexport from ENV
Lists
In fish
all variables are lists (start with index 1
, but lists can't
contain lists.
set foo a b c d
echo $foo[1] # a
echo $foo[-1] # d
echo $foo[2..3] # b c
echo $foo[1 3] # a c
$
can be seen as dereference operator.
set foo a; set a 1337
echo $$foo # outputs 1337
Cartesian product.
echo file.{h,cc}
# file.h file.cc
echo {a,b}{1,2}
# a1 b1 b2
Special Variables (Lists)
$status # exit code of last command
$pipestatus # list of exit codes of pipe chain
$CMD_DURATION # runtime of last command in ms
*PATH
Lists ending with PATH
are automatically split at :
when used and joined
with :
when exported to the environment.
set -x BLA_PATH a:b:c:d
echo $BLA_PATH # a b c d
env | grep BLA_PATH # BLA_PATH=a:b:c:d
Command Handling
# sub-commands are not run in quotes
echo "ls output: "(ls)
I/O redirection
# 'noclobber', fail if 'log' already exists
echo foo >? log
Control Flow
if
/ else
if grep foo bar
# do sth
else if grep foobar bar
# do sth else
else
# do sth else
end
switch
switch (echo foo)
case 'foo*'
# do start with foo
case bar dudel
# do bar and dudel
case '*'
# do else
end
while
Loop
while true
echo foo
end
for
Loop
for f in (ls)
echo $f
end
Functions
Function arguments are passed via $argv
list.
function fn_foo
echo $argv
end
Autoloading
When running a command fish attempts to autoload a function. The shell looks
for <cmd>.fish
in the locations defined by $fish_function_path
and loads
the function lazily if found.
This is the preferred way over monolithically defining all functions in a startup script.
Helper
functions # list al functions
functions foo # describe function 'foo'
functions -e foo # erase function 'foo'
funced foo # edit function 'foo'
# '-e vim' to edit in vim
Prompt
The prompt is defined by the output of the fish_prompt
function.
function fish_prompt
set -l cmd_ret
echo "> "(pwd) $cmd_ret" "
end
Use
set_color
to manipulate terminal colors.
Useful Builtins
# history
history search <str> # search history for <str>
history merge # merge histories from fish sessions
# list
count $var # count elements in list
# string
string split SEP STRING
Keymaps
Shift-Tab ........... tab-completion with search
Alt-Up / Alt-Down ... search history with token under the cursor
Alt-l ............... list content of dir under cursor
Alt-p ............... append '2>&1 | less;' to current cmdline
Debug
status print-stack-trace .. prints function stacktrace (can be used in scripts)
breakpoint ................ halt script execution and gives shell (C-d | exit
to continue)
tmux(1)
Terminology:
session
is a collection of pseudo terminals which can have multiplewindows
window
uses the entire screen and can be split into rectangularpanes
pane
is a single pseudo terminal instance
Tmux cli
# Session
tmux creates new session
tmux ls list running sessions
tmux kill-session -t <s> kill running session <s>
tmux attach -t <s> [-d] attach to session <s>, detach other clients [-d]
tmux detach -s <s> detach all clients from session <s>
# Environment
tmux showenv -g show global tmux environment variables
tmux setenv -g <var> <val> set variable in global tmux env
# Misc
tmux source-file <file> source config <file>
tmux lscm list available tmux commnds
tmux show -g show global tmux options
tmux display <msg> display message in tmux status line
Scripting
# Session
tmux list-sessions -F '#S' list running sessions, only IDs
# Window
tmux list-windows -F '#I' -t <s> list window IDs for session <s>
tmux selectw -t <s>:<w> select window <w> in session <s>
# Pane
tmux list-panes -F '#P' -t <s>:<w> list pane IDs for window <w> in session <s>
tmux selectp -t <s>:<w>.<p> select pane <p> in window <w> in session <s>
# Run commands
tmux send -t <s>:<w>.<p> "ls" C-m send cmds/keys to pane
tmux run -t <p> <sh-cmd> run shell command <sh-cmd> in background and report output on pane -t <p>
For example cycle through all panes in all windows in all sessions:
# bash
for s in $(tmux list-sessions -F '#S'); do
for w in $(tmux list-windows -F '#I' -t $s); do
for p in $(tmux list-panes -F '#P' -t $s:$w); do
echo $s:$w.$p
done
done
done
Bindings
prefix d detach from current session
prefix c create new window
prefix w open window list
prefix $ rename session
prefix , rename window
prefix . move current window
Following bindings are specific to my tmux.conf
:
C-s prefix
# Panes
prefix s horizontal split
prefix v vertical split
prefix f toggle maximize/minimize current pane
# Movement
prefix Tab toggle between window
prefix h move to pane left
prefix j move to pane down
prefix k move to pane up
prefix l move to pane right
# Resize
prefix C-h resize pane left
prefix C-j resize pane down
prefix C-k resize pane up
prefix C-l resize pane right
# Copy/Paste
prefix C-v enter copy mode
prefix C-p paste yanked text
prefix C-b open copy-buffer list
# In Copy Mode
v enable visual mode
y yank selected text
Command mode
To enter command mode prefix :
.
Some useful commands are:
setw synchronize-panes on/off enables/disables synchronized input to all panes
list-keys -t vi-copy list keymaps for vi-copy mode
git(1)
staging
git add -p [<file>] ............ partial staging (interactive)
Remote
git remote -v .................. list remotes verbose (with URLs)
git remote show [-n] <remote> .. list info for <remote> (like remote HEAD,
remote branches, tracking mapping)
Branching
git branch [-a] ................ list available branches; -a to include
remote branches
git branch -vv ................. list branch & annotate with head sha1 &
remote tracking branch
git branch <bname> ............. create local branch with name <bname>
git branch -d <bname> .......... delete local branch with name <bname>
git checkout <bname> ........... switch to branch with name <bname>
git checkout --track <branch> .. start to locally track a remote branch
# Remote
git push -u origin <rbname> ........ push local branch to origin (or other
remote), and setup <rbname> as tracking
branch
git push origin --delete <rbname> .. delete branch <rbname> from origin (or
other remote)
Tags
git tag -a <tname> -m "descr" ........ creates an annotated tag (full object
containing tagger, date, ...)
git tag -l ........................... list available tags
git checkout tag/<tname> ............. checkout specific tag
git checkout tag/<tname> -b <bname> .. checkout specific tag in a new branch
# Remote
git push origin --tags .... push local tags to origin (or other remote)
Log & Commit History
git log --oneline ......... shows log in single line per commit -> alias for
'--pretty=oneline --abbrev-commit'
git log --graph ........... text based graph of commit history
git log --decorate ........ decorate log with REFs
git log -p <file> ......... show commit history + diffs for <file>
git log --oneline <file> .. show commit history for <file> in compact format
Diff & Commit Info
git diff <commit>..<commit> [<file>] .... show changes between two arbitrary
commits. If one <commit> is omitted
it is if HEAD is specified.
git diff -U$(wc -l <file>) <file> ....... shows complete file with diffs
instead of usual diff snippets
git diff --staged ....................... show diffs of staged files
git show --stat <commit> ................ show files changed by <commit>
git show <commit> [<file>] .............. show diffs for <commit>
Patching
git format-patch <opt> <since>/<revision range>
opt:
-N ................... use [PATCH] instead [PATCH n/m] in subject when
generating patch description (for patches spanning
multiple commits)
--start-number <n> ... start output file generation with <n> as start
number instead '1'
since spcifier:
-3 .................. e.g: create a patch from last three commits
<commit hash> ....... create patch with commits starting after <commit hash>
git am <patch> ......... apply patch and create a commit for it
git apply --stat <PATCH> ... see which files the patch would change
git apply --check <PATCH> .. see if the patch can be applied cleanly
git apply <PATCH> .......... apply the patch locally without creating a commit
# eg: generate patches for each commit from initial commit on
git format-patch -N $(git rev-list --max-parents=0 HEAD)
# generate single patch file from a certain commit/ref
git format-patch <COMMIT/REF> --stdout > my-patch.patch
Resetting
git reset [opt] <ref|commit>
opt:
--mixed .................... resets index, but not working tree
--hard ..................... matches the working tree and index to that
of the tree being switched to any changes to
tracked files in the working tree since
<commit> are lost
git reset HEAD <file> .......... remove file from staging
git reset --soft HEAD~1 ........ delete most recent commit but keep work
git reset --hard HEAD~1 ........ delete most recent commit and delete work
Submodules
git submodule add <url> [<path>] .......... add new submodule to current project
git clone --recursive <url> ............... clone project and recursively all
submodules (same as using
'git submodule update --init
--recursive' after clone)
git submodule update --init --recursive ... checkout submodules recursively
using the commit listed in the
super-project (in detached HEAD)
git submodule update --remote <submod> .... fetch & merge remote changes for
<submod>, this will pull
origin/HEAD or a branch specified
for the submodule
git diff --submodule ...................... show commits that are part of the
submodule diff
Inspection
git ls-tree [-r] <ref> .... show git tree for <ref>, -r to recursively ls sub-trees
git show <obj> ............ show <obj>
git cat-file -p <obj> ..... print content of <obj>
Revision Specifier
HEAD ........ last commit
HEAD~1 ...... last commit-1
HEAD~N ...... last commit-N (linear backwards when in tree structure, check
difference between HEAD^ and HEAD~)
git rev-list --max-parents=0 HEAD ........... first commit
awk(1)
awk [opt] program [input]
-F <sepstr> field separator string (can be regex)
program awk program
input file or stdin if not file given
Input processing
Input is processed in two stages:
- Splitting input into a sequence of
records
. By default split atnewline
character, but can be changed via the builtinRS
variable. - Splitting a
record
intofields
. By default strings withoutwhitespace
, but can be changed via the builtin variableFS
or command line option-F
.
Fields are accessed as follows:
$0
wholerecord
$1
field one$2
field two- ...
Program
An awk
program is composed of pairs of the form:
pattern { action }
The program is run against each record
in the input stream. If a pattern
matches a record
the corresponding action
is executed and can access the
fields
.
INPUT
|
v
record ----> ∀ pattern matched
| |
v v
fields ----> run associated action
Any valid awk expr
can be a pattern
.
Special pattern
awk provides two special patterns, BEGIN
and END
, which can be used
multiple times. Actions with those patterns are executed exactly once.
BEGIN
actions are run before processing the first recordEND
actions are run after processing the last record
Special variables
RS
record separator: first char is the record separator, by defaultFS
field separator: regex to split records into fields, by defaultNR
number record: number of current recordNF
number fields: number of fields in the current record
Special statements & functions
-
printf "fmt", args...
Print format string, args are comma separated.
%s
string%d
decimal%x
hex%f
float
Width can be specified as
%Ns
, this reservesN
chars for a string. For floats one can use%N.Mf
,N
is the total number including.
andM
. -
sprintf("fmt", expr, ...)
Format the expressions according to the format string. Similar as
printf
, but this is a function and return value can be assigned to a variable. -
strftime("fmt")
Print time stamp formatted by
fmt
.%Y
full year (eg 2020)%m
month (01-12)%d
day (01-31)%F
alias for%Y-%m-%d
%H
hour (00-23)%M
minute (00-59)%S
second (00-59)%T
alias for%H:%M:%S
Examples
Filter records
awk 'NR%2 == 0 { print $0 }' <file>
The pattern NR%2 == 0
matches every second record and the action { print $0 }
prints the whole record.
Access last fields in records
echo 'a b c d e f' | awk '{ print $NF $(NF-1) }'
Access last fields with arithmetic on the NF
number of fields variable.
Capture in variables
# /proc/<pid>/status
# Name: cat
# ...
# VmRSS: 516 kB
# ...
for f in /proc/*/status; do
cat $f | awk '
/^VmRSS/ { rss = $2/1024 }
/^Name/ { name = $2 }
END { printf "%16s %6d MB\n", name, rss }';
done | sort -k2 -n
We capture values from VmRSS
and Name
into variables and print them at the
END
once processing all records is done.
Run shell command and capture output
cat /proc/1/status | awk '
/^Pid/ {
"ps --no-header -o user " $2 | getline user;
print user
}'
We build a ps
command line and capture the first line of the processes output
in the user
variable and then print it.
emacs(1)
help
C-h ? list available help modes
C-h e show message output (`*Messages*` buffer)
C-h f describe function
C-h v describe variable
C-h w describe which key invoke function (where-is)
C-h c <KEY> print command bound to <KEY>
C-h k <KEY> describe command bound to <KEY>
C-h b list buffer local key-bindings
<kseq> C-h list possible key-bindings with <kseq>
eg C-x C-h -> list key-bindings beginning with C-x
package manager
key fn description
------------------------------------------------
package-refresh-contents refresh package list
package-list-packages list available/installed packages
`U x` to mark packages for Upgrade & eXecute
window
key fn description
----------------------------------------------
C-x 0 delete-window kill focused window
C-x 1 delete-other-windows kill all other windows
C-x 2 split-window-below split horizontal
C-x 3 split-window-right split vertical
C-x o other-window other window (cycle)
buffer
key fn description
---------------------------------------------
C-x C-q read-only-mode toggle read-only mode for buffer
C-x k kill-buffer kill buffer
C-x s save-some-buffers save buffer
C-x w write-file write buffer (save as)
C-x b switch-to-buffer switch buffer
C-x C-b list-buffers buffer list
ibuffer
Builtin advanced buffer selection mode
key fn description
--------------------------------------
ibuffer enter buffer selection
h ibuffer help
o open buffer in other window
C-o open buffer in other window keep focus in ibuffer
s a sort by buffer name
s f sort by file name
s v sort by last viewed
s v sort by major mode
, cycle sorting mode
= compare buffer against file on disk (if file is dirty `*`)
/m filter by major mode
/n filter by buffer name
/f filter by file name
// remove all filter
/g create filter group
/\ remove all filter groups
isearch
key fn description
-------------------------------------------------
C-s isearch-forward search forward from current position (C-s to go to next match)
C-r isearch-backward search backwards from current position (C-r to go to next match)
C-w isearch-yank-word-or-char feed next word to current search (extend)
M-p isearch-ring-advance previous search input
M-n isearch-ring-retreat next search input
occur
key fn description
-----------------------------------
M-s o occur get matches for regexp in buffer
use during `isearch` to use current search term
C-n goto next line
C-p goto previous line
o open match in other window
C-o open match in other window keep focus in ibuffer
key fn description
---------------------------------------------------------
multi-occur-in-matching-buffers run occur in buffers matching regexp
grep
key fn description
-----------------------------------
rgrep recursive grep
find-grep run find-grep result in *grep* buffer
n/p navigate next/previous match in *grep* buffer
q quit *grep* buffer
yank/paste
key fn description
---------------------------------------------
C-<SPACE> set-mark-command set start mark to select text
M-w kill-ring-save copy selected text
C-w kill-region kill selected text
C-y yank paste selected text
M-y yank-pop cycle through kill-ring (only after paste)
register
key fn description
------------------------------------------------
C-x r s <reg> copy-to-register save region in register <reg>
C-x r i <reg> insert-register insert content of register <reg>
block/rect
key fn description
------------------------------------------------
C-x <SPC> rectangle-mark-mode activate rectangle-mark-mode
string-rectangle insert text in marked rect
mass edit
key fn description
------------------------------------------------
C-x h mark-whole-buffer mark whole buffer
delete-matching-line delete lines matching regex
M-% query-replace search & replace
C-M-% query-replace-regexp search & replace regex
narrow
key fn description
---------------------------------------------
C-x n n narrow-to-region show only focused region (narrow)
C-x n w widen show whole buffer (wide)
org
key fn description
------------------------------------
M-up/M-down re-arrange items in same hierarchy
M-left/M-right change item hierarchy
C-RET create new item below current
C-S-RET create new TODO item below current
S-left/S-right cycle TODO states
org source
key fn description
------------------------------
<s TAB generate a source block
C-c ' edit source block (in lang specific buffer)
C-c C-c eval source block
comapny
key fn description
-------------------------------
C-s search through completion candidates
C-o filter completion candidates based on search term
<f1> get doc for completion condidate
M-<digit> select completion candidate
tags
To generate etags
using ctags
ctags -R -e . generate emacs tag file (important `-e`)
Navigate using tags
key fn description
-----------------------------------------------
xref-find-definitions find definition of tag
xref-find-apropos find symbols matching regexp
xref-find-references find references of tag
lisp
key fn description
------------------------------
ielm open interactive elips shell
In lisp-interaction-mode
(*scratch*
buffer by defult)
key fn description
--------------------------------------------------------
C-j eval-print-last-sexp evaluate & print preceeding lisp expr
C-x C-e eval-last-sexp evaluate lisp expr
C-u C-x C-e eval-last-sexp evaluate & print
ido
Builtin fuzzy completion mode (eg buffer select, dired, ...).
key fn description
------------------------------------------
ido-mode toggle ido mode
<Left>/<Right> cycle through available competions
<RET> select completion
evil
key fn description
--------------------------
C-z toggle emacs/evil mode
C-^ toggle between previous and current buffer
C-p after paste cycle kill-ring back
C-n after paste cycle kill-ring forward
dired
key fn description
--------------------------
i open sub-dir in same buffer
+ create new directory
C copy file/dir
q quit
gpg(1)
gpg
-o|--output Specify output file
-a|--armor Create ascii output
-u|--local-user <name> Specify key for signing
-r|--recipient Encrypt for user
Generate new keypair
gpg --full-generate-key
List keys
gpg -k / --list-key # public keys
gpg -K / --list-secret-keys # secret keys
Edit keys
gpg --edit-key <KEY ID>
Gives prompt to modify KEY ID
, common commands:
help show help
save save & quit
list list keys and user IDs
key <N> select subkey <N>
uid <N> select user ID <N>
expire change expiration of selected key
adduid add user ID
deluid delete selected user ID
addkey add subkey
delkey delete selected subkey
Export & Import Keys
gpg --export --armor --output <KEY.PUB> <KEY ID>
gpg --import <FILE>
Search & Send keys
gpg --keyserver <SERVER> --send-keys <KEY ID>
gpg --keyserver <SERVER> --search-keys <KEY ID>
Encrypt (passphrase)
Encrypt file using passphrase
and write encrypted data to <file>.gpg
.
gpg --symmetric <file>
# Decrypt using passphrase
gpg -o <file> --decrypt <file>.gpg
Encrypt (public key)
Encrypt file with public key
of specified recipient
and write encrypted
data to <file>.gpg
.
gpg --encrypt -r foo@bar.de <file>
# Decrypt at foos side (private key required)
gpg -o <file> --decrypt <file>.gpg
Signing
Generate a signed file and write to <file>.gpg
.
# Sign with private key of foo@bar.de
gpg --sign -u foor@bar.de <file>
# Verify with public key of foo@bar.de
gpg --verify <file>
# Extract content from signed file
gpg -o <file> --decrypt <file>.gpg
Without
-u
use first private key in listgpg -K
for signing.
Files can also be signed
and encrypted
at once, gpg will first sign the
file and then encrypt it.
gpg --sign --encrypt -r <recipient> <file>
Signing (detached)
Generate a detached
signature and write to <file>.asc
.
Send <file>.asc
along with <file>
when distributing.
gpg --detach-sign --armor -u foor@bar.de <file>
# Verify
gpg --verify <file>.asc <file>
Without
-u
use first private key in listgpg -K
for signing.
Abbreviations
sec
secret keyssb
secret subkeypub
public keysub
public subkey
Keyservers
- http://pgp.mit.edu
- http://keyserver.ubuntu.com
- hkps://pgp.mailbox.org
gdb(1)
CLI
gdb [opts] [prg [-c coredump | -p pid]]
gdb [opts] --args prg <prg-args>
opts:
-p <pid> attach to pid
-c <coredump> use <coredump>
-x <file> execute script <file> before prompt
-ex <cmd> execute command <cmd> before prompt
--tty <tty> set I/O tty for debugee
Interactive usage
Misc
tty <tty>
Set <tty> as tty for debugee.
Make sure nobody reads from target tty, easiest is to spawn a shell
and run following in target tty:
> while true; do sleep 1024; done
sharedlibrary [<regex>]
Load symbols of shared libs loaded by debugee. Optionally use <regex>
to filter libs for symbol loading.
display [/FMT] <expr>
Print <expr> every time debugee stops. Eg print next instr, see
examples below.
undisplay [<num>]
Delete display expressions either all or one referenced by <num>.
info display
List display expressions.
Breakpoints
break [-qualified] <sym> thread <tnum>
Set a breakpoint only for a specific thread.
-qualified: Treat <sym> as fully qualified symbol (quiet handy to set
breakpoints on C symbols in C++ contexts)
break <sym> if <cond>
Set conditional breakpoint (see examples below).
delete [<num>]
Delete breakpoint either all or one referenced by <num>.
info break
List breakpoints.
cond <bp> <cond>
Make existing breakpoint <bp> conditional with <cond>.
tbreak
Set temporary breakpoint, will be deleted when hit.
Same syntax as `break`.
rbreak <regex>
Set breakpoints matching <regex>, where matching internally is done
on: .*<regex>.*
command [<bp_list>]
Define commands to run after breakpoint hit. If <bp_list> is not
specified attach command to last created breakpoint. Command block
terminated with 'end' token.
<bp_list>: Space separates list, eg 'command 2 5-8' to run command
for breakpoints: 2,5,6,7,8.
Inspection
info functions [<regex>]
List functions matching <regex>. List all functions if no <regex>
provided.
info variables [<regex>]
List variables matching <regex>. List all variables if no <regex>
provided.
Signal handling
info handle [<signal>]
Print how to handle <signal>. If no <signal> specified print for all
signals.
handle <signal> <action>
Configure how gdb handles <signal> sent to debugee.
<action>:
stop/nostop Catch signal in gdb and break.
print/noprint Print message when gdb catches signal.
pass/nopass Pass signal down to debugee.
catch signal <signal>
Create a catchpoint for <signal>.
Source file locations
dir <path>
Add <path> to the beginning of the searh path for source files.
show dir
Show current search path.
set substitute-path <from> <to>
Add substitution rule checked during source file lookup.
show substitute-path
Show current substitution rules.
Configuration
set follow-fork-mode <child | parent>
Specify which process to follow when debuggee makes a fork(2)
syscall.
set pagination <on | off>
Turn on/off gdb's pagination.
set breakpoint pending <on | off | auto>
on: always set pending breakpoints.
off: error when trying to set pending breakpoints.
auto: interatively query user to set breakpoint.
set print pretty <on | off>
Turn on/off pertty printing of structures.
set logging <on | off>
Enable output logging to file (default gdb.txt).
set logging file <fname>
Change output log file to <fname>
set logging redirect <on/off>
on: only log to file.
off: log to file and tty.
User commands (macros)
Gdb allows to create & document user commands as follows:
define <cmd>
# cmds
end
document <cmd>
# docu
end
To get all user commands or documentations one can use:
help user-defined
help <cmd>
Hooks
Gdb allows to create two types of command hooks
hook-
will be run before<cmd>
hookpost-
will be run after<cmd>
define hook-<cmd>
# cmds
end
define hookpost-<cmd>
# cmds
end
Examples
Automatically print next instr
When ever the debugee stops automatically print the memory at the current
instruction pointer ($rip
x86) and format as instruction /i
.
# rip - x86
display /i $rip
# step instruction, after the step the next instruction is automatically printed
si
Conditional breakpoints
Create conditional breakpoints for a function void foo(int i)
in the debugee.
# Create conditional breakpoint
b foo if i == 42
b foo # would create bp 2
# Make existing breakpoint conditional
cond 2 if i == 7
Catch SIGSEGV and execute commands
This creates a catchpoint
for the SIGSEGV
signal and attached the command
to it.
catch signal SIGSEGV
command
bt
c
end
Run backtrace
on thread 1 (batch mode)
gdb --batch -ex 'thread 1' -ex 'bt' -p <pid>
Script gdb for automating debugging sessions
To script gdb add commands into a file and pass it to gdb via -x
.
For example create run.gdb
:
set pagination off
break mmap
command
info reg rdi rsi rdx
bt
c
end
#initial drop
c
This script can be used as:
gdb --batch -x ./run.gdb -p <pid>
Know Bugs
Workaround command + finish
bug
When using finish
inside a command
block, commands after finish
are not
executed. To workaround that bug one can create a wrapper function which calls
finish
.
define handler
bt
finish
info reg rax
end
command
handler
end
radare2(1)
pd <n> [@ <addr>] # print disassembly for <n> instructions
# with optional temporary seek to <addr>
flags
fs # list flag-spaces
fs <fs> # select flag-space <fs>
f # print flags of selected flag-space
help
?*~<kw> # '?*' list all commands and '~' grep for <kw>
?*~... # '..' less mode /'...' interactive search
relocation
> r2 -B <baddr> <exe> # open <exe> mapped to addr <baddr>
oob <addr> # reopen current file at <baddr>
qemu(1)
All the examples & notes use qemu-system-x86_64
but in most cases
this can be swapped with the system emulator for other architectures.
Keybindings
Graphic mode:
Ctrl+Alt+g release mouse capture from VM
Ctrl+Alt+1 switch to display of VM
Ctrl+Alt+2 switch to qemu monitor
No graphic mode:
Ctrl+a h print help
Ctrl+a x exit emulator
Ctrl+a c switch between monitor and console
VM config snippet
Following command-line gives a good starting point to assemble a VM:
qemu-system-x86_64 \
-cpu host -enable-kvm -smp 4 \
-m 8G \
-vga virtio -display sdl,gl=on \
-boot menu=on \
-cdrom <iso> \
-hda <disk> \
-device qemu-xhci,id=xhci \
-device usb-host,bus=xhci.0,vendorid=0x05e1,productid=0x0408,id=capture-card
CPU & RAM
# Emulate host CPU in guest VM, enabling all supported host featured (requires KVM).
# List available CPUs `qemu-system-x86_64 -cpu help`.
-cpu host
# Enable KVM instead software emulation.
-enable-kvm
# Configure number of guest CPUs.
-smp <N>
# Configure size of guest RAM.
-m 8G
Graphic & Display
# Use sdl window as display and enable openGL context.
-display sdl,gl=on
# Use vnc server as display (eg on display `:42` here).
-display vnc=localhost:42
# Confifure virtio as 3D video graphic accelerator (requires virgl in guest).
-vga virtio
Boot Menu
# Enables boot menu to select boot device (enter with `ESC`).
-boot menu=on
Block devices
# Attach cdrom drive with iso to a VM.
-cdrom <iso>
# Attach disk drive to a VM.
-hda <disk>
# Generic way to configure & attach a drive to a VM.
-drive file=<file>,format=qcow2
Create a disk with qemu-img
To create a qcow2
disk (qemu copy-on-write) of size 10G
:
qemu-img create -f qcow2 disk.qcow2 10G
The disk does not contain any partitions
or a partition table
.
We can format the disk from within the guest as following example:
# Create `gpt` partition table.
sudo parted /dev/sda mktable gpt
# Create two equally sized primary partitions.
sudo parted /dev/sda mkpart primary 0% 50%
sudo parted /dev/sda mkpart primary 50% 100%
# Create filesystem on each partition.
sudo mkfs.ext3 /dev/sda1
sudo mkfs.ext4 /dev/sda2
lsblk -f /dev/sda
NAME FSTYPE LABEL UUID FSAVAIL FSUSE% MOUNTPOINT
sda
├─sda1 ext3 ....
└─sda2 ext4 ....
USB
Host Controller
# Add XHCI USB controller to the VM (supports USB 3.0, 2.0, 1.1).
# `id=xhci` creates a usb bus named `xhci`.
-device qemu-xhci,id=xhci
USB Device
# Pass-through USB device from host identified by vendorid & productid and
# attach to usb bus `xhci.0` (defined with controller `id`).
-device usb-host,bus=xhci.0,vendorid=0x05e1,productid=0x0408
Debugging
# Open gdbstub on tcp `<port>` (`-s` shorthand for `-gdb tcp::1234`).
-gdb tcp::<port>
# Freeze guest CPU at startup and wait for debugger connection.
-S
IO redirection
# Create raw tcp server for `serial IO` and wait until a client connects
# before executing the guest.
-serial tcp:localhost:12345,server,wait
# Create telnet server for `serial IO` and wait until a client connects
# before executing the guest.
-serial telnet:localhost:12345,server,wait
# Configure redirection for the QEMU `mointor`, arguments similar to `-serial`
# above.
-monitor ...
In
server
mode usenowait
to execute guest without waiting for a client connection.
Network
# Redirect host tcp port `1234` to guest port `4321`.
-nic user,hostfwd=tcp:localhost:1234-:4321
Shared drives
# Attach a `virtio-9p-pci` device to the VM.
# The guest requires 9p support and can mount the shared drive as:
# mount -t 9p -o trans=virtio someName /mnt
-virtfs local,id=someName,path=<someHostPath>,mount_tag=someName,security_model=none
Tracing
# List name of all trace points.
-trace help
# Enable trace points matching pattern and optionally write trace to file.
-trace <pattern>[,file=<file>]
# Enable trace points for all events listed in the <events> file.
# File must contain one event/pattern per line.
-trace events=<events>
VM snapshots
VM snapshots require that there is at least on qcow2
disk attached to the VM
(VM Snapshots).
Commands for qemu Monitor or QMP:
# List available snapshots.
info snapshots
# Create/Load/Delete snapshot with name <tag>.
savevm <tag>
loadvm <tag>
delvm <tag>
The snapshot can also be directly specified when invoking qemu as:
qemu-system-x86_64 \
-loadvm <tag> \
...
VM Migration
Online
migration example:
# Start machine 1 on host ABC.
qemu-system-x86_64 -monitor stdio -cdrom <iso>
# Prepare machine 2 on host DEF as migration target.
# Listen for any connection on port 12345.
qemu-system-x86_64 -monitor stdio -incoming tcp:0.0.0.0:12345
# Start migration from the machine 1 monitor console.
(qemu) migrate tcp:DEF:12345
Save to external file example:
```bash
# Start machine 1.
qemu-system-x86_64 -monitor stdio -cdrom <iso>
# Save VM state to file.
(qemu) migrate "exec:gzip -c > vm.gz"
# Load VM from file.
qemu-system-x86_64 -monitor stdio -incoming "exec: gzip -d -c vm.gz"
The migration source machine and the migration target machine should be launched with the same parameters.
Appendix: Direct Kernel
boot
Example command line to directly boot a Kernel
with an initrd
ramdisk.
qemu-system-x86_64 \
-cpu host \
-enable-kvm \
-kernel <dir>/arch/x86/boot/bzImage \
-append "earlyprintk=ttyS0 console=ttyS0 nokaslr init=/init debug" \
-initrd <dir>/initramfs.cpio.gz \
...
Instructions to build a minimal Kernel
and initrd
.
References
- QEMU USB
- QEMU IMG
- QEMU Tools
- QEMU System
- QEMU Invocation (command line args)
- QEMU Monitor
- QEMU machine protocol (QMP)
- QEMU VM Snapshots
pacman(1)
Remote package repositories
pacman -Sy refresh package database
pacman -S <pkg> install pkg
pacman -Ss <regex> search remote package database
pacman -Si <pkg> get info for pkg
pacman -Su upgrade installed packages
pacman -Sc clean local package cache
Remove packages
pacman -Rsn <pkg> uninstall package and unneeded deps + config files
Local package database
Local package database of installed packages.
pacman -Q list all installed packages
pacman -Qs <regex> search local package database
pacman -Ql <pkg> list files installed by pkg
pacman -Qo <file> query package that owns file
pacman -Qe only list explicitly installed packages
Local file database
Local file database which allows to search packages owning certain files. Also searches non installed packages, but database must be synced.
pacman -Fy refresh file database
pacman -Fl <pkg> list files in pkg (must not be installed)
pacman -Fx <regex> search
Hacks
Uninstall all orphaned packages (including config files) that were installed as dependencies.
pacman -Rsn $(pacman -Qqtq)
List explicitly installed packages that are not required as dependency by any package and sort by size.
pacman -Qetq | xargs pacman -Qi |
awk '/Name/ { name=$3 }
/Installed Size/ { printf "%8.2f%s %s\n", $4, $5, name }' |
sort -h
Resource analysis & monitor
lsof(8)
lsof
-r <s> ..... repeatedly execute command ervery <s> seconds
-a ......... AND slection filters instead ORing (OR: default)
-p <pid> ... filter by <pid>
+fg ........ show file flags for file descripros
-n ......... don't convert network addr to hostnames
-P ......... don't convert network port to service names
-i <@h[:p]>. show connections to h (hostname|ip addr) with optional port p
-s <p:s> ... in conjunction with '-i' filter for protocol <p> in state <s>
-U ......... show unix domain sockets ('@' indicates abstract sock name, see unix(7))
file flags:
R/W/RW ..... read/write/read-write
CR ......... create
AP ......... append
TR ......... truncate
-s protocols
TCP, UDP
-s states (TCP)
CLOSED, IDLE, BOUND, LISTEN, ESTABLISHED, SYN_SENT, SYN_RCDV, ESTABLISHED,
CLOSE_WAIT, FIN_WAIT1, CLOSING, LAST_ACK, FIN_WAIT_2, TIME_WAIT
-s states (UDP)
Unbound, Idle
Examples
File flags
Show open files with file flags for process:
lsof +fg -p <pid>
Open TCP connections
Show open tcp connections for $USER
:
lsof -a -u $USER -i TCP
Note: -a
ands the results. If -a
is not given all open files matching
$USER
and all tcp connections are listed (ored).
Open connection to specific host
Show open connections to localhost
for $USER
:
lsof -a -u $USER -i @localhost
Open connection to specific port
Show open connections to port :1234
for $USER
:
lsof -a -u $USER -i :1234
IPv4 TCP connections in ESTABLISHED
state
lsof -i 4TCP -s TCP:ESTABLISHED
ss(8)
ss [option] [filter]
[option]
-p ..... Show process using socket
-l ..... Show sockets in listening state
-4/-6 .. Show IPv4/6 sockets
-x ..... Show unix sockets
-n ..... Show numeric ports (no resolve)
-O ..... Oneline output per socket
[filter]
dport/sport PORT .... Filter for destination/source port
dst/src ADDR ........ Filter for destination/source address
and/or .............. Logic operator
==/!= ............... Comparison operator
(EXPR) .............. Group exprs
Examples
Show all tcp IPv4 sockets connecting to port 443
:
ss -4 'dport 443'
Show all tcp IPv4 sockets that don't connect to port 443
or connect to address 1.2.3.4
.
ss -4 'dport != 443 or dst 1.2.3.4'
pidstat(1)
pidstat [opt] [interval] [cont]
-U [user] show username instead UID, optionally only show for user
-r memory statistics
-d I/O statistics
-h single line per process and no lines with average
Page fault and memory utilization
pidstat -r -p <pid> [interval] [count]
minor_pagefault: Happens when the page needed is already in memory but not
allocated to the faulting process, in that case the kernel
only has to create a new page-table entry pointing to the
shared physical page (not required to load a memory page from
disk).
major_pagefault: Happens when the page needed is NOT in memory, the kernel
has to create a new page-table entry and populate the
physical page (required to load a memory page from disk).
I/O statistics
pidstat -d -p <pid> [interval] [count]
pgrep(1)
pgrep [opts] <pattern>
-n only list newest matching process
-u <usr> only show matching for user <usr>
-l additionally list command
-a additionally list command + arguments
Debug newest process
For example attach gdb to newest zsh process from $USER
.
gdb -p $(pgrep -n -u $USER zsh)
pmap(1)
pmap <pid>
Dump virtual memory map of process.
Compared to /proc/<pid>/maps it shows the size of the mappings.
pstack(1)
pstack <pid>
Dump stack for all threads of process.
Trace and Profile
strace(1)
strace [opts] [prg]
-f .......... follow child processes on fork(2)
-p <pid> .... attach to running process
-s <size> ... max string size, truncate of longer (default: 32)
-e <expr> ... expression for trace filtering
-o <file> ... log output into <file>
-c .......... dump syscall statitics at the end
-k .......... dump stack trace for each syscall
-P <path> ... only trace syscall accesing path
-y .......... print paths for FDs
-tt ......... print absolute timestamp (with us precision)
-r .......... print relative timestamp
<expr>:
trace=syscall[,syscall] .... trace only syscall listed
trace=file ................. trace all syscall that take a filename as arg
trace=process .............. trace process management related syscalls
trace=signal ............... trace signal related syscalls
signal ..................... trace signals delivered to the process
Examples
Trace open(2)
& socket(2)
syscalls for a running process + child processes:
strace -f -e trace=open,socket -p <pid>
Trace signals delivered to a running process:
strace -e signal -e 'trace=!all' -p <pid>
ltrace(1)
ltrace [opts] [prg]
-f .......... follow child processes on fork(2)
-p <pid> .... attach to running process
-o <file> ... log output into <file>
-l <filter> . show who calls into lib matched by <filter>
-C .......... demangle
Example
List which program/libs call into libstdc++
:
ltrace -l '*libstdc++*' -C -o ltrace.log ./main
perf(1)
perf list show supported hw/sw events
perf stat
-p <pid> .. show stats for running process
-I <ms> ... show stats periodically over interval <ms>
-e <ev> ... filter for events
perf top
-p <pid> .. show stats for running process
-F <hz> ... sampling frequency
-K ........ hide kernel threads
perf record
-p <pid> ............... record stats for running process
-F <hz> ................ sampling frequency
--call-graph <method> .. [fp, dwarf, lbr] method how to caputre backtrace
fp : use frame-pointer, need to compile with
-fno-omit-frame-pointer
dwarf: use .cfi debug information
lbr : use hardware last branch record facility
-g ..................... short-hand for --call-graph fp
-e <ev> ................ filter for events
perf report
-n .................... annotate symbols with nr of samples
--stdio ............... report to stdio, if not presen tui mode
-g graph,0.5,caller ... show caller based call chains with value >0.5
Useful <ev>:
page-faults
minor-faults
major-faults
cpu-cycles`
task-clock
Flamegraph
Flamegraph with single event trace
perf record -g -e cpu-cycles -p <pid>
perf script | FlameGraph/stackcollapse-perf.pl | FlameGraph/flamegraph.pl > cycles-flamegraph.svg
Flamegraph with multiple event traces
perf record -g -e cpu-cycles,page-faults -p <pid>
perf script --per-event-dump
# fold & generate as above
OProfile
operf -g -p <pid>
-g ...... caputre call-graph information
opreport [opt] FILE
show time spent per binary image
-l ...... show time spent per symbol
-c ...... show callgraph information (see below)
-a ...... add column with time spent accumulated over child nodes
ophelp show supported hw/sw events
/usr/bin/time(1)
# statistics of process run
/usr/bin/time -v <cmd>
Binary
od(1)
od [opts] <file>
-An don't print addr info
-tx4 print hex in 4 byte chunks
-ta print as named character
-tc printable chars or backslash escape
-w4 print 4 bytes per line
-j <n> skip <n> bytes from <file> (hex if start with 0x)
-N <n> dump <n> bytes (hex of start with 0x)
ASCII to hex string
echo -n AAAABBBB | od -An -w4 -tx4
>> 41414141
>> 42424242
echo -n '\x7fELF\n' | od -tx1 -ta -tc
>> 0000000 7f 45 4c 46 0a # tx1
>> del E L F nl # ta
>> 177 E L F \n # tc
Extract parts of file
For example .rodata
section from an elf file. We can use readelf
to get the
offset into the file where the .rodata
section starts.
readelf -W -S foo
>> Section Headers:
>> [Nr] Name Type Address Off Size ES Flg Lk Inf Al
>> ...
>> [15] .rodata PROGBITS 00000000004009c0 0009c0 000030 00 A 0 0 16
With the offset of -j 0x0009c0
we can dump -N 0x30
bytes from the beginning of
the .rodata
section as follows:
od -j 0x0009c0 -N 0x30 -tx4 -w4 foo
>> 0004700 00020001
>> 0004704 00000000
>> *
>> 0004740 00000001
>> 0004744 00000002
>> 0004750 00000003
>> 0004754 00000004
Note: Numbers starting with 0x
will be interpreted as hex by od
.
xxd(1)
xxd [opts]
-p dump continuous hexdump
-r convert hexdump into binary ('revert')
-e dump as little endian mode
-i output as C array
ASCII to hex stream
echo -n 'aabb' | xxd -p
>> 61616262
Hex to binary stream
echo -n '61616262' | xxd -p -r
>> aabb
ASCII to binary
echo -n '\x7fELF' | xxd -p | xxd -p -r | file -p -
>> ELF
ASCII to C
array (hex encoded)
xxd -i <(echo -n '\x7fELF')
>> unsigned char _proc_self_fd_11[] = {
>> 0x7f, 0x45, 0x4c, 0x46
>> };
>> unsigned int _proc_self_fd_11_len = 4;
readelf(1)
readelf [opts] <elf>
-W|--wide wide output, dont break output at 80 chars
-h print ELF header
-S print section headers
-l print program headers + segment mapping
-d print .dynamic section (dynamic link information)
--syms print symbol tables (.symtab .dynsym)
--dyn-syms print dynamic symbol table (exported symbols for dynamic linker)
-r print relocation sections (.rel.*, .rela.*)
objdump(1)
objdump [opts] <elf>
-M intel use intil syntax
-d disassemble text section
-D disassemble all sections
-S mix disassembly with source code
-C demangle
-j <section> display info for section
--[no-]show-raw-insn [dont] show object code next to disassembly
Disassemble section
For example .plt
section:
objdump -j .plt -d <elf>
nm(1)
nm [opts] <elf>
-C demangle
-u undefined only
Development
c++filt(1)
Demangle symbol
c++-filt <symbol_str>
Demangle stream
For example dynamic symbol table:
readelf -W --dyn-syms <elf> | c++filt
c++
Type deduction
Force compile error to see what auto
is deduced to.
auto foo = bar();
// force compile error
typename decltype(foo)::_;
glibc
malloc tracer mtrace(3)
Trace memory allocation and de-allocation to detect memory leaks.
Need to call mtrace(3)
to install the tracing hooks.
If we can't modify the binary to call mtrace
we can create a small shared
library and pre-load it.
// libmtrace.c
#include <mcheck.h>
__attribute__((constructor)) static void init_mtrace() { mtrace(); }
Compile as:
gcc -shared -fPIC -o libmtrace.so libmtrace.c
To generate the trace file run:
export MALLOC_TRACE=<file>
LD_PRELOAD=./libmtrace.so <binary>
Note: If MALLOC_TRACE
is not set mtrace
won't install tracing hooks.
To get the results of the trace file:
mtrace <binary> $MALLOC_TRACE
malloc check mallopt(3)
Configure action when glibc detects memory error.
export MALLOC_CHECK_=<N>
Useful values:
1 print detailed error & continue
3 print detailed error + stack trace + memory mappings & abort
7 print simple error message + stack trace + memory mappings & abort
gcc(1)
CLI
Preprocessing
While debugging can be helpful to just pre-process files.
gcc -E [-dM] ...
-E
run only preprocessor-dM
list only#define
statements
Target options
# List all target options with their description.
gcc --help=target
# Configure for current cpu arch and query (-Q) value of options.
gcc -march=native -Q --help=target
Builtins
__builtin_expect(expr, cond)
Give the compiler a hint which branch is hot, so it can lay out the code accordingly to reduce number of jump instructions. See on compiler explorer.
echo "
extern void foo();
extern void bar();
void run0(int x) {
if (__builtin_expect(x,0)) { foo(); }
else { bar(); }
}
void run1(int x) {
if (__builtin_expect(x,1)) { foo(); }
else { bar(); }
}
" | gcc -O2 -S -masm=intel -o /dev/stdout -xc -
Will generate something similar to the following.
run0
:bar
is on the path without branchrun1
:foo
is on the path without branch
run0:
test edi, edi
jne .L4
xor eax, eax
jmp bar
.L4:
xor eax, eax
jmp foo
run1:
test edi, edi
je .L6
xor eax, eax
jmp foo
.L6:
xor eax, eax
jmp bar
ABI (Linux)
- C ABI - SystemV ABI
- C++ ABI - C++ Itanium ABI
make(1)
Anatomy of make
rules
target .. : prerequisite ..
recipe
..
target
: an output generated by the ruleprerequisite
: an input that is used to generate the targetrecipe
: list of actions to generate the output from the input
Use
make -p
to print all rules and variables (implicitly + explicitly defined).
Pattern rules & Automatic variables
Pattern rules
A pattern rule contains the %
char (exactly one of them) and look like this example:
%.o : %.c
$(CC) -c $(CFLAGS) $(CPPFLAGS) $< -o $@
The target matches files of the pattern %.o
, where %
matches any none-empty
substring and other character match just them self.
The substring matched by %
is called the stem
.
%
in the prerequisite stands for the matched stem
in the target.
Automatic variables
As targets and prerequisites in pattern rules can't be spelled explicitly in the recipe, make provides a set of automatic variables to work with:
$@
: Name of the target that triggered the rule.$<
: Name of the first prerequisite.$^
: Names of all prerequisites (without duplicates).$+
: Names of all prerequisites (with duplicates).$*
: Stem of the pattern rule.
# file: Makefile
all: foobar blabla
foo% bla%: aaa bbb bbb
@echo "@ = $@"
@echo "< = $<"
@echo "^ = $^"
@echo "+ = $+"
@echo "* = $*"
@echo "----"
aaa:
bbb:
Running above Makefile
gives:
@ = foobar
< = aaa
^ = aaa bbb
+ = aaa bbb bbb
* = bar
----
@ = blabla
< = aaa
^ = aaa bbb
+ = aaa bbb bbb
* = bla
----
Variables related to filesystem paths:
$(CURDIR)
: Path of current working dir after usingmake -C path
Useful functions
Substitution references
Substitute strings matching pattern in a list.
in := a.o l.a c.o
out := $(in:.o=.c)
# => out = a.c l.a c.c
filter
Keep strings matching a pattern in a list.
in := a.a b.b c.c d.d
out := $(filter %.b %.c, $(in))
# => out = b.b c.c
filter-out
Remove strings matching a pattern from a list.
in := a.a b.b c.c d.d
out := $(filter-out %.b %.c, $(in))
# => out = a.a d.d
abspath
Resolve each file name as absolute path (don't resolve symlinks).
$(abspath fname1 fname2 ..)
### `realpath`
Resolve each file name as canonical path.
```make
$(realpath fname1 fname2 ..)
ld.so(8)
Environment Variables
LD_PRELOAD=<l_so> colon separated list of libso's to be pre loaded
LD_DEBUG=<opts> comma separated list of debug options
=help list available options
=libs show library search path
=files processing of input files
=symbols show search path for symbol lookup
=bindings show against which definition a symbol is bound
LD_PRELOAD: Initialization Order and Link Map
Libraries specified in LD_PRELOAD
are loaded from left-to-right
but
initialized from right-to-left
.
> ldd ./main
>> libc.so.6 => /usr/lib/libc.so.6
> LD_PRELOAD=liba.so:libb.so ./main
-->
preloaded in this order
<--
initialized in this order
The preload order determines:
- the order libraries are inserted into the
link map
- the initialization order for libraries
For the example listed above the resulting link map
will look like the
following:
+------+ +------+ +------+ +------+
| main | -> | liba | -> | libb | -> | libc |
+------+ +------+ +------+ +------+
This can be seen when running with LD_DEBUG=files
:
> LD_DEBUG=files LD_PRELOAD=liba.so:libb.so ./main
# load order (-> determines link map)
>> file=liba.so [0]; generating link map
>> file=libb.so [0]; generating link map
>> file=libc.so.6 [0]; generating link map
# init order
>> calling init: /usr/lib/libc.so.6
>> calling init: <path>/libb.so
>> calling init: <path>/liba.so
>> initialize program: ./main
To verify the link map
order we let ld.so
resolve the memcpy(3)
libc
symbol (used in main) dynamically, while enabling LD_DEBUG=symbols,bindings
to see the resolving in action.
> LD_DEBUG=symbols,bindings LD_PRELOAD=liba.so:libb.so ./main
>> symbol=memcpy; lookup in file=./main [0]
>> symbol=memcpy; lookup in file=<path>/liba.so [0]
>> symbol=memcpy; lookup in file=<path>/libb.so [0]
>> symbol=memcpy; lookup in file=/usr/lib/libc.so.6 [0]
>> binding file ./main [0] to /usr/lib/libc.so.6 [0]: normal symbol `memcpy' [GLIBC_2.14]
Dynamic Linking (x86_64)
Dynamic linking basically works via one indirect jump. It uses a combination of
function trampolines (.plt
section) and a function pointer table (.got.plt
section).
On the first call the trampoline sets up some metadata and then jumps to the
ld.so
runtime resolve function, which in turn patches the table with the
correct function pointer.
.plt ....... procedure linkage table, contains function trampolines, usually
located in code segment (rx permission)
.got.plt ... global offset table for .plt, holds the function pointer table
Using radare2
we can analyze this in more detail:
[0x00401040]> pd 4 @ section..got.plt
;-- section..got.plt:
;-- .got.plt: ; [22] -rw- section size 32 named .got.plt
;-- _GLOBAL_OFFSET_TABLE_:
[0] 0x00404000 .qword 0x0000000000403e10 ; section..dynamic
[1] 0x00404008 .qword 0x0000000000000000
; CODE XREF from section..plt @ +0x6
[2] 0x00404010 .qword 0x0000000000000000
;-- reloc.puts:
; CODE XREF from sym.imp.puts @ 0x401030
[3] 0x00404018 .qword 0x0000000000401036 ; RELOC 64 puts
[0x00401040]> pd 6 @ section..plt
;-- section..plt:
;-- .plt: ; [12] -r-x section size 32 named .plt
┌─> 0x00401020 ff35e22f0000 push qword [0x00404008]
╎ 0x00401026 ff25e42f0000 jmp qword [0x00404010]
╎ 0x0040102c 0f1f4000 nop dword [rax]
┌ 6: int sym.imp.puts (const char *s);
└ ╎ 0x00401030 ff25e22f0000 jmp qword [reloc.puts]
╎ 0x00401036 6800000000 push 0
└─< 0x0040103b e9e0ffffff jmp sym..plt
- At address
0x00401030
in the.plt
section we see the indirect jump forputs
using the function pointer in_GLOBAL_OFFSET_TABLE_[3] (GOT)
. GOT[3]
initially points to instruction after theputs
trampoline0x00401036
.- This pushes the relocation index
0
and then jumps to the first trampoline0x00401020
. - The first trampoline jumps to
GOT[2]
which will be filled at program startup by theld.so
with its resolve function. - The
ld.so
resolve function fixes the relocation referenced by the relocation index pushed by theputs
trampoline. - The relocation entry at index
0
tells the resolve function which symbol to search for and where to put the function pointer:
As we can see the offset from relocation at index> readelf -r <main> >> Relocation section '.rela.plt' at offset 0x4b8 contains 1 entry: >> Offset Info Type Sym. Value Sym. Name + Addend >> 000000404018 000200000007 R_X86_64_JUMP_SLO 0000000000000000 puts@GLIBC_2.2.5 + 0
0
points toGOT[3]
.
ELF Symbol Versioning
The ELF symbol versioning mechanism allows to attach version information to symbols. This can be used to express symbol version requirements or to provide certain symbols multiple times in the same ELF file with different versions (eg for backwards compatibility).
The libpthread.so
library is an example which provides the
pthread_cond_wait
symbol multiple times but in different versions.
With readelf the version of the symbol can be seen after the @
.
> readelf -W --dyn-syms /lib/libpthread.so
Symbol table '.dynsym' contains 342 entries:
Num: Value Size Type Bind Vis Ndx Name
...
141: 0000f080 696 FUNC GLOBAL DEFAULT 16 pthread_cond_wait@@GLIBC_2.3.2
142: 00010000 111 FUNC GLOBAL DEFAULT 16 pthread_cond_wait@GLIBC_2.2.5
The @@
denotes the default symbol version which will be used during
static linking against the library.
The following dump shows that the tmp
program linked against lpthread
will
depend on the symbol version GLIBC_2.3.2
, which is the default version.
> echo "#include <pthread.h>
int main() {
return pthread_cond_wait(0,0);
}" | gcc -o tmp -xc - -lpthread;
readelf -W --dyn-syms tmp | grep pthread_cond_wait;
Symbol table '.dynsym' contains 7 entries:
Num: Value Size Type Bind Vis Ndx Name
...
2: 00000000 0 FUNC GLOBAL DEFAULT UND pthread_cond_wait@GLIBC_2.3.2 (2)
Only one symbol can be annotated as the
@@
default version.
Using the --version-info
flag with readelf, more details on the symbol
version info compiled into the tmp
ELF file can be obtained.
- The
.gnu.version
section contains the version definition for each symbol in the.dynsym
section.pthread_cond_wait
is at index2
in the.dynsym
section, the corresponding symbol version is at index2
in the.gnu.version
section. - The
.gnu.version_r
section contains symbol version requirements per shared library dependency (DT_NEEDED
dynamic entry).
> readelf -W --version-info --dyn-syms tmp
Symbol table '.dynsym' contains 7 entries:
Num: Value Size Type Bind Vis Ndx Name
0: 0000000000000000 0 NOTYPE LOCAL DEFAULT UND
1: 0000000000000000 0 NOTYPE WEAK DEFAULT UND _ITM_deregisterTMCloneTable
2: 0000000000000000 0 FUNC GLOBAL DEFAULT UND pthread_cond_wait@GLIBC_2.3.2 (2)
3: 0000000000000000 0 FUNC GLOBAL DEFAULT UND __libc_start_main@GLIBC_2.2.5 (3)
4: 0000000000000000 0 NOTYPE WEAK DEFAULT UND __gmon_start__
5: 0000000000000000 0 NOTYPE WEAK DEFAULT UND _ITM_registerTMCloneTable
6: 0000000000000000 0 FUNC WEAK DEFAULT UND __cxa_finalize@GLIBC_2.2.5 (3)
Version symbols section '.gnu.version' contains 7 entries:
Addr: 0x0000000000000534 Offset: 0x000534 Link: 6 (.dynsym)
000: 0 (*local*) 0 (*local*) 2 (GLIBC_2.3.2) 3 (GLIBC_2.2.5)
004: 0 (*local*) 0 (*local*) 3 (GLIBC_2.2.5)
Version needs section '.gnu.version_r' contains 2 entries:
Addr: 0x0000000000000548 Offset: 0x000548 Link: 7 (.dynstr)
000000: Version: 1 File: libc.so.6 Cnt: 1
0x0010: Name: GLIBC_2.2.5 Flags: none Version: 3
0x0020: Version: 1 File: libpthread.so.0 Cnt: 1
0x0030: Name: GLIBC_2.3.2 Flags: none Version: 2
The gnu dynamic linker allows to inspect the version processing during runtime
by setting the LD_DEBUG
environment variable accordingly.
# version: Display version dependencies.
> LD_DEBUG=versions ./tmp
717904: checking for version `GLIBC_2.2.5' in file /usr/lib/libc.so.6 [0] required by file ./tmp [0]
717904: checking for version `GLIBC_2.3.2' in file /usr/lib/libpthread.so.0 [0] required by file ./tmp [0]
...
# symbols : Display symbol table processing.
# bindings: Display information about symbol binding.
> LD_DEBUG=symbols,bindings ./tmp
...
718123: symbol=pthread_cond_wait; lookup in file=./tmp [0]
718123: symbol=pthread_cond_wait; lookup in file=/usr/lib/libpthread.so.0 [0]
718123: binding file ./tmp [0] to /usr/lib/libpthread.so.0 [0]: normal symbol `pthread_cond_wait' [GLIBC_2.3.2]
Example: version script
The following shows an example C++ library libfoo
which provides the same
symbol multiple times but in different versions.
// file: libfoo.cc
#include<stdio.h>
// Bind function symbols to version nodes.
//
// ..@ -> Is the unversioned symbol.
// ..@@.. -> Is the default symbol.
__asm__(".symver func_v0,func@");
__asm__(".symver func_v1,func@LIB_V1");
__asm__(".symver func_v2,func@@LIB_V2");
extern "C" {
void func_v0() { puts("func_v0"); }
void func_v1() { puts("func_v1"); }
void func_v2() { puts("func_v2"); }
}
__asm__(".symver _Z11func_cpp_v1i,_Z8func_cppi@LIB_V1");
__asm__(".symver _Z11func_cpp_v2i,_Z8func_cppi@@LIB_V2");
void func_cpp_v1(int) { puts("func_cpp_v1"); }
void func_cpp_v2(int) { puts("func_cpp_v2"); }
void func_cpp(int) { puts("func_cpp_v2"); }
Version script for libfoo
which defines which symbols for which versions are
exported from the ELF file.
# file: libfoo.ver
LIB_V1 {
global:
func;
extern "C++" {
"func_cpp(int)";
};
local:
*;
};
LIB_V2 {
global:
func;
extern "C++" {
"func_cpp(int)";
};
} LIB_V1;
The local: section in
LIB_V1
is a catch all, that matches any symbol not explicitly specified, and defines that the symbol is local and therefore not exported from the ELF file.
The library libfoo
can be linked with the version definitions in libfoo.ver
by passing the version script to the linker with the --version-script
flag.
> g++ -shared -fPIC -o libfoo.so libfoo.cc -Wl,--version-script=libfoo.ver
> readelf -W --dyn-syms libfoo.so | c++filt
Symbol table '.dynsym' contains 14 entries:
Num: Value Size Type Bind Vis Ndx Name
...
6: 0000000000000000 0 OBJECT GLOBAL DEFAULT ABS LIB_V1
7: 000000000000114b 29 FUNC GLOBAL DEFAULT 13 func_cpp(int)@LIB_V1
8: 0000000000001168 29 FUNC GLOBAL DEFAULT 13 func_cpp(int)@@LIB_V2
9: 0000000000001185 29 FUNC GLOBAL DEFAULT 13 func_cpp(int)@@LIB_V1
10: 0000000000000000 0 OBJECT GLOBAL DEFAULT ABS LIB_V2
11: 0000000000001109 22 FUNC GLOBAL DEFAULT 13 func
12: 000000000000111f 22 FUNC GLOBAL DEFAULT 13 func@LIB_V1
13: 0000000000001135 22 FUNC GLOBAL DEFAULT 13 func@@LIB_V2
The following program demonstrates how to make use of the different versions:
// file: main.cc
#include <dlfcn.h>
#include <assert.h>
// Links against default symbol in the lib.so.
extern "C" void func();
int main() {
// Call the default version.
func();
#ifdef _GNU_SOURCE
typedef void (*fnptr)();
// Unversioned lookup.
fnptr fn_v0 = (fnptr)dlsym(RTLD_DEFAULT, "func");
// Version lookup.
fnptr fn_v1 = (fnptr)dlvsym(RTLD_DEFAULT, "func", "LIB_V1");
fnptr fn_v2 = (fnptr)dlvsym(RTLD_DEFAULT, "func", "LIB_V2");
assert(fn_v0 != 0);
assert(fn_v1 != 0);
assert(fn_v2 != 0);
fn_v0();
fn_v1();
fn_v2();
#endif
return 0;
}
Compiling and running results in:
> g++ -o main main.cc -ldl ./libfoo.so && ./main
func_v2
func_v0
func_v1
func_v2
References
- ELF Symbol Versioning
- Binutils ld: Symbol Versioning
- LSB: Symbol Versioning
- How To Write Shared Libraries
python
Decorator [run]
Some decorator examples with type annotation.
from typing import Callable
def log(f: Callable[[int], None]) -> Callable[[int], None]:
def inner(x: int):
print(f"log::inner f={f.__name__} x={x}")
f(x)
return inner
@log
def some_fn(x: int):
print(f"some_fn x={x}")
def log_tag(tag: str) -> Callable[[Callable[[int], None]], Callable[[int], None]]:
def decorator(f: Callable[[int], None]) -> Callable[[int], None]:
def inner(x: int):
print(f"log_tag::inner f={f.__name__} tag={tag} x={x}")
f(x)
return inner
return decorator
@log_tag("some_tag")
def some_fn2(x: int):
print(f"some_fn2 x={x}")
Walrus operator [run]
Walrus operator :=
added since python 3.8.
from typing import Optional
# Example 1: if let statements
def foo(ret: Optional[int]) -> Optional[int]:
return ret
if r := foo(None):
print(f"foo(None) -> {r}")
if r := foo(1337):
print(f"foo(1337) -> {r}")
# Example 2: while let statements
toks = iter(['a', 'b', 'c'])
while tok := next(toks, None):
print(f"{tok}")
# Example 3: list comprehension
print([tok for t in [" a", " ", " b "] if (tok := t.strip())])
Unittest [run]
Run unittests directly from the command line as
python3 -m unittest -v test
Optionally pass -k <patter>
to only run subset of tests.
# file: test.py
import unittest
class MyTest(unittest.TestCase):
def setUp(self):
pass
def tearDown(self):
pass
# Tests need to start with the prefix 'test'.
def test_foo(self):
self.assertEqual(1 + 2, 3)
def test_bar(self):
with self.assertRaises(IndexError):
list()[0]
Doctest [run]
Run doctests directly from the command line as
python -m doctest -v test.py
# file: test.py
def sum(a: int, b: int) -> int:
"""Sum a and b.
>>> sum(1, 2)
3
>>> sum(10, 20)
30
"""
return a + b
timeit
Micro benchmarking.
python -m timeit '[x.strip() for x in ["a ", " b"]]'
Arch
x86_64
keywords: x86_64, x86, abi
- 64bit synonyms:
x86_64
,x64
,amd64
,intel 64
- 32bit synonyms:
x86
,ia32
,i386
- ISA type:
CISC
- Endianness:
little
Registers
General purpose register
bytes
[7:0] [3:0] [1:0] [1] [0] desc
----------------------------------------------------------
rax eax ax ah al accumulator
rbx ebx bx bh bl base register
rcx ecx cx ch cl counter
rdx edx dx dh dl data register
rsi esi si - sil source index
rdi edi di - dil destination index
rbp ebp bp - bpl base pointer
rsp esp sp - spl stack pointer
r8-15 rNd rNw - rNb
Special register
bytes
[7:0] [3:0] [1:0] desc
---------------------------------------------------
rflags eflags flags flags register
rip eip ip instruction pointer
FLAGS register
rflags
bits desc instr comment
--------------------------------------------------------------------------------------------------------------
[21] ID identification ability to set/clear -> indicates support for CPUID instr
[18] AC alignment check alignment exception for PL 3 (user), requires CR0.AM
[13:12] IOPL io privilege level
[11] OF overflow flag
[10] DF direction flag cld/std
[9] IF interrupt enable cli/sti
[7] SF sign flag
[6] ZF zero flag
[4] AF auxiliary carry flag
[2] PF parity flag
[0] CF carry flag
Change flag bits with pushf
/ popf
instructions:
pushfd // push flags (4bytes) onto stack
or dword ptr [esp], (1 << 18) // enable AC flag
popfd // pop flags (4byte) from stack
There is also
pushfq
/popfq
to push and pop all 8 bytes ofrflags
.
Model Specific Register (MSR)
rdmsr // Read MSR register, effectively does EDX:EAX <- MSR[ECX]
wrmsr // Write MSR register, effectively does MSR[ECX] <- EDX:EAX
Size directives
Explicitly specify size of the operation.
mov byte ptr [rax], 0xff // save 1 byte(s) at [rax]
mov word ptr [rax], 0xff // save 2 byte(s) at [rax]
mov dword ptr [rax], 0xff // save 4 byte(s) at [rax]
mov qword ptr [rax], 0xff // save 8 byte(s) at [rax]
Addressing
mov qword ptr [rax], rbx // save val in rbx at [rax]
mov qword ptr [imm], rbx // save val in rbx at [imm]
mov rax, qword ptr [rbx+4*rcx] // load val at [rbx+4*rcx] into rax
rip
relative addressing:
lea rax, [rip+.my_str] // load addr of .my_str into rax
...
.my_str:
.asciz "Foo"
String instructions
The operand size of a string instruction is defined by the instruction suffix
b | w | d | q
.
Source and destination registers are modified according to the direction flag (DF)
in the flags
register
DF=0
increment src/dest registersDF=1
decrement src/dest registers
Following explanation assumes byte
operands with DF=0
:
movsb // move data from string to string
// ES:[DI] <- DS:[SI]
// DI <- DI + 1
// SI <- SI + 1
lodsb // load string
// AL <- DS:[SI]
// SI <- SI + 1
stosb // store string
// ES:[DI] <- AL
// DI <- DI + 1
cmpsb // compare string operands
// DS:[SI] - ES:[DI] ; set status flag (eg ZF)
// SI <- SI + 1
// DI <- DI + 1
scasb // scan string
// AL - ES:[DI] ; set status flag (eg ZF)
// DI <- DI + 1
String operations can be repeated:
rep // repeat until rcx = 0
repz // repeat until rcx = 0 or while ZF = 0
repnz // repeat until rcx = 0 or while ZF = 1
Example: Simple memset
// memset (dest, 0xaa /* char */, 0x10 /* len */)
lea di, [dest]
mov al, 0xaa
mov cx, 0x10
rep stosb
SysV x86_64 ABI
Passing arguments to functions
- Integer/Pointer arguments
reg arg ----------- rdi 1 rsi 2 rdx 3 rcx 4 r8 5 r9 6
- Floating point arguments
reg arg ----------- xmm0 1 .. .. xmm7 8
- Additional arguments are passed on the stack. Arguments are pushed
right-to-left (RTL), meaning next arguments are closer to current
rsp
.
Return values from functions
- Integer/Pointer return values
reg size ----------------- rax 64 bit rax+rdx 128 bit
- Floating point return values:
reg size ------------------- xmm0 64 bit xmm0+xmm1 128 bit
Caller saved registers
Caller must save these registers if they should be preserved across function calls.
rax
rcx
rdx
rsi
rdi
rsp
r8
-r11
Callee saved registers
Caller can expect these registers to be preserved across function calls. Callee must must save these registers in case they are used.
rbx
rbp
r12
–r15
Stack
- grows downwards
- frames aligned on 16 byte boundary
Hi ADDR | +------------+ | | prev frame | | +------------+ <--- 16 byte aligned (X & ~0xf) | [rbp+8] | saved RIP | | [rbp] | saved RBP | | [rbp-8] | func stack | | | ... | v +------------+ Lo ADDR
Function prologue & epilogue
- prologue
push rbp // save caller base pointer mov rbp, rsp // save caller stack pointer
- epilogue
mov rsp, rbp // restore caller stack pointer pop rbp // restore caller base pointer
Equivalent to
leave
instruction.
ASM skeleton
Small assembler skeleton, ready to use with following properties:
- use raw Linux syscalls (
man 2 syscall
for ABI) - no
C runtime (crt)
- gnu assembler
gas
- intel syntax
# file: greet.s
.intel_syntax noprefix
.section .text, "ax", @progbits
.global _start
_start:
mov rdi, 1 # fd
lea rsi, [rip + greeting] # buf
mov rdx, [rip + greeting_len] # count
mov rax, 1 # write(2) syscall nr
syscall
mov rdi, 0 # exit code
mov rax, 60 # exit(2) syscall nr
syscall
.section .rdonly, "a", @progbits
greeting:
.asciz "Hi ASM-World!\n"
greeting_len:
.int .-greeting
Syscall numbers are defined in
/usr/include/asm/unistd.h
.
To compile and run:
> gcc -o greet greet.s -nostartfiles -nostdlib && ./greet
Hi ASM-World!
References
- SystemV AMD64 ABI
- AMD64 Vol1: Application Programming
- AMD64 Vol2: System Programming
- AMD64 Vol3: General-Purpose & System Instructions
- X86_64 Cheat-Sheet
- Intel 64 Vol1: Basic Architecture
- Intel 64 Vol2: Instruction Set Reference
- Intel 64 Vol3: System Programming Guide
- GNU Assembler
- GNU Assembler Directives
- GNU Assembler
x86_64
dependent features
arm64
keywords: arm64, aarch64, abi
- 64bit synonyms:
arm64
,aarch64
- ISA type:
RISC
- Endianness:
little
,big
Registers
General purpose registers
bytes
[7:0] [3:0] desc
---------------------------------------------
x0-x28 w0-w28 general purpose registers
x29 w29 frame pointer (FP)
x30 w30 link register (LR)
sp wsp stack pointer (SP)
pc program counter (PC)
xzr wzr zero register
Write to
wN
register clears upper 32bit.
Special registers per EL
bytes
[7:0] desc
---------------------------------------------
sp_el0 stack pointer EL0
sp_el1 stack pointer EL1
elr_el1 exception link register EL1
spsr_el1 saved process status register EL1
sp_el2 stack pointer EL2
elr_el2 exception link register EL2
spsr_el2 saved process status register EL2
sp_el3 stack pointer EL3
elr_el3 exception link register EL3
spsr_el3 saved process status register EL3
Instructions cheatsheet
Accessing system registers
Reading from system registers:
mrs x0, vbar_el1 // move vbar_el1 into x0
Writing to system registers:
msr vbar_el1, x0 // move x0 into vbar_el1
Control Flow
b <offset> // relative forward/back branch
br <Xn> // absolute branch to address in register Xn
// branch & link, store return address in X30 (LR)
bl <offset> // relative forward/back branch
blr <Xn> // absolute branch to address in register Xn
ret {Xn} // return to address in X30, or Xn if supplied
Addressing
Offset
ldr x0, [x1] // x0 = [x1]
ldr x0, [x1, 8] // x0 = [x1 + 8]
ldr x0, [x1, x2, lsl #3] // x0 = [x1 + (x2<<3)]
ldr x0, [x1, w2, stxw] // x0 = [x1 + sign_ext(w2)]
ldr x0, [x1, w2, stxw #3] // x0 = [x1 + (sign_ext(w2)<<3)]
Shift amount can either be
0
orlog2(access_size_bytes)
. Eg for 8byte access it can either be{0, 3}
.
Index
ldr x0, [x1, 8]! // pre-inc : x1+=8; x0 = [x1]
ldr x0, [x1], 8 // post-inc: x0 = [x1]; x1+=8
Pair access
ldp x1, x2, [x0] // x1 = [x0]; x2 = [x0 + 8]
stp x1, x2, [x0] // [x0] = x1; [x0 + 8] = x2
Procedure Call Standard ARM64 (aapcs64
)
Passing arguments to functions
- Integer/Pointer arguments
reg arg ----------- x0 1 .. .. x7 8
- Additional arguments are passed on the stack. Arguments are pushed
right-to-left (RTL)
, meaning next arguments are closer to currentsp
.void take(..., int a9, int a10); | | | ... | Hi | +-->| a10 | | +---------->| a9 | <-SP | +-----+ v | ... | Lo
Return values from functions
- Integer/Pointer return values
reg size ----------------- x0 64 bit
Callee saved registers
x19
-x28
SP
Stack
- full descending
- full:
sp
points to the last used location (valid item) - descending: stack grows downwards
- full:
sp
must be 16byte aligned when used to access memory for r/wsp
must be 16byte aligned on public interface interfaces
Frame chain
- linked list of stack-frames
- each frame links to the frame of its caller by a
frame record
- a frame record is described as a
(FP,LR)
pair
- a frame record is described as a
x29 (FP)
must point to the frame record of the current stack-frame+------+ Hi | 0 | frame0 | +->| 0 | | | | ... | | | +------+ | | | LR | frame1 | +--| FP |<-+ | | ... | | | +------+ | | | LR | | current | x29 ->| FP |--+ frame v | ... | Lo
- end of the frame chain is indicated by following frame record
(0,-)
- location of the frame record in the stack frame is not specified
Function prologue & epilogue
- prologue
sub sp, sp, 16 stp x29, x30, [sp] // [sp] = x29; [sp + 8] = x30 mov x29, sp // FP points to frame record
- epilogue
ldp x29, x30, [sp] // x29 = [sp]; x30 = [sp + 8] add sp, sp, 16 ret
ASM skeleton
Small assembler skeleton, ready to use with following properties:
- use raw Linux syscalls (
man 2 syscall
for ABI) - no
C runtime (crt)
- gnu assembler
gas
// file: greet.S
#include <asm/unistd.h> // syscall NRs
.arch armv8-a
.section .text, "ax", @progbits
.balign 4 // align code on 4byte boundary
.global _start
_start:
mov x0, 2 // fd
ldr x1, =greeting // buf
ldr x2, =greeting_len // &len
ldr x2, [x2] // len
mov w8, __NR_write // write(2) syscall
svc 0
mov x0, 0 // exit code
mov w8, __NR_exit // exit(2) syscall
svc 0
.balign 8 // align data on 8byte boundary
.section .rodata, "a", @progbits
greeting:
.asciz "Hi ASM-World!\n"
greeting_len:
.int .-greeting
man gcc:
file.S
assembler code that must be preprocessed.
To cross-compile and run:
> aarch64-linux-gnu-g++ -o greet greet.S -nostartfiles -nostdlib \
-Wl,--dynamic-linker=/usr/aarch64-linux-gnu/lib/ld-linux-aarch64.so.1 \
&& qemu-aarch64 ./greet
Hi ASM-World!
Cross-compiling on
Ubuntu 20.04 (x86_64)
, paths might differ on other distributions. Explicitly specifying the dynamic linker should not be required when compiling natively on arm64.
References
- Procedure Call Standard ARM64
- ARMv8-A Programmer's Guide
- ARMv8-A Architecture Reference Manual
- AppNote: ARMv8 Bare-metal boot code
- GNU Assembler
- GNU Assembler Directives
- GNU Assembler
AArch64
dependent features
armv7a
keywords: arm, armv7, abi
- ISA type:
RISC
- Endianness:
little
,big
Registers
General purpose registers
bytes
[3:0] alt desc
---------------------------------------------
r0-r12 general purpose registers
r11 fp
r13 sp stack pointer
r14 lr link register
r15 pc program counter
Special registers
bytes
[3:0] desc
---------------------------------------------
cpsr current program status register
CPSR register
cpsr
bits desc
-----------------------------
[31] N negative flag
[30] Z zero flag
[29] C carry flag
[28] V overflow flag
[27] Q cummulative saturation (sticky)
[9] E load/store endianness
[8] A disable asynchronous aborts
[7] I disable IRQ
[6] F disable FIQ
[5] T indicate Thumb state
[4:0] M process mode (USR, FIQ, IRQ, SVC, ABT, UND, SYS)
Instructions cheatsheet
Accessing system registers
Reading from system registers:
mrs r0, cpsr // move cpsr into r0
Writing to system registers:
msr cpsr, r0 // move r0 into cpsr
Control Flow
b <lable> // relative forward/back branch
bl <lable> // relative forward/back branch & link return addr in r14 (LR)
// branch & exchange (can change between ARM & Thumb instruction set)
// bit Rm[0] == 0 -> ARM
// bit Rm[0] == 1 -> Thumb
bx <Rm> // absolute branch to address in register Rm
blx <Rm> // absolute branch to address in register Rm &
// link return addr in r14 (LR)
Load/Store
Different addressing modes.
ldr r1, [r0] // r1 = [r0]
ldr r1, [r0, #4] // r1 = [r0+4]
ldr r1, [r0, #4]! // pre-inc : r0+=4; r1 = [r0]
ldr r1, [r0], #4 // post-inc: [r0] = r1; r0+=4
ldr r0, [r1, r2, lsl #3] // r0 = [r1 + (r2<<3)]
Load/store multiple registers full-descending.
stmfd r0!, {r1-r2, r5} // r0-=4; [r0]=r5
// r0-=4; [r0]=r2
// r0-=4; [r0]=r1
ldmfd r0!, {r1-r2, r5} // r1=[r0]; r0+=4
// r2=[r0]; r0+=4
// r5=[r0]; r0+=4
!
is optional but has the effect to update the base pointer registerr0
here.
Push/Pop
push {r0-r2} // effectively stmfd sp!, {r0-r2}
pop {r0-r2} // effectively ldmfd sp!, {r0-r2}
Procedure Call Standard ARM (aapcs32
)
Passing arguments to functions
- integer/pointer arguments
reg arg ----------- r0 1 .. .. r3 4
- a double word (64bit) is passed in two consecutive registers (eg
r1+r2
) - additional arguments are passed on the stack. Arguments are pushed
right-to-left (RTL)
, meaning next arguments are closer to currentsp
.void take(..., int a5, int a6); | | | ... | Hi | +-->| a6 | | +---------->| a5 | <-SP | +-----+ v | ... | Lo
Return values from functions
- integer/pointer return values
reg size ----------------- r0 32 bit r0+r1 64 bit
Callee saved registers
r4
-r11
sp
Stack
- full descending
- full:
sp
points to the last used location (valid item) - descending: stack grows downwards
- full:
sp
must be 4byte aligned (word boundary) at all timesp
must be 8byte aligned on public interface interfaces
Frame chain
- not strictly required by each platform
- linked list of stack-frames
- each frame links to the frame of its caller by a
frame record
- a frame record is described as a
(FP,LR)
pair (2x32bit)
- a frame record is described as a
r11 (FP)
must point to the frame record of the current stack-frame+------+ Hi | 0 | frame0 | +->| 0 | | | | ... | | | +------+ | | | LR | frame1 | +--| FP |<-+ | | ... | | | +------+ | | | LR | | current | r11 ->| FP |--+ frame v | ... | Lo
- end of the frame chain is indicated by following frame record
(0,-)
- location of the frame record in the stack frame is not specified
r11
is not updated before the new frame record is fully constructed
Function prologue & epilogue
- prologue
push {fp, lr} mov fp, sp // FP points to frame record
- epilogue
pop {fp, pc} // pop LR directly into PC
ASM skeleton
Small assembler skeleton, ready to use with following properties:
- use raw Linux syscalls (
man 2 syscall
for ABI) - no
C runtime (crt)
- gnu assembler
gas
// file: greet.S
#include <asm/unistd.h> // syscall NRs
.arch armv7-a
.section .text, "ax"
.balign 4
// Emit `arm` instructions, same as `.arm` directive.
.code 32
.global _start
_start:
// Branch with link and exchange instruction set.
blx _do_greet
mov r0, #0 // exit code
mov r7, #__NR_exit // exit(2) syscall
swi 0x0
// Emit `thumb` instructions, same as `.thumb` directive.
.code 16
.thumb_func
_do_greet:
mov r0, #2 // fd
ldr r1, =greeting // buf
ldr r2, =greeting_len // &len
ldr r2, [r2] // len
mov r7, #__NR_write // write(2) syscall
swi 0x0
// Branch and exchange instruction set.
bx lr
.balign 8 // align data on 8byte boundary
.section .rodata, "a"
greeting:
.asciz "Hi ASM-World!\n"
greeting_len:
.int .-greeting
man gcc:
file.S
assembler code that must be preprocessed.
To cross-compile and run:
> arm-linux-gnueabi-gcc -o greet greet.S -nostartfiles -nostdlib \
-Wl,--dynamic-linker=/usr/arm-linux-gnueabi/lib/ld-linux.so.3 \
&& qemu-arm ./greet
Hi ASM-World!
Cross-compiling on
Ubuntu 20.04 (x86_64)
, paths might differ on other distributions. Explicitly specifying the dynamic linker should not be required when compiling natively on arm.