aboutsummaryrefslogtreecommitdiffhomepage
path: root/content/2021-05-15-pthread_cancel-noexcept/index.md
blob: f09cadc0c1de77c810956cd8553f538d7b524d0d (plain) (blame)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
+++
title = "pthread_cancel in c++ code"

[taxonomies]
tags = ["linux", "threading", "c++"]
+++

Few weeks ago I was debugging a random crash in a legacy code base at work. In
case the crash occurred the following message was printed on `stdout` of the
process:
```
terminate called without an active exception
```

Looking at the reasons when [`std::terminate()`][std_terminate] is being
called, and the message that `std::terminate()` was called without an active
exception, the initial assumption was one of the following:
- `10) a joinable std::thread is destroyed or assigned to`.
- Invoked explicitly by the user.

After receiving a backtrace captured by a customer it wasn't directly obvious
to me why `std::terminate()` was called here. The backtrace received looked
something like the following:
```
#0  0x00007fb21df22ef5 in raise () from /usr/lib/libc.so.6
#1  0x00007fb21df0c862 in abort () from /usr/lib/libc.so.6
#2  0x00007fb21e2a886a in __gnu_cxx::__verbose_terminate_handler () at /build/gcc/src/gcc/libstdc++-v3/libsupc++/vterminate.cc:95
#3  0x00007fb21e2b4d3a in __cxxabiv1::__terminate (handler=<optimized out>) at /build/gcc/src/gcc/libstdc++-v3/libsupc++/eh_terminate.cc:48
#4  0x00007fb21e2b4da7 in std::terminate () at /build/gcc/src/gcc/libstdc++-v3/libsupc++/eh_terminate.cc:58
#5  0x00007fb21e2b470d in __cxxabiv1::__gxx_personality_v0 (version=<optimized out>, actions=10, exception_class=0, ue_header=0x7fb21dee0cb0,  context=<optimized out>) at /build/gcc/src/gcc/libstdc++-v3/libsupc++/eh_personality.cc:673
#6  0x00007fb21e0c3814 in _Unwind_ForcedUnwind_Phase2 (exc=0x7fb21dee0cb0, context=0x7fb21dedfc50, frames_p=0x7fb21dedfb58) at /build/gcc/src/gcc/libgcc/unwind.inc:182
#7  0x00007fb21e0c3f12 in _Unwind_ForcedUnwind (exc=0x7fb21dee0cb0, stop=<optimized out>, stop_argument=0x7fb21dedfe70) at /build/gcc/src/gcc/libgcc/unwind.inc:217
#8  0x00007fb21e401434 in __pthread_unwind () from /usr/lib/libpthread.so.0
#9  0x00007fb21e401582 in __pthread_enable_asynccancel () from /usr/lib/libpthread.so.0
#10 0x00007fb21e4017c7 in write () from /usr/lib/libpthread.so.0
#11 0x000055f6b8149320 in S::~S (this=0x7fb21dedfe37, __in_chrg=<optimized out>) at 20210515-pthread_cancel-noexcept/thread.cc:9
#12 0x000055f6b81491bb in threadFn () at 20210515-pthread_cancel-noexcept/thread.cc:18
#13 0x00007fb21e3f8299 in start_thread () from /usr/lib/libpthread.so.0
#14 0x00007fb21dfe5053 in clone () from /usr/lib/libc.so.6
```
Looking at frames `#6 - #9` we can see that the crashing thread is just
executing `forced unwinding` which is performing the stack unwinding as part of
the thread being cancelled by [`pthread_cancel(3)`][pthread_cancel].
Thread cancellation starts here from the call to `write()` at frame `#10`, as
pthreads in their default configuration only perform thread cancellation
requests when passing a `cancellation point` as described in
[pthreads(7)][pthreads].
> The pthread cancel type can either be `PTHREAD_CANCEL_DEFERRED (default)` or
> `PTHREAD_CANCEL_ASYNCHRONOUS` and can be set with
> [`pthread_setcanceltype(3)`][pthread_canceltype].

With this findings we can take another look at the reasons when
[`std::terminate()`][std_terminate] is being called. The interesting item on
the list this time is the following:
- `7) a noexcept specification is violated`

This item is of particular interest because:
- In c++ `destructors` are implicitly marked [`noexcept`][noexcept].
- For NPTL, thread cancellation is implemented by throwing an exception of type
  `abi::__forced_unwind`.

With all these findings, the random crash in the application can be explained
as that the `pthread_cancel` call was happening asynchronous to the cancelled
thread and there was a chance that a `cancellation point` was hit in a
`destructor`.

## Conclusion
In general `pthread_cancel` should not be used in c++ code at all, but the
thread should have a way to request a clean shutdown (for example similar to
[`std::jthread`][jthread]).

However if thread cancellation is **required** then the code should be audited
very carefully and the cancellation points controlled explicitly. This can be
achieved by inserting cancellation points at **safe** sections as:
```c
pthread_setcancelstate(PTHREAD_CANCEL_ENABLE);
pthread_testcancel();
pthread_setcancelstate(PTHREAD_CANCEL_DISABLE);
```
> On thread entry, the cancel state should be set to `PTHREAD_CANCEL_DISABLE`
> to disable thread cancellation.

## Appendix: `abi::__forced_unwind` exception
As mentioned above, thread cancellation for NPTL is implemented by throwing an
exception of type `abi::__forced_unwind`. This exception can actually be caught
in case some extra clean-up steps need to be performed on thread cancellation.
However it is **required** to `rethrow` the exception.
```cpp
#include <cxxabi.h>

try {
    // ...
} catch (abi::__forced_unwind&) {
    // Do some extra cleanup.
    throw;
}
```

## Appendix: Minimal reproducer
```cpp
{{ include(path="content/2021-05-15-pthread_cancel-noexcept/thread.cc") }}
```

[std_terminate]: https://en.cppreference.com/w/cpp/error/terminate
[pthread_cancel]: https://man7.org/linux/man-pages/man3/pthread_cancel.3.html
[pthread_canceltype]: https://man7.org/linux/man-pages/man3/pthread_setcanceltype.3.html
[pthread_testcancel]: https://man7.org/linux/man-pages/man3/pthread_testcancel.3.html
[pthreads]: https://man7.org/linux/man-pages/man7/pthreads.7.html
[noexcept]: https://en.cppreference.com/w/cpp/language/noexcept_spec
[jthread]: https://en.cppreference.com/w/cpp/thread/jthread/request_stop