diff options
Diffstat (limited to 'content/2021-05-15-pthread_cancel-noexcept')
-rw-r--r-- | content/2021-05-15-pthread_cancel-noexcept/index.md | 110 |
1 files changed, 110 insertions, 0 deletions
diff --git a/content/2021-05-15-pthread_cancel-noexcept/index.md b/content/2021-05-15-pthread_cancel-noexcept/index.md new file mode 100644 index 0000000..f09cadc --- /dev/null +++ b/content/2021-05-15-pthread_cancel-noexcept/index.md @@ -0,0 +1,110 @@ ++++ +title = "pthread_cancel in c++ code" + +[taxonomies] +tags = ["linux", "threading", "c++"] ++++ + +Few weeks ago I was debugging a random crash in a legacy code base at work. In +case the crash occurred the following message was printed on `stdout` of the +process: +``` +terminate called without an active exception +``` + +Looking at the reasons when [`std::terminate()`][std_terminate] is being +called, and the message that `std::terminate()` was called without an active +exception, the initial assumption was one of the following: +- `10) a joinable std::thread is destroyed or assigned to`. +- Invoked explicitly by the user. + +After receiving a backtrace captured by a customer it wasn't directly obvious +to me why `std::terminate()` was called here. The backtrace received looked +something like the following: +``` +#0 0x00007fb21df22ef5 in raise () from /usr/lib/libc.so.6 +#1 0x00007fb21df0c862 in abort () from /usr/lib/libc.so.6 +#2 0x00007fb21e2a886a in __gnu_cxx::__verbose_terminate_handler () at /build/gcc/src/gcc/libstdc++-v3/libsupc++/vterminate.cc:95 +#3 0x00007fb21e2b4d3a in __cxxabiv1::__terminate (handler=<optimized out>) at /build/gcc/src/gcc/libstdc++-v3/libsupc++/eh_terminate.cc:48 +#4 0x00007fb21e2b4da7 in std::terminate () at /build/gcc/src/gcc/libstdc++-v3/libsupc++/eh_terminate.cc:58 +#5 0x00007fb21e2b470d in __cxxabiv1::__gxx_personality_v0 (version=<optimized out>, actions=10, exception_class=0, ue_header=0x7fb21dee0cb0, context=<optimized out>) at /build/gcc/src/gcc/libstdc++-v3/libsupc++/eh_personality.cc:673 +#6 0x00007fb21e0c3814 in _Unwind_ForcedUnwind_Phase2 (exc=0x7fb21dee0cb0, context=0x7fb21dedfc50, frames_p=0x7fb21dedfb58) at /build/gcc/src/gcc/libgcc/unwind.inc:182 +#7 0x00007fb21e0c3f12 in _Unwind_ForcedUnwind (exc=0x7fb21dee0cb0, stop=<optimized out>, stop_argument=0x7fb21dedfe70) at /build/gcc/src/gcc/libgcc/unwind.inc:217 +#8 0x00007fb21e401434 in __pthread_unwind () from /usr/lib/libpthread.so.0 +#9 0x00007fb21e401582 in __pthread_enable_asynccancel () from /usr/lib/libpthread.so.0 +#10 0x00007fb21e4017c7 in write () from /usr/lib/libpthread.so.0 +#11 0x000055f6b8149320 in S::~S (this=0x7fb21dedfe37, __in_chrg=<optimized out>) at 20210515-pthread_cancel-noexcept/thread.cc:9 +#12 0x000055f6b81491bb in threadFn () at 20210515-pthread_cancel-noexcept/thread.cc:18 +#13 0x00007fb21e3f8299 in start_thread () from /usr/lib/libpthread.so.0 +#14 0x00007fb21dfe5053 in clone () from /usr/lib/libc.so.6 +``` +Looking at frames `#6 - #9` we can see that the crashing thread is just +executing `forced unwinding` which is performing the stack unwinding as part of +the thread being cancelled by [`pthread_cancel(3)`][pthread_cancel]. +Thread cancellation starts here from the call to `write()` at frame `#10`, as +pthreads in their default configuration only perform thread cancellation +requests when passing a `cancellation point` as described in +[pthreads(7)][pthreads]. +> The pthread cancel type can either be `PTHREAD_CANCEL_DEFERRED (default)` or +> `PTHREAD_CANCEL_ASYNCHRONOUS` and can be set with +> [`pthread_setcanceltype(3)`][pthread_canceltype]. + +With this findings we can take another look at the reasons when +[`std::terminate()`][std_terminate] is being called. The interesting item on +the list this time is the following: +- `7) a noexcept specification is violated` + +This item is of particular interest because: +- In c++ `destructors` are implicitly marked [`noexcept`][noexcept]. +- For NPTL, thread cancellation is implemented by throwing an exception of type + `abi::__forced_unwind`. + +With all these findings, the random crash in the application can be explained +as that the `pthread_cancel` call was happening asynchronous to the cancelled +thread and there was a chance that a `cancellation point` was hit in a +`destructor`. + +## Conclusion +In general `pthread_cancel` should not be used in c++ code at all, but the +thread should have a way to request a clean shutdown (for example similar to +[`std::jthread`][jthread]). + +However if thread cancellation is **required** then the code should be audited +very carefully and the cancellation points controlled explicitly. This can be +achieved by inserting cancellation points at **safe** sections as: +```c +pthread_setcancelstate(PTHREAD_CANCEL_ENABLE); +pthread_testcancel(); +pthread_setcancelstate(PTHREAD_CANCEL_DISABLE); +``` +> On thread entry, the cancel state should be set to `PTHREAD_CANCEL_DISABLE` +> to disable thread cancellation. + +## Appendix: `abi::__forced_unwind` exception +As mentioned above, thread cancellation for NPTL is implemented by throwing an +exception of type `abi::__forced_unwind`. This exception can actually be caught +in case some extra clean-up steps need to be performed on thread cancellation. +However it is **required** to `rethrow` the exception. +```cpp +#include <cxxabi.h> + +try { + // ... +} catch (abi::__forced_unwind&) { + // Do some extra cleanup. + throw; +} +``` + +## Appendix: Minimal reproducer +```cpp +{{ include(path="content/2021-05-15-pthread_cancel-noexcept/thread.cc") }} +``` + +[std_terminate]: https://en.cppreference.com/w/cpp/error/terminate +[pthread_cancel]: https://man7.org/linux/man-pages/man3/pthread_cancel.3.html +[pthread_canceltype]: https://man7.org/linux/man-pages/man3/pthread_setcanceltype.3.html +[pthread_testcancel]: https://man7.org/linux/man-pages/man3/pthread_testcancel.3.html +[pthreads]: https://man7.org/linux/man-pages/man7/pthreads.7.html +[noexcept]: https://en.cppreference.com/w/cpp/language/noexcept_spec +[jthread]: https://en.cppreference.com/w/cpp/thread/jthread/request_stop |