aboutsummaryrefslogtreecommitdiffstats
path: root/thread_pthread.c
Commit message (Collapse)AuthorAgeFilesLines
* Revert "Add missing GVL hooks for M:N threads and ractors"John Hawthorn2023-12-031-2/+0
| | | | This reverts commit ad54fbf281ca1935e79f4df1460b0106ba76761e.
* Add missing GVL hooks for M:N threads and ractorsJohn Hawthorn2023-12-021-0/+2
| | | | | | | | | | | [Bug #20019] This fixes GVL instrumentation in three locations it was missing: - Suspending when blocking on a Ractor - Suspending when doing a coroutine transfer from an M:N thread - Resuming after an M:N thread starts Co-authored-by: Matthew Draper <matthew@trebex.net>
* Further fix the GVL instrumentation APIJean Boussier2023-11-281-3/+3
| | | | | | | | | | Followup: https://github.com/ruby/ruby/pull/9029 [Bug #20019] Some events still weren't triggered from the right place. The test suite was also improved a bit more.
* Refactor and fix the GVL instrumentation APIJean Boussier2023-11-271-12/+33
| | | | | | | | | | | | This entirely changes how it is tested. Rather than to use counters we now record the timeline of events with associated threads which makes it much easier to assert that certains events are only preceded by a specific event, and makes it much easier to debug unexpected timelines. Co-Authored-By: Étienne Barrié <etienne.barrie@gmail.com> Co-Authored-By: JP Camara <jp@jpcamara.com> Co-Authored-By: John Hawthorn <john@hawthorn.email>
* GVL Instrumentation: pass thread->self as part of event dataJean Boussier2023-11-131-12/+15
| | | | | | | | | | | | | | | | | | | Context: https://github.com/ivoanjo/gvl-tracing/pull/4 Some hooks may want to collect data on a per thread basis. Right now the only way to identify the concerned thread is to use `rb_nativethread_self()` or similar, but even then because of the thread cache or MaNy, two distinct Ruby threads may report the same native thread id. By passing `thread->self`, hooks can use it as a key to store the metadata. NB: Most hooks are executed outside the GVL, so such data collection need to use a thread-safe data-structure, and shouldn't use the reference in other ways from inside the hook. They must also either pin that value or handle compaction.
* thread_pthread.c: unbreak 10.5 Intel by restoring accidentally deleted macroSergey Fedorov2023-11-011-1/+6
|
* "+MN" in descriptionKoichi Sasada2023-10-171-26/+9
| | | | | | | | | | | | | | If `RUBY_MN_THREADS=1` is given, this patch shows `+MN` in `RUBY_DESCRIPTION` like: ``` $ RUBY_MN_THREADS=1 ./miniruby --yjit -v ruby 3.3.0dev (2023-10-17T04:10:14Z master 908f8fffa2) +YJIT +MN [x86_64-linux] ``` Before this patch, a warning is displayed if `$VERBOSE` is given. However it can make troubles with tests (with `$VERBOSE`), do not show any warning with a MN threads configuration.
* Fix typos [ci skip]Kazuhiro NISHIYAMA2023-10-161-2/+2
|
* release sched_lock before VM lockKoichi Sasada2023-10-141-2/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | to avoid deadlock ```ruby r = Ractor.new do obj = Thread.new{} Ractor.yield obj rescue => e e.message end p r.take ``` ``` (lldb) bt * thread #1, name = 'miniruby', stop reason = signal SIGSTOP * frame #0: 0x0000ffff44881410 libpthread.so.0`__lll_lock_wait + 88 frame #1: 0x0000ffff4487a078 libpthread.so.0`__pthread_mutex_lock + 232 frame #2: 0x0000aaab617c0980 miniruby`rb_native_mutex_lock(lock=<unavailable>) at thread_pthread.c:109:14 frame #3: 0x0000aaab617c1d58 miniruby`ubf_event_waiting [inlined] thread_sched_lock_(th=0x0000aaab9df82980, file=<unavailable>, line=46, sched=0x0000aaab9dec79b8) at thread_pthread.c:351:5 frame #4: 0x0000aaab617c1d50 miniruby`ubf_event_waiting(ptr=0x0000aaab9df82980) at thread_pthread_mn.c:46:5 frame #5: 0x0000aaab617c6020 miniruby`rb_threadptr_interrupt [inlined] rb_threadptr_interrupt_common(trap=0, th=0x0000aaab9df82980) at thread.c:352:25 frame #6: 0x0000aaab617c5fec miniruby`rb_threadptr_interrupt(th=0x0000aaab9df82980) at thread.c:365:5 frame #7: 0x0000aaab617379b0 miniruby`rb_ractor_terminate_all at ractor.c:2364:13 frame #8: 0x0000aaab6173797c miniruby`rb_ractor_terminate_all at ractor.c:2383:17 frame #9: 0x0000aaab61737958 miniruby`rb_ractor_terminate_all [inlined] ractor_terminal_interrupt_all(vm=0x0000aaab9dea3320) at ractor.c:2375:1 frame #10: 0x0000aaab61737950 miniruby`rb_ractor_terminate_all at ractor.c:2424:13 frame #11: 0x0000aaab6164f108 miniruby`rb_ec_cleanup(ec=0x0000aaab9dea5900, ex=RUBY_TAG_NONE) at eval.c:239:9 frame #12: 0x0000aaab6164fa3c miniruby`ruby_run_node(n=0x0000ffff417ed178) at eval.c:328:12 frame #13: 0x0000aaab615a5ab0 miniruby`main at main.c:39:12 frame #14: 0x0000aaab615a5a98 miniruby`main(argc=<unavailable>, argv=<unavailable>) at main.c:58:12 frame #15: 0x0000ffff44714b2c libc.so.6`__libc_start_main + 228 frame #16: 0x0000aaab615a5b0c miniruby`_start + 52 (lldb) thread select 3 * thread #3, name = 'bootstraptest.*', stop reason = signal SIGSTOP frame #0: 0x0000ffff448813ec libpthread.so.0`__lll_lock_wait + 52 libpthread.so.0`__lll_lock_wait: -> 0xffff448813ec <+52>: svc #0 0xffff448813f0 <+56>: eor w20, w20, #0x80 0xffff448813f4 <+60>: sxtw x20, w20 0xffff448813f8 <+64>: b 0xffff44881414 ; <+92> (lldb) bt * thread #3, name = 'bootstraptest.*', stop reason = signal SIGSTOP * frame #0: 0x0000ffff448813ec libpthread.so.0`__lll_lock_wait + 52 frame #1: 0x0000ffff4487a078 libpthread.so.0`__pthread_mutex_lock + 232 frame #2: 0x0000aaab617c0980 miniruby`rb_native_mutex_lock(lock=<unavailable>) at thread_pthread.c:109:14 frame #3: 0x0000aaab61823d68 miniruby`rb_vm_lock_enter_body [inlined] vm_lock_enter(no_barrier=false, lev=0x0000ffff215bfbe4, locked=false, vm=0x0000aaab9dea3320, cr=0x0000aaab9dec7890) at vm_sync.c:57:9 frame #4: 0x0000aaab61823d60 miniruby`rb_vm_lock_enter_body(lev=0x0000ffff215bfbe4) at vm_sync.c:119:9 frame #5: 0x0000aaab617c1b30 miniruby`thread_sched_setup_running_threads [inlined] rb_vm_lock_enter(file=<unavailable>, line=597, lev=0x0000ffff215bfbe4) at vm_sync.h:75:9 frame #6: 0x0000aaab617c1b14 miniruby`thread_sched_setup_running_threads(vm=0x0000aaab9dea3320, add_th=0x0000aaab9df82980, del_th=<unavailable>, add_timeslice_th=0x0000000000000000, cr=<unavailable>, sched=<unavailable>, sched=<unavailable>) at thread_pthread.c:597:9 frame #7: 0x0000aaab617c29b4 miniruby`thread_sched_wait_running_turn at thread_pthread.c:614:5 frame #8: 0x0000aaab617c298c miniruby`thread_sched_wait_running_turn(sched=0x0000aaab9dec79b8, th=0x0000aaab9df82980, can_direct_transfer=true) at thread_pthread.c:868:9 frame #9: 0x0000aaab617c6f0c miniruby`thread_sched_wait_events(sched=0x0000aaab9dec79b8, th=0x0000aaab9df82980, fd=<unavailable>, events=<unavailable>, rel=<unavailable>) at thread_pthread_mn.c:90:17 frame #10: 0x0000aaab617c7354 miniruby`rb_thread_terminate_all at thread_pthread.c:3248:13 frame #11: 0x0000aaab617c733c miniruby`rb_thread_terminate_all(th=0x0000aaab9df82980) at thread.c:466:13 frame #12: 0x0000aaab617c7a64 miniruby`thread_start_func_2(th=0x0000aaab9df82980, stack_start=<unavailable>) at thread.c:713:9 frame #13: 0x0000aaab617c7d1c miniruby`co_start [inlined] call_thread_start_func_2(th=0x0000aaab9df82980) at thread_pthread.c:2165:5 frame #14: 0x0000aaab617c7cd0 miniruby`co_start(from=<unavailable>, self=0x0000aaab9df0f760) at thread_pthread_mn.c:421:9 ```
* Allow `NON_SCALAR_THREAD_ID` machinesKoichi Sasada2023-10-141-2/+1
| | | | s390x (Ubuntu) still fails tests with 62dfaeec2c.
* disable MN schedulers for some platformsKoichi Sasada2023-10-141-5/+12
| | | | | | | | * on `__EMSCRIPTEN__` provides epoll* declarations, but no implementations. * on `NON_SCALAR_THREAD_ID`, now we can not debug issues on x390s/Ubuntu so skip it. x390s/RHEL works fine, so I think we can remove second limitation but I could not login to it so it seems hard to debug now.
* fix `native_thread_destroy()` timingKoichi Sasada2023-10-131-17/+16
| | | | | | | | With M:N thread scheduler, the native thread (NT) related resources should be freed when the NT is no longer needed. So the calling `native_thread_destroy()` at the end of `is will be freed when `thread_cleanup_func()` (at the end of Ruby thread) is not correct timing. Call it when the corresponding Ruby thread is collected.
* disable MN scheduler on !`USE_MN_THREADS`Koichi Sasada2023-10-131-2/+5
|
* Fix unused-function warning for 'ruby_ppoll' [ci skip]Nobuyoshi Nakada2023-10-121-1/+1
|
* M:N thread scheduler for RactorsKoichi Sasada2023-10-121-986/+1862
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch introduce M:N thread scheduler for Ractor system. In general, M:N thread scheduler employs N native threads (OS threads) to manage M user-level threads (Ruby threads in this case). On the Ruby interpreter, 1 native thread is provided for 1 Ractor and all Ruby threads are managed by the native thread. From Ruby 1.9, the interpreter uses 1:1 thread scheduler which means 1 Ruby thread has 1 native thread. M:N scheduler change this strategy. Because of compatibility issue (and stableness issue of the implementation) main Ractor doesn't use M:N scheduler on default. On the other words, threads on the main Ractor will be managed with 1:1 thread scheduler. There are additional settings by environment variables: `RUBY_MN_THREADS=1` enables M:N thread scheduler on the main ractor. Note that non-main ractors use the M:N scheduler without this configuration. With this configuration, single ractor applications run threads on M:1 thread scheduler (green threads, user-level threads). `RUBY_MAX_CPU=n` specifies maximum number of native threads for M:N scheduler (default: 8). This patch will be reverted soon if non-easy issues are found. [Bug #19842]
* Fix Thread#native_thread_id being cached across fork (#8418)KJ Tsanaktsidis2023-09-151-12/+15
| | | | | | The native thread ID can and does change on some operating systems (e.g. Linux) after forking, so it needs to be re-queried. [Bug #19873]
* Fix `USE_THREAD_CACHE=0`Nobuyoshi Nakada2023-07-191-3/+5
|
* Move `posix_signal` declaration internal with prefix `ruby_`Nobuyoshi Nakada2023-07-171-1/+1
|
* Compile disabled code for thread cache alwaysNobuyoshi Nakada2023-06-301-23/+16
|
* Fix a potential busy-loop in the thread scheduler (esp. on FreeBSD)KJ Tsanaktsidis2023-05-261-1/+20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch fixes a potential busy-loop in the thread scheduler. If there are two threads, the main thread (where Ruby signal handlers must run) and a sleeping thread, it is possible for the following sequence of events to occur: * The sleeping thread is in native_sleep -> sigwait_sleep A signal * arives, kicking this thread out of rb_sigwait_sleep The sleeping * thread calls THREAD_BLOCKING_END and eventually thread_sched_to_running_common * the sleeping thread writes into the sigwait_fd pipe by calling rb_thread_wakeup_timer_thread * the sleeping thread re-loops around in native_sleep() because the desired sleep time has not actually yet expired * that calls rb_sigwait_sleep again the ppoll() in rb_sigwait_sleep * immediately returns because of the byte written into the sigwait_fd by rb_thread_wakeup_timer_thread * that wakes the thread up again and kicks the whole cycle off again. Such a loop can only be broken by the main thread waking up and handling the signal, such that ubf_threads_empty() below becomes true again; however this loop can actually keep things so busy (and cause so much contention on the main thread's interrupt_lock) that the main thread doesn't deal with the signal for many seconds. This seems particuarly likely on FreeBSD 13. (the cycle can also be broken by the sleeping thread finally elapsing its desired sleep time). The fix for _this_ loop is to only wakeup the timer thrad in thread_sched_to_running_common if the current thread is not itself the sigwait thread. An almost identical loop also happens in the same circumstances because the call to check_signals_nogvl (through sigwait_timeout) in rb_sigwait_sleep returns true if there is any pending signal for the main thread to handle. That then causes rb_sigwait_sleep to skip over sleeping entirely. This is unnescessary and counterproductive, I believe; if the main thread needs to be woken up that is done inline in check_signals_nogvl anyway. See https://bugs.ruby-lang.org/issues/19680
* `rb_bug` prints a newline after the messageNobuyoshi Nakada2023-05-201-2/+2
|
* pass `th` to `thread_sched_to_waiting()`Koichi Sasada2023-03-311-7/+7
| | | | for future extension
* reorder `thread_pthread.c` functionsKoichi Sasada2023-03-311-286/+288
|
* `nt->serial` for `RUBY_DEBUG_LOG`Koichi Sasada2023-03-311-1/+17
| | | | | | Show native thread's serial on `RUBY_DEBUG_LOG`. `nt->serial` is also stored into `ruby_nt_serial` if the compiler supports `RB_THREAD_LOCAL_SPECIFIER`.
* thread_pthread.c: Use a `fork_gen` to protect against fork instead of getpid()Jean Boussier2023-03-231-33/+32
| | | | | | | | | | | | | | | [Feature #19443] Until recently most libc would cache `getpid()` so this was a cheap check to make. However as of glibc version 2.25 the PID cache is removed and calls to getpid() always invoke the actual system call which significantly degrades the performance of existing applications. The reason glibc removed the cache is that some libraries were bypassing fork(2) by issuing system calls themselves, causing stale cache issues. That isn't a concern for Ruby as bypassing MRI's primitive for forking would render the VM unusable, so we can safely cache the PID.
* Rename RB_GC_SAVE_MACHINE_CONTEXT -> RB_VM_SAVE_MACHINE_CONTEXTMatt Valentine-House2023-03-151-1/+1
|
* Remove SIGCHLD `waidpid`. (#7527)Samuel Williams2023-03-151-2/+0
| | | | | | | * Remove `waitpid_lock` and related code. * Remove un-necessary test. * Remove `rb_thread_sleep_interruptible` dead code.
* Revert SIGCHLD changes to diagnose CI failures. (#7517)Samuel Williams2023-03-141-0/+2
| | | | | | | | | | | | | | | * Revert "Remove special handling of `SIGCHLD`. (#7482)" This reverts commit 44a0711eab7fbc71ac2c8ff489d8c53e97a8fe75. * Revert "Remove prototypes for functions that are no longer used. (#7497)" This reverts commit 4dce12bead3bfd91fd80b5e7195f7f540ffffacb. * Revert "Remove SIGCHLD `waidpid`. (#7476)" This reverts commit 1658e7d96696a656d9bd0a0c84c82cde86914ba2. * Fix change to rjit variable name.
* Remove SIGCHLD `waidpid`. (#7476)Samuel Williams2023-03-091-2/+0
| | | | | | | * Remove `waitpid_lock` and related code. * Remove un-necessary test. * Remove `rb_thread_sleep_interruptible` dead code.
* s/mjit/rjit/Takashi Kokubun2023-03-061-1/+1
|
* s/MJIT/RJIT/Takashi Kokubun2023-03-061-1/+1
|
* TestThreadInstrumentation: emit the EXIT event soonerJean Boussier2023-03-061-2/+7
| | | | | | | | | | | | | ``` 1) Failure: TestThreadInstrumentation#test_thread_instrumentation [/tmp/ruby/src/trunk-repeat20-asserts/test/-ext-/thread/test_instrumentation_api.rb:33]: Call counters[4]: [3, 4, 4, 4, 0]. Expected 0 to be > 0. ``` We fire the EXIT hook after the call to `thread_sched_to_dead` which mean another thread might be running before the `EXIT` hook have been executed.
* Merge gc.h and internal/gc.hMatt Valentine-House2023-02-091-1/+1
| | | | [Feature #19425]
* Fix possible use of undefined macros on very old macOS [ci skip]Nobuyoshi Nakada2022-10-171-1/+2
|
* Adjust styles [ci skip]Nobuyoshi Nakada2022-07-271-1/+2
|
* Expand tabs [ci skip]Takashi Kokubun2022-07-211-177/+177
| | | | [Misc #18891]
* GVL Instrumentation: remove the EXITED count assertionJean Boussier2022-07-131-7/+2
| | | | | It's very flaky for some unknown reason. Something we have an extra EXITED event. I suspect some other test is causing this.
* thread_pthread.c: call SUSPENDED event when entering native_sleepJean Boussier2022-07-071-0/+2
| | | | | | | | | [Bug #18900] Thread#join and a few other codepaths are using native sleep as a way to suspend the current thread. So we should call the relevant hook when this happen, otherwise some thread may transition directly from `RESUMED` to `READY`.
* thread_pthread.c: Remove useless call to pthread_rwlock_initJean Boussier2022-07-061-6/+1
|
* GVL Instrumentation API: add STARTED and EXITED eventsJean Boussier2022-06-171-11/+14
| | | | | | | | [Feature #18339] After experimenting with the initial version of the API I figured there is a need for an exit event to cleanup instrumentation data. e.g. if you record data in a {thread_id -> data} table, you need to free associated data when a thread goes away.
* Remove unused rb_thread_create_mjit_threadTakashi Kokubun2022-06-151-34/+0
| | | | follow up https://github.com/ruby/ruby/pull/6006
* thread_pthread.c: trigger THREAD_EVENT_READY when going throuhg the fast path.Jean Boussier2022-06-071-4/+4
|
* [Feature #18339] GVL Instrumentation APIJean Boussier2022-06-031-1/+107
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Ref: https://bugs.ruby-lang.org/issues/18339 Design: - This tries to minimize the overhead when no hook is registered. It should only incur an extra unsynchronized boolean check. - The hook list is protected with a read-write lock as to cause contention when some hooks are registered. - The hooks MUST be thread safe, and MUST NOT call into Ruby as they are executed outside the GVL. - It's simply a noop on Windows. API: ``` rb_internal_thread_event_hook_t * rb_internal_thread_add_event_hook(rb_internal_thread_event_callback callback, rb_event_flag_t internal_event, void *user_data); bool rb_internal_thread_remove_event_hook(rb_internal_thread_event_hook_t * hook); ``` You can subscribe to 3 events: - READY: called right before attempting to acquire the GVL - RESUMED: called right after successfully acquiring the GVL - SUSPENDED: called right after releasing the GVL. The hooks MUST be threadsafe, as they are executed outside of the GVL, they also MUST NOT call any Ruby API.
* Support old Mac OS X SDK and gccNobuyoshi Nakada2022-05-271-2/+17
| | | | | | | | | | | | Follow up of https://github.com/ruby/ruby/pull/5927 `pthread_threadid_np()` is not even be declared in outdated SDKs. Also, the `__API_AVAILABLE` macro does not work on gcc, which does not support the [availability] attribute of clang, so an additional weak symbol declaration is required to check for weakly linked symbols. [availability]: https://clang.llvm.org/docs/AttributeReference.html#availability
* altstack is native thread's attrKoichi Sasada2022-05-241-2/+2
| | | | Move th->altstack to th->nt->altstack.
* remove `DEBUG_OUT()` macroKoichi Sasada2022-05-241-12/+0
| | | | This macro is no longer used ([GH-5933]).
* use `RUBY_DEBUG_LOG` instead of `thread_debug`Koichi Sasada2022-05-241-10/+12
| | | | | `thread_debug()` was introduced to print debug messages on `THREAD_DEBUG > 0` but `RUBY_DEBUG_LOG()` is more controllable.
* remove `NON_SCALAR_THREAD_ID` supportKoichi Sasada2022-05-241-6/+7
| | | | | | | | | `NON_SCALAR_THREAD_ID` shows `pthread_t` is non-scalar (non-pointer) and only s390x is known platform. However, the supporting code is very complex and it is only used for deubg print information. So this patch removes the support of `NON_SCALAR_THREAD_ID` and make the code simple.
* Support old Mac OS XNobuyoshi Nakada2022-05-231-0/+8
| | | | | `pthread_threadid_np` is available since Mac OS X 10.6, use `pthread_mach_thread_np` on older systems.
* Revert broken thread_pthread.c in 539459abda3Nobuyoshi Nakada2022-05-221-17/+4
|