aboutsummaryrefslogtreecommitdiffstats
path: root/vm_insnhelper.c
Commit message (Collapse)AuthorAgeFilesLines
* prohibi method call by defined_method in other racotrsKoichi Sasada2020-09-251-1/+6
| | | | We can not call a non-isolated Proc in multiple ractors.
* Improve the performance of supereileencodes2020-09-231-2/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This PR improves the performance of `super` calls. While working on some Rails optimizations jhawthorn discovered that `super` calls were slower than expected. The changes here do the following: 1) Adds a check for whether the call frame is not equal to the method entry iseq. This avoids the `rb_obj_is_kind_of` check on the next line which is quite slow. If the current call frame is equal to the method entry we know we can't have an instance eval, etc. 2) Changes `FL_TEST` to `FL_TEST_RAW`. This is safe because we've already done the check for `T_ICLASS` above. 3) Adds a benchmark for `T_ICLASS` super calls. 4) Note: makes a chage for `method_entry_cref` to use `const`. On master the benchmarks showed that `super` is 1.76x slower. Our changes improved the performance so that it is now only 1.36x slower. Benchmark IPS: ``` Warming up -------------------------------------- super 244.918k i/100ms method call 383.007k i/100ms Calculating ------------------------------------- super 2.280M (± 6.7%) i/s - 11.511M in 5.071758s method call 3.834M (± 4.9%) i/s - 19.150M in 5.008444s Comparison: method call: 3833648.3 i/s super: 2279837.9 i/s - 1.68x (± 0.00) slower ``` With changes: ``` Warming up -------------------------------------- super 308.777k i/100ms method call 375.051k i/100ms Calculating ------------------------------------- super 2.951M (± 5.4%) i/s - 14.821M in 5.039592s method call 3.551M (± 4.9%) i/s - 18.002M in 5.081695s Comparison: method call: 3551372.7 i/s super: 2950557.9 i/s - 1.20x (± 0.00) slower ``` Ruby VM benchmarks also showed an improvement: Existing `vm_super` benchmark`. ``` $ make benchmark ITEM=vm_super | |compare-ruby|built-ruby| |:---------|-----------:|---------:| |vm_super | 21.555M| 37.819M| | | -| 1.75x| ``` New `vm_iclass_super` benchmark: ``` $ make benchmark ITEM=vm_iclass_super | |compare-ruby|built-ruby| |:----------------|-----------:|---------:| |vm_iclass_super | 1.669M| 3.683M| | | -| 2.21x| ``` This is the benchmark script used for the benchmark-ips benchmarks: ```ruby require "benchmark/ips" class Foo def zuper; end def top; end last_method = "top" ("A".."M").each do |module_name| eval <<-EOM module #{module_name} def zuper; super; end def #{module_name.downcase} #{last_method} end end prepend #{module_name} EOM last_method = module_name.downcase end end foo = Foo.new Benchmark.ips do |x| x.report "super" do foo.zuper end x.report "method call" do foo.m end x.compare! end ``` Co-authored-by: Aaron Patterson <tenderlove@ruby-lang.org> Co-authored-by: John Hawthorn <john@hawthorn.email>
* Revert "Prevent SystemStackError when calling super in module with activated ↵Jeremy Evans2020-09-221-3/+0
| | | | | | | | | | | | | | refinement" This reverts commit eeef16e190cdabc2ba474622720f8e3df7bac43b. This also reverts the spec change. Preventing the SystemStackError would be nice, but there is valid code that the fix breaks, and it is probably more common than cases that cause the SystemStackError. Fixes [Bug #17182]
* Interpolated strings are no longer frozen with frozen-string-literal: trueBenoit Daloze2020-09-151-9/+0
| | | | | * Remove freezestring instruction since this was the only usage for it. * [Feature #17104]
* Introduce Ractor mechanism for parallel executionKoichi Sasada2020-09-031-22/+33
| | | | | | | | | | | | | | | | This commit introduces Ractor mechanism to run Ruby program in parallel. See doc/ractor.md for more details about Ractor. See ticket [Feature #17100] to see the implementation details and discussions. [Feature #17100] This commit does not complete the implementation. You can find many bugs on using Ractor. Also the specification will be changed so that this feature is experimental. You will see a warning when you make the first Ractor with `Ractor.new`. I hope this feature can help programmers from thread-safety issues.
* Remove the pc argument of vm_trace()Alan Wu2020-09-011-2/+3
| | | | This makes the binary 272 bytes smaller on -O3 GCC 10.2.0.
* Fix Method#super_method for aliased methodsJeremy Evans2020-08-271-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | Previously, Method#super_method looked at the called_id to determine the method id to use, but that isn't correct for aliased methods, because the super target depends on the original method id, not the called_id. Additionally, aliases can reference methods defined in other classes and modules, and super lookup needs to start in the super of the defined class in such cases. This adds tests for Method#super_method for both types of aliases, one that uses VM_METHOD_TYPE_ALIAS and another that does not. Both check that the results for calling super methods return the expected values. To find the defined class for alias methods, add an rb_ prefix to find_defined_class_by_owner in vm_insnhelper.c and make it non-static, so that it can be called from method_super_method in proc.c. This bug was original discovered while researching [Bug #11189]. Fixes [Bug #17130]
* Prevent SystemStackError when calling super in module with activated refinementJeremy Evans2020-07-271-0/+3
| | | | | | | | | Without this, if a refinement defines a method that calls super and includes a module with a module that calls super and has a activated refinement at the point super is called, the module method super call will end up calling back into the refinement method, creating a loop. Fixes [Bug #17007]
* Fixed another typoNobuyoshi Nakada2020-07-101-1/+1
|
* Fixed typosNobuyoshi Nakada2020-07-101-2/+2
|
* vm_push_frame_debug_counter_inc: use branches卜部昌平2020-07-101-6/+16
| | | | Ko1 doesn't like previous code.
* vm_push_frame: move assignments around卜部昌平2020-07-101-14/+14
| | | | | | Struct assignment using a compound literal is more readable than before, to me at least. It seems compilers reorder assignments anyways. Neither speedup nor slowdown is observed on my machine.
* vm_push_frame: move assertions out of the function卜部昌平2020-07-101-3/+4
| | | | These assertions are purely static. Ned not be checked on-the-fly.
* vm_push_frame: hoist out debug codes卜部昌平2020-07-101-28/+41
| | | | Made it a bit readable.
* nobody uses the return value of vm_push_frame卜部昌平2020-07-101-3/+1
| | | | Surprised to see such a waste of time in this super duper hot path.
* Use ID instead of GENTRY for gvars. (#3278)Koichi Sasada2020-07-031-1/+1
| | | | | | | | | | | | Use ID instead of GENTRY for gvars. Global variables are compiled into GENTRY (a pointer to struct rb_global_entry). This patch replace this GENTRY to ID and make the code simple. We need to search GENTRY from ID every time (st_lookup), so additional overhead will be introduced. However, the performance of accessing global variables is not important now a day and this simplicity helps Ractor development.
* Extracted METHOD_ENTRY_CACHEABLE macroNobuyoshi Nakada2020-06-301-4/+4
|
* vm_getivar: do not goto into a branch卜部昌平2020-06-291-17/+21
| | | | | I'm not necessarily against every goto in general, but jumping into a branch is definitely a bad idea. Better refactor.
* Decide JIT-ed insn based on cached cfuncTakashi Kokubun2020-06-251-0/+16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | for opt_* insns. opt_eq handles rb_obj_equal inside opt_eq, and all other cfunc is handled by opt_send_without_block. Therefore we can't decide which insn should be generated by checking whether it's cfunc cc or not. ``` $ benchmark-driver -v --rbenv 'before --jit;after --jit' benchmark/mjit_opt_cc_insns.yml --repeat-count=4 before --jit: ruby 2.8.0dev (2020-06-26T05:21:43Z master 9dbc2294a6) +JIT [x86_64-linux] after --jit: ruby 2.8.0dev (2020-06-26T06:30:18Z master 75cece1b0b) +JIT [x86_64-linux] last_commit=Decide JIT-ed insn based on cached cfunc Calculating ------------------------------------- before --jit after --jit mjit_nil?(1) 73.878M 74.021M i/s - 40.000M times in 0.541432s 0.540391s mjit_not(1) 72.635M 74.601M i/s - 40.000M times in 0.550702s 0.536187s mjit_eq(1, nil) 7.331M 7.445M i/s - 8.000M times in 1.091211s 1.074596s mjit_eq(nil, 1) 49.450M 64.711M i/s - 8.000M times in 0.161781s 0.123627s Comparison: mjit_nil?(1) after --jit: 74020528.4 i/s before --jit: 73878185.9 i/s - 1.00x slower mjit_not(1) after --jit: 74600882.0 i/s before --jit: 72634507.6 i/s - 1.03x slower mjit_eq(1, nil) after --jit: 7444657.4 i/s before --jit: 7331304.3 i/s - 1.02x slower mjit_eq(nil, 1) after --jit: 64710790.6 i/s before --jit: 49449507.4 i/s - 1.31x slower ```
* Verify builtin inline annotation with VM_CHECK_MODE (#3244)Takashi Kokubun2020-06-211-4/+7
| | | | | * Verify builtin inline annotation with VM_CHECK_MODE * Remove static to fix the link issue on MJIT
* Fix -Wmaybe-uninitialized at vm_invoke_blockTakashi Kokubun2020-06-211-0/+1
|
* Replaced accessors of `Struct` with `invokebuiltin`Nobuyoshi Nakada2020-06-171-14/+0
|
* Revert "Replaced accessors of `Struct` with `invokebuiltin`"Nobuyoshi Nakada2020-06-161-0/+14
| | | | | This reverts commit 19cabe8b09d92d033c244f32ff622b8e513375f1, which didn't support tool/lib/iseq_loader_checker.rb.
* Replaced accessors of `Struct` with `invokebuiltin`Nobuyoshi Nakada2020-06-161-14/+0
|
* vm_call_method: avoid marking on-stack object卜部昌平2020-06-101-1/+2
| | | | | | This callcache is on stack, must not be GCed. However its contents are copied from other materials, which can be an ordinal object. Should set a flag to make sure it is properly skipped by the GC.
* rb_eql_opt,rb_equal_opt: purge stale cc卜部昌平2020-06-091-4/+2
| | | | | When on USE_EMBED_CI, cd is stored statically. Previous use could cache stale cd->cc, which could have already been GCed. Need flush them.
* vm_ccs_push: do not cache non-heap entries卜部昌平2020-06-091-0/+7
| | | | | Entires not GC-able must be considered to be volatile. Not eligible for later use.
* VM_CI_NEW_ID: USE_EMBED_CI could be false卜部昌平2020-06-091-6/+16
| | | | It was a wrong idea to assume CIs are always embedded.
* eliminate C99 compound literals卜部昌平2020-06-091-16/+27
| | | | | Ko1 prefers variables be assgined, instead of bare literals in function arguments.
* vm_call_method: use struct assignment卜部昌平2020-06-091-1/+2
| | | | | This further reduces the generated binary of vm_call_method from 566 bytes to 545 bytes on my machine, according to nm(1).
* rb_vm_call0: on-stack call info卜部昌平2020-06-091-22/+0
| | | | | This changeset reduces the generated binary of rb_vm_call0 from 281 bytes to 211 bytes on my machine. Should reduce GC pressure as well.
* vm_yield_setup_args: refactor use macro卜部昌平2020-06-091-4/+1
|
* vm_call_method: no call vm_cc_fill卜部昌平2020-06-091-5/+3
| | | | | This changeset reduces the generated binary of vm_call_method from 600 bytes to 566 bytes on my machine, accroding to nm(1).
* vm_call_refined: no call vm_cc_fill卜部昌平2020-06-091-5/+3
| | | | | This changeset reduces the generated binary of vm_call_method_each_type from 2,442 bytes to 2,378 bytes on my machine, accroding to nm(1).
* vm_call_zsuper: no call vm_cc_fill卜部昌平2020-06-091-6/+3
| | | | | This changeset reduces the generated binary of vm_call_method_each_type from 2,522 bytes to 2,442 bytes on my machine, accroding to nm(1).
* vm_call_method_missing_body: on-stack call info卜部昌平2020-06-091-9/+5
| | | | | | This changeset reduces the generated binary of vm_call_method_missing_body from 604 bytes to 532 bytes on my machine. Should reduce GC pressure as well.
* vm_call_symbol: on-stack call info卜部昌平2020-06-091-8/+16
| | | | | This changeset reduces the generated binary of vm_call_symbol from 808 bytes to 798 bytes on my machine. Should reduce GC pressure as well.
* vm_call_alias: no call vm_cc_fill卜部昌平2020-06-091-6/+15
| | | | | This changeset reduces the generated binary of vm_call_alias from 188 bytes to 149 bytes on my machine, accroding to nm(1).
* rb_eql_opt: fully static call data卜部昌平2020-06-091-4/+8
| | | | | This changeset reduces the generated binary of rb_eql_opt from 86 bytes to 20 bytes on my machine, according to nm(1).
* rb_vm_search_method_slowpath: skip vm_empty_cc卜部昌平2020-06-091-3/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Now that vm_empty_cc is statically allocated outside of the object space. It shall not be GCed. Here, because vm_search_cc can return that. Must not be blindly passed to RB_OBJ_WRITE, unless assertions fail on RGENGC_CHECK_MODE, like this: -- C level backtrace information ------------------------------------------- ruby(rb_print_backtrace+0x19) [0x5555557fd579] vm_dump.c:757 ruby(rb_vm_bugreport+0x151) [0x5555557fd6f1] vm_dump.c:955 ruby(rb_bug+0x1d6) [0x5555558d6396] error.c:660 ruby(check_rvalue_consistency_force+0x707) [0x5555555adb97] gc.c:1289 ruby(check_rvalue_consistency+0x1a) [0x555555598a0a] gc.c:1305 ruby(RVALUE_OLD_P+0x15) [0x5555555975d5] gc.c:1382 ruby(rb_gc_writebarrier+0x9f) [0x55555559753f] gc.c:6882 ruby(rb_obj_written+0x3a) [0x5555557a025a] include/ruby/internal/rgengc.h:180 ruby(rb_obj_write+0x41) [0x5555557a1a81] include/ruby/internal/rgengc.h:195 ruby(rb_vm_search_method_slowpath+0x5a) [0x5555557a125a] vm_insnhelper.c:1603 ruby(vm_search_method_fastpath+0x197) [0x5555557d8027] vm_insnhelper.c:1638 ruby(vm_search_method+0xea) [0x5555557d7d2a] vm_insnhelper.c:1650 ruby(vm_search_method_wrap+0x29) [0x5555557dbaf9] vm_insnhelper.c:4091 ruby(vm_sendish+0xa9) [0x5555557dba39] vm_insnhelper.c:4143 ruby(vm_exec_core+0xe357) [0x5555557b0757] insns.def:801 ruby(rb_vm_exec+0x12c) [0x5555557d17cc] vm.c:1942 ruby(invoke_block+0xea) [0x5555557f42fa] vm.c:1058 ruby(invoke_iseq_block_from_c+0x16e) [0x5555557f3eae] vm.c:1130 ruby(invoke_block_from_c_bh) vm.c:1148 ruby(vm_yield+0x71) [0x5555557f3c41] vm.c:1193 ruby(rb_yield_0+0x25) [0x5555557ca615] vm_eval.c:1141 ruby(rb_yield_1+0x27) [0x5555557ca5c7] vm_eval.c:1147 ruby(rb_yield+0x34) [0x5555557ca654] vm_eval.c:1157 ruby(rb_ary_collect+0xb0) [0x555555828320] array.c:3186 ruby(call_cfunc_0+0x29) [0x5555557f0f39] vm_insnhelper.c:2385 ruby(vm_call_cfunc_with_frame+0x278) [0x5555557eca98] vm_insnhelper.c:2553 ruby(vm_sendish+0xd0) [0x5555557dba60] vm_insnhelper.c:4146 ruby(vm_exec_core+0xe0f8) [0x5555557b04f8] insns.def:782 ruby(rb_vm_exec+0x12c) [0x5555557d17cc] vm.c:1942 ruby(invoke_block+0xea) [0x5555557f42fa] vm.c:1058 ruby(invoke_iseq_block_from_c+0x16e) [0x5555557f3eae] vm.c:1130 ruby(invoke_block_from_c_bh) vm.c:1148 ruby(vm_yield+0x71) [0x5555557f3c41] vm.c:1193 ruby(rb_yield_0+0x25) [0x5555557ca615] vm_eval.c:1141 ruby(rb_yield_1+0x27) [0x5555557ca5c7] vm_eval.c:1147 ruby(rb_yield+0x34) [0x5555557ca654] vm_eval.c:1157 ruby(rb_ary_each+0xa5) [0x55555581c795] array.c:2242 ruby(call_cfunc_0+0x29) [0x5555557f0f39] vm_insnhelper.c:2385 ruby(vm_call_cfunc_with_frame+0x278) [0x5555557eca98] vm_insnhelper.c:2553 ruby(vm_sendish+0xd0) [0x5555557dba60] vm_insnhelper.c:4146 ruby(vm_exec_core+0xe0f8) [0x5555557b04f8] insns.def:782 ruby(rb_vm_exec+0x12c) [0x5555557d17cc] vm.c:1942 ruby(invoke_block+0xea) [0x5555557f42fa] vm.c:1058 ruby(invoke_iseq_block_from_c+0x16e) [0x5555557f3eae] vm.c:1130 ruby(invoke_block_from_c_bh) vm.c:1148 ruby(vm_yield+0x71) [0x5555557f3c41] vm.c:1193 ruby(rb_yield_0+0x25) [0x5555557ca615] vm_eval.c:1141 ruby(rb_yield_1+0x27) [0x5555557ca5c7] vm_eval.c:1147 ruby(rb_yield+0x34) [0x5555557ca654] vm_eval.c:1157 ruby(rb_ary_each+0xa5) [0x55555581c795] array.c:2242 ruby(call_cfunc_0+0x29) [0x5555557f0f39] vm_insnhelper.c:2385 ruby(vm_call_cfunc_with_frame+0x278) [0x5555557eca98] vm_insnhelper.c:2553 ruby(vm_sendish+0xd0) [0x5555557dba60] vm_insnhelper.c:4146 ruby(vm_exec_core+0xe0f8) [0x5555557b04f8] insns.def:782 ruby(rb_vm_exec+0x19f) [0x5555557d183f] vm.c:1951 ruby(rb_iseq_eval+0x30) [0x5555557d2530] vm.c:2190 ruby(load_iseq_eval+0xd6) [0x5555555fa7e6] load.c:592 ruby(require_internal+0x25e) [0x5555555f7f5e] load.c:1022 ruby(rb_require_string+0x27) [0x5555555f74e7] load.c:1094 ruby(rb_f_require_relative+0x5f) [0x5555555f758f] load.c:837 ruby(call_cfunc_1+0x30) [0x5555557f0f70] vm_insnhelper.c:2391 ruby(vm_call_cfunc_with_frame+0x278) [0x5555557eca98] vm_insnhelper.c:2553 ruby(vm_call_cfunc+0xad) [0x5555557e521d] vm_insnhelper.c:2574 ruby(vm_call_method_each_type+0xc7) [0x5555557e4af7] vm_insnhelper.c:3040 ruby(vm_call_method+0x19c) [0x5555557e45dc] vm_insnhelper.c:3144 ruby(vm_call_general+0x2d) [0x5555557c8c3d] vm_insnhelper.c:3176 ruby(vm_sendish+0xd0) [0x5555557dba60] vm_insnhelper.c:4146 ruby(vm_exec_core+0xe357) [0x5555557b0757] insns.def:801 ruby(rb_vm_exec+0x12c) [0x5555557d17cc] vm.c:1942 ruby(rb_iseq_eval_main+0x30) [0x5555557d2670] vm.c:2201 ruby(rb_ec_exec_node+0x16b) [0x55555557e39b] eval.c:296 ruby(ruby_run_node+0x72) [0x55555557e1f2] eval.c:354 ruby(main+0x78) [0x55555557a5d8] main.c:50
* rb_equal_opt: fully static call data卜部昌平2020-06-091-10/+16
| | | | | This changeset reduces the generated binary of rb_equal_opt from 129 bytes to 17 bytes on my machine, according to nm(1).
* vm_search_method_fastpath: avoid rb_vm_empty_cc()卜部昌平2020-06-091-3/+12
| | | | | This is such a hot path that it's worth eliminating a function call. Use the static variable directly instead.
* check_cfunc: add assertions卜部昌平2020-06-091-4/+11
| | | | For debug. Must not change generated binary unless VM_ASSERT is on.
* Properly resolve refinements in defined? on private call [Bug #16932]Nobuyoshi Nakada2020-06-041-4/+1
|
* Properly resolve refinements in defined? on method call [Bug #16932]Nobuyoshi Nakada2020-06-041-1/+1
|
* vm_invoke_proc_block: reduce recursion卜部昌平2020-06-031-4/+8
| | | | | According to nobu recursion can be longer than my expectation. Limit them here.
* vm_call_symbol: check stack overflow卜部昌平2020-06-031-0/+1
| | | | | | | | | VM stack could overflow here. The condition is when a symbol is passed to a block-taking method via &variable, and that symbol has never been used for actual method names (thus yielding that results in calling method_missing), and the VM stack is full (no single word left). This is a once-in-a-blue-moon event. Yet there is a very tiny room of stack overflow. We need to check that.
* vm_invoke_block: remove auto qualifier卜部昌平2020-06-031-3/+3
| | | | Was (harmless but) redundant.
* vm_insnhelper.c: add space [ci skip]卜部昌平2020-06-031-0/+6
| | | | Just cosmetic change to improve readability.
* vm_invoke_symbol_block: reduce MEMCPY卜部昌平2020-06-031-59/+84
| | | | | | | | | | | | | | | | | | This commit changes the number of calls of MEMCPY from... | send | &:sym -------------------------|-------|------- Symbol already interned | once | twice Symbol not pinned yet | none | once to: | send | &:sym -------------------------|-------|------- Symbol already interned | once | none Symbol not pinned yet | twice | once So it sacrifices exceptional situation for normal path.