aboutsummaryrefslogtreecommitdiffstats
path: root/vm_insnhelper.c
Commit message (Collapse)AuthorAgeFilesLines
* vm_call_method: avoid marking on-stack object卜部昌平2020-06-101-1/+2
| | | | | | This callcache is on stack, must not be GCed. However its contents are copied from other materials, which can be an ordinal object. Should set a flag to make sure it is properly skipped by the GC.
* rb_eql_opt,rb_equal_opt: purge stale cc卜部昌平2020-06-091-4/+2
| | | | | When on USE_EMBED_CI, cd is stored statically. Previous use could cache stale cd->cc, which could have already been GCed. Need flush them.
* vm_ccs_push: do not cache non-heap entries卜部昌平2020-06-091-0/+7
| | | | | Entires not GC-able must be considered to be volatile. Not eligible for later use.
* VM_CI_NEW_ID: USE_EMBED_CI could be false卜部昌平2020-06-091-6/+16
| | | | It was a wrong idea to assume CIs are always embedded.
* eliminate C99 compound literals卜部昌平2020-06-091-16/+27
| | | | | Ko1 prefers variables be assgined, instead of bare literals in function arguments.
* vm_call_method: use struct assignment卜部昌平2020-06-091-1/+2
| | | | | This further reduces the generated binary of vm_call_method from 566 bytes to 545 bytes on my machine, according to nm(1).
* rb_vm_call0: on-stack call info卜部昌平2020-06-091-22/+0
| | | | | This changeset reduces the generated binary of rb_vm_call0 from 281 bytes to 211 bytes on my machine. Should reduce GC pressure as well.
* vm_yield_setup_args: refactor use macro卜部昌平2020-06-091-4/+1
|
* vm_call_method: no call vm_cc_fill卜部昌平2020-06-091-5/+3
| | | | | This changeset reduces the generated binary of vm_call_method from 600 bytes to 566 bytes on my machine, accroding to nm(1).
* vm_call_refined: no call vm_cc_fill卜部昌平2020-06-091-5/+3
| | | | | This changeset reduces the generated binary of vm_call_method_each_type from 2,442 bytes to 2,378 bytes on my machine, accroding to nm(1).
* vm_call_zsuper: no call vm_cc_fill卜部昌平2020-06-091-6/+3
| | | | | This changeset reduces the generated binary of vm_call_method_each_type from 2,522 bytes to 2,442 bytes on my machine, accroding to nm(1).
* vm_call_method_missing_body: on-stack call info卜部昌平2020-06-091-9/+5
| | | | | | This changeset reduces the generated binary of vm_call_method_missing_body from 604 bytes to 532 bytes on my machine. Should reduce GC pressure as well.
* vm_call_symbol: on-stack call info卜部昌平2020-06-091-8/+16
| | | | | This changeset reduces the generated binary of vm_call_symbol from 808 bytes to 798 bytes on my machine. Should reduce GC pressure as well.
* vm_call_alias: no call vm_cc_fill卜部昌平2020-06-091-6/+15
| | | | | This changeset reduces the generated binary of vm_call_alias from 188 bytes to 149 bytes on my machine, accroding to nm(1).
* rb_eql_opt: fully static call data卜部昌平2020-06-091-4/+8
| | | | | This changeset reduces the generated binary of rb_eql_opt from 86 bytes to 20 bytes on my machine, according to nm(1).
* rb_vm_search_method_slowpath: skip vm_empty_cc卜部昌平2020-06-091-3/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Now that vm_empty_cc is statically allocated outside of the object space. It shall not be GCed. Here, because vm_search_cc can return that. Must not be blindly passed to RB_OBJ_WRITE, unless assertions fail on RGENGC_CHECK_MODE, like this: -- C level backtrace information ------------------------------------------- ruby(rb_print_backtrace+0x19) [0x5555557fd579] vm_dump.c:757 ruby(rb_vm_bugreport+0x151) [0x5555557fd6f1] vm_dump.c:955 ruby(rb_bug+0x1d6) [0x5555558d6396] error.c:660 ruby(check_rvalue_consistency_force+0x707) [0x5555555adb97] gc.c:1289 ruby(check_rvalue_consistency+0x1a) [0x555555598a0a] gc.c:1305 ruby(RVALUE_OLD_P+0x15) [0x5555555975d5] gc.c:1382 ruby(rb_gc_writebarrier+0x9f) [0x55555559753f] gc.c:6882 ruby(rb_obj_written+0x3a) [0x5555557a025a] include/ruby/internal/rgengc.h:180 ruby(rb_obj_write+0x41) [0x5555557a1a81] include/ruby/internal/rgengc.h:195 ruby(rb_vm_search_method_slowpath+0x5a) [0x5555557a125a] vm_insnhelper.c:1603 ruby(vm_search_method_fastpath+0x197) [0x5555557d8027] vm_insnhelper.c:1638 ruby(vm_search_method+0xea) [0x5555557d7d2a] vm_insnhelper.c:1650 ruby(vm_search_method_wrap+0x29) [0x5555557dbaf9] vm_insnhelper.c:4091 ruby(vm_sendish+0xa9) [0x5555557dba39] vm_insnhelper.c:4143 ruby(vm_exec_core+0xe357) [0x5555557b0757] insns.def:801 ruby(rb_vm_exec+0x12c) [0x5555557d17cc] vm.c:1942 ruby(invoke_block+0xea) [0x5555557f42fa] vm.c:1058 ruby(invoke_iseq_block_from_c+0x16e) [0x5555557f3eae] vm.c:1130 ruby(invoke_block_from_c_bh) vm.c:1148 ruby(vm_yield+0x71) [0x5555557f3c41] vm.c:1193 ruby(rb_yield_0+0x25) [0x5555557ca615] vm_eval.c:1141 ruby(rb_yield_1+0x27) [0x5555557ca5c7] vm_eval.c:1147 ruby(rb_yield+0x34) [0x5555557ca654] vm_eval.c:1157 ruby(rb_ary_collect+0xb0) [0x555555828320] array.c:3186 ruby(call_cfunc_0+0x29) [0x5555557f0f39] vm_insnhelper.c:2385 ruby(vm_call_cfunc_with_frame+0x278) [0x5555557eca98] vm_insnhelper.c:2553 ruby(vm_sendish+0xd0) [0x5555557dba60] vm_insnhelper.c:4146 ruby(vm_exec_core+0xe0f8) [0x5555557b04f8] insns.def:782 ruby(rb_vm_exec+0x12c) [0x5555557d17cc] vm.c:1942 ruby(invoke_block+0xea) [0x5555557f42fa] vm.c:1058 ruby(invoke_iseq_block_from_c+0x16e) [0x5555557f3eae] vm.c:1130 ruby(invoke_block_from_c_bh) vm.c:1148 ruby(vm_yield+0x71) [0x5555557f3c41] vm.c:1193 ruby(rb_yield_0+0x25) [0x5555557ca615] vm_eval.c:1141 ruby(rb_yield_1+0x27) [0x5555557ca5c7] vm_eval.c:1147 ruby(rb_yield+0x34) [0x5555557ca654] vm_eval.c:1157 ruby(rb_ary_each+0xa5) [0x55555581c795] array.c:2242 ruby(call_cfunc_0+0x29) [0x5555557f0f39] vm_insnhelper.c:2385 ruby(vm_call_cfunc_with_frame+0x278) [0x5555557eca98] vm_insnhelper.c:2553 ruby(vm_sendish+0xd0) [0x5555557dba60] vm_insnhelper.c:4146 ruby(vm_exec_core+0xe0f8) [0x5555557b04f8] insns.def:782 ruby(rb_vm_exec+0x12c) [0x5555557d17cc] vm.c:1942 ruby(invoke_block+0xea) [0x5555557f42fa] vm.c:1058 ruby(invoke_iseq_block_from_c+0x16e) [0x5555557f3eae] vm.c:1130 ruby(invoke_block_from_c_bh) vm.c:1148 ruby(vm_yield+0x71) [0x5555557f3c41] vm.c:1193 ruby(rb_yield_0+0x25) [0x5555557ca615] vm_eval.c:1141 ruby(rb_yield_1+0x27) [0x5555557ca5c7] vm_eval.c:1147 ruby(rb_yield+0x34) [0x5555557ca654] vm_eval.c:1157 ruby(rb_ary_each+0xa5) [0x55555581c795] array.c:2242 ruby(call_cfunc_0+0x29) [0x5555557f0f39] vm_insnhelper.c:2385 ruby(vm_call_cfunc_with_frame+0x278) [0x5555557eca98] vm_insnhelper.c:2553 ruby(vm_sendish+0xd0) [0x5555557dba60] vm_insnhelper.c:4146 ruby(vm_exec_core+0xe0f8) [0x5555557b04f8] insns.def:782 ruby(rb_vm_exec+0x19f) [0x5555557d183f] vm.c:1951 ruby(rb_iseq_eval+0x30) [0x5555557d2530] vm.c:2190 ruby(load_iseq_eval+0xd6) [0x5555555fa7e6] load.c:592 ruby(require_internal+0x25e) [0x5555555f7f5e] load.c:1022 ruby(rb_require_string+0x27) [0x5555555f74e7] load.c:1094 ruby(rb_f_require_relative+0x5f) [0x5555555f758f] load.c:837 ruby(call_cfunc_1+0x30) [0x5555557f0f70] vm_insnhelper.c:2391 ruby(vm_call_cfunc_with_frame+0x278) [0x5555557eca98] vm_insnhelper.c:2553 ruby(vm_call_cfunc+0xad) [0x5555557e521d] vm_insnhelper.c:2574 ruby(vm_call_method_each_type+0xc7) [0x5555557e4af7] vm_insnhelper.c:3040 ruby(vm_call_method+0x19c) [0x5555557e45dc] vm_insnhelper.c:3144 ruby(vm_call_general+0x2d) [0x5555557c8c3d] vm_insnhelper.c:3176 ruby(vm_sendish+0xd0) [0x5555557dba60] vm_insnhelper.c:4146 ruby(vm_exec_core+0xe357) [0x5555557b0757] insns.def:801 ruby(rb_vm_exec+0x12c) [0x5555557d17cc] vm.c:1942 ruby(rb_iseq_eval_main+0x30) [0x5555557d2670] vm.c:2201 ruby(rb_ec_exec_node+0x16b) [0x55555557e39b] eval.c:296 ruby(ruby_run_node+0x72) [0x55555557e1f2] eval.c:354 ruby(main+0x78) [0x55555557a5d8] main.c:50
* rb_equal_opt: fully static call data卜部昌平2020-06-091-10/+16
| | | | | This changeset reduces the generated binary of rb_equal_opt from 129 bytes to 17 bytes on my machine, according to nm(1).
* vm_search_method_fastpath: avoid rb_vm_empty_cc()卜部昌平2020-06-091-3/+12
| | | | | This is such a hot path that it's worth eliminating a function call. Use the static variable directly instead.
* check_cfunc: add assertions卜部昌平2020-06-091-4/+11
| | | | For debug. Must not change generated binary unless VM_ASSERT is on.
* Properly resolve refinements in defined? on private call [Bug #16932]Nobuyoshi Nakada2020-06-041-4/+1
|
* Properly resolve refinements in defined? on method call [Bug #16932]Nobuyoshi Nakada2020-06-041-1/+1
|
* vm_invoke_proc_block: reduce recursion卜部昌平2020-06-031-4/+8
| | | | | According to nobu recursion can be longer than my expectation. Limit them here.
* vm_call_symbol: check stack overflow卜部昌平2020-06-031-0/+1
| | | | | | | | | VM stack could overflow here. The condition is when a symbol is passed to a block-taking method via &variable, and that symbol has never been used for actual method names (thus yielding that results in calling method_missing), and the VM stack is full (no single word left). This is a once-in-a-blue-moon event. Yet there is a very tiny room of stack overflow. We need to check that.
* vm_invoke_block: remove auto qualifier卜部昌平2020-06-031-3/+3
| | | | Was (harmless but) redundant.
* vm_insnhelper.c: add space [ci skip]卜部昌平2020-06-031-0/+6
| | | | Just cosmetic change to improve readability.
* vm_invoke_symbol_block: reduce MEMCPY卜部昌平2020-06-031-59/+84
| | | | | | | | | | | | | | | | | | This commit changes the number of calls of MEMCPY from... | send | &:sym -------------------------|-------|------- Symbol already interned | once | twice Symbol not pinned yet | none | once to: | send | &:sym -------------------------|-------|------- Symbol already interned | once | none Symbol not pinned yet | twice | once So it sacrifices exceptional situation for normal path.
* vm_invoke_symbol_block: call vm_call_opt_send卜部昌平2020-06-031-8/+33
| | | | | | Symbol#to_proc and Object#send are closely related each other. Why not share their implementations. By doing so we can skip recursive call of vm_exec(), which could benefit for speed.
* vm_invoke_block: force indirect jump卜部昌平2020-06-031-11/+11
| | | | | | | | | | | | | | This changeset slightly speeds up on my machine. Calculating ------------------------------------- before after Optcarrot Lan_Master.nes 38.33488426546287 40.89825082589147 fps 40.91288557922081 41.48687465359386 40.96591995270991 41.98499064664184 41.20461943032173 43.67314690779162 42.38344888176518 44.02777536251875 43.43563728880915 44.88695892714136 43.88082889062643 45.11226186242523
* vm_invoke_block: insertion of unused args卜部昌平2020-06-031-7/+7
| | | | | | | This makes it possible for vm_invoke_block to pass its passed arguments verbatimly to calling functions. Because they are tail-called the function calls can be strength-recuced into indirect jumps, which is a huge win.
* vm_invoke_block: eliminate goto卜部昌平2020-06-031-10/+16
| | | | Use recursion for better readability.
* vm_invoke_block: move logics around卜部昌平2020-06-031-12/+9
| | | | | Moved block handler -> captured block conversion from vm_invokeblock to each vm_invoke_*_block functions.
* Fixed `defined?` against protected method callNobuyoshi Nakada2020-06-021-1/+1
| | | | | | Protected methods are restricted to be called according to the class/module in where it is defined, not the actual receiver's class. [Bug #16931]
* vm_insnhelper.c: merge opt_eq_func / opt_eql_func卜部昌平2020-06-021-77/+47
| | | | | | | | These two function were almost identical, except in case of T_STRING/T_FLOAT. Why not merge them into one, and let the difference be handled in normal method calls (slowpath). This does not improve runtime performance for me, but at least reduces for instance rb_eql_opt from 653 bytes to 86 bytes on my machine, according to nm(1).
* Mark vm_stackoverflow as NOINLINE COLDFUNC on JITTakashi Kokubun2020-05-261-0/+3
| | | | to reduce code size and improve locality of hot code.
* Fix origin iclass pointer for modulesJeremy Evans2020-05-221-1/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If a module has an origin, and that module is included in another module or class, previously the iclass created for the module had an origin pointer to the module's origin instead of the iclass's origin. Setting the origin pointer correctly requires using a stack, since the origin iclass is not created until after the iclass itself. Use a hidden ruby array to implement that stack. Correctly assigning the origin pointers in the iclass caused a use-after-free in GC. If a module with an origin is included in a class, the iclass shares a method table with the module and the iclass origin shares a method table with module origin. Mark iclass origin with a flag that notes that even though the iclass is an origin, it shares a method table, so the method table should not be garbage collected. The shared method table will be garbage collected when the module origin is garbage collected. I've tested that this does not introduce a memory leak. This change caused a VM assertion failure, which was traced to callable method entries using the incorrect defined_class. Update rb_vm_check_redefinition_opt_method and find_defined_class_by_owner to treat iclass origins different than class origins to avoid this issue. This also includes a fix for Module#included_modules to skip iclasses with origins. Fixes [Bug #16736]
* fix memory leak of ccsKoichi Sasada2020-05-221-2/+8
| | | | | | | rb_callable_method_entry() creates ccs entry in cc_tbl, but this code overwrite by insert newly created ccs and overwrote ccs never freed. [Bug #16900]
* more on NULL versus functions卜部昌平2020-05-111-4/+4
| | | | | | | Function pointers are not void*. See also 115fec062ccf7c6d72c8d5f64b7a5d84c9fb2dd8 ce4ea956d24eab5089a143bba38126f2b11b55b6 8427fca49bd85205f5a8766292dd893f003c0e48
* sed -i 's|ruby/impl|ruby/internal|'卜部昌平2020-05-111-1/+1
| | | | To fix build failures.
* sed -i s|ruby/3|ruby/impl|g卜部昌平2020-05-111-1/+1
| | | | This shall fix compile errors.
* Disable -Wswitch warning when VM_CHECK_MODENobuyoshi Nakada2020-05-031-0/+2
|
* Invalidate fastpath when calling attr_reader by superTakashi Kokubun2020-04-141-2/+2
| | | | The same bug as 8355a99883 existed in attr_reader too.
* Invalidate fastpath when calling attr_writer by superTakashi Kokubun2020-04-141-3/+12
| | | | | | | | | | | | | | We started to use fastpath on invokesuper when a method is not refinements since 5c27681813, but we shouldn't have used fastpath for attr_writer either. `cc->aux_.attr_index` is for an actual receiver class, while we store its superclass in `cc->klass` and therefore there's no way to properly invalidate attr_writer's inline cache when it's called by super. [Bug #16785] I suspect the same bug also exists in attr_reader. I'll address that in another commit.
* Make vm_call_cfunc_with_frame a fastpath (#3027)Takashi Kokubun2020-04-131-0/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | when there's no need to call CALLER_SETUP_ARG and CALLER_REMOVE_EMPTY_KW_SPLAT (i.e. !rb_splat_or_kwargs_p(ci) && !calling->kw_splat). Micro benchmark: ``` $ benchmark-driver -v --rbenv 'before;after' benchmark/vm_send_cfunc.yml --repeat-count=4 before: ruby 2.8.0dev (2020-04-13T23:45:05Z master b9d3ceee8f) [x86_64-linux] after: ruby 2.8.0dev (2020-04-14T00:48:52Z no-splat-fastpath 418d363722) [x86_64-linux] Calculating ------------------------------------- before after vm_send_cfunc 69.585M 88.724M i/s - 100.000M times in 1.437097s 1.127096s Comparison: vm_send_cfunc after: 88723605.2 i/s before: 69584737.1 i/s - 1.28x slower ``` Optcarrot: ``` $ benchmark-driver -v --rbenv 'before;after' benchmark.yml --repeat-count=12 --output=all before: ruby 2.8.0dev (2020-04-13T23:45:05Z master b9d3ceee8f) [x86_64-linux] after: ruby 2.8.0dev (2020-04-14T00:48:52Z no-splat-fastpath 418d363722) [x86_64-linux] Calculating ------------------------------------- before after Optcarrot Lan_Master.nes 50.76119601545175 42.73858236484051 fps 50.76388649761503 51.04211379912850 50.80930672252514 51.39455790755538 50.90236000778749 51.75656936556145 51.01744746340430 51.86875277356489 51.06495279015112 51.88692482485558 51.07785337168974 51.93429603190578 51.20163525187862 51.95768145071314 51.34671771913112 52.45577266040274 51.35918340835583 52.53163888762858 51.46641337418146 52.62172484121034 51.50835463462257 52.85064021113239 ```
* Enable fastpath on invokesuper (#3021)Takashi Kokubun2020-04-111-2/+3
| | | | | | | | | | | | | | | | | | Fastpath has not been used for invokesuper since it has set vm_call_super_method on every invocation. Because it seems to be blocked only by refinements, try enabling fastpath on invokesuper when cme is not for refinements. While this patch itself should be helpful for VM performance, a part of this patch's motivation is to unblock inlining invokesuper on JIT. $ benchmark-driver -v --rbenv 'before;after' benchmark/vm2_super.yml --repeat-count=4 before: ruby 2.8.0dev (2020-04-11T15:19:58Z master a01bda5949) [x86_64-linux] after: ruby 2.8.0dev (2020-04-12T02:00:08Z invokesuper-fastpath c171984ee3) [x86_64-linux] Calculating ------------------------------------- before after vm2_super 20.031M 32.860M i/s - 6.000M times in 0.299534s 0.182593s Comparison: vm2_super after: 32859885.2 i/s before: 20031097.3 i/s - 1.64x slower
* Turn class variable warnings into exceptionsJeremy Evans2020-04-101-4/+4
| | | | | | | | | | | | | | | | | | This changes the following warnings: * warning: class variable access from toplevel * warning: class variable @foo of D is overtaken by C into RuntimeErrors. Handle defined?(@@foo) at toplevel by returning nil instead of raising an exception (the previous behavior warned before returning nil when defined? was used). Refactor the specs to avoid the warnings even in older versions. The specs were checking for the warnings, but the purpose of the related specs as evidenced from their description is to test for behavior, not for warnings. Fixes [Bug #14541]
* Merge pull request #2991 from shyouhei/ruby.h卜部昌平2020-04-081-3/+2
| | | Split ruby.h
* Reduce allocations for keyword argument hashesJeremy Evans2020-03-171-6/+21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Previously, passing a keyword splat to a method always allocated a hash on the caller side, and accepting arbitrary keywords in a method allocated a separate hash on the callee side. Passing explicit keywords to a method that accepted a keyword splat did not allocate a hash on the caller side, but resulted in two hashes allocated on the callee side. This commit makes passing a single keyword splat to a method not allocate a hash on the caller side. Passing multiple keyword splats or a mix of explicit keywords and a keyword splat still generates a hash on the caller side. On the callee side, if arbitrary keywords are not accepted, it does not allocate a hash. If arbitrary keywords are accepted, it will allocate a hash, but this commit uses a callinfo flag to indicate whether the caller already allocated a hash, and if so, the callee can use the passed hash without duplicating it. So this commit should make it so that a maximum of a single hash is allocated during method calls. To set the callinfo flag appropriately, method call argument compilation checks if only a single keyword splat is given. If only one keyword splat is given, the VM_CALL_KW_SPLAT_MUT callinfo flag is not set, since in that case the keyword splat is passed directly and not mutable. If more than one splat is used, a new hash needs to be generated on the caller side, and in that case the callinfo flag is set, indicating the keyword splat is mutable by the callee. In compile_hash, used for both hash and keyword argument compilation, if compiling keyword arguments and only a single keyword splat is used, pass the argument directly. On the caller side, in vm_args.c, the callinfo flag needs to be recognized and handled. Because the keyword splat argument may not be a hash, it needs to be converted to a hash first if not. Then, unless the callinfo flag is set, the hash needs to be duplicated. The temporary copy of the callinfo flag, kw_flag, is updated if a hash was duplicated, to prevent the need to duplicate it again. If we are converting to a hash or duplicating a hash, we need to update the argument array, which can including duplicating the positional splat array if one was passed. CALLER_SETUP_ARG and a couple other places needs to be modified to handle similar issues for other types of calls. This includes fairly comprehensive tests for different ways keywords are handled internally, checking that you get equal results but that keyword splats on the caller side result in distinct objects for keyword rest parameters. Included are benchmarks for keyword argument calls. Brief results when compiled without optimization: def kw(a: 1) a end def kws(**kw) kw end h = {a: 1} kw(a: 1) # about same kw(**h) # 2.37x faster kws(a: 1) # 1.30x faster kws(**h) # 2.19x faster kw(a: 1, **h) # 1.03x slower kw(**h, **h) # about same kws(a: 1, **h) # 1.16x faster kws(**h, **h) # 1.14x faster
* %p is for void *卜部昌平2020-03-041-1/+1
| | | | | | | | See also 35eb12c06397e770392a41343cbffc4b204e15c9 6f5eb285077d9abf8f97056531996c58674b570c 687308cf0dab0af675e40da2b6ab8ccd5f77c072 b6a2d63eb3dbc31e110e8cb95e054dd71d49a611
* method_missing_reason should be set.Koichi Sasada2020-03-031-0/+1
| | | | | | | | | | | send() has special method launcher in VM and it has special method_missing caller. This path doesn't set ec->method_missing_reason which is used at exception creation, so setup this information. Without this setting, NoMethodError exception becomes NameError. This patch will fix: http://ci.rvm.jp/results/trunk-random1@phosphorus-docker/2761643
* check imemo_typeKoichi Sasada2020-02-271-3/+10
| | | | | check imemo_type to debug http://ci.rvm.jp/results/trunk-vm-asserts@silicon-docker/2744755