aboutsummaryrefslogtreecommitdiffstats
path: root/vm_callinfo.h
Commit message (Collapse)AuthorAgeFilesLines
* Support tracing of struct member accessor methodsJeremy Evans2023-12-071-1/+8
| | | | | | | | | This follows the same approach used for attr_reader/attr_writer in 2d98593bf54a37397c6e4886ccc7e3654c2eaf85, skipping the checking for tracing after the first call using the call cache, and clearing the call cache when tracing is turned on/off. Fixes [Bug #18886]
* Use reference counting to avoid memory leak in kwargsHParker2023-10-011-0/+4
| | | | | | | | Tracks other callinfo that references the same kwargs and frees them when all references are cleared. [bug #19906] Co-authored-by: Peter Zhu <peter@peterzhu.ca>
* use inline cache for refinementsKoichi Sasada2023-07-311-3/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | From Ruby 3.0, refined method invocations are slow because resolved methods are not cached by inline cache because of conservertive strategy. However, `using` clears all caches so that it seems safe to cache resolved method entries. This patch caches resolved method entries in inline cache and clear all of inline method caches when `using` is called. fix [Bug #18572] ```ruby # without refinements class C def foo = :C end N = 1_000_000 obj = C.new require 'benchmark' Benchmark.bm{|x| x.report{N.times{ obj.foo; obj.foo; obj.foo; obj.foo; obj.foo; obj.foo; obj.foo; obj.foo; obj.foo; obj.foo; obj.foo; obj.foo; obj.foo; obj.foo; obj.foo; obj.foo; obj.foo; obj.foo; obj.foo; obj.foo; }} } _END__ user system total real master 0.362859 0.002544 0.365403 ( 0.365424) modified 0.357251 0.000000 0.357251 ( 0.357258) ``` ```ruby # with refinment but without using class C def foo = :C end module R refine C do def foo = :R end end N = 1_000_000 obj = C.new require 'benchmark' Benchmark.bm{|x| x.report{N.times{ obj.foo; obj.foo; obj.foo; obj.foo; obj.foo; obj.foo; obj.foo; obj.foo; obj.foo; obj.foo; obj.foo; obj.foo; obj.foo; obj.foo; obj.foo; obj.foo; obj.foo; obj.foo; obj.foo; obj.foo; }} } __END__ user system total real master 0.957182 0.000000 0.957182 ( 0.957212) modified 0.359228 0.000000 0.359228 ( 0.359238) ``` ```ruby # with using class C def foo = :C end module R refine C do def foo = :R end end N = 1_000_000 using R obj = C.new require 'benchmark' Benchmark.bm{|x| x.report{N.times{ obj.foo; obj.foo; obj.foo; obj.foo; obj.foo; obj.foo; obj.foo; obj.foo; obj.foo; obj.foo; obj.foo; obj.foo; obj.foo; obj.foo; obj.foo; obj.foo; obj.foo; obj.foo; obj.foo; obj.foo; }} }
* mark `cc->cme_` if it is for `super`Koichi Sasada2023-07-311-4/+26
| | | | | `vm_search_super_method()` makes orphan CCs (they are not connected from ccs) and `cc->cme_` can be collected before without marking.
* fix typo (CACH_ -> CACHE_)Ruby2023-07-281-6/+6
|
* Compile code for non-embedded CI alwaysNobuyoshi Nakada2023-06-301-8/+9
|
* Remove unused VM_CALL_BLOCKISEQ flagTakashi Kokubun2023-04-011-2/+0
|
* Improve explanation of FCALL and VCALLTakashi Kokubun2023-04-011-13/+13
|
* `vm_call_single_noarg_inline_builtin`Koichi Sasada2023-03-231-0/+26
| | | | | | | | If the iseq only contains `opt_invokebuiltin_delegate_leave` insn and the builtin-function (bf) is inline-able, the caller doesn't need to build a method frame. `vm_call_single_noarg_inline_builtin` is fast path for such cases.
* s/MJIT/RJIT/Takashi Kokubun2023-03-061-1/+1
|
* Prevent wrong integer expansionYusuke Endoh2022-10-131-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | `(attr_index + 1)` leads to wrong integer expansion on 32-bit machines (including Solaris 10 CI) because `attr_index_t` is uint16_t. http://rubyci.s3.amazonaws.com/solaris10-gcc/ruby-master/log/20221013T080004Z.fail.html.gz ``` 1) Failure: TestRDocClassModule#test_marshal_load_version_2 [/export/home/users/chkbuild/cb-gcc/tmp/build/20221013T080004Z/ruby/test/rdoc/test_rdoc_class_module.rb:493]: <[doc: [doc (file.rb): [para: "this is a comment"]]]> expected but was <[doc: [doc (file.rb): [para: "this is a comment"]]]>. 2) Failure: TestRDocStats#test_report_method_line [/export/home/users/chkbuild/cb-gcc/tmp/build/20221013T080004Z/ruby/test/rdoc/test_rdoc_stats.rb:460]: Expected /\#\ in\ file\ file\.rb:4/ to match "The following items are not documented:\n" + "\n" + " class C # is documented\n" + "\n" + " # in file file.rb\n" + " def m1; end\n" + "\n" + " end\n" + "\n" + "\n". 3) Failure: TestRDocStats#test_report_attr_line [/export/home/users/chkbuild/cb-gcc/tmp/build/20221013T080004Z/ruby/test/rdoc/test_rdoc_stats.rb:91]: Expected /\#\ in\ file\ file\.rb:3/ to match "The following items are not documented:\n" + "\n" + " class C # is documented\n" + "\n" + " attr_accessor :a # in file file.rb\n" + "\n" + " end\n" + "\n" + "\n". ```
* Initialize shape attr index also in non-markable CCNobuyoshi Nakada2022-10-121-5/+9
|
* Do not read cached_id from callcache on stackYusuke Endoh2022-10-121-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The inline cache is initialized by vm_cc_attr_index_set only when vm_cc_markable(cc). However, vm_getivar attempted to read the cache even if the cc is not vm_cc_markable. This caused a condition that depends on uninitialized value. Here is an output of valgrind: ``` ==10483== Conditional jump or move depends on uninitialised value(s) ==10483== at 0x4C1D60: vm_getivar (vm_insnhelper.c:1171) ==10483== by 0x4C1D60: vm_call_ivar (vm_insnhelper.c:3257) ==10483== by 0x4E8E48: vm_call_symbol (vm_insnhelper.c:3481) ==10483== by 0x4EAD8C: vm_sendish (vm_insnhelper.c:5035) ==10483== by 0x4C62B2: vm_exec_core (insns.def:820) ==10483== by 0x4DD519: rb_vm_exec (vm.c:0) ==10483== by 0x4F00B3: invoke_block (vm.c:1417) ==10483== by 0x4F00B3: invoke_iseq_block_from_c (vm.c:1473) ==10483== by 0x4F00B3: invoke_block_from_c_bh (vm.c:1491) ==10483== by 0x4D42B6: rb_yield (vm_eval.c:0) ==10483== by 0x259128: rb_ary_each (array.c:2733) ==10483== by 0x4E8730: vm_call_cfunc_with_frame (vm_insnhelper.c:3227) ==10483== by 0x4EAD8C: vm_sendish (vm_insnhelper.c:5035) ==10483== by 0x4C6254: vm_exec_core (insns.def:801) ==10483== by 0x4DD519: rb_vm_exec (vm.c:0) ==10483== ``` In fact, the CI on FreeBSD 12 started failing since ad63b668e22e21c352b852f3119ae98a7acf99f1. ``` gmake[1]: Entering directory '/usr/home/chkbuild/chkbuild/tmp/build/20221011T163003Z/ruby' /usr/home/chkbuild/chkbuild/tmp/build/20221011T163003Z/ruby/lib/optparse.rb:924:in `complete': undefined method `complete' for nil:NilClass (NoMethodError) from /usr/home/chkbuild/chkbuild/tmp/build/20221011T163003Z/ruby/lib/optparse.rb:1816:in `block in visit' from /usr/home/chkbuild/chkbuild/tmp/build/20221011T163003Z/ruby/lib/optparse.rb:1815:in `reverse_each' from /usr/home/chkbuild/chkbuild/tmp/build/20221011T163003Z/ruby/lib/optparse.rb:1815:in `visit' from /usr/home/chkbuild/chkbuild/tmp/build/20221011T163003Z/ruby/lib/optparse.rb:1847:in `block in complete' from /usr/home/chkbuild/chkbuild/tmp/build/20221011T163003Z/ruby/lib/optparse.rb:1846:in `catch' from /usr/home/chkbuild/chkbuild/tmp/build/20221011T163003Z/ruby/lib/optparse.rb:1846:in `complete' from /usr/home/chkbuild/chkbuild/tmp/build/20221011T163003Z/ruby/lib/optparse.rb:1640:in `block in parse_in_order' from /usr/home/chkbuild/chkbuild/tmp/build/20221011T163003Z/ruby/lib/optparse.rb:1632:in `catch' from /usr/home/chkbuild/chkbuild/tmp/build/20221011T163003Z/ruby/lib/optparse.rb:1632:in `parse_in_order' from /usr/home/chkbuild/chkbuild/tmp/build/20221011T163003Z/ruby/lib/optparse.rb:1626:in `order!' from /usr/home/chkbuild/chkbuild/tmp/build/20221011T163003Z/ruby/lib/optparse.rb:1732:in `permute!' from /usr/home/chkbuild/chkbuild/tmp/build/20221011T163003Z/ruby/lib/optparse.rb:1757:in `parse!' from ./ext/extmk.rb:359:in `parse_args' from ./ext/extmk.rb:396:in `<main>' ``` This change adds a guard to read the cache only when vm_cc_markable(cc). It might be better to initialize the cache as INVALID_SHAPE_ID when the cc is not vm_cc_markable.
* Make inline cache reads / writes atomic with object shapesJemma Issroff2022-10-111-59/+22
| | | | | | | | | | | | | | Prior to this commit, we were reading and writing ivar index and shape ID in inline caches in two separate instructions when getting and setting ivars. This meant there was a race condition with ractors and these caches where one ractor could change a value in the cache while another was still reading from it. This commit instead reads and writes shape ID and ivar index to inline caches atomically so there is no longer a race condition. Co-Authored-By: Aaron Patterson <tenderlove@ruby-lang.org> Co-Authored-By: John Hawthorn <john@hawthorn.email>
* Revert "Revert "This commit implements the Object Shapes technique in CRuby.""Jemma Issroff2022-10-111-24/+84
| | | | This reverts commit 9a6803c90b817f70389cae10d60b50ad752da48f.
* Revert "This commit implements the Object Shapes technique in CRuby."Aaron Patterson2022-09-301-84/+24
| | | | This reverts commit 68bc9e2e97d12f80df0d113e284864e225f771c2.
* This commit implements the Object Shapes technique in CRuby.Jemma Issroff2022-09-281-24/+84
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Object Shapes is used for accessing instance variables and representing the "frozenness" of objects. Object instances have a "shape" and the shape represents some attributes of the object (currently which instance variables are set and the "frozenness"). Shapes form a tree data structure, and when a new instance variable is set on an object, that object "transitions" to a new shape in the shape tree. Each shape has an ID that is used for caching. The shape structure is independent of class, so objects of different types can have the same shape. For example: ```ruby class Foo def initialize # Starts with shape id 0 @a = 1 # transitions to shape id 1 @b = 1 # transitions to shape id 2 end end class Bar def initialize # Starts with shape id 0 @a = 1 # transitions to shape id 1 @b = 1 # transitions to shape id 2 end end foo = Foo.new # `foo` has shape id 2 bar = Bar.new # `bar` has shape id 2 ``` Both `foo` and `bar` instances have the same shape because they both set instance variables of the same name in the same order. This technique can help to improve inline cache hits as well as generate more efficient machine code in JIT compilers. This commit also adds some methods for debugging shapes on objects. See `RubyVM::Shape` for more details. For more context on Object Shapes, see [Feature: #18776] Co-Authored-By: Aaron Patterson <tenderlove@ruby-lang.org> Co-Authored-By: Eileen M. Uchitelle <eileencodes@gmail.com> Co-Authored-By: John Hawthorn <john@hawthorn.email>
* Revert this until we can figure out WB issues or remove shapes from GCAaron Patterson2022-09-261-84/+24
| | | | | | | | | | Revert "* expand tabs. [ci skip]" This reverts commit 830b5b5c351c5c6efa5ad461ae4ec5085e5f0275. Revert "This commit implements the Object Shapes technique in CRuby." This reverts commit 9ddfd2ca004d1952be79cf1b84c52c79a55978f4.
* This commit implements the Object Shapes technique in CRuby.Jemma Issroff2022-09-261-24/+84
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Object Shapes is used for accessing instance variables and representing the "frozenness" of objects. Object instances have a "shape" and the shape represents some attributes of the object (currently which instance variables are set and the "frozenness"). Shapes form a tree data structure, and when a new instance variable is set on an object, that object "transitions" to a new shape in the shape tree. Each shape has an ID that is used for caching. The shape structure is independent of class, so objects of different types can have the same shape. For example: ```ruby class Foo def initialize # Starts with shape id 0 @a = 1 # transitions to shape id 1 @b = 1 # transitions to shape id 2 end end class Bar def initialize # Starts with shape id 0 @a = 1 # transitions to shape id 1 @b = 1 # transitions to shape id 2 end end foo = Foo.new # `foo` has shape id 2 bar = Bar.new # `bar` has shape id 2 ``` Both `foo` and `bar` instances have the same shape because they both set instance variables of the same name in the same order. This technique can help to improve inline cache hits as well as generate more efficient machine code in JIT compilers. This commit also adds some methods for debugging shapes on objects. See `RubyVM::Shape` for more details. For more context on Object Shapes, see [Feature: #18776] Co-Authored-By: Aaron Patterson <tenderlove@ruby-lang.org> Co-Authored-By: Eileen M. Uchitelle <eileencodes@gmail.com> Co-Authored-By: John Hawthorn <john@hawthorn.email>
* Extract vm_ic_entry API to mimic vm_cc behaviorJemma Issroff2022-07-181-0/+20
|
* Remove a typo hash [ci skip]Nobuyoshi Nakada2022-01-291-1/+1
|
* Streamline cached attr reader / writer indexesJemma Issroff2022-01-261-2/+17
| | | | | | | This commit removes the need to increment and decrement the indexes used by vm_cc_attr_index getters and setters. It also introduces a vm_cc_attr_index_p predicate function, and a vm_cc_attr_index_initalize function.
* `mandatory_only_cme` should not be in `def`Koichi Sasada2021-12-211-5/+11
| | | | | | | | | | | | | | | | | | | | | | | | `def` (`rb_method_definition_t`) is shared by multiple callable method entries (cme, `rb_callable_method_entry_t`). There are two issues: * old -> young reference: `cme1->def->mandatory_only_cme = monly_cme` if `cme1` is young and `monly_cme` is young, there is no problem. Howevr, another old `cme2` can refer `def`, in this case, old `cme2` points young `monly_cme` and it violates gengc assumption. * cme can have different `defined_class` but `monly_cme` only has one `defined_class`. It does not make sense and `monly_cme` should be created for a cme (not `def`). To solve these issues, this patch allocates `monly_cme` per `cme`. `cme` does not have another room to store a pointer to the `monly_cme`, so this patch introduces `overloaded_cme_table`, which is weak key map `[cme] -> [monly_cme]`. `def::body::iseqptr::monly_cme` is deleted. The first issue is reported by Alan Wu.
* fix assertion on `gc_cc_cme()`Koichi Sasada2021-11-251-1/+3
| | | | | `cc->cme_` can be NULL when it is not initialized yet. It can be observed on `GC.stress == true` running.
* add `VM_CALLCACHE_ON_STACK`Koichi Sasada2021-11-171-1/+3
| | | | check if iseq refers to on stack CC (it shouldn't).
* assert `cc->cme_ != NULL`Koichi Sasada2021-11-171-7/+9
| | | | when `vm_cc_markable(cc)`.
* `vm_empty_cc_for_super`Koichi Sasada2021-11-171-0/+1
| | | | | | Same as `vm_empty_cc`, introduce a global variable which has `.call_ = vm_call_super_method`. Use it if the `cme == NULL` on `vm_search_super_method`.
* assert `cc->call_ != NULL`Koichi Sasada2021-11-171-0/+1
|
* `Primitive.mandatory_only?` for fast pathKoichi Sasada2021-11-151-0/+19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Compare with the C methods, A built-in methods written in Ruby is slower if only mandatory parameters are given because it needs to check the argumens and fill default values for optional and keyword parameters (C methods can check the number of parameters with `argc`, so there are no overhead). Passing mandatory arguments are common (optional arguments are exceptional, in many cases) so it is important to provide the fast path for such common cases. `Primitive.mandatory_only?` is a special builtin function used with `if` expression like that: ```ruby def self.at(time, subsec = false, unit = :microsecond, in: nil) if Primitive.mandatory_only? Primitive.time_s_at1(time) else Primitive.time_s_at(time, subsec, unit, Primitive.arg!(:in)) end end ``` and it makes two ISeq, ``` def self.at(time, subsec = false, unit = :microsecond, in: nil) Primitive.time_s_at(time, subsec, unit, Primitive.arg!(:in)) end def self.at(time) Primitive.time_s_at1(time) end ``` and (2) is pointed by (1). Note that `Primitive.mandatory_only?` should be used only in a condition of an `if` statement and the `if` statement should be equal to the methdo body (you can not put any expression before and after the `if` statement). A method entry with `mandatory_only?` (`Time.at` on the above case) is marked as `iseq_overload`. When the method will be dispatch only with mandatory arguments (`Time.at(0)` for example), make another method entry with ISeq (2) as mandatory only method entry and it will be cached in an inline method cache. The idea is similar discussed in https://bugs.ruby-lang.org/issues/16254 but it only checks mandatory parameters or more, because many cases only mandatory parameters are given. If we find other cases (optional or keyword parameters are used frequently and it hurts performance), we can extend the feature.
* Partial revert of ceebc7fc98dAaron Patterson2021-10-201-1/+10
| | | | | | | I'm looking through the places where YJIT needs notifications. It looks like these changes to gc.c and vm_callinfo.h have become unnecessary since 84ab77ba592. This commit just makes the diff against upstream smaller, but otherwise shouldn't change any behavior.
* MicroJIT: generate less code for CFUNCsAlan Wu2021-10-201-10/+1
| | | | | | | | Added UJIT_CHECK_MODE. Set to 1 to double check method dispatch in generated code. It's surprising to me that we need to watch both cc and cme. There might be opportunities to simplify there.
* Remove printf family from the mjit headerNobuyoshi Nakada2021-09-111-3/+3
| | | | | Linking printf family functions makes mjit objects to link unnecessary code.
* internal/*.h: skip doxygen卜部昌平2021-09-101-1/+0
| | | | | These contents are purely implementation details, not worth appearing in CAPI documents. [ci skip]
* global call-cache cache table for rb_funcall*Koichi Sasada2021-01-291-12/+0
| | | | | | | | | | | | | | | | | | | | | | | | | rb_funcall* (rb_funcall(), rb_funcallv(), ...) functions invokes Ruby's method with given receiver. Ruby 2.7 introduced inline method cache with static memory area. However, Ruby 3.0 reimplemented the method cache data structures and the inline cache was removed. Without inline cache, rb_funcall* searched methods everytime. Most of cases per-Class Method Cache (pCMC) will be helped but pCMC requires VM-wide locking and it hurts performance on multi-Ractor execution, especially all Ractors calls methods with rb_funcall*. This patch introduced Global Call-Cache Cache Table (gccct) for rb_funcall*. Call-Cache was introduced from Ruby 3.0 to manage method cache entry atomically and gccct enables method-caching without VM-wide locking. This table solves the performance issue on multi-ractor execution. [Bug #17497] Ruby-level method invocation does not use gccct because it has inline-method-cache and the table size is limited. Basically rb_funcall* is not used frequently, so 1023 entries can be enough. We will revisit the table size if it is not enough.
* fix conditon of vm_cc_invalidated_p()Koichi Sasada2021-01-191-1/+1
| | | | | vm_cc_invalidated_p() returns false when the cme is *NOT* invalidated.
* remove invalidated ccKoichi Sasada2021-01-061-0/+11
| | | | if cc is invalidated, cc should be released from iseq.
* fix inline method cache sync bugKoichi Sasada2020-12-151-2/+1
| | | | | | | | | `cd` is passed to method call functions to method invocation functions, but `cd` can be manipulated by other ractors simultaneously so it contains thread-safety issue. To solve this issue, this patch stores `ci` and found `cc` to `calling` and stops to pass `cd`.
* mutete -> mutateAlan Wu2020-10-221-1/+1
|
* VM_CI_NEW_ID: USE_EMBED_CI could be false卜部昌平2020-06-091-6/+8
| | | | It was a wrong idea to assume CIs are always embedded.
* rb_vm_call0: on-stack call info卜部昌平2020-06-091-16/+21
| | | | | This changeset reduces the generated binary of rb_vm_call0 from 281 bytes to 211 bytes on my machine. Should reduce GC pressure as well.
* rb_equal_opt: fully static call data卜部昌平2020-06-091-4/+9
| | | | | This changeset reduces the generated binary of rb_equal_opt from 129 bytes to 17 bytes on my machine, according to nm(1).
* vm_ci_markable: added卜部昌平2020-06-091-0/+17
| | | | | | | CIs are created on-the-fly, which increases GC pressure. However they include no references to other objects, and those on-the-fly CIs tend to be short lived. Why not skip allocation of them. In doing so we need to add a flag denotes the CI object does not reside inside of objspace.
* Moved vm_empty_cc to local in vm.c [Bug #16934]Nobuyoshi Nakada2020-06-041-14/+1
| | | | Missed to commit a staged change.
* add #include guard hack卜部昌平2020-04-131-0/+13
| | | | | | | | | | | | | | | | | | | | | | According to MSVC manual (*1), cl.exe can skip including a header file when that: - contains #pragma once, or - starts with #ifndef, or - starts with #if ! defined. GCC has a similar trick (*2), but it acts more stricter (e. g. there must be _no tokens_ outside of #ifndef...#endif). Sun C lacked #pragma once for a looong time. Oracle Developer Studio 12.5 finally implemented it, but we cannot assume such recent version. This changeset modifies header files so that each of them include strictly one #ifndef...#endif. I believe this is the most portable way to trigger compiler optimizations. [Bug #16770] *1: https://docs.microsoft.com/en-us/cpp/preprocessor/once *2: https://gcc.gnu.org/onlinedocs/cppinternals/Guard-Macros.html
* Avoid UB with flexible array memberAlan Wu2020-04-121-2/+2
| | | | | | | Accessing past the end of an array is technically UB. Use C99 flexible array member instead to avoid the UB and simplify allocation size calculation. See also: DCL38-C in the SEI CERT C Coding Standard
* Merge pull request #2991 from shyouhei/ruby.h卜部昌平2020-04-081-1/+1
| | | Split ruby.h
* Reduce allocations for keyword argument hashesJeremy Evans2020-03-171-2/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Previously, passing a keyword splat to a method always allocated a hash on the caller side, and accepting arbitrary keywords in a method allocated a separate hash on the callee side. Passing explicit keywords to a method that accepted a keyword splat did not allocate a hash on the caller side, but resulted in two hashes allocated on the callee side. This commit makes passing a single keyword splat to a method not allocate a hash on the caller side. Passing multiple keyword splats or a mix of explicit keywords and a keyword splat still generates a hash on the caller side. On the callee side, if arbitrary keywords are not accepted, it does not allocate a hash. If arbitrary keywords are accepted, it will allocate a hash, but this commit uses a callinfo flag to indicate whether the caller already allocated a hash, and if so, the callee can use the passed hash without duplicating it. So this commit should make it so that a maximum of a single hash is allocated during method calls. To set the callinfo flag appropriately, method call argument compilation checks if only a single keyword splat is given. If only one keyword splat is given, the VM_CALL_KW_SPLAT_MUT callinfo flag is not set, since in that case the keyword splat is passed directly and not mutable. If more than one splat is used, a new hash needs to be generated on the caller side, and in that case the callinfo flag is set, indicating the keyword splat is mutable by the callee. In compile_hash, used for both hash and keyword argument compilation, if compiling keyword arguments and only a single keyword splat is used, pass the argument directly. On the caller side, in vm_args.c, the callinfo flag needs to be recognized and handled. Because the keyword splat argument may not be a hash, it needs to be converted to a hash first if not. Then, unless the callinfo flag is set, the hash needs to be duplicated. The temporary copy of the callinfo flag, kw_flag, is updated if a hash was duplicated, to prevent the need to duplicate it again. If we are converting to a hash or duplicating a hash, we need to update the argument array, which can including duplicating the positional splat array if one was passed. CALLER_SETUP_ARG and a couple other places needs to be modified to handle similar issues for other types of calls. This includes fairly comprehensive tests for different ways keywords are handled internally, checking that you get equal results but that keyword splats on the caller side result in distinct objects for keyword rest parameters. Included are benchmarks for keyword argument calls. Brief results when compiled without optimization: def kw(a: 1) a end def kws(**kw) kw end h = {a: 1} kw(a: 1) # about same kw(**h) # 2.37x faster kws(a: 1) # 1.30x faster kws(**h) # 2.19x faster kw(a: 1, **h) # 1.03x slower kw(**h, **h) # about same kws(a: 1, **h) # 1.16x faster kws(**h, **h) # 1.14x faster
* Pin and inline cme in JIT-ed method callsTakashi Kokubun2020-03-111-2/+3
| | | | | | | | | | | | | | | | | | | | | | ``` $ benchmark-driver benchmark.yml -v --rbenv 'before --jit;after --jit' --repeat-count=12 --output=all before --jit: ruby 2.8.0dev (2020-03-11T07:43:12Z master e89ebdcb87) +JIT [x86_64-linux] after --jit: ruby 2.8.0dev (2020-03-11T07:54:18Z master 143776a0da) +JIT [x86_64-linux] Calculating ------------------------------------- before --jit after --jit Optcarrot Lan_Master.nes 73.86976729561439 77.20184819316513 fps 74.46997176460742 78.43493030231805 77.59686308754307 78.55714131655935 78.53693921126656 79.08984255596820 80.10158944910573 79.17751731838183 80.12254974411167 79.60853122429181 80.28678655204945 79.74674066871896 80.38690681095379 79.90624544440300 80.79223498756919 80.57881084206193 80.82857188422419 80.70677614429169 81.06447745878245 81.03868541295149 81.21620802278490 82.16354660940607 ```
* vm_cc_fill() need to clear aux.Koichi Sasada2020-03-021-0/+2
| | | | | vm_cc_fill() fills CC information into stack allocated memory so it is not cleared. So we need to clear CC->aux.
* * remove trailing spaces. [ci skip]git2020-02-221-1/+1
|