aboutsummaryrefslogtreecommitdiffstats
path: root/vm_insnhelper.c
Commit message (Collapse)AuthorAgeFilesLines
* more on NULL versus functions卜部昌平2020-05-111-4/+4
| | | | | | | Function pointers are not void*. See also 115fec062ccf7c6d72c8d5f64b7a5d84c9fb2dd8 ce4ea956d24eab5089a143bba38126f2b11b55b6 8427fca49bd85205f5a8766292dd893f003c0e48
* sed -i 's|ruby/impl|ruby/internal|'卜部昌平2020-05-111-1/+1
| | | | To fix build failures.
* sed -i s|ruby/3|ruby/impl|g卜部昌平2020-05-111-1/+1
| | | | This shall fix compile errors.
* Disable -Wswitch warning when VM_CHECK_MODENobuyoshi Nakada2020-05-031-0/+2
|
* Invalidate fastpath when calling attr_reader by superTakashi Kokubun2020-04-141-2/+2
| | | | The same bug as 8355a99883 existed in attr_reader too.
* Invalidate fastpath when calling attr_writer by superTakashi Kokubun2020-04-141-3/+12
| | | | | | | | | | | | | | We started to use fastpath on invokesuper when a method is not refinements since 5c27681813, but we shouldn't have used fastpath for attr_writer either. `cc->aux_.attr_index` is for an actual receiver class, while we store its superclass in `cc->klass` and therefore there's no way to properly invalidate attr_writer's inline cache when it's called by super. [Bug #16785] I suspect the same bug also exists in attr_reader. I'll address that in another commit.
* Make vm_call_cfunc_with_frame a fastpath (#3027)Takashi Kokubun2020-04-131-0/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | when there's no need to call CALLER_SETUP_ARG and CALLER_REMOVE_EMPTY_KW_SPLAT (i.e. !rb_splat_or_kwargs_p(ci) && !calling->kw_splat). Micro benchmark: ``` $ benchmark-driver -v --rbenv 'before;after' benchmark/vm_send_cfunc.yml --repeat-count=4 before: ruby 2.8.0dev (2020-04-13T23:45:05Z master b9d3ceee8f) [x86_64-linux] after: ruby 2.8.0dev (2020-04-14T00:48:52Z no-splat-fastpath 418d363722) [x86_64-linux] Calculating ------------------------------------- before after vm_send_cfunc 69.585M 88.724M i/s - 100.000M times in 1.437097s 1.127096s Comparison: vm_send_cfunc after: 88723605.2 i/s before: 69584737.1 i/s - 1.28x slower ``` Optcarrot: ``` $ benchmark-driver -v --rbenv 'before;after' benchmark.yml --repeat-count=12 --output=all before: ruby 2.8.0dev (2020-04-13T23:45:05Z master b9d3ceee8f) [x86_64-linux] after: ruby 2.8.0dev (2020-04-14T00:48:52Z no-splat-fastpath 418d363722) [x86_64-linux] Calculating ------------------------------------- before after Optcarrot Lan_Master.nes 50.76119601545175 42.73858236484051 fps 50.76388649761503 51.04211379912850 50.80930672252514 51.39455790755538 50.90236000778749 51.75656936556145 51.01744746340430 51.86875277356489 51.06495279015112 51.88692482485558 51.07785337168974 51.93429603190578 51.20163525187862 51.95768145071314 51.34671771913112 52.45577266040274 51.35918340835583 52.53163888762858 51.46641337418146 52.62172484121034 51.50835463462257 52.85064021113239 ```
* Enable fastpath on invokesuper (#3021)Takashi Kokubun2020-04-111-2/+3
| | | | | | | | | | | | | | | | | | Fastpath has not been used for invokesuper since it has set vm_call_super_method on every invocation. Because it seems to be blocked only by refinements, try enabling fastpath on invokesuper when cme is not for refinements. While this patch itself should be helpful for VM performance, a part of this patch's motivation is to unblock inlining invokesuper on JIT. $ benchmark-driver -v --rbenv 'before;after' benchmark/vm2_super.yml --repeat-count=4 before: ruby 2.8.0dev (2020-04-11T15:19:58Z master a01bda5949) [x86_64-linux] after: ruby 2.8.0dev (2020-04-12T02:00:08Z invokesuper-fastpath c171984ee3) [x86_64-linux] Calculating ------------------------------------- before after vm2_super 20.031M 32.860M i/s - 6.000M times in 0.299534s 0.182593s Comparison: vm2_super after: 32859885.2 i/s before: 20031097.3 i/s - 1.64x slower
* Turn class variable warnings into exceptionsJeremy Evans2020-04-101-4/+4
| | | | | | | | | | | | | | | | | | This changes the following warnings: * warning: class variable access from toplevel * warning: class variable @foo of D is overtaken by C into RuntimeErrors. Handle defined?(@@foo) at toplevel by returning nil instead of raising an exception (the previous behavior warned before returning nil when defined? was used). Refactor the specs to avoid the warnings even in older versions. The specs were checking for the warnings, but the purpose of the related specs as evidenced from their description is to test for behavior, not for warnings. Fixes [Bug #14541]
* Merge pull request #2991 from shyouhei/ruby.h卜部昌平2020-04-081-3/+2
| | | Split ruby.h
* Reduce allocations for keyword argument hashesJeremy Evans2020-03-171-6/+21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Previously, passing a keyword splat to a method always allocated a hash on the caller side, and accepting arbitrary keywords in a method allocated a separate hash on the callee side. Passing explicit keywords to a method that accepted a keyword splat did not allocate a hash on the caller side, but resulted in two hashes allocated on the callee side. This commit makes passing a single keyword splat to a method not allocate a hash on the caller side. Passing multiple keyword splats or a mix of explicit keywords and a keyword splat still generates a hash on the caller side. On the callee side, if arbitrary keywords are not accepted, it does not allocate a hash. If arbitrary keywords are accepted, it will allocate a hash, but this commit uses a callinfo flag to indicate whether the caller already allocated a hash, and if so, the callee can use the passed hash without duplicating it. So this commit should make it so that a maximum of a single hash is allocated during method calls. To set the callinfo flag appropriately, method call argument compilation checks if only a single keyword splat is given. If only one keyword splat is given, the VM_CALL_KW_SPLAT_MUT callinfo flag is not set, since in that case the keyword splat is passed directly and not mutable. If more than one splat is used, a new hash needs to be generated on the caller side, and in that case the callinfo flag is set, indicating the keyword splat is mutable by the callee. In compile_hash, used for both hash and keyword argument compilation, if compiling keyword arguments and only a single keyword splat is used, pass the argument directly. On the caller side, in vm_args.c, the callinfo flag needs to be recognized and handled. Because the keyword splat argument may not be a hash, it needs to be converted to a hash first if not. Then, unless the callinfo flag is set, the hash needs to be duplicated. The temporary copy of the callinfo flag, kw_flag, is updated if a hash was duplicated, to prevent the need to duplicate it again. If we are converting to a hash or duplicating a hash, we need to update the argument array, which can including duplicating the positional splat array if one was passed. CALLER_SETUP_ARG and a couple other places needs to be modified to handle similar issues for other types of calls. This includes fairly comprehensive tests for different ways keywords are handled internally, checking that you get equal results but that keyword splats on the caller side result in distinct objects for keyword rest parameters. Included are benchmarks for keyword argument calls. Brief results when compiled without optimization: def kw(a: 1) a end def kws(**kw) kw end h = {a: 1} kw(a: 1) # about same kw(**h) # 2.37x faster kws(a: 1) # 1.30x faster kws(**h) # 2.19x faster kw(a: 1, **h) # 1.03x slower kw(**h, **h) # about same kws(a: 1, **h) # 1.16x faster kws(**h, **h) # 1.14x faster
* %p is for void *卜部昌平2020-03-041-1/+1
| | | | | | | | See also 35eb12c06397e770392a41343cbffc4b204e15c9 6f5eb285077d9abf8f97056531996c58674b570c 687308cf0dab0af675e40da2b6ab8ccd5f77c072 b6a2d63eb3dbc31e110e8cb95e054dd71d49a611
* method_missing_reason should be set.Koichi Sasada2020-03-031-0/+1
| | | | | | | | | | | send() has special method launcher in VM and it has special method_missing caller. This path doesn't set ec->method_missing_reason which is used at exception creation, so setup this information. Without this setting, NoMethodError exception becomes NameError. This patch will fix: http://ci.rvm.jp/results/trunk-random1@phosphorus-docker/2761643
* check imemo_typeKoichi Sasada2020-02-271-3/+10
| | | | | check imemo_type to debug http://ci.rvm.jp/results/trunk-vm-asserts@silicon-docker/2744755
* Introduce disposable call-cache.Koichi Sasada2020-02-221-390/+424
| | | | | | | | | | | | | | | | | | | | | | | | | | This patch contains several ideas: (1) Disposable inline method cache (IMC) for race-free inline method cache * Making call-cache (CC) as a RVALUE (GC target object) and allocate new CC on cache miss. * This technique allows race-free access from parallel processing elements like RCU. (2) Introduce per-Class method cache (pCMC) * Instead of fixed-size global method cache (GMC), pCMC allows flexible cache size. * Caching CCs reduces CC allocation and allow sharing CC's fast-path between same call-info (CI) call-sites. (3) Invalidate an inline method cache by invalidating corresponding method entries (MEs) * Instead of using class serials, we set "invalidated" flag for method entry itself to represent cache invalidation. * Compare with using class serials, the impact of method modification (add/overwrite/delete) is small. * Updating class serials invalidate all method caches of the class and sub-classes. * Proposed approach only invalidate the method cache of only one ME. See [Feature #16614] for more details.
* VALUE size packed callinfo (ci).Koichi Sasada2020-02-221-114/+110
| | | | | | | | | | | | | | | | | | | | Now, rb_call_info contains how to call the method with tuple of (mid, orig_argc, flags, kwarg). Most of cases, kwarg == NULL and mid+argc+flags only requires 64bits. So this patch packed rb_call_info to VALUE (1 word) on such cases. If we can not represent it in VALUE, then use imemo_callinfo which contains conventional callinfo (rb_callinfo, renamed from rb_call_info). iseq->body->ci_kw_size is removed because all of callinfo is VALUE size (packed ci or a pointer to imemo_callinfo). To access ci information, we need to use these functions: vm_ci_mid(ci), _flag(ci), _argc(ci), _kwarg(ci). struct rb_call_info_kw_arg is renamed to rb_callinfo_kwarg. rb_funcallv_with_cc() and rb_method_basic_definition_p_with_cc() is temporary removed because cd->ci should be marked.
* Adjusted indent [ci skip]Nobuyoshi Nakada2020-02-221-1/+1
|
* should be compared with called_idKoichi Sasada2020-02-131-1/+1
| | | | | | me->called_id and me->def->original_id can be different sometimes so we should compare with called_id, which is mtbl's key. (fix GH-PR #2869)
* Use inline cache for super callsJohn Hawthorn2020-02-131-1/+17
|
* support MJIT with debug option.Koichi Sasada2020-02-031-2/+7
| | | | | VM_CHECK_MODE > 0 with optflags=-O0 can not run JIT tests because of link problems. This patch fix them.
* Fully separate positional arguments and keyword argumentsJeremy Evans2020-01-021-38/+7
| | | | | | | | | | | | | | | | | | | | | | | | This removes the warnings added in 2.7, and changes the behavior so that a final positional hash is not treated as keywords or vice-versa. To handle the arg_setup_block splat case correctly with keyword arguments, we need to check if we are taking a keyword hash. That case didn't have a test, but it affects real-world code, so add a test for it. This removes rb_empty_keyword_given_p() and related code, as that is not needed in Ruby 3. The empty keyword case is the same as the no keyword case in Ruby 3. This changes rb_scan_args to implement keyword argument separation for C functions when the : character is used. For backwards compatibility, it returns a duped hash. This is a bad idea for performance, but not duping the hash breaks at least Enumerator::ArithmeticSequence#inspect. Instead of having RB_PASS_CALLED_KEYWORDS be a number, simplify the code by just making it be rb_keyword_given_p().
* decouple internal.h headers卜部昌平2019-12-261-6/+17
| | | | | | | | | | | | | | | | | | Saves comitters' daily life by avoid #include-ing everything from internal.h to make each file do so instead. This would significantly speed up incremental builds. We take the following inclusion order in this changeset: 1. "ruby/config.h", where _GNU_SOURCE is defined (must be the very first thing among everything). 2. RUBY_EXTCONF_H if any. 3. Standard C headers, sorted alphabetically. 4. Other system headers, maybe guarded by #ifdef 5. Everything else, sorted alphabetically. Exceptions are those win32-related headers, which tend not be self- containing (headers have inclusion order dependencies).
* add several __has_something macro卜部昌平2019-12-261-3/+1
| | | | | | | With these macros implemented we can write codes just like we can assume the compiler being clang. MSC_VERSION_SINCE is defined to implement those macros, but turned out to be handy for other places. The -fdeclspec compiler flag is necessary for clang to properly handle __has_declspec().
* Fixed misspellingsNobuyoshi Nakada2019-12-201-4/+4
| | | | Fixed misspellings reported at [Bug #16437], only in ruby and rubyspec.
* per-method serial number卜部昌平2019-12-181-2/+2
| | | | | | | | | | | | | | | | | Methods and their definitions can be allocated/deallocated on-the-fly. One pathological situation is when a method is deallocated then another one is allocated immediately after that. Address of those old/new method entries/definitions can be the same then, depending on underlying malloc/free implementation. So pointer comparison is insufficient. We have to check the contents. To do so we introduce def->method_serial, which is an integer unique to that specific method definition. PS: Note that method_serial being uintptr_t rather than rb_serial_t is intentional. This is because rb_serial_t can be bigger than a pointer on a 32bit system (rb_serial_t is at least 64bit). In order to preserve old packing of struct rb_call_cache, rb_serial_t is inappropriate.
* add debug counter to count `call` reusing cases.Koichi Sasada2019-12-171-0/+1
|
* ensure cc->def == cc->me->def卜部昌平2019-12-161-9/+9
| | | | | | | The equation shall hold for every call cache. However prior to this changeset cc->me could be updated without also updating cc->def. Let's make it sure by introducing new macro named CC_SET_ME which sets cc->me and cc->def at once.
* Make super in instance_eval in method in module raise TypeErrorJeremy Evans2019-12-121-6/+7
| | | | | | | | | | | | | | | | | | | | | | | This makes behavior the same as super in instance_eval in method in class. The reason this wasn't implemented before is that there is a check to determine if the self in the current context is of the expected class, and a module itself can be included in multiple classes, so it doesn't have an expected class. Implementing this requires giving iclasses knowledge of which class created them, so that super call in the module method knows the expected class for super calls. This reference is called includer, and should only be set for iclasses. Note that the approach Ruby uses in this check is not robust. If you instance_eval another object of the same class and call super, instead of an TypeError, you get super called with the instance_eval receiver instead of the method receiver. Truly fixing super would require keeping a reference to the super object (method receiver) in each frame where scope has changed, and using that instead of current self when calling super. Fixes [Bug #11636]
* Introduce an "Inline IVAR cache" structAaron Patterson2019-12-051-11/+11
| | | | | | | | | This commit introduces an "inline ivar cache" struct. The reason we need this is so compaction can differentiate from an ivar cache and a regular inline cache. Regular inline caches contain references to `VALUE` and ivar caches just contain references to the ivar index. With this new struct we can easily update references for inline caches (but not inline var caches as they just contain an int)
* check interrupts at each frame pop timing.Koichi Sasada2019-11-291-1/+1
| | | | | | | | | | | | | | | | | | Asynchronous events such as signal trap, finalization timing, thread switching and so on are managed by "interrupt_flag". Ruby's threads check this flag periodically and if a thread does not check this flag, above events doesn't happen. This checking is CHECK_INTS() (related) macro and it is placed at some places (laeve instruction and so on). However, at the end of C methods, C blocks (IMEMO_IFUNC) etc there are no checking and it can introduce uninterruptible thread. To modify this situation, we decide to place CHECK_INTS() at vm_pop_frame(). It increases interrupt checking points. [Bug #16366] This patch can introduce unexpected events...
* Reduce duplicated warnings for the change of Ruby 3 keyword argumentsYusuke Endoh2019-11-291-2/+2
| | | | | | | | | | | | | | | By this change, the following code prints only one warning. ``` def foo(**opt); end 100.times { foo({kw:1}) } ``` A global variable `st_table *caller_to_callees` is a map from caller to a set of callee methods. It remembers that a warning is already printed for each pair of caller and callee. [Feature #16289]
* Revert "export for MJIT"Koichi Sasada2019-11-291-1/+2
| | | | This reverts commit 2e6f1cf8b264f4c8499c4e5f18bf662fdade04ff.
* Revert "* remove trailing spaces. [ci skip]"Koichi Sasada2019-11-291-1/+1
| | | | This reverts commit 27d0d7c0d39076d4bbacd3c3f3864322699db7b4.
* * remove trailing spaces. [ci skip]git2019-11-291-1/+1
|
* export for MJITKoichi Sasada2019-11-291-2/+1
|
* fastpath for ivar read of FL_EXIVAR objects.Koichi Sasada2019-11-291-30/+76
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | vm_getivar() provides fastpath for T_OBJECT by caching an index of ivar. This patch also provides fastpath for FL_EXIVAR objects. FL_EXIVAR objects have an each ivar array and index can be cached as T_OBJECT. To access this ivar array, generic_iv_tbl is exposed by rb_ivar_generic_ivtbl() (declared in variable.h which is newly introduced). Benchmark script: Benchmark.driver(repeat_count: 3){|x| x.executable name: 'clean', command: %w'../clean/miniruby' x.executable name: 'trunk', command: %w'./miniruby' objs = [Object.new, 'str', {a: 1, b: 2}, [1, 2]] objs.each.with_index{|obj, i| rep = obj.inspect rep = 'Object.new' if /\#/ =~ rep x.prelude str = %Q{ v#{i} = #{rep} def v#{i}.foo @iv # ivar access method (attr_reader) end v#{i}.instance_variable_set(:@iv, :iv) } puts str x.report %Q{ v#{i}.foo } } } Result: v0.foo # T_OBJECT clean: 85387141.8 i/s trunk: 85249373.6 i/s - 1.00x slower v1.foo # T_STRING trunk: 57894407.5 i/s clean: 39957178.6 i/s - 1.45x slower v2.foo # T_HASH trunk: 56629413.2 i/s clean: 39227088.9 i/s - 1.44x slower v3.foo # T_ARRAY trunk: 55797530.2 i/s clean: 38263572.9 i/s - 1.46x slower
* Improve consistency of bool/true/falseKazuhiro NISHIYAMA2019-11-251-1/+1
|
* add fast path for argc==0.Koichi Sasada2019-11-251-2/+7
| | | | | If calling builtin functions with no arguments, we don't need to calculate argv location.
* peep-hole optimize VM instructions卜部昌平2019-11-191-18/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | Some minor optimizations. Calculating ------------------------------------- ours trunk vm2_regexp 8.479M 8.346M i/s - 6.000M times in 0.707612s 0.718916s vm2_regexp_invert 8.605M 8.350M i/s - 6.000M times in 0.697298s 0.718576s Comparison: vm2_regexp ours: 8479223.3 i/s trunk: 8345893.8 i/s - 1.02x slower vm2_regexp_invert ours: 8604647.4 i/s trunk: 8349852.8 i/s - 1.03x slower Calculating ------------------------------------- ours+jit trunk+jit Optcarrot Lan_Master.nes 68.603 64.167 fps Comparison: Optcarrot Lan_Master.nes ours+jit: 68.6 fps trunk+jit: 64.2 fps - 1.07x slower
* should not use __func__Koichi Sasada2019-11-181-1/+1
|
* add casts.Koichi Sasada2019-11-181-1/+1
| | | | | add casts to avoid compile error. http://ci.rvm.jp/results/trunk_clang_39@silicon-docker/2402215
* vm_invoke_builtin_delegate with start index.Koichi Sasada2019-11-181-3/+11
| | | | | | | | | | | | | | | | | opt_invokebuiltin_delegate and opt_invokebuiltin_delegate_leave invokes builtin functions with same parameters of the method. This technique eliminate stack push operations. However, delegation parameters should be completely same as given parameters. (e.g. `def foo(a, b, c) __builtin_foo(a, b, c)` is okay, but __builtin_foo(b, c) is not allowed) This patch relaxes this restriction. ISeq has a local variables table which includes parameters. For example, the method defined as `def foo(a, b, c) x=y=nil`, then local variables table contains [a, b, c, x, y]. If calling builtin-function with arguments which are sub-array of the lvar table, use opt_invokebuiltin_delegate instruction with start index. For example, `__builtin_foo(b, c)`, `__builtin_bar(c, x, y)` is okay, and so on.
* move rb_vm_lvar_exposed() correctly.Koichi Sasada2019-11-141-0/+9
| | | | | | rb_vm_lvar_exposed() is prepared for __builtin_inline!(), needed for mini_builtin.c and builtin.c. However, it's only on builtin.c. So move it to make it as a part of VM.
* Avoid top-level search for nested constant reference from nil in defined?Dylan Thacker-Smith2019-11-131-1/+4
| | | | | | | | | | | | Fixes [Bug #16332] Constant access was changed to no longer allow top-level constant access through `nil`, but `defined?` wasn't changed at the same time to stay consistent. Use a separate defined type to distinguish between a constant referenced from the current lexical scope and one referenced from another namespace.
* rewrite comment.Koichi Sasada2019-11-111-4/+3
| | | | | Pointed by nagachika-san. https://ruby-trunk-changes.hatenablog.com/entry/ruby_trunk_changes_20191109
* use STACK_ADDR_FROM_TOP()Koichi Sasada2019-11-091-2/+1
| | | | | | vm_invoke_builtin() accesses VM stack via cfp->sp. However, MJIT can use their own stack. To access them appropriately, we need to use STACK_ADDR_FROM_TOP().
* initialize kw special local var.Koichi Sasada2019-11-091-1/+4
| | | | | | | | | | A method which has keyword parameters has an implicit local variable to specify which keywords are (un)specified. vm_call_iseq_setup_kwparm_nokwarg() is special function to invoke a ISeq method without any keyword arguments. However, it should also initialize the special local var. Without this initialization, the implicit lvar can points a freed (T_NONE) object.
* name the result of calccall卜部昌平2019-11-081-2/+3
| | | | | This is a pure refactoring for better understanding of what is happening here. Should change nothing but readability.
* describe vm_cache_check_for_class_serial [ci skip]卜部昌平2019-11-081-0/+79
| | | | Added comments describing what it is. Requested by ko1.
* support builtin features with Ruby and C.Koichi Sasada2019-11-081-0/+186
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Support loading builtin features written in Ruby, which implement with C builtin functions. [Feature #16254] Several features: (1) Load .rb file at boottime with native binary. Now, prelude.rb is loaded at boottime. However, this file is contained into the interpreter as a text format and we need to compile it. This patch contains a feature to load from binary format. (2) __builtin_func() in Ruby call func() written in C. In Ruby file, we can write `__builtin_func()` like method call. However this is not a method call, but special syntax to call a function `func()` written in C. C functions should be defined in a file (same compile unit) which load this .rb file. Functions (`func` in above example) should be defined with (a) 1st parameter: rb_execution_context_t *ec (b) rest parameters (0 to 15). (c) VALUE return type. This is very similar requirements for functions used by rb_define_method(), however `rb_execution_context_t *ec` is new requirement. (3) automatic C code generation from .rb files. tool/mk_builtin_loader.rb creates a C code to load .rb files needed by miniruby and ruby command. This script is run by BASERUBY, so *.rb should be written in BASERUBY compatbile syntax. This script load a .rb file and find all of __builtin_ prefix method calls, and generate a part of C code to export functions. tool/mk_builtin_binary.rb creates a C code which contains binary compiled Ruby files needed by ruby command.