aboutsummaryrefslogtreecommitdiffstats
path: root/vm_insnhelper.c
Commit message (Collapse)AuthorAgeFilesLines
* Fixed misspellingsNobuyoshi Nakada2019-12-201-4/+4
| | | | Fixed misspellings reported at [Bug #16437], only in ruby and rubyspec.
* per-method serial number卜部昌平2019-12-181-2/+2
| | | | | | | | | | | | | | | | | Methods and their definitions can be allocated/deallocated on-the-fly. One pathological situation is when a method is deallocated then another one is allocated immediately after that. Address of those old/new method entries/definitions can be the same then, depending on underlying malloc/free implementation. So pointer comparison is insufficient. We have to check the contents. To do so we introduce def->method_serial, which is an integer unique to that specific method definition. PS: Note that method_serial being uintptr_t rather than rb_serial_t is intentional. This is because rb_serial_t can be bigger than a pointer on a 32bit system (rb_serial_t is at least 64bit). In order to preserve old packing of struct rb_call_cache, rb_serial_t is inappropriate.
* add debug counter to count `call` reusing cases.Koichi Sasada2019-12-171-0/+1
|
* ensure cc->def == cc->me->def卜部昌平2019-12-161-9/+9
| | | | | | | The equation shall hold for every call cache. However prior to this changeset cc->me could be updated without also updating cc->def. Let's make it sure by introducing new macro named CC_SET_ME which sets cc->me and cc->def at once.
* Make super in instance_eval in method in module raise TypeErrorJeremy Evans2019-12-121-6/+7
| | | | | | | | | | | | | | | | | | | | | | | This makes behavior the same as super in instance_eval in method in class. The reason this wasn't implemented before is that there is a check to determine if the self in the current context is of the expected class, and a module itself can be included in multiple classes, so it doesn't have an expected class. Implementing this requires giving iclasses knowledge of which class created them, so that super call in the module method knows the expected class for super calls. This reference is called includer, and should only be set for iclasses. Note that the approach Ruby uses in this check is not robust. If you instance_eval another object of the same class and call super, instead of an TypeError, you get super called with the instance_eval receiver instead of the method receiver. Truly fixing super would require keeping a reference to the super object (method receiver) in each frame where scope has changed, and using that instead of current self when calling super. Fixes [Bug #11636]
* Introduce an "Inline IVAR cache" structAaron Patterson2019-12-051-11/+11
| | | | | | | | | This commit introduces an "inline ivar cache" struct. The reason we need this is so compaction can differentiate from an ivar cache and a regular inline cache. Regular inline caches contain references to `VALUE` and ivar caches just contain references to the ivar index. With this new struct we can easily update references for inline caches (but not inline var caches as they just contain an int)
* check interrupts at each frame pop timing.Koichi Sasada2019-11-291-1/+1
| | | | | | | | | | | | | | | | | | Asynchronous events such as signal trap, finalization timing, thread switching and so on are managed by "interrupt_flag". Ruby's threads check this flag periodically and if a thread does not check this flag, above events doesn't happen. This checking is CHECK_INTS() (related) macro and it is placed at some places (laeve instruction and so on). However, at the end of C methods, C blocks (IMEMO_IFUNC) etc there are no checking and it can introduce uninterruptible thread. To modify this situation, we decide to place CHECK_INTS() at vm_pop_frame(). It increases interrupt checking points. [Bug #16366] This patch can introduce unexpected events...
* Reduce duplicated warnings for the change of Ruby 3 keyword argumentsYusuke Endoh2019-11-291-2/+2
| | | | | | | | | | | | | | | By this change, the following code prints only one warning. ``` def foo(**opt); end 100.times { foo({kw:1}) } ``` A global variable `st_table *caller_to_callees` is a map from caller to a set of callee methods. It remembers that a warning is already printed for each pair of caller and callee. [Feature #16289]
* Revert "export for MJIT"Koichi Sasada2019-11-291-1/+2
| | | | This reverts commit 2e6f1cf8b264f4c8499c4e5f18bf662fdade04ff.
* Revert "* remove trailing spaces. [ci skip]"Koichi Sasada2019-11-291-1/+1
| | | | This reverts commit 27d0d7c0d39076d4bbacd3c3f3864322699db7b4.
* * remove trailing spaces. [ci skip]git2019-11-291-1/+1
|
* export for MJITKoichi Sasada2019-11-291-2/+1
|
* fastpath for ivar read of FL_EXIVAR objects.Koichi Sasada2019-11-291-30/+76
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | vm_getivar() provides fastpath for T_OBJECT by caching an index of ivar. This patch also provides fastpath for FL_EXIVAR objects. FL_EXIVAR objects have an each ivar array and index can be cached as T_OBJECT. To access this ivar array, generic_iv_tbl is exposed by rb_ivar_generic_ivtbl() (declared in variable.h which is newly introduced). Benchmark script: Benchmark.driver(repeat_count: 3){|x| x.executable name: 'clean', command: %w'../clean/miniruby' x.executable name: 'trunk', command: %w'./miniruby' objs = [Object.new, 'str', {a: 1, b: 2}, [1, 2]] objs.each.with_index{|obj, i| rep = obj.inspect rep = 'Object.new' if /\#/ =~ rep x.prelude str = %Q{ v#{i} = #{rep} def v#{i}.foo @iv # ivar access method (attr_reader) end v#{i}.instance_variable_set(:@iv, :iv) } puts str x.report %Q{ v#{i}.foo } } } Result: v0.foo # T_OBJECT clean: 85387141.8 i/s trunk: 85249373.6 i/s - 1.00x slower v1.foo # T_STRING trunk: 57894407.5 i/s clean: 39957178.6 i/s - 1.45x slower v2.foo # T_HASH trunk: 56629413.2 i/s clean: 39227088.9 i/s - 1.44x slower v3.foo # T_ARRAY trunk: 55797530.2 i/s clean: 38263572.9 i/s - 1.46x slower
* Improve consistency of bool/true/falseKazuhiro NISHIYAMA2019-11-251-1/+1
|
* add fast path for argc==0.Koichi Sasada2019-11-251-2/+7
| | | | | If calling builtin functions with no arguments, we don't need to calculate argv location.
* peep-hole optimize VM instructions卜部昌平2019-11-191-18/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | Some minor optimizations. Calculating ------------------------------------- ours trunk vm2_regexp 8.479M 8.346M i/s - 6.000M times in 0.707612s 0.718916s vm2_regexp_invert 8.605M 8.350M i/s - 6.000M times in 0.697298s 0.718576s Comparison: vm2_regexp ours: 8479223.3 i/s trunk: 8345893.8 i/s - 1.02x slower vm2_regexp_invert ours: 8604647.4 i/s trunk: 8349852.8 i/s - 1.03x slower Calculating ------------------------------------- ours+jit trunk+jit Optcarrot Lan_Master.nes 68.603 64.167 fps Comparison: Optcarrot Lan_Master.nes ours+jit: 68.6 fps trunk+jit: 64.2 fps - 1.07x slower
* should not use __func__Koichi Sasada2019-11-181-1/+1
|
* add casts.Koichi Sasada2019-11-181-1/+1
| | | | | add casts to avoid compile error. http://ci.rvm.jp/results/trunk_clang_39@silicon-docker/2402215
* vm_invoke_builtin_delegate with start index.Koichi Sasada2019-11-181-3/+11
| | | | | | | | | | | | | | | | | opt_invokebuiltin_delegate and opt_invokebuiltin_delegate_leave invokes builtin functions with same parameters of the method. This technique eliminate stack push operations. However, delegation parameters should be completely same as given parameters. (e.g. `def foo(a, b, c) __builtin_foo(a, b, c)` is okay, but __builtin_foo(b, c) is not allowed) This patch relaxes this restriction. ISeq has a local variables table which includes parameters. For example, the method defined as `def foo(a, b, c) x=y=nil`, then local variables table contains [a, b, c, x, y]. If calling builtin-function with arguments which are sub-array of the lvar table, use opt_invokebuiltin_delegate instruction with start index. For example, `__builtin_foo(b, c)`, `__builtin_bar(c, x, y)` is okay, and so on.
* move rb_vm_lvar_exposed() correctly.Koichi Sasada2019-11-141-0/+9
| | | | | | rb_vm_lvar_exposed() is prepared for __builtin_inline!(), needed for mini_builtin.c and builtin.c. However, it's only on builtin.c. So move it to make it as a part of VM.
* Avoid top-level search for nested constant reference from nil in defined?Dylan Thacker-Smith2019-11-131-1/+4
| | | | | | | | | | | | Fixes [Bug #16332] Constant access was changed to no longer allow top-level constant access through `nil`, but `defined?` wasn't changed at the same time to stay consistent. Use a separate defined type to distinguish between a constant referenced from the current lexical scope and one referenced from another namespace.
* rewrite comment.Koichi Sasada2019-11-111-4/+3
| | | | | Pointed by nagachika-san. https://ruby-trunk-changes.hatenablog.com/entry/ruby_trunk_changes_20191109
* use STACK_ADDR_FROM_TOP()Koichi Sasada2019-11-091-2/+1
| | | | | | vm_invoke_builtin() accesses VM stack via cfp->sp. However, MJIT can use their own stack. To access them appropriately, we need to use STACK_ADDR_FROM_TOP().
* initialize kw special local var.Koichi Sasada2019-11-091-1/+4
| | | | | | | | | | A method which has keyword parameters has an implicit local variable to specify which keywords are (un)specified. vm_call_iseq_setup_kwparm_nokwarg() is special function to invoke a ISeq method without any keyword arguments. However, it should also initialize the special local var. Without this initialization, the implicit lvar can points a freed (T_NONE) object.
* name the result of calccall卜部昌平2019-11-081-2/+3
| | | | | This is a pure refactoring for better understanding of what is happening here. Should change nothing but readability.
* describe vm_cache_check_for_class_serial [ci skip]卜部昌平2019-11-081-0/+79
| | | | Added comments describing what it is. Requested by ko1.
* support builtin features with Ruby and C.Koichi Sasada2019-11-081-0/+186
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Support loading builtin features written in Ruby, which implement with C builtin functions. [Feature #16254] Several features: (1) Load .rb file at boottime with native binary. Now, prelude.rb is loaded at boottime. However, this file is contained into the interpreter as a text format and we need to compile it. This patch contains a feature to load from binary format. (2) __builtin_func() in Ruby call func() written in C. In Ruby file, we can write `__builtin_func()` like method call. However this is not a method call, but special syntax to call a function `func()` written in C. C functions should be defined in a file (same compile unit) which load this .rb file. Functions (`func` in above example) should be defined with (a) 1st parameter: rb_execution_context_t *ec (b) rest parameters (0 to 15). (c) VALUE return type. This is very similar requirements for functions used by rb_define_method(), however `rb_execution_context_t *ec` is new requirement. (3) automatic C code generation from .rb files. tool/mk_builtin_loader.rb creates a C code to load .rb files needed by miniruby and ruby command. This script is run by BASERUBY, so *.rb should be written in BASERUBY compatbile syntax. This script load a .rb file and find all of __builtin_ prefix method calls, and generate a part of C code to export functions. tool/mk_builtin_binary.rb creates a C code which contains binary compiled Ruby files needed by ruby command.
* extend rb_call_cache卜部昌平2019-11-071-14/+47
| | | | | | | | | | | | | | | | | | | | | | Prior to this changeset, majority of inline cache mishits resulted into the same method entry when rb_callable_method_entry() resolves a method search. Let's not call the function at the first place on such situations. In doing so we extend the struct rb_call_cache from 44 bytes (in case of 64 bit machine) to 64 bytes, and fill the gap with secondary class serial(s). Call cache's class serials now behavies as a LRU cache. Calculating ------------------------------------- ours 2.7 2.6 vm2_poly_same_method 2.339M 1.744M 1.369M i/s - 6.000M times in 2.565086s 3.441329s 4.381386s Comparison: vm2_poly_same_method ours: 2339103.0 i/s 2.7: 1743512.3 i/s - 1.34x slower 2.6: 1369429.8 i/s - 1.71x slower
* rb_method_basic_definition_p with CC卜部昌平2019-11-051-5/+11
| | | | | | | | | | | | | | | | | | | Noticed that rb_method_basic_definition_p is frequently called. Its callers include vm_caller_setup_args_block(), rb_hash_default_value(), rb_num_neative_int_p(), and a lot more. It seems worth caching the method resolution part. Majority of rb_method_basic_definion_p() usages take fixed class and fixed method id combinations. Calculating ------------------------------------- ours trunk so_matrix 2.379 2.115 i/s - 1.000 times in 0.420409s 0.472879s Comparison: so_matrix ours: 2.4 i/s trunk: 2.1 i/s - 1.12x slower
* fix bug in keyword + protected combination卜部昌平2019-10-281-2/+9
| | | | Test included for the situation formerly was not working.
* more on struct rb_call_data卜部昌平2019-10-251-114/+151
| | | | | Replacing adjacent struct rb_call_info and struct rb_call_cache into a struct rb_call_data.
* retry tailcall optimization (#2529)wanabe2019-10-251-3/+34
| | | Sorry, f62f90367fc3bce6714e7c34cbd040e14e43fe07 is push miss.
* Duplicate hash when converting keyword hash to keywordsJeremy Evans2019-10-241-2/+4
| | | | | | | This mirrors the behavior when manually splatting a hash. This mirrors the changes made in setup_parameters_complex in 6081ddd6e6f2297862b3c7e898d28a76b8f9240b, so that splatting to a non-iseq method works the same as splatting to an iseq method.
* Combine call info and cache to speed up method invocationAlan Wu2019-10-241-35/+31
| | | | | | | | | | | | | | | | | | | | | | | | | | To perform a regular method call, the VM needs two structs, `rb_call_info` and `rb_call_cache`. At the moment, we allocate these two structures in separate buffers. In the worst case, the CPU needs to read 4 cache lines to complete a method call. Putting the two structures together reduces the maximum number of cache line reads to 2. Combining the structures also saves 8 bytes per call site as the current layout uses separate two pointers for the call info and the call cache. This saves about 2 MiB on Discourse. This change improves the Optcarrot benchmark at least 3%. For more details, see attached bugs.ruby-lang.org ticket. Complications: - A new instruction attribute `comptime_sp_inc` is introduced to calculate SP increase at compile time without using call caches. At compile time, a `TS_CALLDATA` operand points to a call info struct, but at runtime, the same operand points to a call data struct. Instruction that explicitly define `sp_inc` also need to define `comptime_sp_inc`. - MJIT code for copying call cache becomes slightly more complicated. - This changes the bytecode format, which might break existing tools. [Misc #16258]
* extracted declare_underNobuyoshi Nakada2019-10-101-8/+10
|
* Revert "tailcall optimization again (#2528)"Koichi Sasada2019-10-061-34/+3
| | | | This reverts commit f62f90367fc3bce6714e7c34cbd040e14e43fe07.
* tailcall optimization again (#2528)wanabe2019-10-061-3/+34
| | | This is follow up of r67315.
* add debug counters for vm_search_method_slowpath()卜部昌平2019-10-031-0/+5
| | | | | Implemented fine-grained inspection of cache misshits. Handy for counting the reasons why an inline method cache was evicted.
* Revert https://github.com/ruby/ruby/pull/2486卜部昌平2019-10-031-23/+29
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This reverts commits: 10d6a3aca7 8ba48c1b85 fba8627dc1 dd883de5ba 6c6a25feca 167e6b48f1 7cb96d41a5 3207979278 595b3c4fdd 1521f7cf89 c11c5e69ac cf33608203 3632a812c0 f56506be0d 86427a3219 . The reason for the revert is that we observe ABA problem around inline method cache. When a cache misshits, we search for a method entry. And if the entry is identical to what was cached before, we reuse the cache. But the commits we are reverting here introduced situations where a method entry is freed, then the identical memory region is used for another method entry. An inline method cache cannot detect that ABA. Here is a code that reproduce such situation: ```ruby require 'prime' class << Integer alias org_sqrt sqrt def sqrt(n) raise end GC.stress = true Prime.each(7*37){} rescue nil # <- Here we populate CC class << Object.new; end # These adjacent remove-then-alias maneuver # frees a method entry, then immediately # reuses it for another. remove_method :sqrt alias sqrt org_sqrt end Prime.each(7*37).to_a # <- SEGV ```
* Treat return in block in class/module as LocalJumpError (#2511)Jeremy Evans2019-10-021-1/+5
| | | | | | return directly in class/module is an error, so return in proc in class/module should also be an error. I believe the previous behavior was an unintentional oversight during the addition of top-level return in 2.4.
* Fix assertionNobuyoshi Nakada2019-09-301-1/+1
| | | | callable_method_entry_p is for rb_callable_method_entry_t.
* delete unnecessary branch卜部昌平2019-09-301-4/+0
| | | | | | | | | | | | | | | | At last, not only myself but also your compiler are fully confident that the method entries pointed from call caches are immutable. We don't have to worry about silent updates. Just delete the branch that is now always false. Calculating ------------------------------------- ours trunk vm2_poly_same_method 2.142M 2.070M i/s - 6.000M times in 2.801148s 2.898994s Comparison: vm2_poly_same_method ours: 2141979.2 i/s trunk: 2069683.8 i/s - 1.03x slower
* refactor constify most of rb_method_entry_t卜部昌平2019-09-301-6/+6
| | | | | | | | | | | Now that we have eliminated most destructive operations over the rb_method_entry_t / rb_callable_method_entry_t, let's make them mostly immutabe and mark them const. One exception is rb_export_method(), which destructively modifies visibilities of method entries. I have left that operation as is because I suspect that destructiveness is the nature of that function.
* refactor add rb_method_entry_from_template卜部昌平2019-09-301-10/+2
| | | | | | Tired of rb_method_entry_create(..., rb_method_definition_create( ..., &(rb_method_foo_t) {...})) maneuver. Provide a function that does the thing to reduce copy&paste.
* refactor delete rb_method_entry_copy卜部昌平2019-09-301-1/+0
| | | | | | The deleted function was to destructively overwrite existing method entries, which is now considered to be a bad idea. Delete it, and assign a newly created method entry instead.
* refactor delete rb_method_definition_set卜部昌平2019-09-301-20/+27
| | | | | Instead of destructively write fields of method entries, create a new entry and let it overwrite its owner.
* refactor rb_method_definition_create take opts卜部昌平2019-09-301-4/+4
| | | | | | | Before this changeset rb_method_definition_create only allocated a memory region and we had to destructively initialize it later. That is not a good design so we change the API to return a complete struct instead.
* refactor constify most of rb_method_definition_t卜部昌平2019-09-301-3/+3
| | | | | | Most (if not all) of the fields of rb_method_definition_t are never meant to be modified once after they are stored. Marking them const makes it possible for compilers to warn on unintended modifications.
* Remove VM_NO_KEYWORDS, replace with RB_NO_KEYWORDSJeremy Evans2019-09-291-1/+1
| | | | | VM_NO_KEYWORDS was introduced first in vm_core.h, but it is best to only use a single definition for this.
* Correctly issue ArgumentError when calling method that accepts no keywordsJeremy Evans2019-09-271-0/+2
| | | | | | | If a method accepts no keywords and was called with a keyword, an ArgumentError was not always issued previously. Force methods that accept no keywords to go through setup_parameters_complex so that an ArgumentError is raised if keywords are provided.