aboutsummaryrefslogtreecommitdiffstats
path: root/yjit/src/cruby.rs
Commit message (Collapse)AuthorAgeFilesLines
* YJIT: Fix `cargo doc --document-private-items` warnings [ci skip]Alan Wu2024-06-281-5/+4
| | | | Mostly putting angle brackets around links to follow markdown syntax.
* Optimized forwarding callers and calleesAaron Patterson2024-06-181-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch optimizes forwarding callers and callees. It only optimizes methods that only take `...` as their parameter, and then pass `...` to other calls. Calls it optimizes look like this: ```ruby def bar(a) = a def foo(...) = bar(...) # optimized foo(123) ``` ```ruby def bar(a) = a def foo(...) = bar(1, 2, ...) # optimized foo(123) ``` ```ruby def bar(*a) = a def foo(...) list = [1, 2] bar(*list, ...) # optimized end foo(123) ``` All variants of the above but using `super` are also optimized, including a bare super like this: ```ruby def foo(...) super end ``` This patch eliminates intermediate allocations made when calling methods that accept `...`. We can observe allocation elimination like this: ```ruby def m x = GC.stat(:total_allocated_objects) yield GC.stat(:total_allocated_objects) - x end def bar(a) = a def foo(...) = bar(...) def test m { foo(123) } end test p test # allocates 1 object on master, but 0 objects with this patch ``` ```ruby def bar(a, b:) = a + b def foo(...) = bar(...) def test m { foo(1, b: 2) } end test p test # allocates 2 objects on master, but 0 objects with this patch ``` How does it work? ----------------- This patch works by using a dynamic stack size when passing forwarded parameters to callees. The caller's info object (known as the "CI") contains the stack size of the parameters, so we pass the CI object itself as a parameter to the callee. When forwarding parameters, the forwarding ISeq uses the caller's CI to determine how much stack to copy, then copies the caller's stack before calling the callee. The CI at the forwarded call site is adjusted using information from the caller's CI. I think this description is kind of confusing, so let's walk through an example with code. ```ruby def delegatee(a, b) = a + b def delegator(...) delegatee(...) # CI2 (FORWARDING) end def caller delegator(1, 2) # CI1 (argc: 2) end ``` Before we call the delegator method, the stack looks like this: ``` Executing Line | Code | Stack ---------------+---------------------------------------+-------- 1| def delegatee(a, b) = a + b | self 2| | 1 3| def delegator(...) | 2 4| # | 5| delegatee(...) # CI2 (FORWARDING) | 6| end | 7| | 8| def caller | -> 9| delegator(1, 2) # CI1 (argc: 2) | 10| end | ``` The ISeq for `delegator` is tagged as "forwardable", so when `caller` calls in to `delegator`, it writes `CI1` on to the stack as a local variable for the `delegator` method. The `delegator` method has a special local called `...` that holds the caller's CI object. Here is the ISeq disasm fo `delegator`: ``` == disasm: #<ISeq:delegator@-e:1 (1,0)-(1,39)> local table (size: 1, argc: 0 [opts: 0, rest: -1, post: 0, block: -1, kw: -1@-1, kwrest: -1]) [ 1] "..."@0 0000 putself ( 1)[LiCa] 0001 getlocal_WC_0 "..."@0 0003 send <calldata!mid:delegatee, argc:0, FCALL|FORWARDING>, nil 0006 leave [Re] ``` The local called `...` will contain the caller's CI: CI1. Here is the stack when we enter `delegator`: ``` Executing Line | Code | Stack ---------------+---------------------------------------+-------- 1| def delegatee(a, b) = a + b | self 2| | 1 3| def delegator(...) | 2 -> 4| # | CI1 (argc: 2) 5| delegatee(...) # CI2 (FORWARDING) | cref_or_me 6| end | specval 7| | type 8| def caller | 9| delegator(1, 2) # CI1 (argc: 2) | 10| end | ``` The CI at `delegatee` on line 5 is tagged as "FORWARDING", so it knows to memcopy the caller's stack before calling `delegatee`. In this case, it will memcopy self, 1, and 2 to the stack before calling `delegatee`. It knows how much memory to copy from the caller because `CI1` contains stack size information (argc: 2). Before executing the `send` instruction, we push `...` on the stack. The `send` instruction pops `...`, and because it is tagged with `FORWARDING`, it knows to memcopy (using the information in the CI it just popped): ``` == disasm: #<ISeq:delegator@-e:1 (1,0)-(1,39)> local table (size: 1, argc: 0 [opts: 0, rest: -1, post: 0, block: -1, kw: -1@-1, kwrest: -1]) [ 1] "..."@0 0000 putself ( 1)[LiCa] 0001 getlocal_WC_0 "..."@0 0003 send <calldata!mid:delegatee, argc:0, FCALL|FORWARDING>, nil 0006 leave [Re] ``` Instruction 001 puts the caller's CI on the stack. `send` is tagged with FORWARDING, so it reads the CI and _copies_ the callers stack to this stack: ``` Executing Line | Code | Stack ---------------+---------------------------------------+-------- 1| def delegatee(a, b) = a + b | self 2| | 1 3| def delegator(...) | 2 4| # | CI1 (argc: 2) -> 5| delegatee(...) # CI2 (FORWARDING) | cref_or_me 6| end | specval 7| | type 8| def caller | self 9| delegator(1, 2) # CI1 (argc: 2) | 1 10| end | 2 ``` The "FORWARDING" call site combines information from CI1 with CI2 in order to support passing other values in addition to the `...` value, as well as perfectly forward splat args, kwargs, etc. Since we're able to copy the stack from `caller` in to `delegator`'s stack, we can avoid allocating objects. I want to do this to eliminate object allocations for delegate methods. My long term goal is to implement `Class#new` in Ruby and it uses `...`. I was able to implement `Class#new` in Ruby [here](https://github.com/ruby/ruby/pull/9289). If we adopt the technique in this patch, then we can optimize allocating objects that take keyword parameters for `initialize`. For example, this code will allocate 2 objects: one for `SomeObject`, and one for the kwargs: ```ruby SomeObject.new(foo: 1) ``` If we combine this technique, plus implement `Class#new` in Ruby, then we can reduce allocations for this common operation. Co-Authored-By: John Hawthorn <john@hawthorn.email> Co-Authored-By: Alan Wu <XrXr@users.noreply.github.com>
* Introduce a specialize instruction for Array#packNobuyoshi Nakada2024-05-231-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Instructions for this code: ```ruby # frozen_string_literal: true [a].pack("C") ``` Before this commit: ``` == disasm: #<ISeq:<main>@test.rb:1 (1,0)-(3,13)> 0000 putself ( 3)[Li] 0001 opt_send_without_block <calldata!mid:a, argc:0, FCALL|VCALL|ARGS_SIMPLE> 0003 newarray 1 0005 putobject "C" 0007 opt_send_without_block <calldata!mid:pack, argc:1, ARGS_SIMPLE> 0009 leave ``` After this commit: ``` == disasm: #<ISeq:<main>@test.rb:1 (1,0)-(3,13)> 0000 putself ( 3)[Li] 0001 opt_send_without_block <calldata!mid:a, argc:0, FCALL|VCALL|ARGS_SIMPLE> 0003 putobject "C" 0005 opt_newarray_send 2, :pack 0008 leave ``` Co-authored-by: Maxime Chevalier-Boisvert <maxime.chevalierboisvert@shopify.com> Co-authored-by: Aaron Patterson <tenderlove@ruby-lang.org>
* YJIT: Remove CString allocation when using `src_loc!()`Alan Wu2024-04-291-12/+11
| | | | | | | Since we often take the VM lock as the first thing we do when entering YJIT, and that needs a `src_loc!()`, this removes a allocation from that. The main trick here is `concat!(file!(), '\0')` to get a C string statically baked into the binary.
* YJIT: Add specialized codegen function for `TrueClass#===` (#10640)Randy Stauner2024-04-291-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * YJIT: Add specialized codegen function for `TrueClass#===` TrueClass#=== is currently number 10 in the most frequent C calls list of the lobsters benchmark. ``` require "benchmark/ips" def wrap true === true true === false true === :x end Benchmark.ips do |x| x.report(:wrap) do wrap end end ``` ``` before Warming up -------------------------------------- wrap 1.791M i/100ms Calculating ------------------------------------- wrap 17.806M (± 1.0%) i/s - 89.544M in 5.029363s after Warming up -------------------------------------- wrap 4.024M i/100ms Calculating ------------------------------------- wrap 40.149M (± 1.1%) i/s - 201.223M in 5.012527s ``` Co-authored-by: Maxime Chevalier-Boisvert <maxime.chevalierboisvert@shopify.com> Co-authored-by: Takashi Kokubun (k0kubun) <takashikkbn@gmail.com> Co-authored-by: Kevin Menard <kevin.menard@shopify.com> Co-authored-by: Alan Wu <XrXr@users.noreply.github.com> * Fix the new test for RJIT --------- Co-authored-by: Maxime Chevalier-Boisvert <maxime.chevalierboisvert@shopify.com> Co-authored-by: Takashi Kokubun (k0kubun) <takashikkbn@gmail.com> Co-authored-by: Kevin Menard <kevin.menard@shopify.com> Co-authored-by: Alan Wu <XrXr@users.noreply.github.com>
* YJIT: Optimize local variables when EP == BP (take 2) (#10607)Takashi Kokubun2024-04-251-0/+1
| | | | | | | | | | | | | * Revert "Revert "YJIT: Optimize local variables when EP == BP" (#10584)" This reverts commit c8783441952217c18e523749c821f82cd7e5d222. * YJIT: Take care of GC references in ISEQ invariants Co-authored-by: Alan Wu <alansi.xingwu@shopify.com> --------- Co-authored-by: Alan Wu <alansi.xingwu@shopify.com>
* Revert "YJIT: Optimize local variables when EP == BP" (#10584)Alan Wu2024-04-191-1/+0
| | | | | | This reverts commit 4cc58ea0b865f2fd20f1e881ddbd4c4fab0b072c. Since the change landed call-threshold=1 CI runs have been timing out. There has also been `verify-ctx` violations. Revert for now while we debug.
* YJIT: Optimize local variables when EP == BP (#10487)Takashi Kokubun2024-04-171-0/+1
|
* YJIT: Inline simple getlocal+leave iseqsAlan Wu2024-03-251-0/+1
| | | | | | | | This mainly targets things like `T.unsafe()` from Sorbet, which is just an identity function at runtime and only a hint for the static checker. Only deal with simple caller and callees (no keywords and splat etc.). Co-authored-by: Takashi Kokubun (k0kubun) <takashikkbn@gmail.com>
* YJIT: String#getbyte codegen (#10188)Maxime Chevalier-Boisvert2024-03-061-0/+3
| | | | | | | | | | | | | | | * WIP getbyte implementation * WIP String#getbyte implementation * Fix whitespace in stats.rs * fix? * Fix whitespace, add comment --------- Co-authored-by: Aaron Patterson <aaron.patterson@shopify.com>
* YJIT: Lazily push a frame for specialized C funcs (#10080)Takashi Kokubun2024-02-231-0/+2
| | | | | | | | | | | | | | | | | | | | | * YJIT: Lazily push a frame for specialized C funcs Co-authored-by: Maxime Chevalier-Boisvert <maxime.chevalierboisvert@shopify.com> * Fix a comment on pc_to_cfunc * Rename rb_yjit_check_pc to rb_yjit_lazy_push_frame * Rename it to jit_prepare_lazy_frame_call * Fix a typo * Optimize String#getbyte as well * Optimize String#byteslice as well --------- Co-authored-by: Maxime Chevalier-Boisvert <maxime.chevalierboisvert@shopify.com>
* YJIT: Verify the assumption of leaf C calls (#10002)Takashi Kokubun2024-02-201-0/+13
|
* YJIT: Add support for `**kwrest` parametersAlan Wu2024-02-121-7/+0
| | | | | | Now that `...` uses `**kwrest` instead of regular splat and ruby2keywords, we need to support these type of methods to support `...` well.
* YJIT: Prefer an overloaded cme if available (#9913)Takashi Kokubun2024-02-121-0/+4
| | | YJIT: Prefer an overloaded cme if applicable
* YJIT: Skip pushing a frame for Hash#empty? (#9875)Takashi Kokubun2024-02-081-1/+2
|
* YJIT: Specialize splatkw on T_HASH (#9764)Takashi Kokubun2024-01-301-0/+5
| | | | | | | | | | | | | * YJIT: Specialize splatkw on T_HASH * Fix a typo Co-authored-by: Alan Wu <XrXr@users.noreply.github.com> * Fix a few more comments --------- Co-authored-by: Alan Wu <XrXr@users.noreply.github.com>
* YJIT: Support concattoarray and pushtoarray (#9708)Takashi Kokubun2024-01-251-0/+1
|
* YJIT: Fix ruby2_keywords splat+rest and drop bogus checksAlan Wu2024-01-231-1/+0
| | | | | | | | | | | | | | | | | | YJIT didn't guard for ruby2_keywords hash in case of splat calls that land in methods with a rest parameter, creating incorrect results. The compile-time checks didn't correspond to any actual effects of ruby2_keywords, so it was masking this bug and YJIT was needlessly refusing to compile some code. About 16% of fallback reasons in `lobsters` was due to the ISeq check. We already handle the tagging part with exit_if_supplying_kw_and_has_no_kw() and should now have a dynamic guard for all splat cases. Note for backporting: You also need 7f51959ff1. [Bug #20195]
* YJIT: expandarray for non-arrays (#9495)ywenc2024-01-121-0/+1
| | | | | | | | | | | | | | * YJIT: expandarray for non-arrays Co-authored-by: John Hawthorn <john@hawthorn.email> * Skip the new test on RJIT * Increment counter for to_ary exit --------- Co-authored-by: John Hawthorn <john@hawthorn.email> Co-authored-by: Takashi Kokubun <takashikkbn@gmail.com>
* YJIT: Fix unused warningsAlan Wu2024-01-101-3/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | ``` warning: unused import: `condition::Condition` --> src/asm/arm64/arg/mod.rs:13:9 | 13 | pub use condition::Condition; | ^^^^^^^^^^^^^^^^^^^^ | = note: `#[warn(unused_imports)]` on by default warning: unused import: `rb_yjit_fix_mul_fix as rb_fix_mul_fix` --> src/cruby.rs:188:9 | 188 | pub use rb_yjit_fix_mul_fix as rb_fix_mul_fix; | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ warning: unused import: `rb_insn_len as raw_insn_len` --> src/cruby.rs:142:9 | 142 | pub use rb_insn_len as raw_insn_len; | ^^^^^^^^^^^^^^^^^^^^^^^^^^^ | = note: `#[warn(unused_imports)]` on by default ``` Make asm public so it stops warning about unused public stuff in there.
* YJIT: Add some object validity assertionsAlan Wu2023-12-061-2/+7
| | | | | | We've seen quite a few compaction bugs lately, and these assertions should give clearer symptoms. We only call class_of() on objects that the Ruby code can see.
* YJIT: Inline basic Ruby methods (#8855)Takashi Kokubun2023-11-071-0/+6
| | | | | | | * YJIT: Inline basic Ruby methods * YJIT: Fix "InsnOut operand made it past register allocation" checktype should not generate a useless instruction.
* YJIT: Add --yjit-perf (#8697)Takashi Kokubun2023-10-181-1/+0
| | | Co-authored-by: Alan Wu <alansi.xingwu@shopify.com>
* YJIT: Lookup IDs on boot instead of binding to themAlan Wu2023-10-171-0/+48
| | | | | | | | | | Previously, the version-controlled `cruby_bindings.inc.rs` file contained the build-time artifact `id.h`, which nobu mentioned hinders the goal of having fewer magic numbers in the repository. Lookup the IDs YJIT needs on boot. It costs cycles, but it's fine since YJIT only uses a handful of IDs at the moment. No perceptible degradation to boot time found in my testing.
* Remove function call for String#bytesize (#8389)Aaron Patterson2023-09-071-1/+0
| | | | | | | | | | | | | | | * Remove function call for String#bytesize String size is stored in a consistent location, so we can eliminate the function call. * Update yjit/src/codegen.rs Co-authored-by: Takashi Kokubun <takashikkbn@gmail.com> --------- Co-authored-by: Maxime Chevalier-Boisvert <maximechevalierb@gmail.com> Co-authored-by: Takashi Kokubun <takashikkbn@gmail.com>
* YJIT: Silence Clippy for bindgen generated codeAlan Wu2023-09-051-1/+1
| | | | | | New Clippy lint in 1.72.0 is breaking our build as GitHub has updated their image. No point hearing about lints from generated code we don't manually write.
* YJIT: Compile exception handlers (#8171)Takashi Kokubun2023-08-081-0/+1
| | | Co-authored-by: Maxime Chevalier-Boisvert <maximechevalierb@gmail.com>
* Remove __bp__ and speed-up bmethod calls (#8060)Alan Wu2023-07-171-3/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Remove rb_control_frame_t::__bp__ and optimize bmethod calls This commit removes the __bp__ field from rb_control_frame_t. It was introduced to help MJIT, but since MJIT was replaced by RJIT, we can use vm_base_ptr() to compute it from the SP of the previous control frame instead. Removing the field avoids needing to set it up when pushing new frames. Simply removing __bp__ would cause crashes since RJIT and YJIT used a slightly different stack layout for bmethod calls than the interpreter. At the moment of the call, the two layouts looked as follows: ┌────────────┐ ┌────────────┐ │ frame_base │ │ frame_base │ ├────────────┤ ├────────────┤ │ ... │ │ ... │ ├────────────┤ ├────────────┤ │ args │ │ args │ ├────────────┤ └────────────┘<─prev_frame_sp │ receiver │ prev_frame_sp─>└────────────┘ RJIT & YJIT interpreter Essentially, vm_base_ptr() needs to compute the address to frame_base given prev_frame_sp in the diagrams. The presence of the receiver created an off-by-one situation. Make the interpreter use the layout the JITs use for iseq-to-iseq bmethod calls. Doing so removes unnecessary argument shifting and vm_exec_core() re-entry from the interpreter, yielding a speed improvement visible through `benchmark/vm_defined_method.yml`: patched: 7578743.1 i/s master: 4796596.3 i/s - 1.58x slower C-to-iseq bmethod calls now store one more VALUE than before, but that should have negligible impact on overall performance. Note that re-entering vm_exec_core() used to be necessary for firing TracePoint events, but that's no longer the case since 9121e57a5f50bc91bae48b3b91edb283bf96cb6b. Closes ruby/ruby#6428
* Implement Struct on VWAPeter Zhu2023-06-051-1/+1
| | | | | | | | | | | | | | | | | | | | | The benchmark results show that this feature has either a positive or no impact on performance. The memory usage is also mostly unchanged, except in hexapdf, where there is a decrease in RSS. -------------- ----------- ---------- --------- ----------- ---------- --------- -------------- ------------- bench master (ms) stddev (%) RSS (MiB) branch (ms) stddev (%) RSS (MiB) branch 1st itr master/branch activerecord 70.8 2.2 56.0 71.7 2.2 56.0 0.99 0.99 erubi_rails 20.5 13.6 94.7 20.5 14.3 94.2 0.93 1.00 hexapdf 2541.0 0.7 212.8 2544.4 0.7 203.4 1.00 1.00 liquid-c 65.6 0.3 38.9 65.3 0.3 38.9 1.01 1.01 liquid-compile 63.7 0.3 34.6 61.1 0.2 34.6 1.04 1.04 liquid-render 163.1 0.1 37.1 163.3 0.1 37.1 1.00 1.00 mail 139.3 0.1 50.5 137.0 0.1 50.1 0.99 1.02 psych-load 2065.7 0.1 36.9 2068.2 0.1 37.3 1.00 1.00 railsbench 2034.6 0.5 103.9 2031.9 0.5 103.8 1.02 1.00 ruby-lsp 65.3 3.1 89.8 66.2 3.0 89.7 1.01 0.99 sequel 73.2 1.0 40.3 73.4 1.0 40.3 1.00 1.00 -------------- ----------- ---------- --------- ----------- ---------- --------- -------------- -------------
* YJIT: Add codegen for Integer methods (#7665)Takashi Kokubun2023-04-051-1/+3
| | | | | | | * YJIT: Add codegen for Integer methods * YJIT: Update dependencies * YJIT: Fix Integer#[] for argc=2
* Revert "YJIT: Suppress unnecessary `unsafe` block (GH-7634)"Alan Wu2023-04-051-1/+1
| | | | | | | This reverts commit 9e678cdbd054f78576a8f21b3f97cccc395ade22. Without the `unsafe` annotations, the SAFETY comments make less sense. I want to keep the SAFETY comments.
* YJIT: Suppress unnecessary `unsafe` block (#7634)Nobuyoshi Nakada2023-03-311-1/+1
|
* YJIT: Support entry for multiple PCs per ISEQ (GH-7535)Takashi Kokubun2023-03-171-0/+7
|
* YJIT: log the names of methods we call to in disasm (#7231)Maxime Chevalier-Boisvert2023-02-021-1/+14
| | | | | | | * YJIT: log the names of methods we call to in disasm * Assert that pointer is not null * Handle case where UTF8 conversion not possible
* Fix typos in YJIT [ci skip]Alan Wu2023-02-021-1/+1
|
* YJIT: Factor out VALUE_BITS = (8 * SIZE_OF_VALUE as u8)Alan Wu2023-01-131-0/+1
| | | | | Using a constant shows intention better and is less noisy. It always took me a second to parse the long expression.
* Enable `clippy` checks for yjit in CI (#7093)Ian Ker-Seymer2023-01-121-1/+1
| | | | | | | | | | | * Add job to check clippy lints in CI * Address all remaining clippy lints * Check lints on arm64 as well * Apply latest clippy lints * Do not exit 0 on clippy warnings
* Transition complex objects to "too complex" shapeJemma Issroff2022-12-151-0/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When an object becomes "too complex" (in other words it has too many variations in the shape tree), we transition it to use a "too complex" shape and use a hash for storing instance variables. Without this patch, there were rare cases where shape tree growth could "explode" and cause performance degradation on what would otherwise have been cached fast paths. This patch puts a limit on shape tree growth, and gracefully degrades in the rare case where there could be a factorial growth in the shape tree. For example: ```ruby class NG; end HUGE_NUMBER.times do NG.new.instance_variable_set(:"@unique_ivar_#{_1}", 1) end ``` We consider objects to be "too complex" when the object's class has more than SHAPE_MAX_VARIATIONS (currently 8) leaf nodes in the shape tree and the object introduces a new variation (a new leaf node) associated with that class. For example, new variations on instances of the following class would be considered "too complex" because those instances create more than 8 leaves in the shape tree: ```ruby class Foo; end 9.times { Foo.new.instance_variable_set(":@uniq_#{_1}", 1) } ``` However, the following class is *not* too complex because it only has one leaf in the shape tree: ```ruby class Foo def initialize @a = @b = @c = @d = @e = @f = @g = @h = @i = nil end end 9.times { Foo.new } `` This case is rare, so we don't expect this change to impact performance of most applications, but it needs to be handled. Co-Authored-By: Aaron Patterson <tenderlove@ruby-lang.org>
* bail on compilation if the comptime receiver is frozenAaron Patterson2022-12-021-0/+4
|
* implement IV writesAaron Patterson2022-12-021-1/+19
|
* MJIT: Use a String buffer in builtin compilersTakashi Kokubun2022-11-271-7/+0
| | | | | | | instead of FILE*. Using C.fprintf is slower than String manipulation on memory. I'm going to change the way MJIT writes files, and this is a prerequisite for it.
* YJIT: Always encode Opnd::Value in 64 bits on x86_64 for GC offsets (#6733)Takashi Kokubun2022-11-151-0/+5
| | | | | | | | | | | | | | | | | * YJIT: Always encode Opnd::Value in 64 bits on x86_64 for GC offsets Co-authored-by: Alan Wu <alansi.xingwu@shopify.com> * Introduce heap_object_p * Leave original mov intact * Remove unneeded branches * Add a test for movabs Co-authored-by: Alan Wu <alansi.xingwu@shopify.com>
* YJIT: Support invokeblock (#6640)Takashi Kokubun2022-11-021-2/+18
| | | | | | | | | * YJIT: Support invokeblock * Update yjit/src/backend/arm64/mod.rs * Update yjit/src/codegen.rs Co-authored-by: Maxime Chevalier-Boisvert <maximechevalierb@gmail.com>
* YJIT: incorporate ruby_special_constsNobuyoshi Nakada2022-10-201-18/+13
|
* Move "special consts" so `Qundef` and `Qnil` differ just 1 bitNobuyoshi Nakada2022-10-201-2/+2
|
* YJIT: fold the "asm_comments" feature into "disasm" (#6591)Alan Wu2022-10-191-1/+1
| | | | | Previously, enabling only "disasm" didn't actually build. Since these two features are closely related and we don't really use one without the other, let's simplify and merge the two features together.
* fixes more clippy warnings (#6543)Jimmy Miller2022-10-131-0/+1
| | | | | * fixes more clippy warnings * Fix x86 c_callable to have doc_strings
* Implement optimize send in yjit (#6488)Jimmy Miller2022-10-111-0/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * Implement optimize send in yjit This successfully makes all our benchmarks exit way less for optimize send reasons. It makes some benchmarks faster, but not by as much as I'd like. I think this implementation works, but there are definitely more optimial arrangements. For example, what if we compiled send to a jump table? That seems like perhaps the most optimal we could do, but not obvious (to me) how to implement give our current setup. Co-authored-by: Alan Wu <XrXr@users.noreply.github.com> * Attempt at fixing the issues raised by @XrXr * fix allowlist * returns 0 instead of nil when not found * remove comment about encoding exception * Fix up c changes * Update assert Co-authored-by: Alan Wu <XrXr@users.noreply.github.com> * get rid of unneeded code and fix the flags * Apply suggestions from code review Co-authored-by: Alan Wu <XrXr@users.noreply.github.com> * rename and fix typo Co-authored-by: Alan Wu <XrXr@users.noreply.github.com>
* Revert "Revert "This commit implements the Object Shapes technique in CRuby.""Jemma Issroff2022-10-111-2/+10
| | | | This reverts commit 9a6803c90b817f70389cae10d60b50ad752da48f.
* Revert "This commit implements the Object Shapes technique in CRuby."Aaron Patterson2022-09-301-10/+2
| | | | This reverts commit 68bc9e2e97d12f80df0d113e284864e225f771c2.