| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
| |
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Right now the `rb_shape_get_next` shape caller need to
first check if there is capacity left, and if not call
`rb_shape_transition_shape_capa` before it can call `rb_shape_get_next`.
And on each of these it needs to checks if we got a TOO_COMPLEX
back.
All this logic is duplicated in the interpreter, YJIT and RJIT.
Instead we can have `rb_shape_get_next` do the capacity transition
when needed. The caller can compare the old and new shapes capacity
to know if resizing is needed. It also can check for TOO_COMPLEX
only once.
|
|
|
|
|
|
|
|
|
|
|
| |
We still need to do `jit.record_boundary_patch_point = false`
when gen_outlined_exit() returns `None` and we return with `?`.
Previously, we tripped the assert at codegen.rs:1042.
Found with `--yjit-exec-mem-size=3` on the lobsters benchmark.
Co-authored-by: Takashi Kokubun <takashikkbn@gmail.com>
Co-authored-by: Maxime Chevalier-Boisvert <maxime.chevalierboisvert@shopify.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We've long had a size restriction on the code memory region such that a
u32 could refer to everything. This commit capitalizes on this
restriction by shrinking the size of `CodePtr` to be 4 bytes from 8.
To derive a full raw pointer from a `CodePtr`, one needs a base pointer.
Both `CodeBlock` and `VirtualMemory` can be used for this purpose. The
base pointer is readily available everywhere, except for in the case of
the `jit_return` "branch". Generalize lea_label() to lea_jump_target()
in the IR to delay deriving the `jit_return` address until `compile()`,
when the base pointer is available.
On railsbench, this yields roughly a 1% reduction to `yjit_alloc_size`
(58,397,765 to 57,742,248).
|
|
|
|
|
|
|
| |
* YJIT: Inline basic Ruby methods
* YJIT: Fix "InsnOut operand made it past register allocation"
checktype should not generate a useless instruction.
|
|
|
|
|
|
| |
If the VM ran out of shape, `rb_shape_transition_shape_capa` might
return `OBJ_TOO_COMPLEX_SHAPE`.
Co-authored-by: Jean Boussier <byroot@ruby-lang.org>
|
| |
|
|
|
|
|
|
|
|
|
| |
So that we get a reminder to check CodeBlock::has_dropped_bytes().
Internally, asm.compile() already checks it, and this patch just
propagates it out to the caller with a `#[must_use]`.
Code GC logic moved out one level in entry_stub_hit(), so the body
can freely use `?`
|
|
|
|
| |
This reverts commit e3afc212ec059525fe4e5387b2a3be920ffe0f0e.
|
|
|
| |
Co-authored-by: Alan Wu <alansi.xingwu@shopify.com>
|
| |
|
| |
|
|
|
|
|
|
|
|
|
|
| |
Previously, the version-controlled `cruby_bindings.inc.rs` file
contained the build-time artifact `id.h`, which nobu mentioned hinders
the goal of having fewer magic numbers in the repository.
Lookup the IDs YJIT needs on boot. It costs cycles, but it's fine since
YJIT only uses a handful of IDs at the moment. No perceptible
degradation to boot time found in my testing.
|
|
|
|
|
|
|
| |
* YJIT: Fallback opt_getconstant_path for const_missing
* Fix a comment [ci skip]
* Remove a wrapper function
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously, for block argument callsites with some specific argument
count and callee local variable count combinations, YJIT ended up
writing over arguments that are supposed to be collected into a rest
parameter array unmodified.
Detect when clobbering would happen and avoid it. Also, place the block
handler after the stack overflow check, since it writes to new stack
space.
Reported-by: Takashi Kokubun <takashikkbn@gmail.com>
|
|
|
|
|
|
|
|
|
|
| |
Given `SHAPE_MAX_NUM_IVS 80`, we transition to TOO_COMPLEX
way before we could overflow a 8bit counter.
This reduce the size of `rb_shape_t` from 32B to 24B.
If we decide to raise `SHAPE_MAX_NUM_IVS` we can always increase
that type again.
|
|
|
|
|
| |
This way the groth factor is encapsulated, which allows
rb_shape_transition_shape_capa to be smarter about ideal sizes.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously, at the end of `leave` we did
`*caller_cfp->sp = return_value`, like the interpreter.
With future changes that leaves the SP field uninitialized for C frames,
this will become problematic. For cases like returning from
`rb_funcall()`, the return value was written above the stack and
never read anyway (callers use the copy in the return register).
Leave the return value in a register at the end of `leave` and have the
code at `cfp->jit_return` decide what to do with it. This avoids the
unnecessary memory write mentioned above. For JIT-to-JIT returns, it goes
through `asm.stack_push()` and benefits from register allocation for
stack temporaries.
Mostly flat on benchmarks, with maybe some marginal speed improvements.
Co-authored-by: Takashi Kokubun <takashikkbn@gmail.com>
|
|
|
| |
YJIT: Remove spill_temps from jit_prepare_routine_call
|
|
|
|
|
| |
* YJIT: Chain-guard opt_mult overflow
* YJIT: Support regenerating Jo after Mul
|
| |
|
| |
|
|
|
|
|
|
|
| |
Since the compile-time iseq used in the guard was not marked and updated
during compaction, a runtime value reusing the address could falsely pass
the guard.
Co-authored-by: Takashi Kokubun <takashikkbn@gmail.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* YJIT: Skip Insn::Comment and format!
if disasm is disabled
Co-authored-by: Alan Wu <alansi.xingwu@shopify.com>
* YJIT: Get rid of asm.comment
---------
Co-authored-by: Alan Wu <alansi.xingwu@shopify.com>
|
|
|
|
|
| |
/yjit/src/backend/x86_64/mod.rs Is also UTF-8 and it doesn't have the
marker. The standard recommends against it, so remove it.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously, Kernel#lambda returned a non-lambda proc when given a
non-literal block and issued a warning under the `:deprecated` category.
With this change, Kernel#lambda will always return a lambda proc, if it
returns without raising.
Due to interactions with block passing optimizations, we previously had
two separate code paths for detecting whether Kernel#lambda got a
literal block. This change allows us to remove one path, the hack done
with rb_control_frame_t::block_code introduced in 85a337f for supporting
situations where Kernel#lambda returned a non-lambda proc.
[Feature #19777]
Co-authored-by: Takashi Kokubun <takashikkbn@gmail.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Add getbyte JIT implementation
Adds an implementation for String#getbyte for YJIT, along with a
bootstrap test. This should be helpful for pure Ruby implementations
and to avoid unneeded allocations.
Co-authored-by: John Hawthorn <jhawthorn@github.com>
* Skip the getbyte test for RJIT for now
---------
Co-authored-by: John Hawthorn <jhawthorn@github.com>
Co-authored-by: Takashi Kokubun <takashikkbn@gmail.com>
|
| |
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Remove function call for String#bytesize
String size is stored in a consistent location, so we can eliminate the
function call.
* Update yjit/src/codegen.rs
Co-authored-by: Takashi Kokubun <takashikkbn@gmail.com>
---------
Co-authored-by: Maxime Chevalier-Boisvert <maximechevalierb@gmail.com>
Co-authored-by: Takashi Kokubun <takashikkbn@gmail.com>
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
* YJIT: Make compiled_* stats available by default
* Update comment about default counters [ci skip]
Co-authored-by: Maxime Chevalier-Boisvert <maximechevalierb@gmail.com>
---------
Co-authored-by: Maxime Chevalier-Boisvert <maximechevalierb@gmail.com>
|
|
|
|
|
| |
getblockparamproxy for "ifunc" behaves identically to iseq, in just
pushing rb_block_param_proxy.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
TempMapping (#8321)
* YJIT: merge tempmapping and temp types into a single-byte encoding
YJIT: refactor to shrink Context by 8 bytes
* Add tests, fix bug in TempMapping::map_to_local()
* Update yjit/src/core.rs
Co-authored-by: Takashi Kokubun <takashikkbn@gmail.com>
* Update yjit/src/core.rs
Co-authored-by: Takashi Kokubun <takashikkbn@gmail.com>
* Fewer transmutes where `as` would suffice. Also repr(u8)
* Update yjit/src/core.rs
Co-authored-by: Takashi Kokubun <takashikkbn@gmail.com>
* Update yjit/src/core.rs
Co-authored-by: Takashi Kokubun <takashikkbn@gmail.com>
* Update yjit/src/core.rs
Co-authored-by: Takashi Kokubun <takashikkbn@gmail.com>
---------
Co-authored-by: Takashi Kokubun <takashikkbn@gmail.com>
Co-authored-by: Alan Wu <XrXr@users.noreply.github.com>
|
|
|
|
|
|
|
|
|
| |
These types are essentially claims about what `RBASIC_CLASS(obj)`
returns. The field changes with singleton class creation, but we didn't
consider so previously and elided guards where we actually needed them.
Found running ruby/spec with --yjit-verify-ctx. The assertion interface
makes extensive use of singleton classes.
|
|
|
|
|
| |
We pass block around as `Option<BlockHandler>` having SpecVal
match that simplifes code matching for the `None` case.
|
|
|
|
|
| |
A refactor so that the variants correspond to
branches in vm_caller_setup_arg_block().
|
|
|
|
|
|
|
|
| |
Rack uses this. Speculate that the `obj` in `the_call(&obj)`
will be a proc when the compile-time sample is a proc.
Co-authored-by: Takashi Kokubun <takashikkbn@gmail.com>
Co-authored-by: Maxime Chevalier-Boisvert <maxime.chevalierboisvert@shopify.com>
Co-authored-by: Aaron Patterson <tenderlove@ruby-lang.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Fix guard-heap upgrades
`getinstancevariable` was generating more heap guards than I thought.
It turns out that the upgrade code has a bug in it.
Given the following Ruby code:
```ruby
class Foo
def initialize
@a = 1
@b = 1
end
def foo
[@a, @b]
end
end
foo = Foo.new
10.times { foo.foo }
puts RubyVM::YJIT.disasm Foo.instance_method(:foo)
```
Before this commit, the machine code was like this:
```
== BLOCK 1/4, ISEQ RANGE [0,3), 36 bytes ======================
# Insn: 0000 getinstancevariable (stack_size: 0)
0x5562fb831023: mov rax, qword ptr [r13 + 0x18]
# guard object is heap
0x5562fb831027: test al, 7
0x5562fb83102a: jne 0x5562fb833080
0x5562fb831030: test rax, rax
0x5562fb831033: je 0x5562fb833080
# guard shape
0x5562fb831039: cmp dword ptr [rax + 4], 0x18
0x5562fb83103d: jne 0x5562fb833062
# reg_temps: 00000000 -> 00000001
0x5562fb831043: mov rsi, qword ptr [rax + 0x10]
== BLOCK 2/4, ISEQ RANGE [3,6), 0 bytes =======================
== BLOCK 3/4, ISEQ RANGE [3,6), 36 bytes ======================
# regenerate_branch
# Insn: 0003 getinstancevariable (stack_size: 1)
# regenerate_branch
0x5562fb831047: mov rax, qword ptr [r13 + 0x18]
# guard object is heap
0x5562fb83104b: test al, 7
0x5562fb83104e: jne 0x5562fb8330db
0x5562fb831054: test rax, rax
0x5562fb831057: je 0x5562fb8330db
# guard shape
0x5562fb83105d: cmp dword ptr [rax + 4], 0x18
0x5562fb831061: jne 0x5562fb8330ba
# reg_temps: 00000001 -> 00000011
0x5562fb831067: mov rdi, qword ptr [rax + 0x18]
```
After this commit, the machine code has fewer guards for `self`:
```
== BLOCK 1/4, ISEQ RANGE [0,3), 36 bytes ======================
# Insn: 0000 getinstancevariable (stack_size: 0)
0x55cb5db5f023: mov rax, qword ptr [r13 + 0x18]
# guard object is heap
0x55cb5db5f027: test al, 7
0x55cb5db5f02a: jne 0x55cb5db61080
0x55cb5db5f030: test rax, rax
0x55cb5db5f033: je 0x55cb5db61080
# guard shape
0x55cb5db5f039: cmp dword ptr [rax + 4], 0x18
0x55cb5db5f03d: jne 0x55cb5db61062
# reg_temps: 00000000 -> 00000001
0x55cb5db5f043: mov rsi, qword ptr [rax + 0x10]
== BLOCK 2/4, ISEQ RANGE [3,6), 0 bytes =======================
== BLOCK 3/4, ISEQ RANGE [3,6), 18 bytes ======================
# regenerate_branch
# Insn: 0003 getinstancevariable (stack_size: 1)
# regenerate_branch
0x55cb5db5f047: mov rax, qword ptr [r13 + 0x18]
# guard shape
0x55cb5db5f04b: cmp dword ptr [rax + 4], 0x18
0x55cb5db5f04f: jne 0x55cb5db610ba
# reg_temps: 00000001 -> 00000011
0x55cb5db5f055: mov rdi, qword ptr [rax + 0x18]
```
Co-Authored-By: Takashi Kokubun <takashikkbn@gmail.com>
* Fix array/string guards as well
---------
Co-authored-by: Takashi Kokubun <takashikkbn@gmail.com>
|
|
|
|
| |
Issue found by running ruby/spec with `--yjit-verify-ctx`. Thanks!
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* YJIT: implement fast path for integer multiplication in opt_mult
* Update yjit/src/codegen.rs
Co-authored-by: Alan Wu <XrXr@users.noreply.github.com>
* Implement mul with overflow checking on arm64
* Fix missing semicolon
* Add arm splitting for lshift, rshift, urshift
---------
Co-authored-by: Alan Wu <XrXr@users.noreply.github.com>
|
|
|
|
|
|
|
| |
We previously falsely asserted that String#<< always returns a ::String
instance. Issue was discovered on CI with `--yjit-verify-ctx`.
https://github.com/ruby/ruby/actions/runs/5893760435/job/15986002531
|
|
|
| |
Co-authored-by: Takashi Kokubun <takashikkbn@gmail.com>
|
|
|
|
|
|
|
| |
We should return false for this type of special methods but wasn't
previously. Was reproducible with:
make test-all TESTS=../test/-ext-/test_notimplement.rb RUN_OPTS='--yjit-call-threshold=1'
|
|
|
|
|
|
|
|
|
|
|
| |
* YJIT: implement side chain fallback for setlocal to avoid exiting
* Update yjit/src/codegen.rs
Co-authored-by: Takashi Kokubun <takashikkbn@gmail.com>
---------
Co-authored-by: Takashi Kokubun <takashikkbn@gmail.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* YJIT: Fix splatting empty array with rest param
* YJIT: Rework optional parameter handling to fix corner case
The old code had a few unintuitive parts. The starting PC of the callee
was set in different places; `num_param`, which one would assume to be
static for a particular callee seemingly tallied to different amounts
depending on the what the caller passed; `opts_filled_with_splat` was
greater than zero even when the opts were not filled by items in the
splat array. Functionally, the bits that lets the callee know which
keyword parameters are unspecified were not passed properly when there
are optional parameters and a rest parameter, and then optional
parameters are all filled.
Make `num_param` non-mut and use parameter information in the callee
iseq as-is. Move local variable nil fill and placing of the rest array
out of `gen_push_frame()` as they are only ever relevant for iseq calls.
Always place the rest array at `lead_num + opt_num` to fix the
previously buggy situation.
* YJIT: Compile splat calls to iseqs with rest params
Test interactions with optional parameters.
|
| |
|
| |
|