aboutsummaryrefslogtreecommitdiffstats
path: root/yjit/src/codegen.rs
Commit message (Collapse)AuthorAgeFilesLines
* YJIT: Take cargo --fix for unnecessary calls to into()Alan Wu2023-11-101-3/+3
|
* YJIT: Auto fix for clippy::unnecessary_castAlan Wu2023-11-101-1/+1
|
* YJIT: Auto fix for clippy::clone_on_copyAlan Wu2023-11-101-10/+10
|
* Refactor rb_shape_transition_shape_capa outJean Boussier2023-11-081-34/+22
| | | | | | | | | | | | | | | | Right now the `rb_shape_get_next` shape caller need to first check if there is capacity left, and if not call `rb_shape_transition_shape_capa` before it can call `rb_shape_get_next`. And on each of these it needs to checks if we got a TOO_COMPLEX back. All this logic is duplicated in the interpreter, YJIT and RJIT. Instead we can have `rb_shape_get_next` do the capacity transition when needed. The caller can compare the old and new shapes capacity to know if resizing is needed. It also can check for TOO_COMPLEX only once.
* YJIT: Fix assert in OOM scenarioAlan Wu2023-11-071-1/+1
| | | | | | | | | | | We still need to do `jit.record_boundary_patch_point = false` when gen_outlined_exit() returns `None` and we return with `?`. Previously, we tripped the assert at codegen.rs:1042. Found with `--yjit-exec-mem-size=3` on the lobsters benchmark. Co-authored-by: Takashi Kokubun <takashikkbn@gmail.com> Co-authored-by: Maxime Chevalier-Boisvert <maxime.chevalierboisvert@shopify.com>
* YJIT: Use u32 for CodePtr to save 4 bytes eachAlan Wu2023-11-071-3/+3
| | | | | | | | | | | | | | | | We've long had a size restriction on the code memory region such that a u32 could refer to everything. This commit capitalizes on this restriction by shrinking the size of `CodePtr` to be 4 bytes from 8. To derive a full raw pointer from a `CodePtr`, one needs a base pointer. Both `CodeBlock` and `VirtualMemory` can be used for this purpose. The base pointer is readily available everywhere, except for in the case of the `jit_return` "branch". Generalize lea_label() to lea_jump_target() in the IR to delay deriving the `jit_return` address until `compile()`, when the base pointer is available. On railsbench, this yields roughly a 1% reduction to `yjit_alloc_size` (58,397,765 to 57,742,248).
* YJIT: Inline basic Ruby methods (#8855)Takashi Kokubun2023-11-071-4/+50
| | | | | | | * YJIT: Inline basic Ruby methods * YJIT: Fix "InsnOut operand made it past register allocation" checktype should not generate a useless instruction.
* YJIT: handle out of shape situation in gen_setinstancevariable (#8857)Jean byroot Boussier2023-11-071-1/+5
| | | | | | If the VM ran out of shape, `rb_shape_transition_shape_capa` might return `OBJ_TOO_COMPLEX_SHAPE`. Co-authored-by: Jean Boussier <byroot@ruby-lang.org>
* YJIT: Always define method codegen table at boot (#8807)Takashi Kokubun2023-11-021-97/+84
|
* YJIT: Return Option from asm.compile() for has_dropped_bytes()Alan Wu2023-10-191-64/+54
| | | | | | | | | So that we get a reminder to check CodeBlock::has_dropped_bytes(). Internally, asm.compile() already checks it, and this patch just propagates it out to the caller with a `#[must_use]`. Code GC logic moved out one level in entry_stub_hit(), so the body can freely use `?`
* Revert "shape.h: Make attr_index_t uint8_t"Katherine Oelsner2023-10-181-3/+3
| | | | This reverts commit e3afc212ec059525fe4e5387b2a3be920ffe0f0e.
* YJIT: Add --yjit-perf (#8697)Takashi Kokubun2023-10-181-2/+62
| | | Co-authored-by: Alan Wu <alansi.xingwu@shopify.com>
* YJIT: Remove call to compile() on empty AssemblerAlan Wu2023-10-171-4/+1
|
* YJIT: Add a few missing counters for send fallback (#8681)Takashi Kokubun2023-10-171-2/+3
|
* YJIT: Lookup IDs on boot instead of binding to themAlan Wu2023-10-171-5/+5
| | | | | | | | | | Previously, the version-controlled `cruby_bindings.inc.rs` file contained the build-time artifact `id.h`, which nobu mentioned hinders the goal of having fewer magic numbers in the repository. Lookup the IDs YJIT needs on boot. It costs cycles, but it's fine since YJIT only uses a handful of IDs at the moment. No perceptible degradation to boot time found in my testing.
* YJIT: Fallback opt_getconstant_path for const_missing (#8623)Takashi Kokubun2023-10-131-8/+22
| | | | | | | * YJIT: Fallback opt_getconstant_path for const_missing * Fix a comment [ci skip] * Remove a wrapper function
* YJIT: Fix argument clobbering in some block_arg+rest_param calls (#8647)Alan Wu2023-10-131-42/+58
| | | | | | | | | | | | Previously, for block argument callsites with some specific argument count and callee local variable count combinations, YJIT ended up writing over arguments that are supposed to be collected into a rest parameter array unmodified. Detect when clobbering would happen and avoid it. Also, place the block handler after the stack overflow check, since it writes to new stack space. Reported-by: Takashi Kokubun <takashikkbn@gmail.com>
* shape.h: Make attr_index_t uint8_tJean Boussier2023-10-111-3/+3
| | | | | | | | | | Given `SHAPE_MAX_NUM_IVS 80`, we transition to TOO_COMPLEX way before we could overflow a 8bit counter. This reduce the size of `rb_shape_t` from 32B to 24B. If we decide to raise `SHAPE_MAX_NUM_IVS` we can always increase that type again.
* Refactor rb_shape_transition_shape_capa to not accept capacityJean Boussier2023-10-101-3/+2
| | | | | This way the groth factor is encapsulated, which allows rb_shape_transition_shape_capa to be smarter about ideal sizes.
* YJIT: Avoid writing return value to memory in `leave`Alan Wu2023-10-051-17/+37
| | | | | | | | | | | | | | | | | | | Previously, at the end of `leave` we did `*caller_cfp->sp = return_value`, like the interpreter. With future changes that leaves the SP field uninitialized for C frames, this will become problematic. For cases like returning from `rb_funcall()`, the return value was written above the stack and never read anyway (callers use the copy in the return register). Leave the return value in a register at the end of `leave` and have the code at `cfp->jit_return` decide what to do with it. This avoids the unnecessary memory write mentioned above. For JIT-to-JIT returns, it goes through `asm.stack_push()` and benefits from register allocation for stack temporaries. Mostly flat on benchmarks, with maybe some marginal speed improvements. Co-authored-by: Takashi Kokubun <takashikkbn@gmail.com>
* YJIT: Stop spilling temps on jit_prepare_routine_call (#8581)Takashi Kokubun2023-10-031-34/+49
| | | YJIT: Remove spill_temps from jit_prepare_routine_call
* YJIT: Chain-guard opt_mult overflow (#8554)Takashi Kokubun2023-09-291-2/+5
| | | | | * YJIT: Chain-guard opt_mult overflow * YJIT: Support regenerating Jo after Mul
* YJIT: Use registers for passing C method arguments (#8538)Takashi Kokubun2023-09-291-19/+23
|
* YJIT: Remove obsoleted jit_rb_int_mul (#8539)Takashi Kokubun2023-09-291-29/+0
|
* YJIT: Fix object movement bug in iseq guard for invokeblockAlan Wu2023-09-151-1/+1
| | | | | | | Since the compile-time iseq used in the guard was not marked and updated during compaction, a runtime value reusing the address could falsely pass the guard. Co-authored-by: Takashi Kokubun <takashikkbn@gmail.com>
* YJIT: Skip Insn::Comment and format! if disasm is disabled (#8441)Takashi Kokubun2023-09-141-126/+126
| | | | | | | | | | | | | * YJIT: Skip Insn::Comment and format! if disasm is disabled Co-authored-by: Alan Wu <alansi.xingwu@shopify.com> * YJIT: Get rid of asm.comment --------- Co-authored-by: Alan Wu <alansi.xingwu@shopify.com>
* YJIT: Remove UTF-8 BOM [ci skip]Alan Wu2023-09-141-1/+1
| | | | | /yjit/src/backend/x86_64/mod.rs Is also UTF-8 and it doesn't have the marker. The standard recommends against it, so remove it.
* Make Kernel#lambda raise when given non-literal blockAlan Wu2023-09-121-3/+0
| | | | | | | | | | | | | | | | | Previously, Kernel#lambda returned a non-lambda proc when given a non-literal block and issued a warning under the `:deprecated` category. With this change, Kernel#lambda will always return a lambda proc, if it returns without raising. Due to interactions with block passing optimizations, we previously had two separate code paths for detecting whether Kernel#lambda got a literal block. This change allows us to remove one path, the hack done with rb_control_frame_t::block_code introduced in 85a337f for supporting situations where Kernel#lambda returned a non-lambda proc. [Feature #19777] Co-authored-by: Takashi Kokubun <takashikkbn@gmail.com>
* Add `String#getbyte` YJIT implementation (#8397)Ian Candy2023-09-071-0/+29
| | | | | | | | | | | | | | | | * Add getbyte JIT implementation Adds an implementation for String#getbyte for YJIT, along with a bootstrap test. This should be helpful for pure Ruby implementations and to avoid unneeded allocations. Co-authored-by: John Hawthorn <jhawthorn@github.com> * Skip the getbyte test for RJIT for now --------- Co-authored-by: John Hawthorn <jhawthorn@github.com> Co-authored-by: Takashi Kokubun <takashikkbn@gmail.com>
* YJIT: Decrease IVAR_MAX_DEPTH to 8 (#8398)Takashi Kokubun2023-09-071-4/+4
|
* YJIT: Decrease SEND_MAX_DEPTH to 5 (#8390)Takashi Kokubun2023-09-071-17/+14
|
* Remove function call for String#bytesize (#8389)Aaron Patterson2023-09-071-2/+13
| | | | | | | | | | | | | | | * Remove function call for String#bytesize String size is stored in a consistent location, so we can eliminate the function call. * Update yjit/src/codegen.rs Co-authored-by: Takashi Kokubun <takashikkbn@gmail.com> --------- Co-authored-by: Maxime Chevalier-Boisvert <maximechevalierb@gmail.com> Co-authored-by: Takashi Kokubun <takashikkbn@gmail.com>
* YJIT: Different comment when only setting ec->cfp [ci skip]Alan Wu2023-09-061-1/+2
|
* YJIT: Make compiled_* stats available by default (#8379)Takashi Kokubun2023-09-061-12/+3
| | | | | | | | | | | * YJIT: Make compiled_* stats available by default * Update comment about default counters [ci skip] Co-authored-by: Maxime Chevalier-Boisvert <maximechevalierb@gmail.com> --------- Co-authored-by: Maxime Chevalier-Boisvert <maximechevalierb@gmail.com>
* YJIT: Handle getblockparamproxy with ifuncJohn Hawthorn2023-08-311-10/+14
| | | | | getblockparamproxy for "ifunc" behaves identically to iseq, in just pushing rb_block_param_proxy.
* YJIT: shrink Context from 29 to 21 bytes by reducing space used by ↵Maxime Chevalier-Boisvert2023-08-301-15/+17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | TempMapping (#8321) * YJIT: merge tempmapping and temp types into a single-byte encoding YJIT: refactor to shrink Context by 8 bytes * Add tests, fix bug in TempMapping::map_to_local() * Update yjit/src/core.rs Co-authored-by: Takashi Kokubun <takashikkbn@gmail.com> * Update yjit/src/core.rs Co-authored-by: Takashi Kokubun <takashikkbn@gmail.com> * Fewer transmutes where `as` would suffice. Also repr(u8) * Update yjit/src/core.rs Co-authored-by: Takashi Kokubun <takashikkbn@gmail.com> * Update yjit/src/core.rs Co-authored-by: Takashi Kokubun <takashikkbn@gmail.com> * Update yjit/src/core.rs Co-authored-by: Takashi Kokubun <takashikkbn@gmail.com> --------- Co-authored-by: Takashi Kokubun <takashikkbn@gmail.com> Co-authored-by: Alan Wu <XrXr@users.noreply.github.com>
* YJIT: Remove Type::CArray and limit use of Type::CStringAlan Wu2023-08-281-7/+12
| | | | | | | | | These types are essentially claims about what `RBASIC_CLASS(obj)` returns. The field changes with singleton class creation, but we didn't consider so previously and elided guards where we actually needed them. Found running ruby/spec with --yjit-verify-ctx. The assertion interface makes extensive use of singleton classes.
* YJIT: Refactor to use Option<BlockHandler> in SpecValAlan Wu2023-08-241-17/+9
| | | | | We pass block around as `Option<BlockHandler>` having SpecVal match that simplifes code matching for the `None` case.
* YJIT: Move block handler SpecVal variants into BlockHandlerAlan Wu2023-08-241-25/+27
| | | | | A refactor so that the variants correspond to branches in vm_caller_setup_arg_block().
* YJIT: Implement VM_CALL_ARGS_BLOCKARG with Proc for ISeq callsAlan Wu2023-08-231-18/+60
| | | | | | | | Rack uses this. Speculate that the `obj` in `the_call(&obj)` will be a proc when the compile-time sample is a proc. Co-authored-by: Takashi Kokubun <takashikkbn@gmail.com> Co-authored-by: Maxime Chevalier-Boisvert <maxime.chevalierboisvert@shopify.com> Co-authored-by: Aaron Patterson <tenderlove@ruby-lang.org>
* Fix guard-heap upgrades (#8264)Aaron Patterson2023-08-231-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * Fix guard-heap upgrades `getinstancevariable` was generating more heap guards than I thought. It turns out that the upgrade code has a bug in it. Given the following Ruby code: ```ruby class Foo def initialize @a = 1 @b = 1 end def foo [@a, @b] end end foo = Foo.new 10.times { foo.foo } puts RubyVM::YJIT.disasm Foo.instance_method(:foo) ``` Before this commit, the machine code was like this: ``` == BLOCK 1/4, ISEQ RANGE [0,3), 36 bytes ====================== # Insn: 0000 getinstancevariable (stack_size: 0) 0x5562fb831023: mov rax, qword ptr [r13 + 0x18] # guard object is heap 0x5562fb831027: test al, 7 0x5562fb83102a: jne 0x5562fb833080 0x5562fb831030: test rax, rax 0x5562fb831033: je 0x5562fb833080 # guard shape 0x5562fb831039: cmp dword ptr [rax + 4], 0x18 0x5562fb83103d: jne 0x5562fb833062 # reg_temps: 00000000 -> 00000001 0x5562fb831043: mov rsi, qword ptr [rax + 0x10] == BLOCK 2/4, ISEQ RANGE [3,6), 0 bytes ======================= == BLOCK 3/4, ISEQ RANGE [3,6), 36 bytes ====================== # regenerate_branch # Insn: 0003 getinstancevariable (stack_size: 1) # regenerate_branch 0x5562fb831047: mov rax, qword ptr [r13 + 0x18] # guard object is heap 0x5562fb83104b: test al, 7 0x5562fb83104e: jne 0x5562fb8330db 0x5562fb831054: test rax, rax 0x5562fb831057: je 0x5562fb8330db # guard shape 0x5562fb83105d: cmp dword ptr [rax + 4], 0x18 0x5562fb831061: jne 0x5562fb8330ba # reg_temps: 00000001 -> 00000011 0x5562fb831067: mov rdi, qword ptr [rax + 0x18] ``` After this commit, the machine code has fewer guards for `self`: ``` == BLOCK 1/4, ISEQ RANGE [0,3), 36 bytes ====================== # Insn: 0000 getinstancevariable (stack_size: 0) 0x55cb5db5f023: mov rax, qword ptr [r13 + 0x18] # guard object is heap 0x55cb5db5f027: test al, 7 0x55cb5db5f02a: jne 0x55cb5db61080 0x55cb5db5f030: test rax, rax 0x55cb5db5f033: je 0x55cb5db61080 # guard shape 0x55cb5db5f039: cmp dword ptr [rax + 4], 0x18 0x55cb5db5f03d: jne 0x55cb5db61062 # reg_temps: 00000000 -> 00000001 0x55cb5db5f043: mov rsi, qword ptr [rax + 0x10] == BLOCK 2/4, ISEQ RANGE [3,6), 0 bytes ======================= == BLOCK 3/4, ISEQ RANGE [3,6), 18 bytes ====================== # regenerate_branch # Insn: 0003 getinstancevariable (stack_size: 1) # regenerate_branch 0x55cb5db5f047: mov rax, qword ptr [r13 + 0x18] # guard shape 0x55cb5db5f04b: cmp dword ptr [rax + 4], 0x18 0x55cb5db5f04f: jne 0x55cb5db610ba # reg_temps: 00000001 -> 00000011 0x55cb5db5f055: mov rdi, qword ptr [rax + 0x18] ``` Co-Authored-By: Takashi Kokubun <takashikkbn@gmail.com> * Fix array/string guards as well --------- Co-authored-by: Takashi Kokubun <takashikkbn@gmail.com>
* YJIT: Fix return type of Integer#/ with T_FIXNUM inputsAlan Wu2023-08-181-1/+4
| | | | Issue found by running ruby/spec with `--yjit-verify-ctx`. Thanks!
* YJIT: implement fast path for integer multiplication in opt_mult (#8204)Maxime Chevalier-Boisvert2023-08-181-2/+36
| | | | | | | | | | | | | | | | | * YJIT: implement fast path for integer multiplication in opt_mult * Update yjit/src/codegen.rs Co-authored-by: Alan Wu <XrXr@users.noreply.github.com> * Implement mul with overflow checking on arm64 * Fix missing semicolon * Add arm splitting for lshift, rshift, urshift --------- Co-authored-by: Alan Wu <XrXr@users.noreply.github.com>
* YJIT: Fix String#<< return typeAlan Wu2023-08-171-2/+2
| | | | | | | We previously falsely asserted that String#<< always returns a ::String instance. Issue was discovered on CI with `--yjit-verify-ctx`. https://github.com/ruby/ruby/actions/runs/5893760435/job/15986002531
* Add note about rb_f_notimplement [ci skip]Alan Wu2023-08-171-1/+1
| | | Co-authored-by: Takashi Kokubun <takashikkbn@gmail.com>
* YJIT: Fix Kernel#respond_to? handling of rb_f_notimplementAlan Wu2023-08-171-11/+19
| | | | | | | We should return false for this type of special methods but wasn't previously. Was reproducible with: make test-all TESTS=../test/-ext-/test_notimplement.rb RUN_OPTS='--yjit-call-threshold=1'
* YJIT: implement side chain fallback for setlocal to avoid exiting (#8227)Maxime Chevalier-Boisvert2023-08-171-9/+40
| | | | | | | | | | | * YJIT: implement side chain fallback for setlocal to avoid exiting * Update yjit/src/codegen.rs Co-authored-by: Takashi Kokubun <takashikkbn@gmail.com> --------- Co-authored-by: Takashi Kokubun <takashikkbn@gmail.com>
* YJIT: Optional parameter rework and bugfix (#8220)Alan Wu2023-08-151-143/+143
| | | | | | | | | | | | | | | | | | | | | | | | | * YJIT: Fix splatting empty array with rest param * YJIT: Rework optional parameter handling to fix corner case The old code had a few unintuitive parts. The starting PC of the callee was set in different places; `num_param`, which one would assume to be static for a particular callee seemingly tallied to different amounts depending on the what the caller passed; `opts_filled_with_splat` was greater than zero even when the opts were not filled by items in the splat array. Functionally, the bits that lets the callee know which keyword parameters are unspecified were not passed properly when there are optional parameters and a rest parameter, and then optional parameters are all filled. Make `num_param` non-mut and use parameter information in the callee iseq as-is. Move local variable nil fill and placing of the rest array out of `gen_push_frame()` as they are only ever relevant for iseq calls. Always place the rest array at `lead_num + opt_num` to fix the previously buggy situation. * YJIT: Compile splat calls to iseqs with rest params Test interactions with optional parameters.
* YJIT: Chain guard classes on instance_of (#8209)Takashi Kokubun2023-08-141-2/+9
|
* YJIT: Implement GET_BLOCK_HANDLER() for invokesuper (#8206)Takashi Kokubun2023-08-111-65/+68
|