aboutsummaryrefslogtreecommitdiffstats
path: root/yjit/src/cruby.rs
Commit message (Collapse)AuthorAgeFilesLines
* YJIT: Inline basic Ruby methods (#8855)Takashi Kokubun2023-11-071-0/+6
| | | | | | | * YJIT: Inline basic Ruby methods * YJIT: Fix "InsnOut operand made it past register allocation" checktype should not generate a useless instruction.
* YJIT: Add --yjit-perf (#8697)Takashi Kokubun2023-10-181-1/+0
| | | Co-authored-by: Alan Wu <alansi.xingwu@shopify.com>
* YJIT: Lookup IDs on boot instead of binding to themAlan Wu2023-10-171-0/+48
| | | | | | | | | | Previously, the version-controlled `cruby_bindings.inc.rs` file contained the build-time artifact `id.h`, which nobu mentioned hinders the goal of having fewer magic numbers in the repository. Lookup the IDs YJIT needs on boot. It costs cycles, but it's fine since YJIT only uses a handful of IDs at the moment. No perceptible degradation to boot time found in my testing.
* Remove function call for String#bytesize (#8389)Aaron Patterson2023-09-071-1/+0
| | | | | | | | | | | | | | | * Remove function call for String#bytesize String size is stored in a consistent location, so we can eliminate the function call. * Update yjit/src/codegen.rs Co-authored-by: Takashi Kokubun <takashikkbn@gmail.com> --------- Co-authored-by: Maxime Chevalier-Boisvert <maximechevalierb@gmail.com> Co-authored-by: Takashi Kokubun <takashikkbn@gmail.com>
* YJIT: Silence Clippy for bindgen generated codeAlan Wu2023-09-051-1/+1
| | | | | | New Clippy lint in 1.72.0 is breaking our build as GitHub has updated their image. No point hearing about lints from generated code we don't manually write.
* YJIT: Compile exception handlers (#8171)Takashi Kokubun2023-08-081-0/+1
| | | Co-authored-by: Maxime Chevalier-Boisvert <maximechevalierb@gmail.com>
* Remove __bp__ and speed-up bmethod calls (#8060)Alan Wu2023-07-171-3/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Remove rb_control_frame_t::__bp__ and optimize bmethod calls This commit removes the __bp__ field from rb_control_frame_t. It was introduced to help MJIT, but since MJIT was replaced by RJIT, we can use vm_base_ptr() to compute it from the SP of the previous control frame instead. Removing the field avoids needing to set it up when pushing new frames. Simply removing __bp__ would cause crashes since RJIT and YJIT used a slightly different stack layout for bmethod calls than the interpreter. At the moment of the call, the two layouts looked as follows: ┌────────────┐ ┌────────────┐ │ frame_base │ │ frame_base │ ├────────────┤ ├────────────┤ │ ... │ │ ... │ ├────────────┤ ├────────────┤ │ args │ │ args │ ├────────────┤ └────────────┘<─prev_frame_sp │ receiver │ prev_frame_sp─>└────────────┘ RJIT & YJIT interpreter Essentially, vm_base_ptr() needs to compute the address to frame_base given prev_frame_sp in the diagrams. The presence of the receiver created an off-by-one situation. Make the interpreter use the layout the JITs use for iseq-to-iseq bmethod calls. Doing so removes unnecessary argument shifting and vm_exec_core() re-entry from the interpreter, yielding a speed improvement visible through `benchmark/vm_defined_method.yml`: patched: 7578743.1 i/s master: 4796596.3 i/s - 1.58x slower C-to-iseq bmethod calls now store one more VALUE than before, but that should have negligible impact on overall performance. Note that re-entering vm_exec_core() used to be necessary for firing TracePoint events, but that's no longer the case since 9121e57a5f50bc91bae48b3b91edb283bf96cb6b. Closes ruby/ruby#6428
* Implement Struct on VWAPeter Zhu2023-06-051-1/+1
| | | | | | | | | | | | | | | | | | | | | The benchmark results show that this feature has either a positive or no impact on performance. The memory usage is also mostly unchanged, except in hexapdf, where there is a decrease in RSS. -------------- ----------- ---------- --------- ----------- ---------- --------- -------------- ------------- bench master (ms) stddev (%) RSS (MiB) branch (ms) stddev (%) RSS (MiB) branch 1st itr master/branch activerecord 70.8 2.2 56.0 71.7 2.2 56.0 0.99 0.99 erubi_rails 20.5 13.6 94.7 20.5 14.3 94.2 0.93 1.00 hexapdf 2541.0 0.7 212.8 2544.4 0.7 203.4 1.00 1.00 liquid-c 65.6 0.3 38.9 65.3 0.3 38.9 1.01 1.01 liquid-compile 63.7 0.3 34.6 61.1 0.2 34.6 1.04 1.04 liquid-render 163.1 0.1 37.1 163.3 0.1 37.1 1.00 1.00 mail 139.3 0.1 50.5 137.0 0.1 50.1 0.99 1.02 psych-load 2065.7 0.1 36.9 2068.2 0.1 37.3 1.00 1.00 railsbench 2034.6 0.5 103.9 2031.9 0.5 103.8 1.02 1.00 ruby-lsp 65.3 3.1 89.8 66.2 3.0 89.7 1.01 0.99 sequel 73.2 1.0 40.3 73.4 1.0 40.3 1.00 1.00 -------------- ----------- ---------- --------- ----------- ---------- --------- -------------- -------------
* YJIT: Add codegen for Integer methods (#7665)Takashi Kokubun2023-04-051-1/+3
| | | | | | | * YJIT: Add codegen for Integer methods * YJIT: Update dependencies * YJIT: Fix Integer#[] for argc=2
* Revert "YJIT: Suppress unnecessary `unsafe` block (GH-7634)"Alan Wu2023-04-051-1/+1
| | | | | | | This reverts commit 9e678cdbd054f78576a8f21b3f97cccc395ade22. Without the `unsafe` annotations, the SAFETY comments make less sense. I want to keep the SAFETY comments.
* YJIT: Suppress unnecessary `unsafe` block (#7634)Nobuyoshi Nakada2023-03-311-1/+1
|
* YJIT: Support entry for multiple PCs per ISEQ (GH-7535)Takashi Kokubun2023-03-171-0/+7
|
* YJIT: log the names of methods we call to in disasm (#7231)Maxime Chevalier-Boisvert2023-02-021-1/+14
| | | | | | | * YJIT: log the names of methods we call to in disasm * Assert that pointer is not null * Handle case where UTF8 conversion not possible
* Fix typos in YJIT [ci skip]Alan Wu2023-02-021-1/+1
|
* YJIT: Factor out VALUE_BITS = (8 * SIZE_OF_VALUE as u8)Alan Wu2023-01-131-0/+1
| | | | | Using a constant shows intention better and is less noisy. It always took me a second to parse the long expression.
* Enable `clippy` checks for yjit in CI (#7093)Ian Ker-Seymer2023-01-121-1/+1
| | | | | | | | | | | * Add job to check clippy lints in CI * Address all remaining clippy lints * Check lints on arm64 as well * Apply latest clippy lints * Do not exit 0 on clippy warnings
* Transition complex objects to "too complex" shapeJemma Issroff2022-12-151-0/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When an object becomes "too complex" (in other words it has too many variations in the shape tree), we transition it to use a "too complex" shape and use a hash for storing instance variables. Without this patch, there were rare cases where shape tree growth could "explode" and cause performance degradation on what would otherwise have been cached fast paths. This patch puts a limit on shape tree growth, and gracefully degrades in the rare case where there could be a factorial growth in the shape tree. For example: ```ruby class NG; end HUGE_NUMBER.times do NG.new.instance_variable_set(:"@unique_ivar_#{_1}", 1) end ``` We consider objects to be "too complex" when the object's class has more than SHAPE_MAX_VARIATIONS (currently 8) leaf nodes in the shape tree and the object introduces a new variation (a new leaf node) associated with that class. For example, new variations on instances of the following class would be considered "too complex" because those instances create more than 8 leaves in the shape tree: ```ruby class Foo; end 9.times { Foo.new.instance_variable_set(":@uniq_#{_1}", 1) } ``` However, the following class is *not* too complex because it only has one leaf in the shape tree: ```ruby class Foo def initialize @a = @b = @c = @d = @e = @f = @g = @h = @i = nil end end 9.times { Foo.new } `` This case is rare, so we don't expect this change to impact performance of most applications, but it needs to be handled. Co-Authored-By: Aaron Patterson <tenderlove@ruby-lang.org>
* bail on compilation if the comptime receiver is frozenAaron Patterson2022-12-021-0/+4
|
* implement IV writesAaron Patterson2022-12-021-1/+19
|
* MJIT: Use a String buffer in builtin compilersTakashi Kokubun2022-11-271-7/+0
| | | | | | | instead of FILE*. Using C.fprintf is slower than String manipulation on memory. I'm going to change the way MJIT writes files, and this is a prerequisite for it.
* YJIT: Always encode Opnd::Value in 64 bits on x86_64 for GC offsets (#6733)Takashi Kokubun2022-11-151-0/+5
| | | | | | | | | | | | | | | | | * YJIT: Always encode Opnd::Value in 64 bits on x86_64 for GC offsets Co-authored-by: Alan Wu <alansi.xingwu@shopify.com> * Introduce heap_object_p * Leave original mov intact * Remove unneeded branches * Add a test for movabs Co-authored-by: Alan Wu <alansi.xingwu@shopify.com>
* YJIT: Support invokeblock (#6640)Takashi Kokubun2022-11-021-2/+18
| | | | | | | | | * YJIT: Support invokeblock * Update yjit/src/backend/arm64/mod.rs * Update yjit/src/codegen.rs Co-authored-by: Maxime Chevalier-Boisvert <maximechevalierb@gmail.com>
* YJIT: incorporate ruby_special_constsNobuyoshi Nakada2022-10-201-18/+13
|
* Move "special consts" so `Qundef` and `Qnil` differ just 1 bitNobuyoshi Nakada2022-10-201-2/+2
|
* YJIT: fold the "asm_comments" feature into "disasm" (#6591)Alan Wu2022-10-191-1/+1
| | | | | Previously, enabling only "disasm" didn't actually build. Since these two features are closely related and we don't really use one without the other, let's simplify and merge the two features together.
* fixes more clippy warnings (#6543)Jimmy Miller2022-10-131-0/+1
| | | | | * fixes more clippy warnings * Fix x86 c_callable to have doc_strings
* Implement optimize send in yjit (#6488)Jimmy Miller2022-10-111-0/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * Implement optimize send in yjit This successfully makes all our benchmarks exit way less for optimize send reasons. It makes some benchmarks faster, but not by as much as I'd like. I think this implementation works, but there are definitely more optimial arrangements. For example, what if we compiled send to a jump table? That seems like perhaps the most optimal we could do, but not obvious (to me) how to implement give our current setup. Co-authored-by: Alan Wu <XrXr@users.noreply.github.com> * Attempt at fixing the issues raised by @XrXr * fix allowlist * returns 0 instead of nil when not found * remove comment about encoding exception * Fix up c changes * Update assert Co-authored-by: Alan Wu <XrXr@users.noreply.github.com> * get rid of unneeded code and fix the flags * Apply suggestions from code review Co-authored-by: Alan Wu <XrXr@users.noreply.github.com> * rename and fix typo Co-authored-by: Alan Wu <XrXr@users.noreply.github.com>
* Revert "Revert "This commit implements the Object Shapes technique in CRuby.""Jemma Issroff2022-10-111-2/+10
| | | | This reverts commit 9a6803c90b817f70389cae10d60b50ad752da48f.
* Revert "This commit implements the Object Shapes technique in CRuby."Aaron Patterson2022-09-301-10/+2
| | | | This reverts commit 68bc9e2e97d12f80df0d113e284864e225f771c2.
* A bunch of clippy auto fixes for yjit (#6476)Jimmy Miller2022-09-301-1/+1
|
* This commit implements the Object Shapes technique in CRuby.Jemma Issroff2022-09-281-2/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Object Shapes is used for accessing instance variables and representing the "frozenness" of objects. Object instances have a "shape" and the shape represents some attributes of the object (currently which instance variables are set and the "frozenness"). Shapes form a tree data structure, and when a new instance variable is set on an object, that object "transitions" to a new shape in the shape tree. Each shape has an ID that is used for caching. The shape structure is independent of class, so objects of different types can have the same shape. For example: ```ruby class Foo def initialize # Starts with shape id 0 @a = 1 # transitions to shape id 1 @b = 1 # transitions to shape id 2 end end class Bar def initialize # Starts with shape id 0 @a = 1 # transitions to shape id 1 @b = 1 # transitions to shape id 2 end end foo = Foo.new # `foo` has shape id 2 bar = Bar.new # `bar` has shape id 2 ``` Both `foo` and `bar` instances have the same shape because they both set instance variables of the same name in the same order. This technique can help to improve inline cache hits as well as generate more efficient machine code in JIT compilers. This commit also adds some methods for debugging shapes on objects. See `RubyVM::Shape` for more details. For more context on Object Shapes, see [Feature: #18776] Co-Authored-By: Aaron Patterson <tenderlove@ruby-lang.org> Co-Authored-By: Eileen M. Uchitelle <eileencodes@gmail.com> Co-Authored-By: John Hawthorn <john@hawthorn.email>
* Revert this until we can figure out WB issues or remove shapes from GCAaron Patterson2022-09-261-10/+2
| | | | | | | | | | Revert "* expand tabs. [ci skip]" This reverts commit 830b5b5c351c5c6efa5ad461ae4ec5085e5f0275. Revert "This commit implements the Object Shapes technique in CRuby." This reverts commit 9ddfd2ca004d1952be79cf1b84c52c79a55978f4.
* This commit implements the Object Shapes technique in CRuby.Jemma Issroff2022-09-261-2/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Object Shapes is used for accessing instance variables and representing the "frozenness" of objects. Object instances have a "shape" and the shape represents some attributes of the object (currently which instance variables are set and the "frozenness"). Shapes form a tree data structure, and when a new instance variable is set on an object, that object "transitions" to a new shape in the shape tree. Each shape has an ID that is used for caching. The shape structure is independent of class, so objects of different types can have the same shape. For example: ```ruby class Foo def initialize # Starts with shape id 0 @a = 1 # transitions to shape id 1 @b = 1 # transitions to shape id 2 end end class Bar def initialize # Starts with shape id 0 @a = 1 # transitions to shape id 1 @b = 1 # transitions to shape id 2 end end foo = Foo.new # `foo` has shape id 2 bar = Bar.new # `bar` has shape id 2 ``` Both `foo` and `bar` instances have the same shape because they both set instance variables of the same name in the same order. This technique can help to improve inline cache hits as well as generate more efficient machine code in JIT compilers. This commit also adds some methods for debugging shapes on objects. See `RubyVM::Shape` for more details. For more context on Object Shapes, see [Feature: #18776] Co-Authored-By: Aaron Patterson <tenderlove@ruby-lang.org> Co-Authored-By: Eileen M. Uchitelle <eileencodes@gmail.com> Co-Authored-By: John Hawthorn <john@hawthorn.email>
* Initial support for VM_CALL_ARGS_SPLAT (#6341)Jimmy Miller2022-09-141-0/+2
| | | | | | | | | | | | | | | | | | | * Initial support for VM_CALL_ARGS_SPLAT This implements support for calls with splat (*) for some methods. In benchmarks this made very little difference for most benchmarks, but a large difference for binarytrees. Looking at side exits, many benchmarks now don't exit for splat, but exit for some other reason. Binarytrees however had a number of calls that used splat args that are now much faster. In my non-scientific benchmarking this made splat args performance on par with not using splat args at all. * Fix wording and whitespace Co-authored-by: Maxime Chevalier-Boisvert <maximechevalierb@gmail.com> * Get rid of side_effect reassignment Co-authored-by: Maxime Chevalier-Boisvert <maximechevalierb@gmail.com>
* YJIT: Implement concatarray in yjit (https://github.com/Shopify/ruby/pull/405)Maple Ong2022-08-291-0/+1
| | | | | | | | | | | | | | | | * Create code generation func * Make rb_vm_concat_array available to use in Rust * Map opcode to code gen func * Implement code gen for concatarray * Add test for concatarray * Use new asm backend * Add comment to C func wrapper
* Use bindgen for old manual extern declarations ↵Alan Wu2022-08-291-172/+59
| | | | | | | | | | | (https://github.com/Shopify/ruby/pull/404) We have a large extern block in cruby.rs leftover from the port. We can use bindgen for it now and reserve the manual declaration for just a handful of vm_insnhelper.c functions. Fixup a few minor discrepencies bindgen found between the C declaration and the manual declaration. Mostly missing `const` on the C side.
* Fix compile errors on arm on the CI (https://github.com/Shopify/ruby/pull/313)Maxime Chevalier-Boisvert2022-08-291-1/+1
| | | | | | * Fix compile errors on arm on the CI * Fix typo
* Port topn, adjuststack, most of opt_plusMaxime Chevalier-Boisvert2022-08-291-1/+8
|
* YJIT: Teach getblockparamproxy to handle the no-block case without exiting ↵Matthew Draper2022-07-281-0/+3
| | | | | | | | | (#6191) Teach getblockparamproxy to handle the no-block case without exiting Co-authored-by: John Hawthorn <john@hawthorn.email> Co-authored-by: John Hawthorn <john@hawthorn.email>
* Implement Objects on VWAPeter Zhu2022-07-151-4/+0
| | | | | | This commit implements Objects on Variable Width Allocation. This allows Objects with more ivars to be embedded (i.e. contents directly follow the object header) which improves performance through better cache locality.
* YJIT: Refactor gen_opt_mod (#6078)Dave Schwantes2022-06-301-1/+3
| | | Refactor gen_opt_mod in YJIT
* YJIT: On-demand executable memory allocation; faster boot (#5944)Alan Wu2022-06-141-3/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This commit makes YJIT allocate memory for generated code gradually as needed. Previously, YJIT allocates all the memory it needs on boot in one go, leading to higher than necessary resident set size (RSS) and time spent on boot initializing the memory with a large memset(). Users should no longer need to search for a magic number to pass to `--yjit-exec-mem` since physical memory consumption should now more accurately reflect the requirement of the workload. YJIT now reserves a range of addresses on boot. This region start out with no access permission at all so buggy attempts to jump to the region crashes like before this change. To get this hardening at finer granularity than the page size, we fill each page with trapping instructions when we first allocate physical memory for the page. Most of the time applications don't need 256 MiB of executable code, so allocating on-demand ends up doing less total work than before. Case in point, a simple `ruby --yjit-call-threshold=1 -eitself` takes about half as long after this change. In terms of memory consumption, here is a table to give a rough summary of the impact: | Peak RSS in MiB | -eitself example | railsbench once | | :-------------: | ---------------: | --------------: | | before | 265 | 377 | | after | 11 | 143 | | no YJIT | 10 | 101 | A new module is introduced to handle allocation bookkeeping. `CodePtr` is moved into the module since it has a close relationship with the new `VirtualMemory` struct. This new interface has a slightly smaller surface than before in that marking a region as writable is no longer a public operation.
* Add tests for a variety of string-subclass operations (#5999)Noah Gibbs2022-06-101-5/+0
| | | | This way YJIT has to match CRuby for each of them. Remove unused string_p() Rust function
* Use bindgen to import Ruby constants wherever possible. (#5943)Noah Gibbs2022-06-061-53/+12
| | | | Constants that can't be imported via bindgen should have a comment saying why not.
* Use bindgen to import CRuby constants for YARV instruction bytecodesNoah Gibbs (and/or Benchmark CI)2022-05-261-110/+0
|
* Special-case jit_guard_known_class for strings. This can remove (#5920)Noah Gibbs2022-05-201-0/+5
| | | | runtime guard-checks for String#to_s, making some blocks too short to invalidate later. Add NOPs in those cases to reserve space.
* YJIT: Enable default rustc lints (warnings) (#5864)Alan Wu2022-04-291-236/+239
| | | | | | | | | | | | | | `rustc` performs in depth dead code analysis and issues warning even for things like unused struct fields and unconstructed enum variants. This was annoying for us during the port but hopefully they are less of an issue now. This patch enables all the unused warnings we disabled and address all the warnings we previously ignored. Generally, the approach I've taken is to use `cfg!` instead of using the `cfg` attribute and to delete code where it makes sense. I've put `#[allow(unused)]` on things we intentionally keep around for printf style debugging and on items that are too annoying to keep warning-free in all build configs.
* Rust YJITAlan Wu2022-04-271-0/+919
In December 2021, we opened an [issue] to solicit feedback regarding the porting of the YJIT codebase from C99 to Rust. There were some reservations, but this project was given the go ahead by Ruby core developers and Matz. Since then, we have successfully completed the port of YJIT to Rust. The new Rust version of YJIT has reached parity with the C version, in that it passes all the CRuby tests, is able to run all of the YJIT benchmarks, and performs similarly to the C version (because it works the same way and largely generates the same machine code). We've even incorporated some design improvements, such as a more fine-grained constant invalidation mechanism which we expect will make a big difference in Ruby on Rails applications. Because we want to be careful, YJIT is guarded behind a configure option: ```shell ./configure --enable-yjit # Build YJIT in release mode ./configure --enable-yjit=dev # Build YJIT in dev/debug mode ``` By default, YJIT does not get compiled and cargo/rustc is not required. If YJIT is built in dev mode, then `cargo` is used to fetch development dependencies, but when building in release, `cargo` is not required, only `rustc`. At the moment YJIT requires Rust 1.60.0 or newer. The YJIT command-line options remain mostly unchanged, and more details about the build process are documented in `doc/yjit/yjit.md`. The CI tests have been updated and do not take any more resources than before. The development history of the Rust port is available at the following commit for interested parties: https://github.com/Shopify/ruby/commit/1fd9573d8b4b65219f1c2407f30a0a60e537f8be Our hope is that Rust YJIT will be compiled and included as a part of system packages and compiled binaries of the Ruby 3.2 release. We do not anticipate any major problems as Rust is well supported on every platform which YJIT supports, but to make sure that this process works smoothly, we would like to reach out to those who take care of building systems packages before the 3.2 release is shipped and resolve any issues that may come up. [issue]: https://bugs.ruby-lang.org/issues/18481 Co-authored-by: Maxime Chevalier-Boisvert <maximechevalierb@gmail.com> Co-authored-by: Noah Gibbs <the.codefolio.guy@gmail.com> Co-authored-by: Kevin Newton <kddnewton@gmail.com>