| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
| |
* YJIT: Inline basic Ruby methods
* YJIT: Fix "InsnOut operand made it past register allocation"
checktype should not generate a useless instruction.
|
|
|
| |
Co-authored-by: Alan Wu <alansi.xingwu@shopify.com>
|
|
|
|
|
|
|
|
|
|
| |
Previously, the version-controlled `cruby_bindings.inc.rs` file
contained the build-time artifact `id.h`, which nobu mentioned hinders
the goal of having fewer magic numbers in the repository.
Lookup the IDs YJIT needs on boot. It costs cycles, but it's fine since
YJIT only uses a handful of IDs at the moment. No perceptible
degradation to boot time found in my testing.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Remove function call for String#bytesize
String size is stored in a consistent location, so we can eliminate the
function call.
* Update yjit/src/codegen.rs
Co-authored-by: Takashi Kokubun <takashikkbn@gmail.com>
---------
Co-authored-by: Maxime Chevalier-Boisvert <maximechevalierb@gmail.com>
Co-authored-by: Takashi Kokubun <takashikkbn@gmail.com>
|
|
|
|
|
|
| |
New Clippy lint in 1.72.0 is breaking our build as GitHub has updated
their image. No point hearing about lints from generated code we don't
manually write.
|
|
|
| |
Co-authored-by: Maxime Chevalier-Boisvert <maximechevalierb@gmail.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Remove rb_control_frame_t::__bp__ and optimize bmethod calls
This commit removes the __bp__ field from rb_control_frame_t. It was
introduced to help MJIT, but since MJIT was replaced by RJIT, we can use
vm_base_ptr() to compute it from the SP of the previous control frame
instead. Removing the field avoids needing to set it up when pushing new
frames.
Simply removing __bp__ would cause crashes since RJIT and YJIT used a
slightly different stack layout for bmethod calls than the interpreter.
At the moment of the call, the two layouts looked as follows:
┌────────────┐ ┌────────────┐
│ frame_base │ │ frame_base │
├────────────┤ ├────────────┤
│ ... │ │ ... │
├────────────┤ ├────────────┤
│ args │ │ args │
├────────────┤ └────────────┘<─prev_frame_sp
│ receiver │
prev_frame_sp─>└────────────┘
RJIT & YJIT interpreter
Essentially, vm_base_ptr() needs to compute the address to frame_base
given prev_frame_sp in the diagrams. The presence of the receiver
created an off-by-one situation.
Make the interpreter use the layout the JITs use for iseq-to-iseq
bmethod calls. Doing so removes unnecessary argument shifting and
vm_exec_core() re-entry from the interpreter, yielding a speed
improvement visible through `benchmark/vm_defined_method.yml`:
patched: 7578743.1 i/s
master: 4796596.3 i/s - 1.58x slower
C-to-iseq bmethod calls now store one more VALUE than before, but that
should have negligible impact on overall performance.
Note that re-entering vm_exec_core() used to be necessary for firing
TracePoint events, but that's no longer the case since
9121e57a5f50bc91bae48b3b91edb283bf96cb6b.
Closes ruby/ruby#6428
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The benchmark results show that this feature has either a positive or
no impact on performance. The memory usage is also mostly unchanged,
except in hexapdf, where there is a decrease in RSS.
-------------- ----------- ---------- --------- ----------- ---------- --------- -------------- -------------
bench master (ms) stddev (%) RSS (MiB) branch (ms) stddev (%) RSS (MiB) branch 1st itr master/branch
activerecord 70.8 2.2 56.0 71.7 2.2 56.0 0.99 0.99
erubi_rails 20.5 13.6 94.7 20.5 14.3 94.2 0.93 1.00
hexapdf 2541.0 0.7 212.8 2544.4 0.7 203.4 1.00 1.00
liquid-c 65.6 0.3 38.9 65.3 0.3 38.9 1.01 1.01
liquid-compile 63.7 0.3 34.6 61.1 0.2 34.6 1.04 1.04
liquid-render 163.1 0.1 37.1 163.3 0.1 37.1 1.00 1.00
mail 139.3 0.1 50.5 137.0 0.1 50.1 0.99 1.02
psych-load 2065.7 0.1 36.9 2068.2 0.1 37.3 1.00 1.00
railsbench 2034.6 0.5 103.9 2031.9 0.5 103.8 1.02 1.00
ruby-lsp 65.3 3.1 89.8 66.2 3.0 89.7 1.01 0.99
sequel 73.2 1.0 40.3 73.4 1.0 40.3 1.00 1.00
-------------- ----------- ---------- --------- ----------- ---------- --------- -------------- -------------
|
|
|
|
|
|
|
| |
* YJIT: Add codegen for Integer methods
* YJIT: Update dependencies
* YJIT: Fix Integer#[] for argc=2
|
|
|
|
|
|
|
| |
This reverts commit 9e678cdbd054f78576a8f21b3f97cccc395ade22.
Without the `unsafe` annotations, the SAFETY comments make less sense.
I want to keep the SAFETY comments.
|
| |
|
| |
|
|
|
|
|
|
|
| |
* YJIT: log the names of methods we call to in disasm
* Assert that pointer is not null
* Handle case where UTF8 conversion not possible
|
| |
|
|
|
|
|
| |
Using a constant shows intention better and is less noisy. It always
took me a second to parse the long expression.
|
|
|
|
|
|
|
|
|
|
|
| |
* Add job to check clippy lints in CI
* Address all remaining clippy lints
* Check lints on arm64 as well
* Apply latest clippy lints
* Do not exit 0 on clippy warnings
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When an object becomes "too complex" (in other words it has too many
variations in the shape tree), we transition it to use a "too complex"
shape and use a hash for storing instance variables.
Without this patch, there were rare cases where shape tree growth could
"explode" and cause performance degradation on what would otherwise have
been cached fast paths.
This patch puts a limit on shape tree growth, and gracefully degrades in
the rare case where there could be a factorial growth in the shape tree.
For example:
```ruby
class NG; end
HUGE_NUMBER.times do
NG.new.instance_variable_set(:"@unique_ivar_#{_1}", 1)
end
```
We consider objects to be "too complex" when the object's class has more
than SHAPE_MAX_VARIATIONS (currently 8) leaf nodes in the shape tree and
the object introduces a new variation (a new leaf node) associated with
that class.
For example, new variations on instances of the following class would be
considered "too complex" because those instances create more than 8
leaves in the shape tree:
```ruby
class Foo; end
9.times { Foo.new.instance_variable_set(":@uniq_#{_1}", 1) }
```
However, the following class is *not* too complex because it only has
one leaf in the shape tree:
```ruby
class Foo
def initialize
@a = @b = @c = @d = @e = @f = @g = @h = @i = nil
end
end
9.times { Foo.new }
``
This case is rare, so we don't expect this change to impact performance
of most applications, but it needs to be handled.
Co-Authored-By: Aaron Patterson <tenderlove@ruby-lang.org>
|
| |
|
| |
|
|
|
|
|
|
|
| |
instead of FILE*.
Using C.fprintf is slower than String manipulation on memory. I'm going
to change the way MJIT writes files, and this is a prerequisite for it.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* YJIT: Always encode Opnd::Value in 64 bits on x86_64
for GC offsets
Co-authored-by: Alan Wu <alansi.xingwu@shopify.com>
* Introduce heap_object_p
* Leave original mov intact
* Remove unneeded branches
* Add a test for movabs
Co-authored-by: Alan Wu <alansi.xingwu@shopify.com>
|
|
|
|
|
|
|
|
|
| |
* YJIT: Support invokeblock
* Update yjit/src/backend/arm64/mod.rs
* Update yjit/src/codegen.rs
Co-authored-by: Maxime Chevalier-Boisvert <maximechevalierb@gmail.com>
|
| |
|
| |
|
|
|
|
|
| |
Previously, enabling only "disasm" didn't actually build. Since these
two features are closely related and we don't really use one without the
other, let's simplify and merge the two features together.
|
|
|
|
|
| |
* fixes more clippy warnings
* Fix x86 c_callable to have doc_strings
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Implement optimize send in yjit
This successfully makes all our benchmarks exit way less for optimize send reasons.
It makes some benchmarks faster, but not by as much as I'd like. I think this implementation
works, but there are definitely more optimial arrangements. For example, what if we compiled
send to a jump table? That seems like perhaps the most optimal we could do, but not obvious (to me)
how to implement give our current setup.
Co-authored-by: Alan Wu <XrXr@users.noreply.github.com>
* Attempt at fixing the issues raised by @XrXr
* fix allowlist
* returns 0 instead of nil when not found
* remove comment about encoding exception
* Fix up c changes
* Update assert
Co-authored-by: Alan Wu <XrXr@users.noreply.github.com>
* get rid of unneeded code and fix the flags
* Apply suggestions from code review
Co-authored-by: Alan Wu <XrXr@users.noreply.github.com>
* rename and fix typo
Co-authored-by: Alan Wu <XrXr@users.noreply.github.com>
|
|
|
|
| |
This reverts commit 9a6803c90b817f70389cae10d60b50ad752da48f.
|
|
|
|
| |
This reverts commit 68bc9e2e97d12f80df0d113e284864e225f771c2.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Object Shapes is used for accessing instance variables and representing the
"frozenness" of objects. Object instances have a "shape" and the shape
represents some attributes of the object (currently which instance variables are
set and the "frozenness"). Shapes form a tree data structure, and when a new
instance variable is set on an object, that object "transitions" to a new shape
in the shape tree. Each shape has an ID that is used for caching. The shape
structure is independent of class, so objects of different types can have the
same shape.
For example:
```ruby
class Foo
def initialize
# Starts with shape id 0
@a = 1 # transitions to shape id 1
@b = 1 # transitions to shape id 2
end
end
class Bar
def initialize
# Starts with shape id 0
@a = 1 # transitions to shape id 1
@b = 1 # transitions to shape id 2
end
end
foo = Foo.new # `foo` has shape id 2
bar = Bar.new # `bar` has shape id 2
```
Both `foo` and `bar` instances have the same shape because they both set
instance variables of the same name in the same order.
This technique can help to improve inline cache hits as well as generate more
efficient machine code in JIT compilers.
This commit also adds some methods for debugging shapes on objects. See
`RubyVM::Shape` for more details.
For more context on Object Shapes, see [Feature: #18776]
Co-Authored-By: Aaron Patterson <tenderlove@ruby-lang.org>
Co-Authored-By: Eileen M. Uchitelle <eileencodes@gmail.com>
Co-Authored-By: John Hawthorn <john@hawthorn.email>
|
|
|
|
|
|
|
|
|
|
| |
Revert "* expand tabs. [ci skip]"
This reverts commit 830b5b5c351c5c6efa5ad461ae4ec5085e5f0275.
Revert "This commit implements the Object Shapes technique in CRuby."
This reverts commit 9ddfd2ca004d1952be79cf1b84c52c79a55978f4.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Object Shapes is used for accessing instance variables and representing the
"frozenness" of objects. Object instances have a "shape" and the shape
represents some attributes of the object (currently which instance variables are
set and the "frozenness"). Shapes form a tree data structure, and when a new
instance variable is set on an object, that object "transitions" to a new shape
in the shape tree. Each shape has an ID that is used for caching. The shape
structure is independent of class, so objects of different types can have the
same shape.
For example:
```ruby
class Foo
def initialize
# Starts with shape id 0
@a = 1 # transitions to shape id 1
@b = 1 # transitions to shape id 2
end
end
class Bar
def initialize
# Starts with shape id 0
@a = 1 # transitions to shape id 1
@b = 1 # transitions to shape id 2
end
end
foo = Foo.new # `foo` has shape id 2
bar = Bar.new # `bar` has shape id 2
```
Both `foo` and `bar` instances have the same shape because they both set
instance variables of the same name in the same order.
This technique can help to improve inline cache hits as well as generate more
efficient machine code in JIT compilers.
This commit also adds some methods for debugging shapes on objects. See
`RubyVM::Shape` for more details.
For more context on Object Shapes, see [Feature: #18776]
Co-Authored-By: Aaron Patterson <tenderlove@ruby-lang.org>
Co-Authored-By: Eileen M. Uchitelle <eileencodes@gmail.com>
Co-Authored-By: John Hawthorn <john@hawthorn.email>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Initial support for VM_CALL_ARGS_SPLAT
This implements support for calls with splat (*) for some methods. In
benchmarks this made very little difference for most benchmarks, but a
large difference for binarytrees. Looking at side exits, many
benchmarks now don't exit for splat, but exit for some other
reason. Binarytrees however had a number of calls that used splat args
that are now much faster. In my non-scientific benchmarking this made
splat args performance on par with not using splat args at all.
* Fix wording and whitespace
Co-authored-by: Maxime Chevalier-Boisvert <maximechevalierb@gmail.com>
* Get rid of side_effect reassignment
Co-authored-by: Maxime Chevalier-Boisvert <maximechevalierb@gmail.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Create code generation func
* Make rb_vm_concat_array available to use in Rust
* Map opcode to code gen func
* Implement code gen for concatarray
* Add test for concatarray
* Use new asm backend
* Add comment to C func wrapper
|
|
|
|
|
|
|
|
|
|
|
| |
(https://github.com/Shopify/ruby/pull/404)
We have a large extern block in cruby.rs leftover from the port. We can
use bindgen for it now and reserve the manual declaration for just a
handful of vm_insnhelper.c functions.
Fixup a few minor discrepencies bindgen found between the C declaration
and the manual declaration. Mostly missing `const` on the C side.
|
|
|
|
|
|
| |
* Fix compile errors on arm on the CI
* Fix typo
|
| |
|
|
|
|
|
|
|
|
|
| |
(#6191)
Teach getblockparamproxy to handle the no-block case without exiting
Co-authored-by: John Hawthorn <john@hawthorn.email>
Co-authored-by: John Hawthorn <john@hawthorn.email>
|
|
|
|
|
|
| |
This commit implements Objects on Variable Width Allocation. This allows
Objects with more ivars to be embedded (i.e. contents directly follow the
object header) which improves performance through better cache locality.
|
|
|
| |
Refactor gen_opt_mod in YJIT
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This commit makes YJIT allocate memory for generated code gradually as
needed. Previously, YJIT allocates all the memory it needs on boot in
one go, leading to higher than necessary resident set size (RSS) and
time spent on boot initializing the memory with a large memset().
Users should no longer need to search for a magic number to pass to
`--yjit-exec-mem` since physical memory consumption should now more
accurately reflect the requirement of the workload.
YJIT now reserves a range of addresses on boot. This region start out
with no access permission at all so buggy attempts to jump to the region
crashes like before this change. To get this hardening at finer
granularity than the page size, we fill each page with trapping
instructions when we first allocate physical memory for the page.
Most of the time applications don't need 256 MiB of executable code, so
allocating on-demand ends up doing less total work than before. Case in
point, a simple `ruby --yjit-call-threshold=1 -eitself` takes about
half as long after this change. In terms of memory consumption, here is
a table to give a rough summary of the impact:
| Peak RSS in MiB | -eitself example | railsbench once |
| :-------------: | ---------------: | --------------: |
| before | 265 | 377 |
| after | 11 | 143 |
| no YJIT | 10 | 101 |
A new module is introduced to handle allocation bookkeeping.
`CodePtr` is moved into the module since it has a close relationship
with the new `VirtualMemory` struct. This new interface has a slightly
smaller surface than before in that marking a region as writable is no
longer a public operation.
|
|
|
|
| |
This way YJIT has to match CRuby for each of them.
Remove unused string_p() Rust function
|
|
|
|
| |
Constants that can't be imported via bindgen should have
a comment saying why not.
|
| |
|
|
|
|
| |
runtime guard-checks for String#to_s, making some blocks too
short to invalidate later. Add NOPs in those cases to reserve space.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
`rustc` performs in depth dead code analysis and issues warning
even for things like unused struct fields and unconstructed enum
variants. This was annoying for us during the port but hopefully
they are less of an issue now.
This patch enables all the unused warnings we disabled and address
all the warnings we previously ignored. Generally, the approach I've
taken is to use `cfg!` instead of using the `cfg` attribute and
to delete code where it makes sense. I've put `#[allow(unused)]`
on things we intentionally keep around for printf style debugging
and on items that are too annoying to keep warning-free in all
build configs.
|
|
In December 2021, we opened an [issue] to solicit feedback regarding the
porting of the YJIT codebase from C99 to Rust. There were some
reservations, but this project was given the go ahead by Ruby core
developers and Matz. Since then, we have successfully completed the port
of YJIT to Rust.
The new Rust version of YJIT has reached parity with the C version, in
that it passes all the CRuby tests, is able to run all of the YJIT
benchmarks, and performs similarly to the C version (because it works
the same way and largely generates the same machine code). We've even
incorporated some design improvements, such as a more fine-grained
constant invalidation mechanism which we expect will make a big
difference in Ruby on Rails applications.
Because we want to be careful, YJIT is guarded behind a configure
option:
```shell
./configure --enable-yjit # Build YJIT in release mode
./configure --enable-yjit=dev # Build YJIT in dev/debug mode
```
By default, YJIT does not get compiled and cargo/rustc is not required.
If YJIT is built in dev mode, then `cargo` is used to fetch development
dependencies, but when building in release, `cargo` is not required,
only `rustc`. At the moment YJIT requires Rust 1.60.0 or newer.
The YJIT command-line options remain mostly unchanged, and more details
about the build process are documented in `doc/yjit/yjit.md`.
The CI tests have been updated and do not take any more resources than
before.
The development history of the Rust port is available at the following
commit for interested parties:
https://github.com/Shopify/ruby/commit/1fd9573d8b4b65219f1c2407f30a0a60e537f8be
Our hope is that Rust YJIT will be compiled and included as a part of
system packages and compiled binaries of the Ruby 3.2 release. We do not
anticipate any major problems as Rust is well supported on every
platform which YJIT supports, but to make sure that this process works
smoothly, we would like to reach out to those who take care of building
systems packages before the 3.2 release is shipped and resolve any
issues that may come up.
[issue]: https://bugs.ruby-lang.org/issues/18481
Co-authored-by: Maxime Chevalier-Boisvert <maximechevalierb@gmail.com>
Co-authored-by: Noah Gibbs <the.codefolio.guy@gmail.com>
Co-authored-by: Kevin Newton <kddnewton@gmail.com>
|