ruby.git - rhe's working repository

	Commit message (Collapse)	Author	Age	Files	Lines
*	YJIT: Introduce Target::SideExit (#7712)	Takashi Kokubun	2023-04-14	1	-5/+8
\| \| \| \| \| \| \|	* YJIT: Introduce Target::SideExit * YJIT: Obviate Insn::SideExitContext * YJIT: Avoid cloning a Context for each insn
*	YJIT: Fix large ISeq rejection (#7576)	Alan Wu	2023-03-21	1	-1/+10
\| \| \| \| \| \| \| \| \| \| \| \|	We crashed in some edge cases due to the recent change to not compile encoded iseqs that are larger than `u16::MAX`. - Match the C signature of rb_yjit_constant_ic_update() and clamp down to `IseqIdx` size - Return failure instead of panicking with `unwrap()` in codegen when the iseq is too large Co-authored-by: Maxime Chevalier-Boisvert <maxime.chevalierboisvert@shopify.com> Co-authored-by: Noah Gibbs <noah.gibbs@shopify.com>
*	YJIT: Delete --yjit-global-constant-state (#7559)	Alan Wu	2023-03-17	1	-27/+5
\| \| \| \| \|	It was useful for evaluating 6068da8937d7e4358943f95e7450dae7179a7763 but I think we should remove it now to make the logic around invalidation more straight forward.
*	YJIT: Use raw pointers and shared references over `Rc<RefCell<_>>`	Alan Wu	2023-03-17	1	-61/+58
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	`Rc` and `RefCell` both incur runtime space costs. In addition, `RefCell` has given us some headaches with the non obvious borrow panics it likes to throw out. The latest one started with 7fd53eeb46db261bbc20025cdab70096245a5cbe and is yet to be resolved. Since we already rely on the GC to properly reclaim memory for `Block` and `Branch`, we might as well stop paying the overhead of `Rc` and `RefCell`. The `RefCell` panics go away with this change, too. On 25 iterations of `railsbench` with a stats build I got `yjit_alloc_size: 8,386,129 => 7,348,637`, with the new memory size 87.6% of the status quo. This makes the metadata and machine code size roughly line up one-to-one. The general idea here is to use `&` shared references with [interior mutability][1] with `Cell`, which doesn't take any extra space. The `noalias` requirement that `&mut` imposes is way too hard to meet and verify. Imagine replacing places where we would've gotten `BorrowError` from `RefCell` with Rust/LLVM miscompiling us due to aliasing violations. With shared references, we don't have to think about subtle cases like the GC _sometimes_ calling the mark callback while codegen has an aliasing reference in a stack frame below. We mostly only need to worry about liveness, with which the GC already helps. There is now a clean split between blocks and branches that are not yet fully constructed and ones that are "in-service", so to speak. Working with `PendingBranch` and `JITState` don't really involve `unsafe` stuff. This change allows `Branch` and `Block` to not have as many optional fields as many of them are only optional during compilation. Fields that change post-compilation are wrapped in `Cell` to facilitate mutation through shared references. I do some `unsafe` dances here. I've included just a couple tests to run with Miri (`cargo +nightly miri test miri`). We can add more Miri tests if desired. [1]: https://doc.rust-lang.org/std/cell/struct.UnsafeCell.html
*	YJIT: use u16 for insn_idx instead of u32 (#7534)	Maxime Chevalier-Boisvert	2023-03-15	1	-2/+2
\|
*	YJIT: Delete stale `frozen_bytes` related code (#7423)	Alan Wu	2023-03-02	1	-11/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	The code and comments in there have been disabled by comments for a long time. The issues that the counter used to solve are now solved more comprehensively by "runningness" [tracking][1] introduced by Code GC and [delayed deallocation][2]. Having a single counter doesn't fit our current model where code pages that could be touched or not are interleaved, anyway. Just delete the code. [1]: e7c71c6c9271b0c29f210769159090e17128e740 [2]: a0b0365e905e1ac51998ace7e6fc723406a2f157
*	YJIT: Use a boxed slice for outgoing branches and cme dependencies (#7409)	Takashi Kokubun	2023-03-01	1	-3/+1
\| \| \| \| \|	YJIT: Use a boxed slice for outgoing branches and cme dependencies
*	Fix typos in YJIT [ci skip]	Alan Wu	2023-02-02	1	-1/+1
\|
*	Enable `clippy` checks for yjit in CI (#7093)	Ian Ker-Seymer	2023-01-12	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \|	* Add job to check clippy lints in CI * Address all remaining clippy lints * Check lints on arm64 as well * Apply latest clippy lints * Do not exit 0 on clippy warnings
*	YJIT: Deallocate `struct Block` to plug memory leaks	Alan Wu	2022-11-30	1	-12/+19
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Previously we essentially never freed block even after invalidation. Their reference count never reached zero for a couple of reasons: 1. `Branch::block` formed a cycle with the block holding the branch 2. Strong count on a branch that has ever contained a stub never reached 0 because we increment the `.clone()` call for `BranchRef::into_raw()` didn't have a matching decrement. It's not safe to immediately deallocate blocks during invalidation since `branch_stub_hit()` can end up running with a branch pointer from an invalidated branch. To plug the leaks, we wait until code GC or global invalidation and deallocate the blocks for iseqs that are definitely not running.
*	YJIT: Deallocate when assumptions tables are empty	Alan Wu	2022-11-30	1	-0/+28
\| \| \| \| \| \|	When we run global invalidation for TracePoints or code GC, we clear out all blocks in our assumptions table but we don't deallocate the backing buffers. Let's reclaim some memory during these rare events.
*	YJIT: Skip padding jumps to side exits on Arm (#6790)	Takashi Kokubun	2022-11-22	1	-1/+1
\| \| \| \| \| \| \| \| \|	YJIT: Skip padding jumps to side exits Co-authored-by: Maxime Chevalier-Boisvert <maxime.chevalierboisvert@shopify.com> Co-authored-by: Alan Wu <alansi.xingwu@shopify.com> Co-authored-by: Maxime Chevalier-Boisvert <maxime.chevalierboisvert@shopify.com> Co-authored-by: Alan Wu <alansi.xingwu@shopify.com>
*	YJIT: Stop wrapping CmePtr with CmeDependency (#6747)	Takashi Kokubun	2022-11-16	1	-1/+1
\| \| \| \| \|	* YJIT: Stop wrapping CmePtr with CmeDependency * YJIT: Fix an outdated comment [ci skip]
*	YJIT: Invalidate redefined methods only through cme (#6734)	Takashi Kokubun	2022-11-15	1	-65/+2
\| \| \| \| \|	Co-authored-by: Alan Wu <alansi.xingwu@shopify.com> Co-authored-by: Alan Wu <alansi.xingwu@shopify.com>
*	YJIT: Reset dropped_bytes when patching code	Alan Wu	2022-11-08	1	-0/+3
\| \| \| \| \| \| \| \| \| \|	We switch to a new page when we detect dropped_bytes flipping from false to true. Previously, when we patch code for invalidation during code gc, we start with the flag being set to true, so we failed to apply patches that straddle pages. We would write out jumps half way and then stop, which left the code corrupted. Reset the flag before patching so we patch across pages properly.
*	Code clean around unused code for some architectures or features (#6581)	Jimmy Miller	2022-10-18	1	-1/+1
\|
*	Allow passing a Rust closure to rb_iseq_callback (#6575)	Takashi Kokubun	2022-10-18	1	-4/+2
\|
*	YJIT: Avoid creating payloads for non-JITed ISEQs (#6549)	Takashi Kokubun	2022-10-14	1	-1/+1
\| \| \| \| \|	* YJIT: Count freed ISEQs * YJIT: Avoid creating payloads for non-JITed ISEQs
*	A bunch of clippy auto fixes for yjit (#6476)	Jimmy Miller	2022-09-30	1	-1/+1
\|
*	YJIT: Implement specialized respond_to? (#6363)	John Hawthorn	2022-09-14	1	-0/+19
\| \| \| \| \| \| \| \| \| \| \|	* Add rb_callable_method_entry_or_negative * YJIT: Implement specialized respond_to? This implements a specialized respond_to? in YJIT. * Update yjit/src/codegen.rs Co-authored-by: Maxime Chevalier-Boisvert <maximechevalierb@gmail.com>
*	New constant caching insn: opt_getconstant_path	John Hawthorn	2022-09-01	1	-50/+32
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Previously YARV bytecode implemented constant caching by having a pair of instructions, opt_getinlinecache and opt_setinlinecache, wrapping a series of getconstant calls (with putobject providing supporting arguments). This commit replaces that pattern with a new instruction, opt_getconstant_path, handling both getting/setting the inline cache and fetching the constant on a cache miss. This is implemented by storing the full constant path as a null-terminated array of IDs inside of the IC structure. idNULL is used to signal an absolute constant reference. $ ./miniruby --dump=insns -e '::Foo::Bar::Baz' == disasm: #<ISeq:<main>@-e:1 (1,0)-(1,13)> (catch: FALSE) 0000 opt_getconstant_path <ic:0 ::Foo::Bar::Baz> ( 1)[Li] 0002 leave The motivation for this is that we had increasingly found the need to disassemble the instructions between the opt_getinlinecache and opt_setinlinecache in order to determine the constant we are fetching, or otherwise store metadata. This disassembly was done: * In opt_setinlinecache, to register the IC against the constant names it is using for granular invalidation. * In rb_iseq_free, to unregister the IC from the invalidation table. * In YJIT to find the position of a opt_getinlinecache instruction to invalidate it when the cache is populated * In YJIT to register the constant names being used for invalidation. With this change we no longe need disassemly for these (in fact rb_iseq_each is now unused), as the list of constant names being referenced is held in the IC. This should also make it possible to make more optimizations in the future. This may also reduce the size of iseqs, as previously each segment required 32 bytes (on 64-bit platforms) for each constant segment. This implementation only stores one ID per-segment. There should be no significant performance change between this and the previous implementation. Previously opt_getinlinecache was a "leaf" instruction, but it included a jump (almost always to a separate cache line). Now opt_getconstant_path is a non-leaf (it may raise/autoload/call const_missing) but it does not jump. These seem to even out.
*	Use new assembler to support global invalidation on A64	Alan Wu	2022-08-29	1	-7/+10
\| \| \| \| \| \| \| \| \| \| \| \| \|	Previously, we patched in an x64 JMP even on A64, which resulted in invalid machine code. Use the new assembler to generate a jump instead. Add an assert to make sure patches don't step on each other since it's less clear cut on A64, where the size of the jump varies depending on its placement relative to the target. Fixes a lot of tests that use `set_trace_func` in `test_insns.rb`. PR: https://github.com/Shopify/ruby/pull/379
*	Add ability to trace exit locations in yjit (#5970)	Eileen M. Uchitelle	2022-06-09	1	-0/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When running with `--yjit-stats` turned on, yjit can inform the user what the most common exits are. While this is useful information it doesn't tell you the source location of the code that exited or what the code that exited looks like. This change intends to fix that. To use the feature, run yjit with the `--yjit-trace-exits` option, which will record the backtrace for every exit that occurs. This functionality requires the stats feature to be turned on. Calling `--yjit-trace-exits` will automatically set the `--yjit-stats` option. Users must call `RubyVM::YJIT.dump_exit_locations(filename)` which will Marshal dump the contents of `RubyVM::YJIT.exit_locations` into a file based on the passed filename. Example usage: Given the following script, we write to a file called `concat_array.dump` the results of `RubyVM::YJIT.exit_locations`. ```ruby def concat_array ["t", "r", x = "u", "e"].join end 1000.times do concat_array end RubyVM::YJIT.dump_exit_locations("concat_array.dump") ``` When we run the file with this branch and the appropriate flags the stacktrace will be recorded. Note Stackprof needs to be installed or you need to point to the library directly. ``` ./ruby --yjit --yjit-call-threshold=1 --yjit-trace-exits -I/Users/eileencodes/open_source/stackprof/lib test.rb ``` We can then read the dump file with Stackprof: ``` ./ruby -I/Users/eileencodes/open_source/stackprof/lib/ /Users/eileencodes/open_source/stackprof/bin/stackprof --text concat_array.dump ``` Results will look similar to the following: ``` ================================== Mode: () Samples: 1817 (0.00% miss rate) GC: 0 (0.00%) ================================== TOTAL (pct) SAMPLES (pct) FRAME 1001 (55.1%) 1001 (55.1%) concatarray 335 (18.4%) 335 (18.4%) invokeblock 178 (9.8%) 178 (9.8%) send 140 (7.7%) 140 (7.7%) opt_getinlinecache ...etc... ``` Simply inspecting the `concatarray` method will give `SOURCE UNAVAILABLE` because the source is insns.def. ``` ./ruby -I/Users/eileencodes/open_source/stackprof/lib/ /Users/eileencodes/open_source/stackprof/bin/stackprof --text concat_array.dump --method concatarray ``` Result: ``` concatarray (nonexistent.def:1) samples: 1001 self (55.1%) / 1001 total (55.1%) callers: 1000 ( 99.9%) Object#concat_array 1 ( 0.1%) Gem.suffixes callees (0 total): code: SOURCE UNAVAILABLE ``` However if we go deeper to the callee we can see the exact source of the `concatarray` exit. ``` ./ruby -I/Users/eileencodes/open_source/stackprof/lib/ /Users/eileencodes/open_source/stackprof/bin/stackprof --text concat_array.dump --method Object#concat_array ``` ``` Object#concat_array (/Users/eileencodes/open_source/rust_ruby/test.rb:1) samples: 0 self (0.0%) / 1000 total (55.0%) callers: 1000 ( 100.0%) block in <main> callees (1000 total): 1000 ( 100.0%) concatarray code: \| 1 \| def concat_array 1000 (55.0%) \| 2 \| ["t", "r", x = "u", "e"].join \| 3 \| end ``` The `--walk` option is recommended for this feature as it make it easier to traverse the tree of exits. Goals of this feature: This feature is meant to give more information when working on YJIT. The idea is that if we know what code is exiting we can decide what areas to prioritize when fixing exits. In some cases this means adding prioritizing avoiding certain exits in yjit. In more complex cases it might mean changing the Ruby code to be more performant when run with yjit. Ultimately the more information we have about what code is exiting AND why, the better we can make yjit. Known limitations: * Due to tracing exits, running this on large codebases like Rails can be quite slow. * On complex methods it can still be difficult to pinpoint the exact cause of an exit. * Stackprof is a requirement to to view the backtrace information from the dump file. Co-authored-by: Aaron Patterson <tenderlove@ruby-lang.org> Co-authored-by: Aaron Patterson <tenderlove@ruby-lang.org>
*	Use bindgen to import Ruby constants wherever possible. (#5943)	Noah Gibbs	2022-06-06	1	-1/+1
\| \| \| \|	Constants that can't be imported via bindgen should have a comment saying why not.
*	Use bindgen to import CRuby constants for YARV instruction bytecodes	Noah Gibbs (and/or Benchmark CI)	2022-05-26	1	-3/+3
\|
*	YJIT: Adopt Clippy suggestions we like	Alan Wu	2022-04-29	1	-15/+15
\| \| \| \| \| \| \| \| \|	This adopts most suggestions that rust-clippy is confident enough to auto apply. The manual changes mostly fix manual if-lets and take opportunities to use the `Default` trait on standard collections. Co-authored-by: Kevin Newton <kddnewton@gmail.com> Co-authored-by: Maxime Chevalier-Boisvert <maxime.chevalierboisvert@shopify.com>
*	Rust YJIT	Alan Wu	2022-04-27	1	-0/+585
	In December 2021, we opened an [issue] to solicit feedback regarding the porting of the YJIT codebase from C99 to Rust. There were some reservations, but this project was given the go ahead by Ruby core developers and Matz. Since then, we have successfully completed the port of YJIT to Rust. The new Rust version of YJIT has reached parity with the C version, in that it passes all the CRuby tests, is able to run all of the YJIT benchmarks, and performs similarly to the C version (because it works the same way and largely generates the same machine code). We've even incorporated some design improvements, such as a more fine-grained constant invalidation mechanism which we expect will make a big difference in Ruby on Rails applications. Because we want to be careful, YJIT is guarded behind a configure option: ```shell ./configure --enable-yjit # Build YJIT in release mode ./configure --enable-yjit=dev # Build YJIT in dev/debug mode ``` By default, YJIT does not get compiled and cargo/rustc is not required. If YJIT is built in dev mode, then `cargo` is used to fetch development dependencies, but when building in release, `cargo` is not required, only `rustc`. At the moment YJIT requires Rust 1.60.0 or newer. The YJIT command-line options remain mostly unchanged, and more details about the build process are documented in `doc/yjit/yjit.md`. The CI tests have been updated and do not take any more resources than before. The development history of the Rust port is available at the following commit for interested parties: https://github.com/Shopify/ruby/commit/1fd9573d8b4b65219f1c2407f30a0a60e537f8be Our hope is that Rust YJIT will be compiled and included as a part of system packages and compiled binaries of the Ruby 3.2 release. We do not anticipate any major problems as Rust is well supported on every platform which YJIT supports, but to make sure that this process works smoothly, we would like to reach out to those who take care of building systems packages before the 3.2 release is shipped and resolve any issues that may come up. [issue]: https://bugs.ruby-lang.org/issues/18481 Co-authored-by: Maxime Chevalier-Boisvert <maximechevalierb@gmail.com> Co-authored-by: Noah Gibbs <the.codefolio.guy@gmail.com> Co-authored-by: Kevin Newton <kddnewton@gmail.com>