aboutsummaryrefslogtreecommitdiffstats
path: root/ext/ripper
Commit message (Collapse)AuthorAgeFilesLines
* Change the semantics of rb_postponed_job_registerKJ Tsanaktsidis2023-12-101-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Our current implementation of rb_postponed_job_register suffers from some safety issues that can lead to interpreter crashes (see bug #1991). Essentially, the issue is that jobs can be called with the wrong arguments. We made two attempts to fix this whilst keeping the promised semantics, but: * The first one involved masking/unmasking when flushing jobs, which was believed to be too expensive * The second one involved a lock-free, multi-producer, single-consumer ringbuffer, which was too complex The critical insight behind this third solution is that essentially the only user of these APIs are a) internal, or b) profiling gems. For a), none of the usages actually require variable data; they will work just fine with the preregistration interface. For b), generally profiling gems only call a single callback with a single piece of data (which is actually usually just zero) for the life of the program. The ringbuffer is complex because it needs to support multi-word inserts of job & data (which can't be atomic); but nobody actually even needs that functionality, really. So, this comit: * Introduces a pre-registration API for jobs, with a GVL-requiring rb_postponed_job_prereigster, which returns a handle which can be used with an async-signal-safe rb_postponed_job_trigger. * Deprecates rb_postponed_job_register (and re-implements it on top of the preregister function for compatability) * Moves all the internal usages of postponed job register pre-registration
* Stop creating ripper.h because it's not usedyui-knk2023-10-202-2/+2
|
* ripper: Support member references in the DSLNobuyoshi Nakada2023-10-101-1/+1
|
* Use rb_node_opt_arg_t and rb_node_kw_arg_t instead of NODEyui-knk2023-10-011-2/+2
|
* Extract `ripper_parser_params`Nobuyoshi Nakada2023-09-301-76/+42
|
* Change RNode structure from union to structyui-knk2023-09-281-3/+2
| | | | | | | | | | | | | | | | | | | | | | | All kind of AST nodes use same struct RNode, which has u1, u2, u3 union members for holding different kind of data. This has two problems. 1. Low flexibility of data structure Some nodes, for example NODE_TRUE, don’t use u1, u2, u3. On the other hand, NODE_OP_ASGN2 needs more than three union members. However they use same structure definition, need to allocate three union members for NODE_TRUE and need to separate NODE_OP_ASGN2 into another node. This change removes the restriction so make it possible to change data structure by each node type. 2. No compile time check for union member access It’s developer’s responsibility for using correct member for each node type when it’s union. This change clarifies which node has which type of fields and enables compile time check. This commit also changes node_buffer_elem_struct buf management to handle different size data with alignment.
* ripper: Support named references in the DSLNobuyoshi Nakada2023-09-251-9/+12
|
* ripper: Preprocess ripper-dispatchable types onlyNobuyoshi Nakada2023-09-171-4/+3
| | | | Keep the other types, which not having setter macros for ripper.
* Set ripper_init.c.tmpl to C mode [ci skip]Nobuyoshi Nakada2023-09-101-0/+1
|
* include missing header卜部昌平2023-08-251-0/+1
|
* tool/update-deps --fix卜部昌平2023-08-251-0/+2
|
* Fix `#line` directive filename of ripper.cyui-knk2023-07-161-1/+1
| | | | | | | | | | | | | | | | Before: ```c /* First part of user prologue. */ #line 14 "parse.y" ``` After: ```c /* First part of user prologue. */ #line 14 "ripper.y" ```
* Fix null pointer access in Ripper#initializeNobuyoshi Nakada2023-07-161-3/+3
| | | | | | In `rb_ruby_ripper_parser_allocate`, `r->p` is NULL between creating `self` and `parser_params` assignment. As GC can happen there, the typed-data functions for it need to consider the case.
* Use functions defined by parser_st.c to reduce dependency on st.cyui-knk2023-07-151-0/+1
|
* Include ripper.h into `$distcleanfiles`yui-knk2023-07-091-1/+1
|
* More dependencies for ripperNobuyoshi Nakada2023-06-291-0/+1
|
* Fix memory leak in RipperPeter Zhu2023-06-281-0/+1
| | | | | | | | | | | | | | | | The following script leaks memory in Ripper: ```ruby require "ripper" 20.times do 100_000.times do Ripper.parse("") end puts `ps -o rss= -p #{$$}` end ```
* Add missing dependenciesNobuyoshi Nakada2023-06-121-1/+1
|
* [Feature #19719] Universal Parseryui-knk2023-06-128-19/+1267
| | | | | | | | | Introduce Universal Parser mode for the parser. This commit includes these changes: * Introduce `UNIVERSAL_PARSER` macro. All of CRuby related functions are passed via `struct rb_parser_config_struct` when this macro is enabled. * Add CI task with 'cppflags=-DUNIVERSAL_PARSER' for ubuntu.
* Ripper does not depend on Bison [ci skip]yui-knk2023-06-031-1/+0
| | | | It also uses Lrama then no dependency on Bison.
* No need to define "BISON" on extconf.rbyui-knk2023-06-021-10/+1
| | | | "BISON" is defined in "ext/ripper/depend".
* Process parse.y without temporary filesNobuyoshi Nakada2023-05-151-1/+1
|
* Add user argument to some macros used by bisonNobuyoshi Nakada2023-05-141-4/+1
|
* Preprocess input parse.y from stdinNobuyoshi Nakada2023-05-142-19/+26
|
* Use Lrama LALR parser generator instead of Bisonv3_3_0_preview1Yuichiro Kaneko2023-05-122-5/+7
| | | | | https://bugs.ruby-lang.org/issues/19637 Co-authored-by: Nobuyoshi Nakada <nobu@ruby-lang.org>
* Update VPATH for socket, & dependenciesMatt Valentine-House2023-04-061-0/+12
| | | | | | | | | | | | | | | | | | The socket extensions rubysocket.h pulls in the "private" include/gc.h, which now depends on vm_core.h. vm_core.h pulls in id.h when tool/update-deps generates the dependencies for the makefiles, it generates the line for id.h to be based on VPATH, which is configured in the extconf.rb for each of the extensions. By default VPATH does not include the actual source directory of the current Ruby so the dependency fails to resolve and linking fails. We need to append the topdir and top_srcdir to VPATH to have the dependancy picked up correctly (and I believe we need both of these to cope with in-tree and out-of-tree builds). I copied this from the approach taken in https://github.com/ruby/ruby/blob/master/ext/objspace/extconf.rb#L3
* Update the depend filesMatt Valentine-House2023-02-281-1/+0
|
* Remove intern/gc.h from Make depsMatt Valentine-House2023-02-271-1/+0
|
* Extract include/ruby/internal/attr/packed_struct.hNobuyoshi Nakada2023-02-081-0/+1
| | | | | | | | | Split `PACKED_STRUCT` and `PACKED_STRUCT_UNALIGNED` macros into the macros bellow: * `RBIMPL_ATTR_PACKED_STRUCT_BEGIN` * `RBIMPL_ATTR_PACKED_STRUCT_END` * `RBIMPL_ATTR_PACKED_STRUCT_UNALIGNED_BEGIN` * `RBIMPL_ATTR_PACKED_STRUCT_UNALIGNED_END`
* [Bug #19399] Parsing invalid heredoc inside block parameterNobuyoshi Nakada2023-02-021-1/+1
| | | | | Although this is of course invalid as Ruby code, allow to just parse and tokenize.
* Introduce encoding check macroS-H-GAMELINKS2022-12-021-0/+1
|
* Enhance keep_tokens option for RubyVM::AbstractSyntaxTree parsing methodsyui-knk2022-11-211-19/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Implementation for Language Server Protocol (LSP) sometimes needs token information. For example both `m(1)` and `m(1, )` has same AST structure other than node locations then it's impossible to check the existence of `,` from AST. However in later case, it might be better to suggest variables list for the second argument. Token information is important for such case. This commit adds these methods. * Add `keep_tokens` option for `RubyVM::AbstractSyntaxTree.parse`, `.parse_file` and `.of` * Add `RubyVM::AbstractSyntaxTree::Node#tokens` which returns tokens for the node including tokens for descendants nodes. * Add `RubyVM::AbstractSyntaxTree::Node#all_tokens` which returns all tokens for the input script regardless the receiver node. [Feature #19070] Impacts on memory usage and performance are below: Memory usage: ``` $ cat test.rb root = RubyVM::AbstractSyntaxTree.parse_file(File.expand_path('../test/ruby/test_keyword.rb', __FILE__), keep_tokens: true) $ /usr/bin/time -f %Mkb /usr/local/bin/ruby -v ruby 3.2.0dev (2022-11-19T09:41:54Z 19070-keep_tokens d3af1b8057) [x86_64-linux] 11408kb # keep_tokens :false $ /usr/bin/time -f %Mkb /usr/local/bin/ruby test.rb 17508kb # keep_tokens :true $ /usr/bin/time -f %Mkb /usr/local/bin/ruby test.rb 30960kb ``` Performance: ``` $ cat ../ast_keep_tokens.yml prelude: | src = <<~SRC module M class C def m1(a, b) 1 + a + b end end end SRC benchmark: without_keep_tokens: | RubyVM::AbstractSyntaxTree.parse(src, keep_tokens: false) with_keep_tokens: | RubyVM::AbstractSyntaxTree.parse(src, keep_tokens: true) $ make benchmark COMPARE_RUBY="./ruby" ARGS=../ast_keep_tokens.yml /home/kaneko.y/.rbenv/shims/ruby --disable=gems -rrubygems -I../benchmark/lib ../benchmark/benchmark-driver/exe/benchmark-driver \ --executables="compare-ruby::./ruby -I.ext/common --disable-gem" \ --executables="built-ruby::./miniruby -I../lib -I. -I.ext/common ../tool/runruby.rb --extout=.ext -- --disable-gems --disable-gem" \ --output=markdown --output-compare -v ../ast_keep_tokens.yml compare-ruby: ruby 3.2.0dev (2022-11-19T09:41:54Z 19070-keep_tokens d3af1b8057) [x86_64-linux] built-ruby: ruby 3.2.0dev (2022-11-19T09:41:54Z 19070-keep_tokens d3af1b8057) [x86_64-linux] warming up.. | |compare-ruby|built-ruby| |:--------------------|-----------:|---------:| |without_keep_tokens | 21.659k| 21.303k| | | 1.02x| -| |with_keep_tokens | 6.220k| 5.691k| | | 1.09x| -| ```
* Transition shape when object's capacity changesJemma Issroff2022-11-101-0/+1
| | | | | | | | | | | | | | | | This commit adds a `capacity` field to shapes, and adds shape transitions whenever an object's capacity changes. Objects which are allocated out of a bigger size pool will also make a transition from the root shape to the shape with the correct capacity for their size pool when they are allocated. This commit will allow us to remove numiv from objects completely, and will also mean we can guarantee that if two objects share shapes, their IVs are in the same positions (an embedded and extended object cannot share shapes). This will enable us to implement ivar sets in YJIT using object shapes. Co-Authored-By: Aaron Patterson <tenderlove@ruby-lang.org>
* Preprocess for older bison is no longer neededNobuyoshi Nakada2022-11-101-2/+0
|
* Set default %printer for NODE ntermsyui-knk2022-11-081-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Before: ``` Reducing stack by rule 639 (line 5062): $1 = token "integer literal" (1.0-1.1: 1) -> $$ = nterm simple_numeric (1.0-1.1: ) ``` After: ``` Reducing stack by rule 641 (line 5078): $1 = token "integer literal" (1.0-1.1: 1) -> $$ = nterm simple_numeric (1.0-1.1: NODE_LIT) ``` `"<*>"` is supported by Bison 2.3b (2008-05-27) or later. https://git.savannah.gnu.org/cgit/bison.git/commit/?id=12e3584054c16ab255672c07af0ffc7bb220e8bc Therefore developers need to install Bison 2.3b+ to build ruby from source codes if their Bison is older. Minimum version requirement for Bison is changed to 3.0. See: https://bugs.ruby-lang.org/issues/19068 [Feature #19068]
* Process token IDs from id.def without id.hNobuyoshi Nakada2022-09-081-3/+2
| | | | | | | | | Fixes id.h error during updating ripper.c by `make after-update`. While it used to update id.h in the build directory, but was trying to update ripper.c in the source directory. In principle, files in the source directory can or should not depend on files in the build directory.
* [Feature #18249] Update dependenciesPeter Zhu2022-02-221-0/+2
|
* ext/ripper/lib/ripper/lexer.rb: Do not deprecate Ripper::Lexer::State#[]Yusuke Endoh2021-12-091-14/+11
| | | | | | | | | | | The old code of IRB still uses this method. The warning is noisy on rails console. In principle, Ruby 3.1 deprecates nothing, so let's avoid the deprecation for the while. I think It is not so hard to continue to maintain it as it is a trivial shim. https://github.com/ruby/ruby/pull/5093
* Define Ripper::Lexer::Elem#to_sNobuyoshi Nakada2021-12-021-0/+2
| | | | | | Alias `#inspect` as `#to_s` also in the new `Ripper::Lexer::Elem` class, so that `puts Ripper::Lexer.new(code).scan` shows the attributes.
* Deprecate `Lexer::Elem#[]` and `Lexer::State#[]`schneems2021-12-021-0/+31
| | | | | | | Discussed in https://github.com/ruby/ruby/pull/5093#issuecomment-964426481. > it would be enough to mimic only [] for almost all cases This adds back the `Lexer::Elem#[]` and `Lexer::State#[]` and adds deprecation warnings for them.
* Only iterate Lexer heredoc arraysschneems2021-12-021-10/+12
| | | | The last element in the `@buf` may be either an array or an `Elem`. In the case it is an `Elem` we iterate over every element, when we do not need to. This check guards that case by ensuring that we only iterate over an array of elements.
* ~1.10x faster Change Ripper.lex structs to classesschneems2021-12-021-8/+22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | ## Concept I am proposing we replace the Struct implementation of data structures inside of ripper with real classes. This will improve performance and the implementation is not meaningfully more complicated. ## Example Struct versus class comparison: ```ruby Elem = Struct.new(:pos, :event, :tok, :state, :message) do def initialize(pos, event, tok, state, message = nil) super(pos, event, tok, State.new(state), message) end # ... def to_a a = super a.pop unless a.empty? a end end class ElemClass attr_accessor :pos, :event, :tok, :state, :message def initialize(pos, event, tok, state, message = nil) @pos = pos @event = event @tok = tok @state = State.new(state) @message = message end def to_a if @message [@pos, @event, @tok, @state, @message] else [@pos, @event, @tok, @state] end end end # stub state class creation for now class State; def initialize(val); end; end ``` ## MicroBenchmark creation ```ruby require 'benchmark/ips' require 'ripper' pos = [1, 2] event = :on_nl tok = "\n".freeze state = Ripper::EXPR_BEG Benchmark.ips do |x| x.report("struct") { Elem.new(pos, event, tok, state) } x.report("class ") { ElemClass.new(pos, event, tok, state) } x.compare! end; nil ``` Gives ~1.2x faster creation: ``` Warming up -------------------------------------- struct 263.983k i/100ms class 303.367k i/100ms Calculating ------------------------------------- struct 2.638M (± 5.9%) i/s - 13.199M in 5.023460s class 3.171M (± 4.6%) i/s - 16.078M in 5.082369s Comparison: class : 3170690.2 i/s struct: 2638493.5 i/s - 1.20x (± 0.00) slower ``` ## MicroBenchmark `to_a` (Called by Ripper.lex for every element) ```ruby require 'benchmark/ips' require 'ripper' pos = [1, 2] event = :on_nl tok = "\n".freeze state = Ripper::EXPR_BEG struct = Elem.new(pos, event, tok, state) from_class = ElemClass.new(pos, event, tok, state) Benchmark.ips do |x| x.report("struct") { struct.to_a } x.report("class ") { from_class.to_a } x.compare! end; nil ``` Gives 1.46x faster `to_a`: ``` Warming up -------------------------------------- struct 612.094k i/100ms class 893.233k i/100ms Calculating ------------------------------------- struct 6.121M (± 5.4%) i/s - 30.605M in 5.015851s class 8.931M (± 7.9%) i/s - 44.662M in 5.039733s Comparison: class : 8930619.0 i/s struct: 6121358.9 i/s - 1.46x (± 0.00) slower ``` ## MicroBenchmark data access ```ruby require 'benchmark/ips' require 'ripper' pos = [1, 2] event = :on_nl tok = "\n".freeze state = Ripper::EXPR_BEG struct = Elem.new(pos, event, tok, state) from_class = ElemClass.new(pos, event, tok, state) Benchmark.ips do |x| x.report("struct") { struct.pos[1] } x.report("class ") { from_class.pos[1] } x.compare! end; nil ``` Gives ~1.17x faster data access: ``` Warming up -------------------------------------- struct 1.694M i/100ms class 1.868M i/100ms Calculating ------------------------------------- struct 16.149M (± 6.8%) i/s - 81.318M in 5.060633s class 18.886M (± 2.9%) i/s - 95.262M in 5.048359s Comparison: class : 18885669.6 i/s struct: 16149255.8 i/s - 1.17x (± 0.00) slower ``` ## Full benchmark integration of this inside of Ripper.lex Inside of this repo with this commit ``` $ cd ext/ripper $ make $ cat test.rb file = File.join(__dir__, "../../array.rb") source = File.read(file) bench = Benchmark.measure do 10_000.times.each do Ripper.lex(source) end end puts bench ``` Then execute with and without this change 50 times: ``` rm new.txt rm old.txt for i in {0..50} do `ruby -Ilib -rripper -rbenchmark ./test.rb >> new.txt` `ruby -rripper -rbenchmark ./test.rb >> old.txt` done ``` I used derailed benchmarks internals to compare the results: ``` dir = Pathname(".") branch_info = {} branch_info["old"] = { desc: "Struct lex", time: Time.now, file: dir.join("old.txt"), name: "old" } branch_info["new"] = { desc: "Class lex", time: Time.now, file: dir.join("new.txt"), name: "new" } stats = DerailedBenchmarks::StatsFromDir.new(branch_info) stats.call.banner ``` Which gave us: ``` ❤️ ❤️ ❤️ (Statistically Significant) ❤️ ❤️ ❤️ [new] (3.3139 seconds) "Class lex" ref: "new" FASTER 🚀🚀🚀 by: 1.1046x [older/newer] 9.4700% [(older - newer) / older * 100] [old] (3.6606 seconds) "Struct lex" ref: "old" Iterations per sample: Samples: 51 Test type: Kolmogorov Smirnov Confidence level: 99.0 % Is significant? (max > critical): true D critical: 0.30049534876137013 D max: 0.9607843137254902 Histograms (time ranges are in seconds): [new] description: [old] description: "Class lex" "Struct lex" ┌ ┐ ┌ ┐ [3.0, 3.3) ┤▇ 1 [3.0, 3.3) ┤ 0 [3.3, 3.6) ┤▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 47 [3.3, 3.6) ┤ 0 [3.5, 3.8) ┤▇▇ 2 [3.5, 3.8) ┤▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 46 [3.8, 4.1) ┤▇ 1 [3.8, 4.1) ┤▇▇▇ 4 [4.0, 4.3) ┤ 0 [4.0, 4.3) ┤ 0 [4.3, 4.6) ┤ 0 [4.3, 4.6) ┤▇ 1 └ ┘ └ ┘ # of runs in range # of runs in range ``` To sum this up, the "new" version of this code (using real classes instead of structs) is 10% faster across 50 runs with a statistical significance confidence level of 99%. Histograms are for visual checksum.
* Keep the generated source files when clean [Bug #18363]Nobuyoshi Nakada2021-11-251-1/+2
|
* Update dependenciesNobuyoshi Nakada2021-11-211-2/+0
|
* ruby tool/update-deps --fix卜部昌平2021-10-051-0/+9
|
* dependency updates卜部昌平2021-04-131-1/+0
|
* ripper: fix a bug of Ripper::Lexer with syntax error and heredoc [Bug #17644]Shugo Maeda2021-02-191-1/+1
|
* Fix Ripper with heredoc.manga_osyo2021-01-171-0/+1
|
* ripper: call #pretty_print on also `state`Nobuyoshi Nakada2021-01-041-1/+1
|
* ripper: fix `#tok` on some error events [Bug 17345]Nobuhiro IMAI2020-12-191-4/+9
| | | | sorting alias target by event arity, and setup suitable `Elem` for error.