aboutsummaryrefslogtreecommitdiffstats
path: root/string.c
Commit message (Collapse)AuthorAgeFilesLines
* drop-in type check for rb_define_singleton_method卜部昌平2019-08-291-1/+23
| | | | | | We can check the function pointer passed to rb_define_singleton_method like how we do so in rb_define_method. Doing so revealed many arity mismatches.
* Fixed heap-use-after-freeNobuyoshi Nakada2019-08-151-1/+2
| | | | | | * string.c (rb_str_sub_bang): retrieves a pointer to the replacement string buffer just before using it, for the case of replacement with the receiver string itself. [Bug #16105]
* * expand tabs. [ci skip]git2019-08-151-2/+2
|
* Fold to lowercase intead of uppercase for String#casecmpJeremy Evans2019-08-141-4/+4
| | | | strcasecmp(3) and String#casecmp? both fold to lowercase.
* Update docs to use more natural EnglishAaron Patterson2019-08-121-10/+10
| | | | Just a few updates to make the English sound a bit more natural
* string.c (rb_str_sub, _gsub): improve the rdocYusuke Endoh2019-08-121-21/+58
| | | | | | | | | | | | This change: * Added an explanation about back references except \n and \k<n> (\` \& \' \+ \0) * Added an explanation about an escape (\\) * Added some rdoc references * Rephrased and clarified the reason why double escape is needed, added some examples, and moved the note to the last (because it is not specific to the method itself).
* leafify opt_plus卜部昌平2019-08-061-0/+31
| | | | | | Inspired by 346aa557b31fe96760e505d30da26eb7a846bac9 Closes: https://github.com/ruby/ruby/pull/2321
* Make opt_eq and opt_neq insns leafTakashi Kokubun2019-08-041-18/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | # Benchmark zero? ``` require 'benchmark/ips' Numeric.class_eval do def ruby_zero? self == 0 end end Benchmark.ips do |x| x.report('0.zero?') { 0.ruby_zero? } x.report('1.zero?') { 1.ruby_zero? } x.compare! end ``` ## VM No significant impact for VM. ### before ruby 2.7.0dev (2019-08-04T02:56:02Z master 2d8c037e97) [x86_64-linux] 0.zero?: 21855445.5 i/s 1.zero?: 21770817.3 i/s - same-ish: difference falls within error ### after ruby 2.7.0dev (2019-08-04T11:17:10Z opt-eq-leaf 6404bebd6a) [x86_64-linux] 1.zero?: 21958912.3 i/s 0.zero?: 21881625.9 i/s - same-ish: difference falls within error ## JIT The performance improves about 1.23x. ### before ruby 2.7.0dev (2019-08-04T02:56:02Z master 2d8c037e97) +JIT [x86_64-linux] 0.zero?: 36343111.6 i/s 1.zero?: 36295153.3 i/s - same-ish: difference falls within error ### after ruby 2.7.0dev (2019-08-04T11:17:10Z opt-eq-leaf 6404bebd6a) +JIT [x86_64-linux] 0.zero?: 44740467.2 i/s 1.zero?: 44363616.1 i/s - same-ish: difference falls within error # Benchmark str == str / str != str ``` # frozen_string_literal: true require 'benchmark/ips' Benchmark.ips do |x| x.report('a == a') { 'a' == 'a' } x.report('a == b') { 'a' == 'b' } x.report('a != a') { 'a' != 'a' } x.report('a != b') { 'a' != 'b' } x.compare! end ``` ## VM No significant impact for VM. ### before ruby 2.7.0dev (2019-08-04T02:56:02Z master 2d8c037e97) [x86_64-linux] a == a: 27286219.0 i/s a != a: 24892389.5 i/s - 1.10x slower a == b: 23623635.8 i/s - 1.16x slower a != b: 21800958.0 i/s - 1.25x slower ### after ruby 2.7.0dev (2019-08-04T11:17:10Z opt-eq-leaf 6404bebd6a) [x86_64-linux] a == a: 27224016.2 i/s a != a: 24490109.5 i/s - 1.11x slower a == b: 23391052.4 i/s - 1.16x slower a != b: 21811321.7 i/s - 1.25x slower ## JIT The performance improves on JIT a little. ### before ruby 2.7.0dev (2019-08-04T02:56:02Z master 2d8c037e97) +JIT [x86_64-linux] a == a: 42010674.7 i/s a != a: 38920311.2 i/s - same-ish: difference falls within error a == b: 32574262.2 i/s - 1.29x slower a != b: 32099790.3 i/s - 1.31x slower ### after ruby 2.7.0dev (2019-08-04T11:17:10Z opt-eq-leaf 6404bebd6a) +JIT [x86_64-linux] a == a: 46902738.8 i/s a != a: 43097258.6 i/s - 1.09x slower a == b: 35822018.4 i/s - 1.31x slower a != b: 33377257.8 i/s - 1.41x slower This is needed towards Bug#15589. Closes: https://github.com/ruby/ruby/pull/2318
* Reuse match dataNobuyoshi Nakada2019-07-281-2/+5
| | | | * string.c (rb_str_split_m): reuse occupied match data. [Bug #16024]
* Occupy match dataNobuyoshi Nakada2019-07-271-1/+3
| | | | | * string.c (rb_str_split_m): occupy match data not to be modified during yielding the block. [Bug #16024]
* string.c (str_succ): refactoringYusuke Endoh2019-07-141-3/+3
| | | | Use more communicative variable name
* string.c (str_succ): remove a unnecessary assignmentYusuke Endoh2019-07-141-1/+0
| | | | This change will suppress Coverity Scan warnings
* * expand tabs.git2019-07-141-1/+1
|
* Prefer `rb_error_arity` to `rb_check_arity` when it can be usedYusuke Endoh2019-07-141-1/+1
|
* Check that String#scrub block does not modify receiverJeremy Evans2019-07-021-7/+12
| | | | | | | Similar to the check used for String#gsub. Can fix possible segfault. Fixes [Bug #15941]
* Make String#-@ not freeze receiver if called on unfrozen subclass instanceJeremy Evans2019-07-021-0/+3
| | | | | | | | | rb_fstring behavior in this case is to freeze the receiver. I'm not sure if that should be changed, so this takes the conservative approach of duping the receiver in String#-@ before passing to rb_fstring. Fixes [Bug #15926]
* * expand tabs.git2019-06-291-2/+2
|
* Fixed String#grapheme_clusters with wide encodingsNobuyoshi Nakada2019-06-291-2/+23
| | | | | | | | * string.c (get_reg_grapheme_cluster): make regexp from properly encoded sources fro wide-char encodings. [Bug #15965] * regparse.c (node_extended_grapheme_cluster): suppress false duplicated range warning for the time being.
* Resize capacity for fstringJohn Hawthorn2019-06-261-0/+3
| | | | | | | | | | | | | | | | | | | | | | | When a string is #frozen, it's capacity is resized to fit (if it is much larger), since we know it will no longer be mutated. > puts ObjectSpace.dump(String.new("a"*30, capacity: 1000)) {"type":"STRING", "class":"0x7feaf00b7bf0", "bytesize":30, "capacity":1000, "value":"... > puts ObjectSpace.dump(String.new("a"*30, capacity: 1000).freeze) {"type":"STRING", "class":"0x7feaf00b7bf0", "frozen":true, "bytesize":30, "value":"... (ObjectSpace.dump doesn't show capacity if capacity is equal to bytesize) Previously, if we dedup into an fstring, using String#-@, capacity would not be reduced. > puts ObjectSpace.dump(-String.new("a"*30, capacity: 1000)) {"type":"STRING", "class":"0x7feaf00b7bf0", "frozen":true, "fstring":true, "bytesize":30, "capacity":1000, "value":"... This commit makes rb_fstring call rb_str_resize, the same as rb_str_freeze does. Closes: https://github.com/ruby/ruby/pull/2256
* * expand tabs.git2019-06-211-1/+1
|
* Get rid of undefined behaviorNobuyoshi Nakada2019-06-211-1/+1
| | | | | * string.c (rb_str_sub_bang): str and repl can be same. [Bug #15946]
* New buffer for shared stringNobuyoshi Nakada2019-06-191-0/+9
| | | | | * string.c (rb_str_init): allocate new buffer if the string is shared. [Bug #15937]
* Preserve the string content at self-copyingNobuyoshi Nakada2019-06-191-1/+4
| | | | | * string.c (rb_str_init): preserve the embedded content when self-copying with a capacity. [Bug #15937]
* Fix memory leakNobuyoshi Nakada2019-06-181-1/+4
| | | | | | | * string.c (str_make_independent_expand): free independent buffer. [Bug# 15935] Co-Authored-By: luke-gru (Luke Gruber) <luke.gru@gmail.com>
* * expand tabs.git2019-06-181-1/+1
|
* String#b: Don't depend on dependent stringAlan Wu2019-06-181-4/+11
| | | | | | | | | | | | | | | | | | | | | | | | Registering a string that depend on a dependent string as fstring can lead to use-after-free. See c06ddfe and 3f95620 for details. The following script triggers use-after-free on trunk, 2.4.6, 2.5.5 and 2.6.3. Credits to @wanabe for using eval as a cross-version way of registering a fstring. ```ruby a = ('j' * 24).b.b eval('', binding, a) p a 4.times { GC.start } p a ``` - string.c (str_replace_shared_without_enc): when given a dependent string, depend on the root of the dependent string. [Bug #15934]
* Fix memory leakNobuyoshi Nakada2019-06-161-0/+7
| | | | | | | | | | | | | | * string.c (str_replace_shared_without_enc): free previous buffer before replaced. * parse.y (gettable): make sure in advance that the `__FILE__` object shares a fstring, to get rid of replacement with the fstring later. TODO: this hack may be needed in other places. [Bug #15916] Co-Authored-By: luke-gru (Luke Gruber) <luke.gru@gmail.com>
* Symbol just represents a nameNobuyoshi Nakada2019-05-141-2/+2
|
* str_duplicate: Don't share with a frozen shared stringAlan Wu2019-05-091-9/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is a follow up for 3f9562015e651735bfc2fdd14e8f6963b673e22a. Before this commit, it was possible to create a shared string which shares with another shared string by passing a frozen shared string to `str_duplicate`. Such string looks like: ``` -------- ----------------- | root | ------ owns -----> | root's buffer | -------- ----------------- ^ ^ ^ ----------- | | | shared1 | ------ references ----- | ----------- | ^ | ----------- | | shared2 | ------ references --------- ----------- ``` This is bad news because `rb_fstring(shared2)` can make `shared1` independent, which severs the reference from `shared1` to `root`: ```c /* from fstr_update_callback() */ str = str_new_frozen(rb_cString, shared2); /* can return shared1 */ if (STR_SHARED_P(str)) { /* shared1 is also a shared string */ str_make_independent(str); /* no frozen check */ } ``` If `shared1` was the only reference to `root`, then `root` can be reclaimed by the GC, leaving `shared2` in a corrupted state: ``` ----------- -------------------- | shared1 | -------- owns --------> | shared1's buffer | ----------- -------------------- ^ | ----------- ------------------------- | shared2 | ------ references ----> | root's buffer (freed) | ----------- ------------------------- ``` Here is a reproduction script for the situation this commit fixes. ```ruby a = ('a' * 24).strip.freeze.strip -a p a 4.times { GC.start } p a ``` - string.c (str_duplicate): always share with the root string when the original is a shared string. - test_rb_str_dup.rb: specifically test `rb_str_dup` to make sure it does not try to share with a shared string. [Bug #15792] Closes: https://github.com/ruby/ruby/pull/2159
* Revert "UTF-8 is one of byte based encodings"Nobuyoshi Nakada2019-05-061-1/+1
| | | | | | This reverts commit 5776ae347540ac19c40d146a3566a806cd176bf1. Mistaken `max` as `min`.
* Improve documentation for String#{dump,undump}Marcus Stollsteimer2019-05-051-4/+6
|
* * expand tabs.git2019-05-031-2/+2
|
* Improve performance of case-conversion methodsNobuyoshi Nakada2019-05-031-57/+160
|
* UTF-8 is one of byte based encodingsNobuyoshi Nakada2019-05-031-2/+2
|
* * expand tabs.git2019-05-021-2/+2
|
* Fix potential memory leakNobuyoshi Nakada2019-05-021-17/+32
|
* this variable is not guaranteed alignedUrabe, Shyouhei2019-04-291-1/+1
| | | | No problem for unaligned-ness because we never dereference.
* fix typoUrabe, Shyouhei2019-04-291-1/+1
|
* Get rid of indirect sharingNobuyoshi Nakada2019-04-271-3/+8
| | | | | | | | | * string.c (str_duplicate): share the root shared string if the original string is already sharing, so that all shared strings refer the root shared string directly. indirect sharing can cause a dangling pointer. [Bug #15792]
* string.c: warn non-nil $;nobu2019-04-181-0/+6
| | | | | | | | * string.c (rb_str_split_m): warn use of non-nil $;. * string.c (rb_fs_setter): warn when set to non-nil value. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@67603 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* string.c: improve splitting into charsnobu2019-04-171-10/+20
| | | | | | | | | | | | | | | | | | | | | | | | * string.c (rb_str_split_m): improve splitting into chars by an empty string, without a regexp. Comparison: to_chars-1 built-ruby: 1273527.6 i/s compare-ruby: 189423.3 i/s - 6.72x slower to_chars-10 built-ruby: 120993.5 i/s compare-ruby: 37075.8 i/s - 3.26x slower to_chars-100 built-ruby: 15646.4 i/s compare-ruby: 4012.1 i/s - 3.90x slower to_chars-1000 built-ruby: 1295.1 i/s compare-ruby: 408.5 i/s - 3.17x slower git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@67582 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* string.c: [DOC] fix reference to sprintf [ci skip]nobu2019-03-201-1/+1
| | | | git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@67312 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* string.c: [DOC] remove unnecessary markups [ci skip]nobu2019-03-201-97/+98
| | | | | | | * string.c: remove <code> markups, which are not only unnecessary but also prevented cross-references. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@67311 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* string.c: [DOC] fix indent [ci skip]nobu2019-03-201-43/+43
| | | | | | | * string.c (rb_str_crypt): fix indent not to make the whole list verbatim entirely. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@67310 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* string.c: respect the actual encodingnobu2019-03-051-2/+3
| | | | | | | | * string.c (rb_enc_str_coderange): respect the actual encoding of if a BOM presents, and scan for the actual code range. [ruby-core:91662] [Bug #15635] git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@67167 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* * string.c (chopped_length): early return for empty stringsnobu2019-02-071-1/+1
| | | | | | | | [Bug #11391] From: Franck Verrot <franck@verrot.fr> git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@67018 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* Add more example of `String#dump`kazu2019-01-221-2/+3
| | | | git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@66906 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* Improvements to documentation.samuel2019-01-211-4/+4
| | | | git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@66897 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* string.c (rb_str_dump): Fix the rdocmame2019-01-211-1/+4
| | | | | | * Officially states that String#dump is intended for round-trip. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@66894 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* Use `&` instead of `modulo`nobu2019-01-151-1/+1
| | | | git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@66830 b2dd03c8-39d4-4d8f-98ff-823fe69b080e