aboutsummaryrefslogtreecommitdiffstats
path: root/string.c
Commit message (Collapse)AuthorAgeFilesLines
* string.c: avoid signed integer overflowtopic/string-integer-overflowKazuki Yamaguchi2016-09-131-48/+46
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | The behavior on signed integer overflow is undefined. On platform with sizeof(long)==4, it's fairly easy that 'len + termlen' overflows, where len is the string length and termlen is the terminator length. So, prevent the integer overflow by avoiding adding to a string length, or casting to size_t before adding where the total size is passed to {RE,}ALLOC*(). * string.c (STR_HEAP_SIZE, RESIZE_CAPA_TERM, str_new0, rb_str_buf_new, str_shared_replace, rb_str_init, str_make_independent_expand, rb_str_resize): Avoid overflow by casting the length to size_t. size_t should be able to represent LONG_MAX+termlen. * string.c (rb_str_modify_expand): Check that the new length is in the range of long before resizing. Also refactor to use RESIZE_CAPA_TERM macro. * string.c (str_buf_cat): Fix so that it does not create a negative length String. Also fix the condition for 'string sizes too big', the total length can be up to LONG_MAX. * string.c (rb_str_plus): Check the resulting String length does not exceed LONG_MAX. * string.c (rb_str_dump): Fix integer overflow. The dump result will be longer then the original String.
* string.c: rename STR_EMBEDABLE_P to STR_EMBEDDABLE_Prhe2016-09-131-15/+17
| | | | | | | * string.c (STR_EMBEDDABLE_P): Renamed from STR_EMBEDABLE_P(). And use it in more places. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@56155 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* string.c: STR_EMBEDABLE_Pnobu2016-09-131-3/+6
| | | | | | | | * string.c (STR_EMBEDABLE_P): extract the predicate macro to tell if the given length is capable in an embedded string, and fix possible integer overflow. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@56151 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* string.c: fix integer overflownobu2016-09-131-1/+2
| | | | | | | | * string.c (rb_str_change_terminator_length): fix integer overflow in the case growing the terminator length and the string length is around LONG_MAX. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@56149 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* string.c: fix buffer overflow check condition in rb_str_set_len()rhe2016-09-131-1/+1
| | | | | | | | | | | | * string.c (rb_str_set_len): The buffer overflow check is wrong. The space for termlen is allocated outside the capacity returned by rb_str_capacity(). This fixes r41920 ("string.c: multi-byte terminator", 2013-07-11). [ruby-core:77257] [Bug #12757] * test/-ext-/string/test_set_len.rb (test_capacity_equals_to_new_size): Test for this change. Applying only the test will trigger [BUG]. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@56148 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* replace fixnum by integer in documents.akr2016-09-081-16/+16
| | | | git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@56102 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* multiple argumentsnobu2016-08-271-16/+50
| | | | | | | | * array.c (rb_ary_concat_multi): take multiple arguments. based on the patch by Satoru Horie. [Feature #12333] * string.c (rb_str_concat_multi, rb_str_prepend_multi): ditto. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@56021 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* string.c: rb_fs_setternobu2016-08-231-3/+15
| | | | | | | * string.c (rb_fs_setter): check and convert $; value at assignment. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55990 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* string.c: $; name in error messagenobu2016-08-221-6/+19
| | | | | | | * string.c (rb_str_split_m): show $; name in error message when it is a wrong object. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55986 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* * string.c (String#downcase), NEWS: Mentioned that case mapping for allduerst2016-07-301-1/+1
| | | | | | | of ISO-8859-1~16 is now supported. [ci skip] git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55777 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* rb_funcallvnobu2016-07-291-1/+1
| | | | | | | * *.c: rename rb_funcall2 to rb_funcallv, except for extensions which are/will be/may be gems. [Fix GH-1406] git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55773 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* * vm_core.h: revisit the structure of frame, block and env.ko12016-07-281-2/+2
| | | | | | | | | | | | | | | | | | | | | | [Bug #12628] This patch introduce many changes. * Introduce concept of "Block Handler (BH)" to represent passed blocks. * move rb_control_frame_t::flag to ep[0] (as a special local variable). This flags represents not only frame type, but also env flags such as escaped. * rename `rb_block_t` to `struct rb_block`. * Make Proc, Binding and RubyVM::Env objects wb-protected. Check [Bug #12628] for more details. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55766 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* Fix Issues reported by PVS-Studio static analyzernobu2016-07-221-1/+1
| | | | | | | | | | * vm.c (vm_set_main_stack): remove unnecessary check. toplevel binding must be initialized. [Bug #12611] (N1) * win32/win32.c (w32_symlink): fix return type. [Bug #12611] (N3) * string.c (rb_str_split_m): simplify the condition. [Bug #12611](N4) git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55729 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* * string.c (String#dump): Change escaping of non-ASCII characters induerst2016-07-221-4/+11
| | | | | | | | | UTF-8 to use upper-case four-digit hexadecimal escapes without braces where possible [Feature #12419]. * test/ruby/test_string.rb (test_dump): Add tests for above. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55728 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* * string.c (str_buf_cat): Fix potential interger overflow of capa.ngoto2016-07-151-2/+3
| | | | | | | In addition, termlen is used instead of +1. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55692 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* * string.c (str_buf_cat): Fix capa size for embed string.ngoto2016-07-151-1/+1
| | | | | | | Fix bug in r55547. [Bug #12536] git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55691 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* string.c: reduce malloc overhead for default buffer sizenormal2016-07-141-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * string.c (STR_BUF_MIN_SIZE): reduce from 128 to 127 [ruby-core:76371] [Feature #12025] * string.c (rb_str_buf_new): adjust for above reduction From Jeremy Evans <code@jeremyevans.net>: This changes the minimum buffer size for string buffers from 128 to 127. The underlying C buffer is always 1 more than the ruby buffer, so this changes the actual amount of memory used for the minimum string buffer from 129 to 128. This makes it much easier on the malloc implementation, as evidenced by the following code (note that time -l is used here, but Linux systems may need time -v). $ cat bench_mem.rb i = ARGV.first.to_i Array.new(1000000){" " * i} $ /usr/bin/time -l ruby bench_mem.rb 128 3.10 real 2.19 user 0.46 sys 289080 maximum resident set size 72673 minor page faults 13 block output operations 29 voluntary context switches $ /usr/bin/time -l ruby bench_mem.rb 127 2.64 real 2.09 user 0.27 sys 162720 maximum resident set size 40966 minor page faults 2 block output operations 4 voluntary context switches To try to ensure a power-of-2 growth, when a ruby string capacity needs to be increased, after doubling the capacity, add one. This ensures the ruby capacity will be odd, which means actual amount of memory used will be even, which is probably better than the current case of the ruby capacity being even and the actual amount of memory used being odd. A very similar patch was proposed 4 years ago in feature #5875. It ended up being rejected, because no performance increase was shown. One reason for that is that ruby does not use STR_BUF_MIN_SIZE unless rb_str_buf_new is called, and that previously did not have a ruby API, only a C API, so unless you were using a C extension that called it, there would be no performance increase. With the recently proposed feature #12024, String.buffer is added, which is a ruby API for creating string buffers. Using String.buffer(100) wastes much less memory with this patch, as the malloc implementation can more easily deal with the power-of-2 sized memory usage. As measured above, memory usage is 44% less, and performance is 17% better. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55686 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* * string.c (rb_str_change_terminator_length): New function to changengoto2016-07-051-24/+43
| | | | | | | | | | | | | | | | | | | | | | | | | termlen and resize heap for the terminator. This is split from rb_str_fill_terminator (str_fill_term) because filling terminator and changing terminator length are different things. [Bug #12536] * internal.h: declaration for rb_str_change_terminator_length. * string.c (str_fill_term): Simplify only to zero-fill the terminator. For non-shared strings, it assumes that (capa + termlen) bytes of heap is allocated. This partially reverts r55557. * encoding.c (rb_enc_associate_index): rb_str_change_terminator_length is used, and it should be called whenever the termlen is changed. * string.c (str_capacity): New static function to return capacity of a string with the given termlen, because the termlen may sometimes be different from TERM_LEN(str) especially during changing termlen or filling terminator with specific termlen. * string.c (rb_str_capacity): Use str_capacity. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55575 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* * string.c: Partially reverts r55547 and r55555.ngoto2016-07-011-14/+6
| | | | | | | | ChangeLog about the reverted changes are also deleted in this file. [Bug #12536] [ruby-dev:49699] [ruby-dev:49702] git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55559 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* * string.c (str_fill_term): When termlen increases, re-allocationngoto2016-07-011-3/+22
| | | | | | | | | of memory for termlen should always be needed. In this fix, if possible, decrease capa instead of realloc. [Bug #12536] [ruby-dev:49699] git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55557 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* * string.c: Specify termlen as far as possible.ngoto2016-07-011-6/+23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Additional fix for [Bug #12536] [ruby-dev:49699]. * string.c (rb_usascii_str_new, rb_utf8_str_new): Specify termlen which is apparently 1 for the encodings. * string.c (str_new0_cstr): New static function to create a String object from a C string with specifying termlen. * string.c (rb_usascii_str_new_cstr, rb_utf8_str_new_cstr): Specify termlen by using new str_new0_cstr(). * string.c (str_new_static): Specify termlen from the given encoding when creating a new String object is needed. * string.c (rb_tainted_str_new_with_enc): New function to create a tainted String object with the given encoding. This means that the termlen is correctly specified. Curretly static function. The function name might be renamed to rb_tainted_enc_str_new or rb_enc_tainted_str_new. * string.c (rb_external_str_new_with_enc): Use encoding by using the above rb_tainted_str_new_with_enc(). git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55555 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* * string.c (rb_str_subseq, str_substr): When RSTRING_EMBED_LEN_MAXngoto2016-07-011-2/+2
| | | | | | | | | is used, TERM_LEN(str) should be considered with it because embedded strings are also processed by TERM_FILL. Additional fix for [Bug #12536] [ruby-dev:49699]. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55552 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* string.c: Add parentheses to avoid C source code ambiguity. [Bug #12536]ngoto2016-07-011-7/+7
| | | | git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55551 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* * string.c: Fix memory corruptions when using UTF-16/32 strings.ngoto2016-06-301-17/+24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | [Bug #12536] [ruby-dev:49699] * string.c (TERM_LEN_MAX): Macro for the longest TERM_FILL length, the same as largest value of rb_enc_mbminlen(enc) among encodings. * string.c (str_new, rb_str_buf_new, str_shared_replace): Allocate +TERM_LEN_MAX bytes instead of +1. This change may increase memory usage. * string.c (rb_str_new_with_class): Use TERM_LEN of the "obj". * string.c (rb_str_plus, rb_str_justify): Use str_new0 which is aware of termlen. * string.c (str_shared_replace): Copy +termlen bytes instead of +1. * string.c (rb_str_times): termlen should not be included in capa. * string.c (RESIZE_CAPA_TERM): When using RSTRING_EMBED_LEN_MAX, termlen should be counted with it because embedded strings are also processed by TERM_FILL. * string.c (rb_str_capacity, str_shared_replace, str_buf_cat): ditto. * string.c (rb_str_drop_bytes, rb_str_setbyte, str_byte_substr): ditto. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55547 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* CASEMAP_DEBUG [ci skip]nobu2016-06-211-6/+15
| | | | | | | | * string.c (rb_str_casemap, rb_str_ascii_casemap): move debug/tuning messages under a preprocessor condition, CASEMAP_DEBUG. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55483 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* Fix garbage allocationnobu2016-06-211-2/+4
| | | | | | | | * string.c (rb_str_casemap): do not put code with side effects inside RSTRING_PTR() macro which evaluates the argument multiple times. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55481 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* * string.c (rb_str_casemap): fix memory leak.naruse2016-06-211-1/+3
| | | | git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55480 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* * string.c (rb_str_casemap): int is too small for string size.naruse2016-06-211-2/+2
| | | | git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55479 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* string.c: adjust buffer sizenobu2016-06-161-6/+4
| | | | | | | * string.c (tr_trans): adjust buffer size by processed and rest lengths, instead of doubling repeatedly. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55428 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* string.c: fix terminatornobu2016-06-161-8/+10
| | | | | | | * string.c (tr_trans): consider terminator length and fix heap overflow. reported by Guido Vranken <guido AT guidovranken.nl>. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55427 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* Fix typo in string.c [ci skip]nobu2016-06-111-1/+1
| | | | | | | * string.c (rb_str_oct): [DOC] fix typo, hornored -> honored. [Fix GH-1379] git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55378 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* * enc/iso_8859_1.c: Implement non-ASCII case mapping.duerst2016-06-111-1/+1
| | | | | | | | * test/ruby/enc/test_case_comprehensive.rb: Tests for above. * string.c: Add iso-8859-1 to supported encodings. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55373 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* * string.c: Special-case :ascii option in rb_str_capitalize_bang andduerst2016-06-101-2/+8
| | | | | | | rb_str_swapcase_bang. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55361 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* * string.c: Special-case :ascii option in rb_str_upcase_bang (retry).duerst2016-06-101-6/+5
| | | | git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55359 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* hash.c: ensure NUL-terminated for ENVnobu2016-06-101-2/+2
| | | | | | | | | * hash.c (get_env_cstr): ensure NUL-terminated. [ruby-dev:49655] [Bug #12475] * string.c (rb_str_fill_terminator): return the pointer to the NUL-terminated content. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55345 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* string.c (rb_str_ascii_casemap): fix compile error.kazu2016-06-081-1/+1
| | | | | | error: implicit conversion loses integer precision: 'long' to 'int' [-Werror,-Wshorten-64-to-32] git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55332 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* * string.c: Revert previous commit (possibility of endless loop).duerst2016-06-081-5/+6
| | | | git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55331 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* * string.c: Special-case :ascii option in rb_str_upcase_bang.duerst2016-06-081-6/+5
| | | | git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55330 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* * string.c: New static function rb_str_ascii_casemap; special-casingduerst2016-06-081-8/+32
| | | | | | | | | | :ascii option in rb_str_upcase_bang and rb_str_downcase_bang. * regenc.c: Fix a bug (wrong use of unnecessary slack at end of string). * regenc.h -> include/ruby/oniguruma.h: Move declaration of onigenc_ascii_only_case_map so that it is visible in string.c. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55329 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* * string.c (rb_str_upcase_bang, rb_str_capitalize_bang,duerst2016-06-071-82/+11
| | | | | | | rb_str_swapcase_bang): Switch to use primitive. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55310 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* * string.c (rb_str_downcase_bang): Switch to use primitive except ifduerst2016-06-071-28/+3
| | | | | | | conversion can be done ASCII-only. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55308 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* * string.c: Added UTF-16BE/LE and UTF-32BE/LE to supported encodingsduerst2016-06-061-6/+7
| | | | | | | | | for Unicode case mapping. * test/ruby/enc/test_case_comprehensive.rb: Tests for above functionality; fixed an encoding issue in assertion error message. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55296 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* * string.c Change rb_str_casemap to use encoding primitiveduerst2016-06-061-12/+1
| | | | | | | case_map instead of directly calling onigenc_unicode_case_map. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55293 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* * string.c: Remove :lithuanian guard for Unicode case mapping.duerst2016-06-051-17/+21
| | | | git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55277 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* crypt.h: remove initializednobu2016-06-041-0/+2
| | | | | | | | | | | | * missing/crypt.h (struct crypt_data): remove unnecessary member "initialized". * missing/crypt.c (des_setkey_r): nothing to be initialized in crypt_data. * configure.in (struct crypt_data): check for "initialized" in struct crypt_data, which may be only in glibc, and isn't on AIX at least. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55272 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* * string.c: Raise ArgumentError when invalid string is detected induerst2016-06-021-7/+19
| | | | | | | | | | | case mapping methods. * enc/unicode.c: Check for invalid string and signal with negative length value. * test/ruby/enc/test_case_mapping.rb: Add tests for above. * test/ruby/test_m17n_comb.rb: Add a message to clarify test failure. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55253 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* string.c: fallback to crypt_rnobu2016-06-011-2/+5
| | | | | | | * string.c: prefer crypt_r to crypt iff system crypt nor crypt_r are not provided. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55250 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* use system cryptnobu2016-06-011-0/+21
| | | | | | | | | * configure.in: revert r55237. replace crypt, not crypt_r, and check if crypt is broken more. * missing/crypt.c: move crypt_r.c * string.c (rb_str_crypt): use crypt_r if provided by the system. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55245 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* use crypt_rnobu2016-06-011-13/+5
| | | | | | * string.c (rb_str_crypt): use reentrant crypt_r. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55237 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* Revert r55225naruse2016-05-311-19/+17
| | | | | | | | | | Run test-all before large commit: "* string.c: Activate full Unicode case mapping for UTF-8 by removing" This reverts commit 3fb0fcd1e881c1f6dd74db73a64e8623208acb77. http://rubyci.s3.amazonaws.com/centos5-64/ruby-trunk/log/20160531T013303Z.fail.html.gz git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55226 b2dd03c8-39d4-4d8f-98ff-823fe69b080e