aboutsummaryrefslogtreecommitdiffstats
path: root/enc/unicode.c
Commit message (Collapse)AuthorAgeFilesLines
* * reg*.c: Merge Onigmo 5.15.0 38a870960aa7370051a3544naruse2014-09-151-1/+0
| | | | git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@47598 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* constify rb_encoding and OnigEncodingnobu2014-06-011-1/+1
| | | | | | | * include/ruby/encoding.h: constify `rb_encoding` arguments. * include/ruby/oniguruma.h: constify `OnigEncoding` arguments. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@46309 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* unicode.c: no initializationnobu2014-05-301-18/+0
| | | | | | | * enc/unicode.c (init_case_fold_table): no longer need to initialize tables at runtime. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@46273 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* case-folding.rb: perfect hash for case unfolding3nobu2014-05-301-44/+10
| | | | | | | * enc/unicode/case-folding.rb (lookup_hash): make perfect hash to lookup case unfolding table 3. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@46272 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* case-folding.rb: perfect hash for case unfolding2nobu2014-05-301-39/+15
| | | | | | | * enc/unicode/case-folding.rb (lookup_hash): make perfect hash to lookup case unfolding table 2. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@46271 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* case-folding.rb: perfect hash for case unfolding1nobu2014-05-301-19/+1
| | | | | | | * enc/unicode/case-folding.rb (lookup_hash): make perfect hash to lookup case unfolding table 1. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@46270 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* case-folding.rb: perfect hash for case foldingnobu2014-05-301-19/+15
| | | | | | | * enc/unicode/case-folding.rb (lookup_hash): make perfect hash to lookup case folding table. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@46269 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* case-folding.rb: merge tablesnobu2014-05-301-23/+7
| | | | | | | * enc/unicode/case-folding.rb (print_table): merge non-locale and locale tables, and reduce initializing loops. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@46268 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* enc/unicode.c: lookup functionsnobu2014-05-231-15/+51
| | | | | | | * enc/unicode.c (onigenc_unicode_{fold,unfold{1,2,3}}_lookup): abstract lookup functions. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@46057 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* enc/unicode.c: constifynobu2014-05-231-7/+13
| | | | | | | | | | | * enc/unicode.c (code{2,3}_{cmp,hash}): constify and adjust argument types. * enc/unicode.c (onigenc_unicode_fold_lookup): constify. * enc/unicode.c (onigenc_unicode_apply_all_case_fold): ditto. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@46056 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* * regparse.c (is_onechar_cclass): optimize character classnaruse2012-02-291-4/+4
| | | | | | | | | | | | Merge Onigmo 27278c12e6674043cc8affca6507e20e119a86ee. * regparse.c (is_onechar_cclass): [bug] unexpected match occurs when a char class contains no char * enc/unicode.c (init_case_fold_table): define the sizes of case folding tables in casefold.h git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@34860 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* * Merge Onigmo-5.13.1. [ruby-dev:45057] [Feature #5820]naruse2012-02-171-1935/+6
| | | | | | | | | | https://github.com/k-takata/Onigmo cp reg{comp,enc,error,exec,parse,syntax}.c reg{enc,int,parse}.h cp oniguruma.h cp tool/enc-unicode.rb cp -r enc/ git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@34663 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* * enc/unicode.c (PROPERTY_NAME_MAX_SIZE): +1.naruse2011-11-201-1/+1
| | | | | | reported by Ken Takata. [ruby-dev:44894][Bug #5652] git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@33797 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* * remove trailing spaces.nobu2011-05-151-4/+4
| | | | git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@31573 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* * enc/unicode.c (onigenc_unicode_property_name_to_ctype):naruse2010-10-031-1/+0
| | | | | | | | | | remove useless assignment. * vm.c (vm_make_proc_from_block): ditto. * variable.c (rb_ivar_count): ditto. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@29405 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* * include/ruby/oniguruma.h: updated to follow Oniguruma 5.9.2.matz2010-03-011-1/+1
| | | | | | | | * re.c (make_regexp): use onig_new() instead of onig_alloc_init(). * re.c (rb_reg_to_s): ditto. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@26791 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* * unicode.c (onigenc_unicode_property_name_to_ctype):naruse2009-09-101-4/+4
| | | | | | | | | | | | ignore case of properties. * tool/enc-unicode.rb: downcase properties list. * enc/unicode/name2ctype.h, enc/unicode/name2ctype.h.blt, enc/unicode/name2ctype.kwd, enc/unicode/name2ctype.src: follow above. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@24836 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* * include/ruby/st.h (st_hash_func): use st_index_t.nobu2009-09-081-4/+4
| | | | git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@24792 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* * unicode.c (PROPERTY_NAME_MAX_SIZE): use MAX_WORD_LENGTH.naruse2009-08-261-1/+1
| | | | git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@24677 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* * enc/unicode.c (onigenc_unicode_mbc_case_fold): balanced braces.nobu2009-08-261-2/+3
| | | | git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@24658 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* Update Oniguruma's UnicodeData to 5.1.naruse2009-08-251-8575/+1
| | | | | | | | | | | | | | | | | | | | * tool/enc-unicode.rb: added for generate name2ctype.kwd. contributed by Run Paint Run Run [ruby-core:24775] use like following: ruby19 tool/enc-unicode.rb enc/unicode/UnicodeData.txt \ enc/unicode/Scripts.txt > enc/unicode/name2ctype.kwd * enc/unicode.c (CodeRanges): move definitions to name2ctype.h. * enc/unicode/name2ctype.h.blt, enc/unicode/name2ctype.kwd, enc/unicode/name2ctype.src: updated to v5.1. * enc/unicode/UnicodeData.txt, enc/unicode/Scripts.txt: added v5.1. * Makefile.in: add rule to generate name2ctype.kwd from UnicodeData.txt and Scripts.txt. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@24651 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* * enc/unicode/name2ctype.h: split from enc/unicode.c and made anobu2009-08-211-150/+4
| | | | | | | perfect hash. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@24613 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* * enc/unicode.c (CodeRanges): initialized statically.nobu2009-08-191-163/+133
| | | | git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@24582 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* * grapheme cluster implementation reverted. [ruby-dev:36375]akr2008-09-181-10/+10
| | | | git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@19417 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* * include/ruby/oniguruma.h (OnigEncodingTypeST): add precise_retakr2008-09-161-10/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | argument for mbc_to_code. (ONIGENC_MBC_TO_CODE): provide NULL for precise_ret. (ONIGENC_MBC_PRECISE_CODEPOINT): defined. * include/ruby/encoding.h (rb_enc_mbc_precise_codepoint): defined. * regenc.h (onigenc_single_byte_mbc_to_code): precise_ret argument added. (onigenc_mbn_mbc_to_code): ditto. * regenc.c (onigenc_single_byte_mbc_to_code): precise_ret argument added. (onigenc_mbn_mbc_to_code): ditto. * string.c (count_utf8_lead_bytes_with_word): removed. (str_utf8_nth): removed. (str_utf8_offset): removed. (str_strlen): UTF-8 codepoint oriented optimization removed. (rb_str_substr): ditto. (enc_succ_char): use rb_enc_mbc_precise_codepoint. (enc_pred_char): ditto. (rb_str_succ): ditto. * encoding.c (rb_enc_ascget): check length with rb_enc_mbc_precise_codepoint. (rb_enc_codepoint): use rb_enc_mbc_precise_codepoint. * regexec.c (string_cmp_ic): add text_end argument. (match_at): check end of character after exact string matches. * enc/utf_8.c (graphme_table): defined for extended graphme cluster boundary. (grapheme_cmp): defined. (get_grapheme_properties): defined. (grapheme_boundary_p): defined. (MAX_BYTES_LENGTH): defined. (comb_char_enc_len): defined. (mbc_to_code0): extracted from mbc_to_code. (mbc_to_code): use mbc_to_code0. (left_adjust_combchar_head): defined. (utf_8): use a extended graphme cluster as a unit. * enc/unicode.c (onigenc_unicode_mbc_case_fold): use ONIGENC_MBC_PRECISE_CODEPOINT to extract codepoints. (onigenc_unicode_get_case_fold_codes_by_str): ditto. * enc/euc_jp.c (mbc_to_code): follow mbc_to_code field change. use onigenc_mbn_mbc_to_code. * enc/shift_jis.c (mbc_to_code): ditto. * enc/emacs_mule.c (mbc_to_code): ditto. * enc/gbk.c (gbk_mbc_to_code): follow mbc_to_code field and onigenc_mbn_mbc_to_code change. * enc/cp949.c (cp949_mbc_to_code): ditto. * enc/big5.c (big5_mbc_to_code): ditto. * enc/euc_tw.c (euctw_mbc_to_code): ditto. * enc/euc_kr.c (euckr_mbc_to_code): ditto. * enc/gb18030.c (gb18030_mbc_to_code): ditto. * enc/utf_32be.c (utf32be_mbc_to_code): follow mbc_to_code field change. * enc/utf_16be.c (utf16be_mbc_to_code): ditto. * enc/utf_32le.c (utf32le_mbc_to_code): ditto. * enc/utf_16le.c (utf16le_mbc_to_code): ditto. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@19389 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* * enc/euc_jp.c (property_name_to_ctype): core dumped when sizeof(int)mame2008-06-171-2/+3
| | | | | | | | | | | differs from sizeof(long). * enc/shift_jis.c (property_name_to_ctype): ditto. * enc/unicode.c (onigenc_unicode_property_name_to_ctype): ditto. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@17381 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* * enc/koi8_u.c: added.naruse2008-01-191-2/+2
| | | | | | | * regenc.c, enc/utf_8.c, enc/unicode.c, enc/gb18030.c: add ARG_UNUSED. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@15130 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* * regenc.c (onigenc_strlen_null, onigenc_str_bytelen_null): suppressednobu2008-01-081-1/+2
| | | | | | | | | | | | warnings. * regenc.h, enc/unicode.c (onigenc_unicode_ctype_code_range): added encoding argument. * enc/utf{16,32}_{be,le}.c: added init functions. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14946 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* * include/ruby/oniguruma.h: Oniguruma 1.9.1 merged.matz2008-01-031-22/+34
| | | | git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14874 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* * Makefile.in, */Makefile.sub (VPATH): add enc directory.nobu2007-10-101-0/+11345
* common.mk (ENCOBJS): encoding objects. * enc: directory for encodings. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@13675 b2dd03c8-39d4-4d8f-98ff-823fe69b080e