aboutsummaryrefslogtreecommitdiffstats
path: root/include/ruby/oniguruma.h
Commit message (Collapse)AuthorAgeFilesLines
* \d, \s and \w are now non Unicode class. [ruby-dev:39026]naruse2009-08-151-0/+8
| | | | | | | | | | | | | | | | | | | | * include/ruby/oniguruma.h (ONIGENC_CTYPE_SPECIAL_MASK): added. (ONIGENC_CTYPE_D): ditto. (ONIGENC_CTYPE_S): ditto. (ONIGENC_CTYPE_W): ditto. * regparse.c: \d, \s and \w are now non Unicode class. [ruby-dev:39026] (fetch_token_in_cc): use ONIGENC_CTYPE_[DSW] for \d/\s/\w. (fetch_token): ditto. (add_ctype_to_cc): add routines for ONIGENC_CTYPE_[DSW]. (parse_exp): ditto. * test/ruby/test_regexp.rb (TestRegexp#test_char_class): add tests for above. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@24544 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* Warn duplicated characters in character class of regexp. [ruby-core:24593]naruse2009-08-041-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | * include/ruby/oniguruma.h (ONIG_SYN_WARN_CC_DUP): defined. * regparse.h (ScanEnv): add warnings_flag. * regparse.c (CC_DUP_WARN): defined for warn duplicated characters in character class of regexp. [ruby-core:24593] (add_code_range_to_buf): add CC_DUP_WARN. (next_state_val): add CC_DUP_WARN. (OnigSyntaxRuby): add ONIG_SYN_WARN_CC_DUP. (SET_ALL_MULTI_BYTE_RANGE): add env to arguments. (add_code_range): ditto. (add_code_range_to_buf): ditto. (not_code_range_buf): ditto. (or_code_range_buf): ditto. (and_code_range1): ditto. (and_code_range_buf): ditto. (and_cclass): ditto. (or_cclass): ditto. (add_ctype_to_cc_by_range): ditto. (add_ctype_to_cc): ditto. (parse_char_class): ditto. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@24387 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* * include/ruby/oniguruma.h, include/ruby/re.h, re.c, regcomp.c,nobu2009-06-301-3/+3
| | | | | | | regenc.c, regerror.c, regexec.c, regint.h, regparse.c: use long. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@23907 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* * id.h, include/ruby/{intern,oniguruma}.h, regenc.h, regparse.h,nobu2008-12-091-2/+2
| | | | | | | template/*.tmpl: removed trailing garbage spaces. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@20596 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* * grapheme cluster implementation reverted. [ruby-dev:36375]akr2008-09-181-3/+2
| | | | git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@19417 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* * include/ruby/oniguruma.h (OnigEncodingTypeST): add precise_retakr2008-09-161-2/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | argument for mbc_to_code. (ONIGENC_MBC_TO_CODE): provide NULL for precise_ret. (ONIGENC_MBC_PRECISE_CODEPOINT): defined. * include/ruby/encoding.h (rb_enc_mbc_precise_codepoint): defined. * regenc.h (onigenc_single_byte_mbc_to_code): precise_ret argument added. (onigenc_mbn_mbc_to_code): ditto. * regenc.c (onigenc_single_byte_mbc_to_code): precise_ret argument added. (onigenc_mbn_mbc_to_code): ditto. * string.c (count_utf8_lead_bytes_with_word): removed. (str_utf8_nth): removed. (str_utf8_offset): removed. (str_strlen): UTF-8 codepoint oriented optimization removed. (rb_str_substr): ditto. (enc_succ_char): use rb_enc_mbc_precise_codepoint. (enc_pred_char): ditto. (rb_str_succ): ditto. * encoding.c (rb_enc_ascget): check length with rb_enc_mbc_precise_codepoint. (rb_enc_codepoint): use rb_enc_mbc_precise_codepoint. * regexec.c (string_cmp_ic): add text_end argument. (match_at): check end of character after exact string matches. * enc/utf_8.c (graphme_table): defined for extended graphme cluster boundary. (grapheme_cmp): defined. (get_grapheme_properties): defined. (grapheme_boundary_p): defined. (MAX_BYTES_LENGTH): defined. (comb_char_enc_len): defined. (mbc_to_code0): extracted from mbc_to_code. (mbc_to_code): use mbc_to_code0. (left_adjust_combchar_head): defined. (utf_8): use a extended graphme cluster as a unit. * enc/unicode.c (onigenc_unicode_mbc_case_fold): use ONIGENC_MBC_PRECISE_CODEPOINT to extract codepoints. (onigenc_unicode_get_case_fold_codes_by_str): ditto. * enc/euc_jp.c (mbc_to_code): follow mbc_to_code field change. use onigenc_mbn_mbc_to_code. * enc/shift_jis.c (mbc_to_code): ditto. * enc/emacs_mule.c (mbc_to_code): ditto. * enc/gbk.c (gbk_mbc_to_code): follow mbc_to_code field and onigenc_mbn_mbc_to_code change. * enc/cp949.c (cp949_mbc_to_code): ditto. * enc/big5.c (big5_mbc_to_code): ditto. * enc/euc_tw.c (euctw_mbc_to_code): ditto. * enc/euc_kr.c (euckr_mbc_to_code): ditto. * enc/gb18030.c (gb18030_mbc_to_code): ditto. * enc/utf_32be.c (utf32be_mbc_to_code): follow mbc_to_code field change. * enc/utf_16be.c (utf16be_mbc_to_code): ditto. * enc/utf_32le.c (utf32le_mbc_to_code): ditto. * enc/utf_16le.c (utf16le_mbc_to_code): ditto. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@19389 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* * include/ruby/oniguruma.h (OnigEncodingTypeST): add end argument forakr2008-09-131-4/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | left_adjust_char_head. (ONIGENC_LEFT_ADJUST_CHAR_HEAD): add end argument. (onigenc_get_left_adjust_char_head): ditto. * include/ruby/encoding.h (rb_enc_left_char_head): add end argument. * regenc.h (onigenc_single_byte_left_adjust_char_head): ditto. * regenc.c (onigenc_get_right_adjust_char_head): follow the interface change. (onigenc_get_right_adjust_char_head_with_prev): ditto. (onigenc_get_prev_char_head): ditto. (onigenc_step_back): ditto. (onigenc_get_left_adjust_char_head): ditto. (onigenc_single_byte_code_to_mbc): ditto. * re.c: ditto. * string.c: ditto. * io.c: ditto. * regexec.c: ditto. * enc/euc_jp.c: ditto. * enc/cp949.c: ditto. * enc/shift_jis.c: ditto. * enc/gbk.c: ditto. * enc/big5.c: ditto. * enc/euc_tw.c: ditto. * enc/euc_kr.c: ditto. * enc/emacs_mule.c: ditto. * enc/gb18030.c: ditto. * enc/utf_8.c: ditto. * enc/utf_16le.c: ditto. * enc/utf_16be.c: ditto. * enc/utf_32le.c: ditto. * enc/utf_32be.c: ditto. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@19334 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* * include/ruby/oniguruma.h (ONIGENC_STEP_BACK): add end argument.akr2008-09-131-3/+3
| | | | | | | | | | | | (onigenc_step_back): ditto. * regenc.c (onigenc_step_back): add end argument. * regexec.c: follow the interface change. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@19333 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* * include/ruby/oniguruma.h (onigenc_get_prev_char_head): add endakr2008-09-131-1/+1
| | | | | | | | | | | | | | | | | | | | argument. * include/ruby/encoding.h (rb_enc_prev_char): ditto. * regenc.c (onigenc_get_prev_char_head): add end argument. * regparse.c: follow the interface change. * regexec.c: ditto. * string.c: ditto. * parse.y: ditto. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@19332 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* * include/ruby/oniguruma.hakr2008-09-131-1/+1
| | | | | | | | | | | | | (onigenc_get_right_adjust_char_head_with_prev): add end argument. * regenc.c (onigenc_get_right_adjust_char_head_with_prev): use end argument. * regexec.c (forward_search_range): follow the interface change. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@19331 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* * include/ruby/oniguruma.h (onigenc_get_right_adjust_char_head): addakr2008-09-131-1/+1
| | | | | | | | | | | | | | | | | | | end argument. * include/ruby/encoding.h (rb_enc_right_char_head): add end argument. * regenc.c (onigenc_get_right_adjust_char_head): use end argument. * re.c (rb_reg_adjust_startpos): follow the interface change. * string.c (rb_str_index): ditto. * regexec.c (backward_search_range): ditto. (onig_search): ditto. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@19330 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* * include/ruby/oniguruma.h (OnigCodePoint): unsigned long to unsigned int.naruse2008-09-111-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * include/ruby/encoding.h (rb_enc_codepoint): ditto. * encoding.c (rb_enc_codepoint): signed int to unsigned int. * encoding.c (rb_enc_ascget): ditto. * string.c (rb_str_casecmp): ditto. * string.c (enc_succ_alnum_char): ditto. * string.c (rb_str_inspect): ditto. * string.c (rb_str_upcase_bang): ditto. * string.c (rb_str_downcase_bang): ditto. * string.c (rb_str_capitalize_bang): ditto. * string.c (rb_str_swapcase_bang): ditto. * string.c (struct tr): ditto. * string.c (trnext): ditto. * string.c (tr_trans): ditto. * string.c (tr_setup_table): ditto. * string.c (tr_find): ditto. * string.c (rb_str_delete_bang): ditto. * string.c (rb_str_squeeze_bang): ditto. * string.c (rb_str_count): ditto. * string.c (rb_str_split_m): ditto. * string.c (rb_str_each_line): ditto. * string.c (rb_str_lstrip_bang): ditto. * string.c (rb_str_rstrip_bang): ditto. * string.c (rb_str_intern): ditto. * dir.c (char_casecmp): ditto. * sprintf.c (rb_str_format): ditto. * enc/emacs_mule.c (mbc_to_code): to be 32bit clean. * enc/emacs_mule.c (code_to_mbc): ditto. * enc/gb18030.c (mbc_to_code): ditto. * enc/gb18030.c (code_to_mbc): ditto. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@19295 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* * include/ruby/oniguruma.h (OnigEncoding): removed auxiliary_data.nobu2008-07-041-1/+0
| | | | | | | | | | | | | * include/ruby/encoding.h (ENC_DUMMY_P): moved dummy encoding flag to rb_encoding from Encoding instance. * encoding.c (rb_encoding_list): list of Encoding instances. * encoding.c (struct rb_encoding_entry): moved base encoding from instance variable. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@17875 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* * regexec.c (stack_double): use MatchStackLimitSize atomically.nobu2008-07-011-16/+16
| | | | | | | | | | * regparse.c (onig_free_shared_cclass_table): OnigTypeCClassTable needs atomicity * regsyntax.c: constified all predefined OnigSyntaxTypes. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@17765 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* * include/ruby/oniguruma.h: precise mbclen API redesigned to avoidakr2008-01-271-19/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | inline functions. (onigenc_mbclen_charfound): removed. (onigenc_mbclen_needmore): removed. (onigenc_mbclen_recover): removed. (ONIGENC_MBCLEN_CHARFOUND): removed. (ONIGENC_MBCLEN_CHARFOUND_P): defined. (ONIGENC_MBCLEN_CHARFOUND_LEN): defined. (ONIGENC_MBCLEN_INVALID): removed. (ONIGENC_MBCLEN_INVALID_P): defined. (ONIGENC_MBCLEN_NEEDMORE): removed. (ONIGENC_MBCLEN_NEEDMORE_P): defined. (ONIGENC_MBCLEN_NEEDMORE_LEN): defined. (ONIGENC_MBC_ENC_LEN): use onigenc_mbclen_approximate. * regenc.c (onigenc_mbclen_approximate): defined. * include/ruby/encoding.h (MBCLEN_CHARFOUND): removed. (MBCLEN_INVALID): removed. (MBCLEN_NEEDMORE): removed. (MBCLEN_CHARFOUND_P): defined. (MBCLEN_INVALID_P): defined. (MBCLEN_NEEDMORE_P): defined. (MBCLEN_CHARFOUND_LEN): defined. (MBCLEN_NEEDMORE_LEN): defined. * encoding.c: use new API. * re.c: ditto. * string.c: ditto. * parse.y: ditto. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@15280 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* * encoding.c (ENC_REGISTER): use &OnigEncoding*.naruse2008-01-151-2/+0
| | | | | | | | | | | | | (ENCINDEX_UTF_8): renamed from ENCINDEX_UTF8. (rb_enc_init): use ENC_REGISTER. * include/ruby/oniguruma.h (OnigEncodingUTF8, ONIG_ENCODING_UTF8): removed. * enc/*.c: remove use of &encoding_*; use enc argument instead. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@15067 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* * include/ruby/oniguruma.h: remove ONIG_ENCODING_* and OnigEncoding*naruse2008-01-131-58/+0
| | | | | | | | | | | | | which are not builtin. * regenc.{c,h} (onigenc_mb2_code_to_mbclen, onigenc_mb4_code_to_mbclen): fix prototype. * enc/big5.c, enc/euc_kr.c, enc/euc_tw.c, enc/gb18030.c, enc/koi8_r.c, enc/windows_1251.c: imported from Oniguruma. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@15026 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* * encoding.c, Makefile.in, include/ruby/oniguruma.h,naruse2008-01-081-8/+8
| | | | | | | enc/Makefile.in: fix rules for UTF-{16,32}{BE,LE}. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14956 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* * include/ruby/oniguruma.h (OnigEncodingType): new memberakr2008-01-071-0/+1
| | | | | | | | | | | | | ruby_encoding_index to avoid linear search in rb_enc_to_index. * include/ruby/encoding.h (rb_enc_to_index): macro defined to use ruby_encoding_index. * encoding.c (rb_enc_to_index): removed. (enc_register_at): initialize ruby_encoding_index member. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14931 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* * include/ruby/oniguruma.h: Oniguruma 1.9.1 merged.matz2008-01-031-6/+13
| | | | git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14874 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* * regenc.h (onigenc_ascii_is_code_ctype): put back.akr2008-01-031-2/+0
| | | | git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14866 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* * include/ruby/encoding.h (rb_isascii): simplified.akr2008-01-031-0/+2
| | | | | | | | | | | | | | | | | | | | | (rb_isalnum): call onigenc_ascii_is_code_ctype without indirect call. (rb_isalpha): ditto. (rb_isblank): ditto. (rb_iscntrl): ditto. (rb_isdigit): ditto. (rb_isgraph): ditto. (rb_islower): ditto. (rb_isprint): ditto. (rb_ispunct): ditto. (rb_isspace): ditto. (rb_isupper): ditto. (rb_isxdigit): ditto. * include/ruby/oniguruma.h (onigenc_ascii_is_code_ctype): declaration moved from regenc.h. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14864 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* * include/ruby/oniguruma.h (ONIGENC_CONSTRUCT_MBCLEN_NEEDMORE):akr2007-12-111-1/+1
| | | | | | | parenthesize an argument. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14189 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* * encoding.c (rb_enc_precise_mbclen): new function for mbclen withakr2007-12-061-2/+27
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | validation. * include/ruby/encoding.h (rb_enc_precise_mbclen): declared. (MBCLEN_CHARFOUND): new macro. (MBCLEN_INVALID): new macro. (MBCLEN_NEEDMORE): new macro. * include/ruby/oniguruma.h (OnigEncodingTypeST): replace mbc_enc_len by precise_mbc_enc_len. (ONIGENC_PRECISE_MBC_ENC_LEN): new macro. (ONIGENC_CONSTRUCT_MBCLEN_CHARFOUND): new macro. (ONIGENC_CONSTRUCT_MBCLEN_INVALID): new macro. (ONIGENC_CONSTRUCT_MBCLEN_NEEDMORE): new macro. (ONIGENC_MBCLEN_CHARFOUND): new macro. (ONIGENC_MBCLEN_INVALID): new macro. (ONIGENC_MBCLEN_NEEDMORE): new macro. (ONIGENC_MBC_ENC_LEN): use ONIGENC_PRECISE_MBC_ENC_LEN. * enc/euc_jp.c: validation implemented. * enc/sjis.c: ditto. * enc/utf8.c: ditto. * string.c (rb_str_inspect): use rb_enc_precise_mbclen for invalid encoding. (rb_str_valid_encoding_p): new method String#valid_encoding?. * io.c (rb_io_getc): use rb_enc_precise_mbclen. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@14119 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* * include/ruby/oniguruma.h (OnigEncodingTypeST): add OnigEncodingmatz2007-10-101-24/+25
| | | | | | | | | parameter to every function members. * include/ruby/oniguruma.h (OnigEncodingTypeST): add auxiliary data member to provide user defined data for an encoding. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@13674 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* * array.c (rb_ary_cycle): typo in rdoc. a patch from Yuguimatz2007-09-061-4/+4
| | | | | | <yugui@yugui.sakura.ne.jp>. [ruby-dev:31748] git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@13348 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* * include/ruby/oniguruma.h: upgrade to Oniguruma 5.9.0. fixesmatz2007-07-231-1/+1
| | | | | | some memory violation. [ruby-dev:31070] git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@12841 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* * include/ruby/onigiruma.h (ONIG_EXTERN): use RUBY_EXTERN if defined.usa2007-07-031-0/+4
| | | | | | | | | | * regenc.h: include ruby/defines.h. * regint.h: x64-mswin64 support. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@12682 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* * include/ruby: moved public headers.nobu2007-06-101-0/+816
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@12501 b2dd03c8-39d4-4d8f-98ff-823fe69b080e