| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
| |
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@58065 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
|
|
|
|
|
| |
* enc/unicode/9.0.0/name2ctype.h: update due to merger of Onigmo
6.0.0.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@58064 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* enc/unicode.c: Remove special processing for U+03B9/U+03BC/U+A64B
(GREEK SMALL LETTERs IOTA/MU, CYRILLIC SMALL LETTER MONOGRAPH UK)
from onigenc_unicode_case_map and simplify code.
* enc/unicode/case-folding.rb: Remove check for U+03B9/U+03BC/U+A64B.
This and the previous few related commits make sure that we won't hit
the equivalent of bug #12990 anymore for future updates of Unicode versions.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@56976 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
|
|
|
|
|
|
|
| |
* enc/unicode/case-folding.rb: Reorder codepoints so that the upper-case
mapping comes first.
* enc/unicode/9.0.0/casefold.h: Codepoints reordered, upper-case mapping
flag added.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@56975 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
|
|
| |
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@56952 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
|
|
| |
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@56951 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* meta character \X matches Unicode 9.0.0 characters with some workarounds
for UTR #51 Unicode Emoji, Version 4.0 emoji zwj sequences.
[Feature #12831] [ruby-core:77586]
The term "character" can have many meanings bytes, codepoints, combined
characters, and so on. "grapheme cluster" is highest one of such words,
which means user-perceived characters.
Unicode Standard Annex #29 UNICODE TEXT SEGMENTATION specifies how to
handle grapheme clusters (extended grapheme cluster).
But some specs aren't updated to current situation because Unicode Emoji
is rapidly extended without well definition.
It breaks the precondition of UTR#29 "Grapheme cluster boundaries can be
easily tested by looking at immediately adjacent characters". (the
sentence will be removed in the next version)
Though some of its detail are described in Unicode Technical Report #51
UNICODE EMOJI but it is not merged into UTR#29 yet.
http://unicode.org/reports/tr29/
http://unicode.org/reports/tr51/
http://unicode.org/Public/emoji/4.0/
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@56949 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* enc/unicode.c: Add U+A64B to the special cases 03B9 and 03BC
at the end of onigenc_unicode_case_map (Bug #12990).
* enc/unicode/case-folding.rb: Add U+A64B to the special cases
03B9 and 03BC. Add a comment pointing to enc/unicode.c.
Change warnings to exceptions for unpredicted cases,
because this would have been more easily noticed
(the warning was not noticed when upgrading to Unicode 9.0.0).
* test/ruby/enc/test_case_comprehensive.rb: Remove temporary
exclusion of U+A64B from testing.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@56941 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
|
|
|
|
|
| |
removing directories/files related to Unicode version 8.0.0
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@56090 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
|
|
|
|
|
|
| |
* unicode/9.0.0/casefold.h, name2ctype.h, unicode/data/9.0.0:
new directories/files for Unicode version 9.0.0
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@56087 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
|
|
|
|
|
| |
* common.mk (UNICODE_HDR_DIR): separate unicode header files from
unicode data files. [ruby-core:76879] [Bug #12677]
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55942 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
|
|
|
|
|
| |
* common.mk, enc/depend (casefold.h, name2ctype.h): move to
unicode data directory per version.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55701 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
|
|
|
|
|
| |
* enc/unicode/case-folding.rb, tool/enc-unicode.rb: check if
Unicode versions are consistent with each other.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55687 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
|
|
|
|
|
|
|
|
| |
* Makefile.in (enc/unicode/name2ctype.h): remove stale recipe,
which did not support Unicode age properties.
* common.mk (enc/unicode/name2ctype.h): update by --header option
of tool/enc-unicode.rb. enc/unicode/name2ctype.kwd file has not
been used.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55678 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
|
|
|
|
| |
* enc/unicode/case-folding.rb: define Unicode version numbers.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55546 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
|
|
|
|
|
| |
* enc/unicode/case-folding.rb: check if version numbers in each
data files match.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55545 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
|
|
|
|
| |
It is wrong commit.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55518 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
|
|
| |
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55514 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
|
|
|
|
|
| |
* enc/unicode/case-folding.rb (CaseFolding#load): read in binary
mode to deal with non-ASCII charater in CaseFolding.txt.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55496 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
|
|
|
|
| |
* enc/unicode/case-folding.rb: touch the destination file.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55494 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
|
|
|
|
|
|
|
|
|
|
| |
* common.mk (lib/unicode_normalize/tables.rb): should not depend
on Unicode data files unless ALWAYS_UPDATE_UNICODE=yes, to get
rid of downloading Unicode data unnecessary. [ruby-dev:49681]
* common.mk (enc/unicode/casefold.h): update Unicode files in a
sub-make, not to let the header depend on the files always.
* enc/unicode/case-folding.rb: if gperf is not usable, assume the
existing file is OK.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55492 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
swapcase functionality for titlecase characters. Swapcase isn't defined
by Unicode, because the purpose/usage of swapcase is unclear anyway.
The implementation follows a proposal from Nobu, swaping the case of
each component of a titlecase character individually.
This means that the titlecase characters have to be decomposed.
* enc/unicode.c: Code using the above data.
* test/ruby/enc/test_case_mapping.rb: Tests for the above.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@54469 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
|
|
|
|
|
|
|
|
| |
special cases in CaseUnfold_11_Table.
* enc/unicode.c: Adjustments for above.
* test/ruby/enc/test_case_mapping.rb: Tests for the above: Some tests in
test_titlecase activated; test_greek added. A test in test_cherokee fixed.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@54383 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
|
|
|
|
|
|
| |
titlecasing.
* enc/unicode.c: Adjust code to data removal.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@54347 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
|
|
| |
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@54230 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
|
|
|
|
|
|
|
|
| |
* enc/unicode/case-folding.rb, casefold.h: Using above flag in data.
* enc/unicode.c: Marking capitalized character as unmodified if it is
already titlecase.
* test/ruby/enc/test_case_mapping.rb: Tests for above functionality.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@54229 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
|
|
|
|
|
|
|
|
| |
case mapping data not available from case folding by unifying all
three cases (special title, special upper, special lower).
* enc/unicode.c: Adjust macro names for above (macros are currently inactive).
(with Kimihito Matsui)
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@54085 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
|
|
|
|
|
|
| |
table by eliminating duplicates.
(with Kimihito Matsui)
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@53957 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
|
|
|
|
|
|
| |
for TitleCase table in casefold.h.
(with Kimihito Matsui)
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@53930 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
|
|
|
|
|
|
|
| |
data (new table, with indices from other tables).
* enc/unicode.c: Ignoring titlecase data indices for the moment.
(with Kimihito Matsui)
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@53906 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
|
|
|
|
|
|
| |
SpecialCasing.txt.
(with Kimihito Matsui)
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@53904 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
|
|
|
|
|
|
| |
not yet operational.
(with Kimihito Matsui)
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@53891 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
|
|
|
|
|
|
| |
of compatibility characters in uppper-/lower-case mappings.
(with Kimihito Matsui)
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@53890 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
|
|
|
|
|
|
| |
(rather than all) of target in CaseUnfold_11 array.
(with Kimihito Matsui)
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@53843 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
|
|
|
|
|
| |
(with Kimihito Matsui)
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@53833 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
|
|
| |
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@53780 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
|
|
|
|
|
|
| |
upper/lower conversion added (titlecase and SpecialCasing still missing)
(with Kimihito Matsui)
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@53779 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
|
|
|
|
|
|
|
|
|
| |
single-letter; use flags in casefold.h for logic.
* enc/unicode/case-folding.rb: Added flag for case folding.
Changed parameter passing.
* enc/unicode/casefold.h: New flags added.
(with Kimihito Matsui)
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@53775 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
|
|
| |
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@53768 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
|
|
|
|
|
|
|
|
|
| |
* enc/unicode.c: Added shortening macros for enc/unicode/casefold.h
* enc/unicode/case-folding.rb: Fixed file encoding for CaseFolding.txt
to ASCII-8BIT (should fix some ci errors). Clarified usage. Created
class MapItem. Partially implemented class CaseMapping.
(with Kimihito Matsui)
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@53767 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
|
|
|
|
|
| |
(with Kimihito Matsui)
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@53765 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
|
|
|
|
|
|
| |
to pass as parameters; not yet implemented or used.
(with Kimihito Matsui)
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@53764 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
|
|
|
|
|
|
| |
* enc/unicode/case-folding.rb: Correctly specify argument to new option.
(with Kimihito Matsui)
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@53762 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
|
|
| |
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@53760 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
|
|
| |
b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
|
|
| |
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@53127 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
|
|
|
|
|
|
| |
test/ruby/test_transcode.rb: Fixed encoding name
to the correct one in the IANA registry (IBM037)
and added an alias (ebcdic-cp-us)
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@53124 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
|
|
|
|
|
| |
regular expressions from 7.0.0 to 8.0.0
(with help from Kimihito Matsui) [Feature #11563]
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@52612 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
|
|
| |
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@49292 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
|
|
|
|
| |
this includes Support for Unicode 7.0 [Bug #9092].
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@46831 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|