| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Add encoding conversion (transcoding) from UTF-8 to CESU-8
and back. CESU-8 is an encoding similar to UTF-8, but encodes
codepoints above U+FFFF as two surrogates, these surrogates
again being encoded as if they were UTF-8 codepoints. This
preserves the same binary sorting order as in UTF-16. It is
also somewhat similar (although not exactly identical) to an
encoding used internally by Java.
This completes issue #15995.
enc/trans/cesu_8.trans: Add encoding conversion from/to CESU-8
test/ruby/test_transcode.rb: Add tests for above
|
|
|
|
|
|
|
| |
Replace the copyrights and explanatory texts in the data files used for mapping
GB2312/GB12345 to/from Unicode with short explanatory texts. [Bug #12598] [ci skip]
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@59715 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
|
|
|
|
|
| |
* enc/trans/windows-1255-tbl.rb: update mapping from 0xCA to
U+05BA. [Feature #12877]
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@56516 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
|
|
|
|
|
| |
* enc/trans/single_byte.trans (transcode_tblgen_singlebyte):
remove useless code. returned value is not used.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@56513 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
|
|
| |
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@54130 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
|
|
| |
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@54129 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
|
|
| |
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@53128 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
|
|
|
|
|
|
|
|
| |
* enc/trans/ebcdic.trans: transcodings between EBCDIC-US
and iso-8859-1 [with code from Andrea Ribuoli]
* test/ruby/test_transcode.rb: tests for above
* tool/transcode_tablegen.rb: additional argument for
method transcode_tblgen
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@53112 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
|
|
|
|
|
| |
* enc/trans/euckr-tbl.rb (EUCKR_TO_UCS_TBL): add missing euro and
registered signs. [ruby-core:64452] [Bug #10149]
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@47221 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
|
|
| |
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@46044 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
|
|
|
|
| |
Fix conversion table and logic. [ruby-dev:47680]
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@42823 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
|
|
|
|
|
| |
Previous table is used on Mac OS X 10.1 or prior.
This table is used on 10.2 or later. [ruby-dev:47680]
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@42789 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
|
|
|
|
|
| |
tool/transcode-tblgen.rb: change EUC-JP-2004 to EUC-JIS-2004.
This is follow up to changes in r41024.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@41035 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Fixes --with-static-linked-ext.
Patch by Google Inc. [ruby-core:45073].
* Makefile.in (ENCOBJS, EXTOBJS): New variables to specify static
linked libraries. Also reintroduces extinit.o, introduces encinit.o
introduces encinit.o
* common.mk: Builds static libraries rather than shared objects if
specified.
* configure.in (LD): new substitution.
Avoids PIE if s
* enc/depend: Supports static linked libraries
(libencs, libenc, libtrans): New target.
* enc/encinit.c.erb: new template to generate the initialization of
statically linked encodings.
* enc/make_encmake.rb (--module): new flag to specify whether static
or dynamic.
* transcode_data.h (TRANS_INIT): New macro to get rid of the name
collision of encoding initializers and transcoder initializers.
* ext/extmk.rb: Fixes the behavior on $extstatic is true.
* lib/mkmf.rb (clean-static): new target to clean up static linked
libraries.
* ruby.c (process_options): New initializes statically linked
encodings here.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@35662 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
[ruby-dev:45571] [Feature #6349]
Requested by Kyouhei Yanagita <yanagi@shakenbu.org>.
* enc/trans/japanese_euc.trans: ditto.
* enc/trans/JIS/JISX0213-[12]%UCS@{BMP,SIP}.src: JIS X 0213:2004 ->
Unicode mapping table from NetBSD.
* enc/trans/JIS/UCS@{BMP,SIP}%JISX0213-[12].src: Unicode -> JIX X
0213:2004 mapping table from NetBSD.
* tool/transcode-tblgen.rb: added SIP support.
* test/ruby/test_transcode.rb: tests of above changes.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@35460 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
|
|
|
|
| |
* enc/trans/single_byte.trans: ditto.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@33993 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
|
|
| |
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@31644 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
|
|
|
|
|
|
| |
CP932 UDA. Another reason is emacs-mule: the implementation of
stateless-iso-2022-jp doesn't support beyond 94x94 (0x7fxx);
but CP932 UDA is in 7Fxx-92xx.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@31366 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
|
|
| |
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@30780 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
|
|
|
|
|
|
| |
patched by oCameLo oTnTh [ruby-core:33256]
* enc/big5.c: add alias Big5-HKSCS:2008 to Big5-HKSCS.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@29922 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
|
|
| |
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@29895 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
|
|
| |
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@29892 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
|
|
|
|
| |
surrogates.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@29891 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
|
|
|
|
| |
* enc/trans/utf_16_32.trans: add a converter from UTF-16 to UTF-8.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@29889 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
|
|
| |
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@29871 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* enc/big5.c: split CP951 from Big5-HKSCS.
* enc/trans/big5.trans: import conversion table of Big5, Big5-HKSCS,
CP950, and CP951 from ICU. they need fallback conversions.
ref [ruby-core:33256]
http://source.icu-project.org/repos/icu/data/trunk/charset/data/ucm/
* tool/transcode-tblgen.rb (import_ucm): add to import ucm files.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@29869 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
|
|
|
|
|
| |
CP936, which is de facto definition of GBK, has it.
http://msdn.microsoft.com/en-us/goglobal/cc305153.aspx
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@29713 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
|
|
|
|
| |
whose result is 2 bytes. [ruby-core:30751]
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@28307 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
|
|
|
|
| |
* enc/trans/iso2022.trans: add converter for CP50220.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@27860 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
|
|
| |
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@27149 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
|
|
|
|
| |
casts for supplessing warnings.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@27040 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
valid_encoding mandatory unless from_encoding is registered in
ValidEncoding.
(transcode_tbl_only): ditto.
(transcode_tblgen): ditto.
(ValidEncoding): new function.
* enc/trans/escape.trans: specify valid_encoding.
* enc/trans/emoji_sjis_docomo.trans: ditto.
* enc/trans/emoji.trans: ditto.
* enc/trans/emoji_iso2022_kddi.trans: ditto.
* enc/trans/big5.trans: ditto.
* enc/trans/emoji_sjis_softbank.trans: ditto.
* enc/trans/emoji_sjis_kddi.trans: ditto.
* enc/trans/chinese.trans: use ValidEncoding() instead of
ValidEncoding[].
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@26995 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
|
|
| |
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@26955 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
optional argument.
* enc/trans/single_byte.trans use valid_encoding argument for
transcode_tblgen.
* enc/trans/chinese.trans: ditto.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@26941 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
|
|
| |
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@26915 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
|
|
|
|
|
|
| |
* enc/trans/single_byte.trans: remove ambiguous maping such as
\xD6 -> U+05F2 and \xD6\xC7 -> U+FB1F in Windows-1255
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@26912 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
|
|
|
|
|
|
| |
test/ruby/enc/test_emoji.rb, tool/enc-emoji-citrus-gen.rb, tool/enc-emoji4unicode.rb, tool/jisx0208.rb, tool/test/test_jisx0208.rb: new encodings to support emoji charsets, which are used by Japanese mobile phones [ruby-dev:40528]. Thanks Yoji Shidara for a lot of contribution.
* tool/transcode-tblgen.rb: modified for enc-emoji4unicode.rb.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@26856 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
|
|
|
|
| |
value. [ruby-dev:40233]
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@26464 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
|
|
|
|
|
|
|
|
|
|
| |
support for new transcoding instruction FUNsio (with Tatsuya Mizuno)
* enc/trans/gb18030.trans: Significantly reduced GB18030 conversion
table footprint using FUNsio and differences (with Tatsuya Mizuno)
* test/ruby/test_transcode.rb: Minor name fix (from Tatsuya Mizuno)
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@26065 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
|
|
|
|
|
|
|
|
| |
(from Tatsuya Mizuno)
* test/ruby/test_transcode.rb: Added test for converting full range of
Unicode codepoints from/to GB18030 (from Tatsuya Mizuno)
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@25980 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
|
|
|
|
|
|
| |
after \r\n detection instead of just after \r.
[ruby-list:45988] [ruby-core:25881] [ruby-core:26788]
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@25883 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
|
|
|
|
|
|
| |
test/ruby/test-transcode.rb: Added Encoding 'Big5-UAO' and transcoding
for it (from Tatsuya Mizuno) (see Bug #1784)
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@25822 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
new Chinese BIG5-HKSCS transcoding (with Tatsuya Mizuno)
* test/ruby/test_transcode.rb: added tests for the above
(with Tatsuya Mizuno)
* enc/big5.c: Added BIG5-HKSCS as a replicate encoding of BIG5
(short term solution, needs more work; with Tatsuya Mizuno)
* tool/transcode-tblgen.rb: made 'pat' directly accessible in
class StrSet
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@24264 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
|
|
| |
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@23686 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* transcode.c: NOMAP is now multibyte direct map.
* transcode.c: remove ASIS.
* transcode_data.h: ditto.
* tool/transcode-tb (ActionMap#generate_info): remove :asis.
* tool/transcode-tb (ActionMap#generate_info): add :nomap0.
* enc/trans/utf8_mac.trans: replace :asis by :nomap0.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@23344 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
|
|
|
|
| |
* enc/trans/utf8_mac.trans: follow above.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@23325 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
|
|
|
|
|
| |
compile.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@23309 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
|
|
| |
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@23307 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
|
|
|
|
|
|
| |
* enc/trans/utf8_mac-tbl.rb: ditto.
* test/ruby/test_econv.rb: tests for above.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@23296 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
|
|
|
|
|
| |
compile. [ruby-core:21345]
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@21512 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|