Chromium Code Reviews
chromiumcodereview-hr@appspot.gserviceaccount.com (chromiumcodereview-hr) | Please choose your nickname with Settings | Help | Chromium Project | Gerrit Changes | Sign out
(197)

Side by Side Diff: source/data/mappings/convrtrs.txt

Issue 587833004: Turn on UCONFIG_NO_NON_HTML5_CONVERTER to save 100kB (Closed) Base URL: svn://svn.chromium.org/chrome/trunk/deps/third_party/icu52/
Patch Set: more tests added to desc Created 6 years, 3 months ago
Use n/p to move between diff chunks; N/P to move between comments. Draft comments are only viewable by you.
Jump to:
View unified diff | Download patch | Annotate | Revision Log
« no previous file with comments | « source/data/in/icudtl.dat ('k') | source/data/mappings/euc-jp-html.ucm » ('j') | no next file with comments »
Toggle Intra-line Diffs ('i') | Expand Comments ('e') | Collapse Comments ('c') | Show Comments Hide Comments ('s')
OLDNEW
1 # ****************************************************************************** 1 # ******************************************************************************
2 # * 2 # *
3 # * Copyright (C) 1995-2013, International Business Machines 3 # * Copyright (C) 1995-2013, International Business Machines
4 # * Corporation and others. All Rights Reserved. 4 # * Corporation and others. All Rights Reserved.
5 # * 5 # *
6 # ****************************************************************************** 6 # ******************************************************************************
7 7
8 # If this converter alias table looks very confusing, a much easier to 8 # If this converter alias table looks very confusing, a much easier to
9 # understand view can be found at this demo: 9 # understand view can be found at this demo:
10 # http://demo.icu-project.org/icu-bin/convexp 10 # http://demo.icu-project.org/icu-bin/convexp
(...skipping 499 matching lines...) Expand 10 before | Expand all | Expand 10 after
510 cp923 { JAVA } 510 cp923 { JAVA }
511 923 { JAVA } 511 923 { JAVA }
512 windows-28605 { WINDOWS* } 512 windows-28605 { WINDOWS* }
513 513
514 # CJK encodings 514 # CJK encodings
515 515
516 # Chrome: Instead of ibm-943_P15A-2003, we use what's specified in the WHATWG 516 # Chrome: Instead of ibm-943_P15A-2003, we use what's specified in the WHATWG
517 # encoding standard (HTML5) for Shift_JIS. Keep all the aliases (even though 517 # encoding standard (HTML5) for Shift_JIS. Keep all the aliases (even though
518 not all of them not required by the encoding spec) for now. 518 not all of them not required by the encoding spec) for now.
519 519
520 shift_jis-html5 { UTR22* } 520 shift_jis-html
521 ibm-943 # Leave untagged because this isn't the default 521 ibm-943 # Leave untagged because this isn't the default
522 Shift_JIS { IANA* MIME* WINDOWS JAVA } 522 Shift_JIS { IANA* MIME* WINDOWS JAVA }
523 MS_Kanji { IANA WINDOWS JAVA } 523 MS_Kanji { IANA WINDOWS JAVA }
524 csShiftJIS { IANA WINDOWS JAVA } 524 csShiftJIS { IANA WINDOWS JAVA }
525 windows-31j { IANA JAVA } # A further extension of Shift _JIS to include NEC special characters (Row 13) 525 windows-31j { IANA JAVA } # A further extension of Shift _JIS to include NEC special characters (Row 13)
526 csWindows31J { IANA WINDOWS JAVA } # A further extension of Shift_JIS to include NEC special characters (Row 13) 526 csWindows31J { IANA WINDOWS JAVA } # A further extension of Shift_JIS to include NEC special characters (Row 13)
527 x-sjis { WINDOWS JAVA } 527 x-sjis { WINDOWS JAVA }
528 x-ms-cp932 { WINDOWS } 528 x-ms-cp932 { WINDOWS }
529 cp932 { WINDOWS } 529 cp932 { WINDOWS }
530 windows-932 { WINDOWS* } 530 windows-932 { WINDOWS* }
531 cp943c { JAVA* } # This is slightly different, but th e backslash mapping is the same. 531 cp943c { JAVA* } # This is slightly different, but th e backslash mapping is the same.
532 IBM-943C #{ AIX* } # Add this tag once AIX aliases becom es available 532 IBM-943C #{ AIX* } # Add this tag once AIX aliases becom es available
533 ms932 533 ms932
534 pck # Probably SOLARIS 534 pck # Probably SOLARIS
535 sjis # This might be for ibm-1351 535 sjis # This might be for ibm-1351
536 ibm-943_VSUB_VPUA 536 ibm-943_VSUB_VPUA
537 x-MS932_0213 { JAVA } 537 x-MS932_0213 { JAVA }
538 x-JISAutoDetect { JAVA } 538 x-JISAutoDetect { JAVA }
539 539
540 # Chrome: Instead of ibm-33722_P*, we use what's specified in the WHATWG 540 # Chrome: Instead of ibm-33722_P*, we use what's specified in the WHATWG
541 # encoding standard (HTML5). All the 541 # encoding standard (HTML5). All the
542 # 3-byte seqeunces in the normative EUC-JP are now decode-only. 542 # 3-byte seqeunces in the normative EUC-JP are now decode-only.
543 euc-jp-html5 { UTR22* } 543 euc-jp-html
544 EUC-JP { MIME* IANA JAVA* WINDOWS*} 544 EUC-JP { MIME* IANA JAVA* WINDOWS*}
545 Extended_UNIX_Code_Packed_Format_for_Japanese { IANA* JA VA WINDOWS } 545 Extended_UNIX_Code_Packed_Format_for_Japanese { IANA* JA VA WINDOWS }
546 csEUCPkdFmtJapanese { IANA JAVA WINDOWS } 546 csEUCPkdFmtJapanese { IANA JAVA WINDOWS }
547 windows-51932 { WINDOWS } 547 windows-51932 { WINDOWS }
548 X-EUC-JP { MIME JAVA WINDOWS } # Japan EUC. x-euc-jp i s a MIME name 548 X-EUC-JP { MIME JAVA WINDOWS } # Japan EUC. x-euc-jp i s a MIME name
549 eucjis {JAVA} 549 eucjis {JAVA}
550 ujis # Linux sometimes uses this name. This is an unfort unate generic and rarely used name. Its use is discouraged. 550 ujis # Linux sometimes uses this name. This is an unfort unate generic and rarely used name. Its use is discouraged.
551 551
552 552
553 windows-950-2000 { UTR22* } 553 windows-950-2000 { UTR22* }
554 Big5 { IANA* MIME* JAVA* WINDOWS } 554 Big5 { IANA* MIME* JAVA* WINDOWS }
555 csBig5 { IANA WINDOWS } 555 csBig5 { IANA WINDOWS }
556 windows-950 { WINDOWS* } 556 windows-950 { WINDOWS* }
557 x-windows-950 { JAVA } 557 x-windows-950 { JAVA }
558 x-big5 558 x-big5
559 ms950 559 ms950
560 # Chrome: HTML5 has big5-hkscs as an alias for big5
561 # TODO(jshin): Decide if Chrome should follow spec. crbug.com/277040
560 ibm-1375_P100-2007 { UTR22* } # Big5-HKSCS-2004 with Unicode 3.1 mappings. Thi s uses supplementary characters. 562 ibm-1375_P100-2007 { UTR22* } # Big5-HKSCS-2004 with Unicode 3.1 mappings. Thi s uses supplementary characters.
561 ibm-1375 { IBM* } 563 ibm-1375 { IBM* }
562 Big5-HKSCS { IANA* JAVA* } 564 Big5-HKSCS { IANA* JAVA* }
563 big5hk { JAVA } 565 big5hk { JAVA }
564 HKSCS-BIG5 # From http://www.openi18n.org/localenamegui de/ 566 HKSCS-BIG5 # From http://www.openi18n.org/localenamegui de/
565 567
566 # Chrome: HTML5 has big5-hkscs as an alias for big5
567 # TODO(jshin): Decide if Chrome should follow spec. crbug.com/277040
568 ibm-5471_P100-2006 { UTR22* } # Big5-HKSCS-2001 with Unicode 3.0 mappings. Thi s uses many PUA characters. 568 ibm-5471_P100-2006 { UTR22* } # Big5-HKSCS-2001 with Unicode 3.0 mappings. Thi s uses many PUA characters.
569 ibm-5471 { IBM* } 569 ibm-5471 { IBM* }
570 Big5-HKSCS 570 Big5-HKSCS
571 MS950_HKSCS { JAVA* } 571 MS950_HKSCS { JAVA* }
572 hkbig5 # from HP-UX 11i, which can't handle supplementar y characters. 572 hkbig5 # from HP-UX 11i, which can't handle supplementar y characters.
573 big5-hkscs:unicode3.0 573 big5-hkscs:unicode3.0
574 x-MS950-HKSCS { JAVA } 574 x-MS950-HKSCS { JAVA }
575 # windows-950 # Windows-950 can be w/ or w/o HKSCS exten sions. By default it's not. 575 # windows-950 # Windows-950 can be w/ or w/o HKSCS exten sions. By default it's not.
576 # windows-950_hkscs 576 # windows-950_hkscs
577 # GBK 577 # GBK
(...skipping 49 matching lines...) Expand 10 before | Expand all | Expand 10 after
627 TIS-620 { IANA* WINDOWS MIME* } 627 TIS-620 { IANA* WINDOWS MIME* }
628 windows-874 { JAVA* WINDOWS* MIME } 628 windows-874 { JAVA* WINDOWS* MIME }
629 MS874 { JAVA } 629 MS874 { JAVA }
630 x-windows-874 { JAVA } 630 x-windows-874 { JAVA }
631 iso-8859-11 { IANA WINDOWS MIME } # iso-8859-11 is simil ar to TIS-620. ibm-13162 is a closer match. 631 iso-8859-11 { IANA WINDOWS MIME } # iso-8859-11 is simil ar to TIS-620. ibm-13162 is a closer match.
632 632
633 # Platform codepages 633 # Platform codepages
634 # Chrome: only keep ibm-878 for KOI8-R, ibm-1168 for KOI8-RU and ibm-866 634 # Chrome: only keep ibm-878 for KOI8-R, ibm-1168 for KOI8-RU and ibm-866
635 ibm-878_P100-1996 { UTR22* } ibm-878 { IBM* } KOI8-R { IANA* MIME* WINDOWS JA VA* } koi8 { WINDOWS JAVA } csKOI8R { IANA WINDOWS JAVA } windows-20866 { WINDOW S* } cp878 # Russian internet 635 ibm-878_P100-1996 { UTR22* } ibm-878 { IBM* } KOI8-R { IANA* MIME* WINDOWS JA VA* } koi8 { WINDOWS JAVA } csKOI8R { IANA WINDOWS JAVA } windows-20866 { WINDOW S* } cp878 # Russian internet
636 # Chrome: Use the table from the WHATWG encoding standard (HTML5). 636 # Chrome: Use the table from the WHATWG encoding standard (HTML5).
637 ibm-866_html5-2012 { UTR22* } ibm-866 { IBM* } IBM866 { IANA* MIME* JAVA } cp 866 { IANA MIME WINDOWS JAVA* } 866 { IANA JAVA } csIBM866 { IANA JAVA } # PC Ru ssian (w/o euro update) 637 ibm866-html ibm-866 { IBM* } IBM866 { IANA* MIME* JAVA } cp866 { IANA MIME WIN DOWS JAVA* } 866 { IANA JAVA } csIBM866 { IANA JAVA } # PC Russian (w/o euro upd ate)
638 ibm-1168_P100-2002 { UTR22* } ibm-1168 { IBM* } KOI8-U { IANA* WINDOWS } windo ws-21866 { WINDOWS* } # Ukrainian KOI8. koi8-ru != KOI8-U and Microsoft is wrong for aliasing them as the same. 638 ibm-1168_P100-2002 { UTR22* } ibm-1168 { IBM* } KOI8-U { IANA* WINDOWS } windo ws-21866 { WINDOWS* } # Ukrainian KOI8. koi8-ru != KOI8-U and Microsoft is wrong for aliasing them as the same.
639 639
640 # The cp aliases in this section aren't really windows aliases, but it was used by ICU for Windows. 640 # The cp aliases in this section aren't really windows aliases, but it was used by ICU for Windows.
641 # cp is usually used to denote IBM in Java, and that is why we don't do that any more. 641 # cp is usually used to denote IBM in Java, and that is why we don't do that any more.
642 # The windows-* aliases mean windows codepages. 642 # The windows-* aliases mean windows codepages.
643 ibm-5346_P100-1998 { UTR22* } ibm-5346 { IBM* } windows-1250 { IANA* JAVA* WIN DOWS* } cp1250 { WINDOWS JAVA } # Windows Latin2 (w/ euro update) 643 ibm-5346_P100-1998 { UTR22* } ibm-5346 { IBM* } windows-1250 { IANA* JAVA* WIN DOWS* } cp1250 { WINDOWS JAVA } # Windows Latin2 (w/ euro update)
644 ibm-5347_P100-1998 { UTR22* } ibm-5347 { IBM* } windows-1251 { IANA* JAVA* WIN DOWS* } cp1251 { WINDOWS JAVA } ANSI1251 # Windows Cyrillic (w/ euro update). AN SI1251 is from Solaris 644 ibm-5347_P100-1998 { UTR22* } ibm-5347 { IBM* } windows-1251 { IANA* JAVA* WIN DOWS* } cp1251 { WINDOWS JAVA } ANSI1251 # Windows Cyrillic (w/ euro update). AN SI1251 is from Solaris
645 ibm-5348_P100-1997 { UTR22* } ibm-5348 { IBM* } windows-1252 { IANA* JAVA* WIN DOWS* } cp1252 { JAVA } # Windows Latin1 (w/ euro update) 645 ibm-5348_P100-1997 { UTR22* } ibm-5348 { IBM* } windows-1252 { IANA* JAVA* WIN DOWS* } cp1252 { JAVA } # Windows Latin1 (w/ euro update)
646 ibm-5349_P100-1998 { UTR22* } ibm-5349 { IBM* } windows-1253 { IANA* JAVA* WIN DOWS* } cp1253 { JAVA } # Windows Greek (w/ euro update) 646 ibm-5349_P100-1998 { UTR22* } ibm-5349 { IBM* } windows-1253 { IANA* JAVA* WIN DOWS* } cp1253 { JAVA } # Windows Greek (w/ euro update)
647 647
(...skipping 17 matching lines...) Expand all
665 macos-7_3-10.2 { UTR22* } x-mac-cyrillic { MIME* WINDOWS } windows-10007 { WINDOWS* } mac-cyrillic maccy x-MacCyrillic { JAVA } x-MacUkraine { JAVA* } # A pple Cyrillic 665 macos-7_3-10.2 { UTR22* } x-mac-cyrillic { MIME* WINDOWS } windows-10007 { WINDOWS* } mac-cyrillic maccy x-MacCyrillic { JAVA } x-MacUkraine { JAVA* } # A pple Cyrillic
666 666
667 # Partially algorithmic converters 667 # Partially algorithmic converters
668 668
669 # [U_ENABLE_GENERIC_ISO_2022] 669 # [U_ENABLE_GENERIC_ISO_2022]
670 # The _generic_ ISO-2022 converter is disabled starting 2003-dec-03 (ICU 2.8). 670 # The _generic_ ISO-2022 converter is disabled starting 2003-dec-03 (ICU 2.8).
671 # For details see the icu mailing list from 2003-dec-01 and the ucnv2022.c file. 671 # For details see the icu mailing list from 2003-dec-01 and the ucnv2022.c file.
672 # Language-specific variants of ISO-2022 continue to be available as listed belo w. 672 # Language-specific variants of ISO-2022 continue to be available as listed belo w.
673 # ISO_2022 ISO-2022 673 # ISO_2022 ISO-2022
674 674
675 # Chrome: The encoding standard only supports ISO-2022-JP and HZ-GB. 675 # Chrome: The encoding standard only supports ISO-2022-JP.
676 # Keep ISO-2022-{KR,CN,CN-Ext} until we're sure what to do about 676 # Remove ISO-2022-{KR,CN,CN-Ext} and HZ-GB from the alias table.
677 # replacement encodings. See crbug.com/277037 677 # See crbug.com/277037 and https://www.w3.org/Bugs/Public/show_bug.cgi?id=25339
678 # TODO(jshin): Remove them when the bug is resolved. 678 # about HZ-GB.
679 ISO_2022,locale=ja,version=0 ISO-2022-JP { IANA* MIME* JAVA* } csISO2022JP { IANA JAVA } x-windows-iso2022jp { JAVA } x-windows-50220 { JAVA } 679 ISO_2022,locale=ja,version=0 ISO-2022-JP { IANA* MIME* JAVA* } csISO2022JP { IANA JAVA } x-windows-iso2022jp { JAVA } x-windows-50220 { JAVA }
680 ISO_2022,locale=ko,version=0 ISO-2022-KR { IANA* MIME* JAVA* } csISO2022KR { IANA JAVA } # This uses ibm-949
681 ISO_2022,locale=zh,version=0 ISO-2022-CN { IANA* JAVA* } csISO2022CN { JAVA } x-ISO-2022-CN-GB { JAVA }
682 ISO_2022,locale=zh,version=1 ISO-2022-CN-EXT { IANA* }
683 HZ HZ-GB-2312 { IANA* }
684 680
685 # Chrome: HTML5 does not need ISCII. 681 # Chrome: HTML5 does not need ISCII.
686 # Remove all Lotus entries as well. 682 # Remove all Lotus entries as well.
687 683
688 # EBCDIC codepages according to the CDRA 684 # EBCDIC codepages according to the CDRA
689 # Chrome: Removed all EBCDIC code pages. 685 # Chrome: Removed all EBCDIC code pages.
690 686
691 # These are not installed by default. They are rarely used. 687 # These are not installed by default. They are rarely used.
692 # Many of them can be added through the online ICU Data Library Customization to ol 688 # Many of them can be added through the online ICU Data Library Customization to ol
693 # Chrome: Removed all these entries except for ISO-8859-16 required by HTML5. 689 # Chrome: Removed all these entries except for ISO-8859-16 required by HTML5.
694 690
695 iso-8859_16-2001 { UTR22* } ISO-8859-16 { IANA* } iso-ir-226 { IANA } ISO_88 59-16:2001 { IANA } latin10 { IANA } l10 { IANA } 691 iso-8859_16-2001 { UTR22* } ISO-8859-16 { IANA* } iso-ir-226 { IANA } ISO_88 59-16:2001 { IANA } latin10 { IANA } l10 { IANA }
696 692
OLDNEW
« no previous file with comments | « source/data/in/icudtl.dat ('k') | source/data/mappings/euc-jp-html.ucm » ('j') | no next file with comments »

Powered by Google App Engine
This is Rietveld 408576698