grit/tool/xmb.py - Issue 1424933018: Allow higher unicode characters in XMB files.

Side by Side Diff: grit/tool/xmb.py

Issue 1424933018: Allow higher unicode characters in XMB files. (Closed) Base URL: http://grit-i18n.googlecode.com/svn/trunk

Patch Set: Created 5 years, 1 month ago

Use n/p to move between diff chunks; N/P to move between comments. Draft comments are only viewable by you.

Jump to:

View unified diff | Download patch | Annotate | Revision Log

OLD	NEW
1 #!/usr/bin/env python	1 #!/usr/bin/env python

2 # Copyright (c) 2012 The Chromium Authors. All rights reserved.	2 # Copyright (c) 2012 The Chromium Authors. All rights reserved.

3 # Use of this source code is governed by a BSD-style license that can be	3 # Use of this source code is governed by a BSD-style license that can be

4 # found in the LICENSE file.	4 # found in the LICENSE file.

5	5

6 """The 'grit xmb' tool.	6 """The 'grit xmb' tool.

7 """	7 """

8	8

9 import getopt	9 import getopt

10 import os	10 import os

(...skipping 10 matching lines...) Expand all Loading...
21 # Used to collapse presentable content to determine if	21 # Used to collapse presentable content to determine if

22 # xml:space="preserve" is needed.	22 # xml:space="preserve" is needed.

23 _WHITESPACES_REGEX = lazy_re.compile(ur'\s\s*')	23 _WHITESPACES_REGEX = lazy_re.compile(ur'\s\s*')

24	24

25	25

26 # See XmlEscape below.	26 # See XmlEscape below.

27 _XML_QUOTE_ESCAPES = {	27 _XML_QUOTE_ESCAPES = {

28 u"'": u''',	28 u"'": u''',

29 u'"': u'"',	29 u'"': u'"',

30 }	30 }

	31 # See http://www.w3.org/TR/xml/#charsets

31 _XML_BAD_CHAR_REGEX = lazy_re.compile(u'[^\u0009\u000A\u000D'	32 _XML_BAD_CHAR_REGEX = lazy_re.compile(u'[^\u0009\u000A\u000D'

32 u'\u0020-\uD7FF\uE000-\uFFFD]')	33 u'\u0020-\uD7FF\uE000-\uFFFD'

	34 u'\U00010000-\U0010FFFF]')

33	35

34	36

35 def _XmlEscape(s):	37 def _XmlEscape(s):

36 """Returns text escaped for XML in a way compatible with Google's	38 """Returns text escaped for XML in a way compatible with Google's

37 internal Translation Console tool. May be used for attributes as	39 internal Translation Console tool. May be used for attributes as

38 well as for contents.	40 well as for contents.

39 """	41 """

40 if not type(s) == unicode:	42 if not type(s) == unicode:

41 s = unicode(s)	43 s = unicode(s)

42 result = saxutils.escape(s, _XML_QUOTE_ESCAPES)	44 result = saxutils.escape(s, _XML_QUOTE_ESCAPES)

43 return _XML_BAD_CHAR_REGEX.sub(u'', result).encode('utf-8')	45 illegal_chars = _XML_BAD_CHAR_REGEX.search(result)

	46 if illegal_chars:
	newt (away) 2015/11/10 17:22:52 All of Chrome's grd files pass this stricter error All of Chrome's grd files pass this stricter error checking (as you'd hope!). It's possible that some other project contains illegal characters but I think they'd rather know about that just continue to get translations with missing characters.
	47 raise Exception('String contains characters disallowed in XML: %s' %

	48 repr(result))

	49 return result.encode('utf-8')

44	50

45	51

46 def _WriteAttribute(file, name, value):	52 def _WriteAttribute(file, name, value):

47 """Writes an XML attribute to the specified file.	53 """Writes an XML attribute to the specified file.

48	54

49 Args:	55 Args:

50 file: file to write to	56 file: file to write to

51 name: name of the attribute	57 name: name of the attribute

52 value: (unescaped) value of the attribute	58 value: (unescaped) value of the attribute

53 """	59 """

(...skipping 228 matching lines...) Expand 10 before \| Expand all \| Expand 10 after Loading...
282 messages.sort(key=lambda x:x.GetId())	288 messages.sort(key=lambda x:x.GetId())

283	289

284 if self.format == self.FORMAT_IDS_ONLY:	290 if self.format == self.FORMAT_IDS_ONLY:

285 # We just print the list of IDs to the output file.	291 # We just print the list of IDs to the output file.

286 for msg in messages:	292 for msg in messages:

287 output_file.write(msg.GetId())	293 output_file.write(msg.GetId())

288 output_file.write('\n')	294 output_file.write('\n')

289 else:	295 else:

290 assert self.format == self.FORMAT_XMB	296 assert self.format == self.FORMAT_XMB

291 WriteXmbFile(output_file, messages)	297 WriteXmbFile(output_file, messages)

OLD	NEW

« no previous file with comments | « no previous file | grit/tool/xmb_unittest.py » ('j') | no next file with comments »