third_party/Python-Markdown/markdown/__init__.py - Issue 1356203004: Check in a simple pure-python based Markdown previewer.

Side by Side Diff: third_party/Python-Markdown/markdown/init.py

Issue 1356203004: Check in a simple pure-python based Markdown previewer. (Closed) Base URL: https://chromium.googlesource.com/chromium/src.git@add

Patch Set: fix license file Created 5 years, 2 months ago

Use n/p to move between diff chunks; N/P to move between comments. Draft comments are only viewable by you.

Jump to:

View unified diff | Download patch

OLD	NEW
1 # markdown is released under the BSD license

2 # Copyright 2007, 2008 The Python Markdown Project (v. 1.7 and later)

3 # Copyright 2004, 2005, 2006 Yuri Takhteyev (v. 0.2-1.6b)

4 # Copyright 2004 Manfred Stienstra (the original version)

5 #

6 # All rights reserved.

7 #

8 # Redistribution and use in source and binary forms, with or without

9 # modification, are permitted provided that the following conditions are met:

10 #

11 # * Redistributions of source code must retain the above copyright

12 # notice, this list of conditions and the following disclaimer.

13 # * Redistributions in binary form must reproduce the above copyright

14 # notice, this list of conditions and the following disclaimer in the

15 # documentation and/or other materials provided with the distribution.

16 # * Neither the name of the <organization> nor the

17 # names of its contributors may be used to endorse or promote products

18 # derived from this software without specific prior written permission.

19 #

20 # THIS SOFTWARE IS PROVIDED BY THE PYTHON MARKDOWN PROJECT ''AS IS'' AND ANY

21 # EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED

22 # WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE

23 # DISCLAIMED. IN NO EVENT SHALL ANY CONTRIBUTORS TO THE PYTHON MARKDOWN PROJECT

24 # BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR

25 # CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF

26 # SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS

27 # INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN

28 # CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)

29 # ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE

30 # POSSIBILITY OF SUCH DAMAGE.

31

32

33 """	1 """

34 Python Markdown	2 Python Markdown

35 ===============	3 ===============

36	4

37 Python Markdown converts Markdown to HTML and can be used as a library or	5 Python Markdown converts Markdown to HTML and can be used as a library or

38 called from the command line.	6 called from the command line.

39	7

40 ## Basic usage as a module:	8 ## Basic usage as a module:

41	9

42 import markdown	10 import markdown

43 html = markdown.markdown(your_text_string)	11 html = markdown.markdown(your_text_string)

44	12

45 See <http://packages.python.org/Markdown/> for more	13 See <https://pythonhosted.org/Markdown/> for more

46 information and instructions on how to extend the functionality of	14 information and instructions on how to extend the functionality of

47 Python Markdown. Read that before you try modifying this file.	15 Python Markdown. Read that before you try modifying this file.

48	16

49 ## Authors and License	17 ## Authors and License

50	18

51 Started by [Manfred Stienstra](http://www.dwerg.net/). Continued and	19 Started by [Manfred Stienstra](http://www.dwerg.net/). Continued and

52 maintained by [Yuri Takhteyev](http://www.freewisdom.org), [Waylan	20 maintained by [Yuri Takhteyev](http://www.freewisdom.org), [Waylan

53 Limberg](http://achinghead.com/) and [Artem Yunusov](http://blog.splyer.com).	21 Limberg](http://achinghead.com/) and [Artem Yunusov](http://blog.splyer.com).

54	22

55 Contact: markdown@freewisdom.org	23 Contact: markdown@freewisdom.org

56	24

57 Copyright 2007-2013 The Python Markdown Project (v. 1.7 and later)	25 Copyright 2007-2013 The Python Markdown Project (v. 1.7 and later)

58 Copyright 200? Django Software Foundation (OrderedDict implementation)	26 Copyright 200? Django Software Foundation (OrderedDict implementation)

59 Copyright 2004, 2005, 2006 Yuri Takhteyev (v. 0.2-1.6b)	27 Copyright 2004, 2005, 2006 Yuri Takhteyev (v. 0.2-1.6b)

60 Copyright 2004 Manfred Stienstra (the original version)	28 Copyright 2004 Manfred Stienstra (the original version)

61	29

62 License: BSD (see LICENSE for details).	30 License: BSD (see LICENSE for details).

63 """	31 """

64	32

65 from __future__ import absolute_import	33 from __future__ import absolute_import

66 from __future__ import unicode_literals	34 from __future__ import unicode_literals

67 from .__version__ import version, version_info	35 from .__version__ import version, version_info # noqa

68 import re

69 import codecs	36 import codecs

70 import sys	37 import sys

71 import logging	38 import logging

	39 import warnings

	40 import importlib

72 from . import util	41 from . import util

73 from .preprocessors import build_preprocessors	42 from .preprocessors import build_preprocessors

74 from .blockprocessors import build_block_parser	43 from .blockprocessors import build_block_parser

75 from .treeprocessors import build_treeprocessors	44 from .treeprocessors import build_treeprocessors

76 from .inlinepatterns import build_inlinepatterns	45 from .inlinepatterns import build_inlinepatterns

77 from .postprocessors import build_postprocessors	46 from .postprocessors import build_postprocessors

78 from .extensions import Extension	47 from .extensions import Extension

79 from .serializers import to_html_string, to_xhtml_string	48 from .serializers import to_html_string, to_xhtml_string

80	49

81 __all__ = ['Markdown', 'markdown', 'markdownFromFile']	50 __all__ = ['Markdown', 'markdown', 'markdownFromFile']

82	51

	52

83 logger = logging.getLogger('MARKDOWN')	53 logger = logging.getLogger('MARKDOWN')

84	54

85	55

86 class Markdown(object):	56 class Markdown(object):

87 """Convert Markdown to HTML."""	57 """Convert Markdown to HTML."""

88	58

89 doc_tag = "div" # Element used to wrap document - later removed	59 doc_tag = "div" # Element used to wrap document - later removed

90	60

91 option_defaults = {	61 option_defaults = {

92 'html_replacement_text' : '[HTML_REMOVED]',	62 'html_replacement_text': '[HTML_REMOVED]',

93 'tab_length' : 4,	63 'tab_length': 4,

94 'enable_attributes' : True,	64 'enable_attributes': True,

95 'smart_emphasis' : True,	65 'smart_emphasis': True,

96 'lazy_ol' : True,	66 'lazy_ol': True,

97 }	67 }

98	68

99 output_formats = {	69 output_formats = {

100 'html' : to_html_string,	70 'html': to_html_string,

101 'html4' : to_html_string,	71 'html4': to_html_string,

102 'html5' : to_html_string,	72 'html5': to_html_string,

103 'xhtml' : to_xhtml_string,	73 'xhtml': to_xhtml_string,

104 'xhtml1': to_xhtml_string,	74 'xhtml1': to_xhtml_string,

105 'xhtml5': to_xhtml_string,	75 'xhtml5': to_xhtml_string,

106 }	76 }

107	77

108 ESCAPED_CHARS = ['\\', '`', '*', '_', '{', '}', '[', ']',	78 ESCAPED_CHARS = ['\\', '`', '*', '_', '{', '}', '[', ']',

109 '(', ')', '>', '#', '+', '-', '.', '!']	79 '(', ')', '>', '#', '+', '-', '.', '!']

110	80

111 def __init__(self, args, *kwargs):	81 def __init__(self, args, *kwargs):

112 """	82 """

113 Creates a new Markdown instance.	83 Creates a new Markdown instance.

114	84

115 Keyword arguments:	85 Keyword arguments:

116	86

117 * extensions: A list of extensions.	87 * extensions: A list of extensions.

118 If they are of type string, the module mdx_name.py will be loaded.	88 If they are of type string, the module mdx_name.py will be loaded.

119 If they are a subclass of markdown.Extension, they will be used	89 If they are a subclass of markdown.Extension, they will be used

120 as-is.	90 as-is.

121 * extension_configs: Configuration settingis for extensions.	91 * extension_configs: Configuration settings for extensions.

122 * output_format: Format of output. Supported formats are:	92 * output_format: Format of output. Supported formats are:

123 * "xhtml1": Outputs XHTML 1.x. Default.	93 * "xhtml1": Outputs XHTML 1.x. Default.

124 * "xhtml5": Outputs XHTML style tags of HTML 5	94 * "xhtml5": Outputs XHTML style tags of HTML 5

125 * "xhtml": Outputs latest supported version of XHTML (currently XHTM L 1.1).	95 * "xhtml": Outputs latest supported version of XHTML

	96 (currently XHTML 1.1).

126 * "html4": Outputs HTML 4	97 * "html4": Outputs HTML 4

127 * "html5": Outputs HTML style tags of HTML 5	98 * "html5": Outputs HTML style tags of HTML 5

128 * "html": Outputs latest supported version of HTML (currently HTML 4 ).	99 * "html": Outputs latest supported version of HTML

	100 (currently HTML 4).

129 Note that it is suggested that the more specific formats ("xhtml1"	101 Note that it is suggested that the more specific formats ("xhtml1"

130 and "html4") be used as "xhtml" or "html" may change in the future	102 and "html4") be used as "xhtml" or "html" may change in the future

131 if it makes sense at that time.	103 if it makes sense at that time.

132 * safe_mode: Disallow raw html. One of "remove", "replace" or "escape".	104 * safe_mode: Deprecated! Disallow raw html. One of "remove", "replace"

133 * html_replacement_text: Text used when safe_mode is set to "replace".	105 or "escape".

	106 * html_replacement_text: Deprecated! Text used when safe_mode is set

	107 to "replace".

134 * tab_length: Length of tabs in the source. Default: 4	108 * tab_length: Length of tabs in the source. Default: 4

135 * enable_attributes: Enable the conversion of attributes. Default: True	109 * enable_attributes: Enable the conversion of attributes. Default: True

136 * smart_emphasis: Treat `_connected_words_` intelegently Default: True	110 * smart_emphasis: Treat `_connected_words_` intelligently Default: True

137 * lazy_ol: Ignore number of first item of ordered lists. Default: True	111 * lazy_ol: Ignore number of first item of ordered lists. Default: True

138	112

139 """	113 """

140	114

141 # For backward compatibility, loop through old positional args	115 # For backward compatibility, loop through old positional args

142 pos = ['extensions', 'extension_configs', 'safe_mode', 'output_format']	116 pos = ['extensions', 'extension_configs', 'safe_mode', 'output_format']

143 c = 0	117 for c, arg in enumerate(args):

144 for arg in args:

145 if pos[c] not in kwargs:	118 if pos[c] not in kwargs:

146 kwargs[pos[c]] = arg	119 kwargs[pos[c]] = arg

147 c += 1	120 if c+1 == len(pos): # pragma: no cover

148 if c == len(pos):

149 # ignore any additional args	121 # ignore any additional args

150 break	122 break

	123 if len(args):

	124 warnings.warn('Positional arguments are deprecated in Markdown. '

	125 'Use keyword arguments only.',

	126 DeprecationWarning)

151	127

152 # Loop through kwargs and assign defaults	128 # Loop through kwargs and assign defaults

153 for option, default in self.option_defaults.items():	129 for option, default in self.option_defaults.items():

154 setattr(self, option, kwargs.get(option, default))	130 setattr(self, option, kwargs.get(option, default))

155	131

156 self.safeMode = kwargs.get('safe_mode', False)	132 self.safeMode = kwargs.get('safe_mode', False)

157 if self.safeMode and 'enable_attributes' not in kwargs:	133 if self.safeMode and 'enable_attributes' not in kwargs:

158 # Disable attributes in safeMode when not explicitly set	134 # Disable attributes in safeMode when not explicitly set

159 self.enable_attributes = False	135 self.enable_attributes = False

160	136

	137 if 'safe_mode' in kwargs:

	138 warnings.warn('"safe_mode" is deprecated in Python-Markdown. '

	139 'Use an HTML sanitizer (like '

	140 'Bleach http://bleach.readthedocs.org/) '

	141 'if you are parsing untrusted markdown text. '

	142 'See the 2.6 release notes for more info',

	143 DeprecationWarning)

	144

	145 if 'html_replacement_text' in kwargs:

	146 warnings.warn('The "html_replacement_text" keyword is '

	147 'deprecated along with "safe_mode".',

	148 DeprecationWarning)

	149

161 self.registeredExtensions = []	150 self.registeredExtensions = []

162 self.docType = ""	151 self.docType = ""

163 self.stripTopLevelTags = True	152 self.stripTopLevelTags = True

164	153

165 self.build_parser()	154 self.build_parser()

166	155

167 self.references = {}	156 self.references = {}

168 self.htmlStash = util.HtmlStash()	157 self.htmlStash = util.HtmlStash()

169 self.set_output_format(kwargs.get('output_format', 'xhtml1'))

170 self.registerExtensions(extensions=kwargs.get('extensions', []),	158 self.registerExtensions(extensions=kwargs.get('extensions', []),

171 configs=kwargs.get('extension_configs', {}))	159 configs=kwargs.get('extension_configs', {}))

	160 self.set_output_format(kwargs.get('output_format', 'xhtml1'))

172 self.reset()	161 self.reset()

173	162

174 def build_parser(self):	163 def build_parser(self):

175 """ Build the parser from the various parts. """	164 """ Build the parser from the various parts. """

176 self.preprocessors = build_preprocessors(self)	165 self.preprocessors = build_preprocessors(self)

177 self.parser = build_block_parser(self)	166 self.parser = build_block_parser(self)

178 self.inlinePatterns = build_inlinepatterns(self)	167 self.inlinePatterns = build_inlinepatterns(self)

179 self.treeprocessors = build_treeprocessors(self)	168 self.treeprocessors = build_treeprocessors(self)

180 self.postprocessors = build_postprocessors(self)	169 self.postprocessors = build_postprocessors(self)

181 return self	170 return self

182	171

183 def registerExtensions(self, extensions, configs):	172 def registerExtensions(self, extensions, configs):

184 """	173 """

185 Register extensions with this instance of Markdown.	174 Register extensions with this instance of Markdown.

186	175

187 Keyword arguments:	176 Keyword arguments:

188	177

189 * extensions: A list of extensions, which can either	178 * extensions: A list of extensions, which can either

190 be strings or objects. See the docstring on Markdown.	179 be strings or objects. See the docstring on Markdown.

191 * configs: A dictionary mapping module names to config options.	180 * configs: A dictionary mapping module names to config options.

192	181

193 """	182 """

194 for ext in extensions:	183 for ext in extensions:

195 if isinstance(ext, util.string_type):	184 if isinstance(ext, util.string_type):

196 ext = self.build_extension(ext, configs.get(ext, []))	185 ext = self.build_extension(ext, configs.get(ext, {}))

197 if isinstance(ext, Extension):	186 if isinstance(ext, Extension):

198 ext.extendMarkdown(self, globals())	187 ext.extendMarkdown(self, globals())

	188 logger.debug(

	189 'Successfully loaded extension "%s.%s".'

	190 % (ext.__class__.__module__, ext.__class__.__name__)

	191 )

199 elif ext is not None:	192 elif ext is not None:

200 raise TypeError(	193 raise TypeError(

201 'Extension "%s.%s" must be of type: "markdown.Extension"'	194 'Extension "%s.%s" must be of type: "markdown.Extension"'

202 % (ext.__class__.__module__, ext.__class__.__name__))	195 % (ext.__class__.__module__, ext.__class__.__name__))

203	196

204 return self	197 return self

205	198

206 def build_extension(self, ext_name, configs = []):	199 def build_extension(self, ext_name, configs):

207 """Build extension by name, then return the module.	200 """Build extension by name, then return the module.

208	201

209 The extension name may contain arguments as part of the string in the	202 The extension name may contain arguments as part of the string in the

210 following format: "extname(key1=value1,key2=value2)"	203 following format: "extname(key1=value1,key2=value2)"

211	204

212 """	205 """

213	206

	207 configs = dict(configs)

	208

214 # Parse extensions config params (ignore the order)	209 # Parse extensions config params (ignore the order)

215 configs = dict(configs)	210 pos = ext_name.find("(") # find the first "("

216 pos = ext_name.find("(") # find the first "("

217 if pos > 0:	211 if pos > 0:

218 ext_args = ext_name[pos+1:-1]	212 ext_args = ext_name[pos+1:-1]

219 ext_name = ext_name[:pos]	213 ext_name = ext_name[:pos]

220 pairs = [x.split("=") for x in ext_args.split(",")]	214 pairs = [x.split("=") for x in ext_args.split(",")]

221 configs.update([(x.strip(), y.strip()) for (x, y) in pairs])	215 configs.update([(x.strip(), y.strip()) for (x, y) in pairs])

	216 warnings.warn('Setting configs in the Named Extension string is '

	217 'deprecated. It is recommended that you '

	218 'pass an instance of the extension class to '

	219 'Markdown or use the "extension_configs" keyword. '

	220 'The current behavior will raise an error in version 2 .7. '

	221 'See the Release Notes for Python-Markdown version '

	222 '2.6 for more info.', DeprecationWarning)

222	223

223 # Setup the module name	224 # Get class name (if provided): `path.to.module:ClassName`

224 module_name = ext_name	225 ext_name, class_name = ext_name.split(':', 1) \

225 if '.' not in ext_name:	226 if ':' in ext_name else (ext_name, '')

226 module_name = '.'.join(['third_party.markdown.extensions', ext_name] )

227	227

228 # Try loading the extension first from one place, then another	228 # Try loading the extension first from one place, then another

229 try: # New style (markdown.extensons.<extension>)	229 try:

230 module = __import__(module_name, {}, {}, [module_name.rpartition('.' )[0]])	230 # Assume string uses dot syntax (`path.to.some.module`)

	231 module = importlib.import_module(ext_name)

	232 logger.debug(

	233 'Successfuly imported extension module "%s".' % ext_name

	234 )

	235 # For backward compat (until deprecation)

	236 # check that this is an extension.

	237 if ('.' not in ext_name and not (hasattr(module, 'makeExtension') or

	238 (class_name and hasattr(module, class_name)))):

	239 # We have a name conflict

	240 # eg: extensions=['tables'] and PyTables is installed

	241 raise ImportError

231 except ImportError:	242 except ImportError:

232 module_name_old_style = '_'.join(['mdx', ext_name])	243 # Preppend `markdown.extensions.` to name

233 try: # Old style (mdx_<extension>)	244 module_name = '.'.join(['markdown.extensions', ext_name])

234 module = __import__(module_name_old_style)	245 try:

235 except ImportError as e:	246 module = importlib.import_module(module_name)

236 message = "Failed loading extension '%s' from '%s' or '%s'" \	247 logger.debug(

237 % (ext_name, module_name, module_name_old_style)	248 'Successfuly imported extension module "%s".' %

	249 module_name

	250 )

	251 warnings.warn('Using short names for Markdown\'s builtin '

	252 'extensions is deprecated. Use the '

	253 'full path to the extension with Python\'s dot '

	254 'notation (eg: "%s" instead of "%s"). The '

	255 'current behavior will raise an error in version '

	256 '2.7. See the Release Notes for '

	257 'Python-Markdown version 2.6 for more info.' %

	258 (module_name, ext_name),

	259 DeprecationWarning)

	260 except ImportError:

	261 # Preppend `mdx_` to name

	262 module_name_old_style = '_'.join(['mdx', ext_name])

	263 try:

	264 module = importlib.import_module(module_name_old_style)

	265 logger.debug(

	266 'Successfuly imported extension module "%s".' %

	267 module_name_old_style)

	268 warnings.warn('Markdown\'s behavior of prepending "mdx_" '

	269 'to an extension name is deprecated. '

	270 'Use the full path to the '

	271 'extension with Python\'s dot notation '

	272 '(eg: "%s" instead of "%s"). The current '

	273 'behavior will raise an error in version 2.7. '

	274 'See the Release Notes for Python-Markdown '

	275 'version 2.6 for more info.' %

	276 (module_name_old_style, ext_name),

	277 DeprecationWarning)

	278 except ImportError as e:

	279 message = "Failed loading extension '%s' from '%s', '%s' " \

	280 "or '%s'" % (ext_name, ext_name, module_name,

	281 module_name_old_style)

	282 e.args = (message,) + e.args[1:]

	283 raise

	284

	285 if class_name:

	286 # Load given class name from module.

	287 return getattr(module, class_name)(**configs)

	288 else:

	289 # Expect makeExtension() function to return a class.

	290 try:

	291 return module.makeExtension(**configs)

	292 except AttributeError as e:

	293 message = e.args[0]

	294 message = "Failed to initiate extension " \

	295 "'%s': %s" % (ext_name, message)

238 e.args = (message,) + e.args[1:]	296 e.args = (message,) + e.args[1:]

239 raise	297 raise

240	298

241 # If the module is loaded successfully, we expect it to define a

242 # function called makeExtension()

243 try:

244 return module.makeExtension(configs.items())

245 except AttributeError as e:

246 message = e.args[0]

247 message = "Failed to initiate extension " \

248 "'%s': %s" % (ext_name, message)

249 e.args = (message,) + e.args[1:]

250 raise

251

252 def registerExtension(self, extension):	299 def registerExtension(self, extension):

253 """ This gets called by the extension """	300 """ This gets called by the extension """

254 self.registeredExtensions.append(extension)	301 self.registeredExtensions.append(extension)

255 return self	302 return self

256	303

257 def reset(self):	304 def reset(self):

258 """	305 """

259 Resets all state variables so that we can start with a new text.	306 Resets all state variables so that we can start with a new text.

260 """	307 """

261 self.htmlStash.reset()	308 self.htmlStash.reset()

262 self.references.clear()	309 self.references.clear()

263	310

264 for extension in self.registeredExtensions:	311 for extension in self.registeredExtensions:

265 if hasattr(extension, 'reset'):	312 if hasattr(extension, 'reset'):

266 extension.reset()	313 extension.reset()

267	314

268 return self	315 return self

269	316

270 def set_output_format(self, format):	317 def set_output_format(self, format):

271 """ Set the output format for the class instance. """	318 """ Set the output format for the class instance. """

272 self.output_format = format.lower()	319 self.output_format = format.lower()

273 try:	320 try:

274 self.serializer = self.output_formats[self.output_format]	321 self.serializer = self.output_formats[self.output_format]

275 except KeyError as e:	322 except KeyError as e:

276 valid_formats = list(self.output_formats.keys())	323 valid_formats = list(self.output_formats.keys())

277 valid_formats.sort()	324 valid_formats.sort()

278 message = 'Invalid Output Format: "%s". Use one of %s.' \	325 message = 'Invalid Output Format: "%s". Use one of %s.' \

279 % (self.output_format,	326 % (self.output_format,

280 '"' + '", "'.join(valid_formats) + '"')	327 '"' + '", "'.join(valid_formats) + '"')

281 e.args = (message,) + e.args[1:]	328 e.args = (message,) + e.args[1:]

282 raise	329 raise

283 return self	330 return self

284	331

285 def convert(self, source):	332 def convert(self, source):

286 """	333 """

287 Convert markdown to serialized XHTML or HTML.	334 Convert markdown to serialized XHTML or HTML.

288	335

289 Keyword arguments:	336 Keyword arguments:

290	337

(...skipping 28 matching lines...) Expand all Loading...
319 self.lines = source.split("\n")	366 self.lines = source.split("\n")

320 for prep in self.preprocessors.values():	367 for prep in self.preprocessors.values():

321 self.lines = prep.run(self.lines)	368 self.lines = prep.run(self.lines)

322	369

323 # Parse the high-level elements.	370 # Parse the high-level elements.

324 root = self.parser.parseDocument(self.lines).getroot()	371 root = self.parser.parseDocument(self.lines).getroot()

325	372

326 # Run the tree-processors	373 # Run the tree-processors

327 for treeprocessor in self.treeprocessors.values():	374 for treeprocessor in self.treeprocessors.values():

328 newRoot = treeprocessor.run(root)	375 newRoot = treeprocessor.run(root)

329 if newRoot:	376 if newRoot is not None:

330 root = newRoot	377 root = newRoot

331	378

332 # Serialize _properly_. Strip top-level tags.	379 # Serialize _properly_. Strip top-level tags.

333 output = self.serializer(root)	380 output = self.serializer(root)

334 if self.stripTopLevelTags:	381 if self.stripTopLevelTags:

335 try:	382 try:

336 start = output.index('<%s>'%self.doc_tag)+len(self.doc_tag)+2	383 start = output.index(

337 end = output.rindex('</%s>'%self.doc_tag)	384 '<%s>' % self.doc_tag) + len(self.doc_tag) + 2

	385 end = output.rindex('</%s>' % self.doc_tag)

338 output = output[start:end].strip()	386 output = output[start:end].strip()

339 except ValueError:	387 except ValueError: # pragma: no cover

340 if output.strip().endswith('<%s />'%self.doc_tag):	388 if output.strip().endswith('<%s />' % self.doc_tag):

341 # We have an empty document	389 # We have an empty document

342 output = ''	390 output = ''

343 else:	391 else:

344 # We have a serious problem	392 # We have a serious problem

345 raise ValueError('Markdown failed to strip top-level tags. D ocument=%r' % output.strip())	393 raise ValueError('Markdown failed to strip top-level '

	394 'tags. Document=%r' % output.strip())

346	395

347 # Run the text post-processors	396 # Run the text post-processors

348 for pp in self.postprocessors.values():	397 for pp in self.postprocessors.values():

349 output = pp.run(output)	398 output = pp.run(output)

350	399

351 return output.strip()	400 return output.strip()

352	401

353 def convertFile(self, input=None, output=None, encoding=None):	402 def convertFile(self, input=None, output=None, encoding=None):

354 """Converts a markdown file and returns the HTML as a unicode string.	403 """Converts a Markdown file and returns the HTML as a Unicode string.

355	404

356 Decodes the file using the provided encoding (defaults to utf-8),	405 Decodes the file using the provided encoding (defaults to utf-8),

357 passes the file content to markdown, and outputs the html to either	406 passes the file content to markdown, and outputs the html to either

358 the provided stream or the file with provided name, using the same	407 the provided stream or the file with provided name, using the same

359 encoding as the source file. The 'xmlcharrefreplace' error handler is	408 encoding as the source file. The 'xmlcharrefreplace' error handler is

360 used when encoding the output.	409 used when encoding the output.

361	410

362 Note: This is the only place that decoding and encoding of unicode	411 Note: This is the only place that decoding and encoding of Unicode

363 takes place in Python-Markdown. (All other code is unicode-in /	412 takes place in Python-Markdown. (All other code is Unicode-in /

364 unicode-out.)	413 Unicode-out.)

365	414

366 Keyword arguments:	415 Keyword arguments:

367	416

368 * input: File object or path. Reads from stdin if `None`.	417 * input: File object or path. Reads from stdin if `None`.

369 * output: File object or path. Writes to stdout if `None`.	418 * output: File object or path. Writes to stdout if `None`.

370 * encoding: Encoding of input and output files. Defaults to utf-8.	419 * encoding: Encoding of input and output files. Defaults to utf-8.

371	420

372 """	421 """

373	422

374 encoding = encoding or "utf-8"	423 encoding = encoding or "utf-8"

375	424

376 # Read the source	425 # Read the source

377 if input:	426 if input:

378 if isinstance(input, util.string_type):	427 if isinstance(input, util.string_type):

379 input_file = codecs.open(input, mode="r", encoding=encoding)	428 input_file = codecs.open(input, mode="r", encoding=encoding)

380 else:	429 else:

381 input_file = codecs.getreader(encoding)(input)	430 input_file = codecs.getreader(encoding)(input)

382 text = input_file.read()	431 text = input_file.read()

383 input_file.close()	432 input_file.close()

384 else:	433 else:

385 text = sys.stdin.read()	434 text = sys.stdin.read()

386 if not isinstance(text, util.text_type):	435 if not isinstance(text, util.text_type):

387 text = text.decode(encoding)	436 text = text.decode(encoding)

388	437

389 text = text.lstrip('\ufeff') # remove the byte-order mark	438 text = text.lstrip('\ufeff') # remove the byte-order mark

390	439

391 # Convert	440 # Convert

392 html = self.convert(text)	441 html = self.convert(text)

393	442

394 # Write to file or stdout	443 # Write to file or stdout

395 if output:	444 if output:

396 if isinstance(output, util.string_type):	445 if isinstance(output, util.string_type):

397 output_file = codecs.open(output, "w",	446 output_file = codecs.open(output, "w",

398 encoding=encoding,	447 encoding=encoding,

399 errors="xmlcharrefreplace")	448 errors="xmlcharrefreplace")

400 output_file.write(html)	449 output_file.write(html)

401 output_file.close()	450 output_file.close()

402 else:	451 else:

403 writer = codecs.getwriter(encoding)	452 writer = codecs.getwriter(encoding)

404 output_file = writer(output, errors="xmlcharrefreplace")	453 output_file = writer(output, errors="xmlcharrefreplace")

405 output_file.write(html)	454 output_file.write(html)

406 # Don't close here. User may want to write more.	455 # Don't close here. User may want to write more.

407 else:	456 else:

408 # Encode manually and write bytes to stdout.	457 # Encode manually and write bytes to stdout.

409 html = html.encode(encoding, "xmlcharrefreplace")	458 html = html.encode(encoding, "xmlcharrefreplace")

410 try:	459 try:

411 # Write bytes directly to buffer (Python 3).	460 # Write bytes directly to buffer (Python 3).

412 sys.stdout.buffer.write(html)	461 sys.stdout.buffer.write(html)

413 except AttributeError:	462 except AttributeError:

414 # Probably Python 2, which works with bytes by default.	463 # Probably Python 2, which works with bytes by default.

415 sys.stdout.write(html)	464 sys.stdout.write(html)

416	465

417 return self	466 return self

418	467

419	468

420 """	469 """

421 EXPORTED FUNCTIONS	470 EXPORTED FUNCTIONS

422 =============================================================================	471 =============================================================================

423	472

424 Those are the two functions we really mean to export: markdown() and	473 Those are the two functions we really mean to export: markdown() and

425 markdownFromFile().	474 markdownFromFile().

426 """	475 """

427	476

	477

428 def markdown(text, args, *kwargs):	478 def markdown(text, args, *kwargs):

429 """Convert a markdown string to HTML and return HTML as a unicode string.	479 """Convert a Markdown string to HTML and return HTML as a Unicode string.

430	480

431 This is a shortcut function for `Markdown` class to cover the most	481 This is a shortcut function for `Markdown` class to cover the most

432 basic use case. It initializes an instance of Markdown, loads the	482 basic use case. It initializes an instance of Markdown, loads the

433 necessary extensions and runs the parser on the given text.	483 necessary extensions and runs the parser on the given text.

434	484

435 Keyword arguments:	485 Keyword arguments:

436	486

437 * text: Markdown formatted text as Unicode or ASCII string.	487 * text: Markdown formatted text as Unicode or ASCII string.

438 * Any arguments accepted by the Markdown class.	488 * Any arguments accepted by the Markdown class.

439	489

(...skipping 20 matching lines...) Expand all Loading...
460 """	510 """

461 # For backward compatibility loop through positional args	511 # For backward compatibility loop through positional args

462 pos = ['input', 'output', 'extensions', 'encoding']	512 pos = ['input', 'output', 'extensions', 'encoding']

463 c = 0	513 c = 0

464 for arg in args:	514 for arg in args:

465 if pos[c] not in kwargs:	515 if pos[c] not in kwargs:

466 kwargs[pos[c]] = arg	516 kwargs[pos[c]] = arg

467 c += 1	517 c += 1

468 if c == len(pos):	518 if c == len(pos):

469 break	519 break

	520 if len(args):

	521 warnings.warn('Positional arguments are depreacted in '

	522 'Markdown and will raise an error in version 2.7. '

	523 'Use keyword arguments only.',

	524 DeprecationWarning)

470	525

471 md = Markdown(**kwargs)	526 md = Markdown(**kwargs)

472 md.convertFile(kwargs.get('input', None),	527 md.convertFile(kwargs.get('input', None),

473 kwargs.get('output', None),	528 kwargs.get('output', None),

474 kwargs.get('encoding', None))	529 kwargs.get('encoding', None))

475

OLD	NEW

« no previous file with comments | « third_party/Python-Markdown/README.md ('k') | third_party/Python-Markdown/markdown/__main__.py » ('j') | no next file with comments »

Side by Side Diff: third_party/Python-Markdown/markdown/__init__.py

Side by Side Diff: third_party/Python-Markdown/markdown/init.py