Chromium Code Reviews
chromiumcodereview-hr@appspot.gserviceaccount.com (chromiumcodereview-hr) | Please choose your nickname with Settings | Help | Chromium Project | Gerrit Changes | Sign out
(49)

Side by Side Diff: third_party/Python-Markdown/markdown/__init__.py

Issue 1356203004: Check in a simple pure-python based Markdown previewer. (Closed) Base URL: https://chromium.googlesource.com/chromium/src.git@add
Patch Set: fix license file Created 5 years, 2 months ago
Use n/p to move between diff chunks; N/P to move between comments. Draft comments are only viewable by you.
Jump to:
View unified diff | Download patch
« no previous file with comments | « third_party/Python-Markdown/README.md ('k') | third_party/Python-Markdown/markdown/__main__.py » ('j') | no next file with comments »
Toggle Intra-line Diffs ('i') | Expand Comments ('e') | Collapse Comments ('c') | Show Comments Hide Comments ('s')
OLDNEW
1 # markdown is released under the BSD license
2 # Copyright 2007, 2008 The Python Markdown Project (v. 1.7 and later)
3 # Copyright 2004, 2005, 2006 Yuri Takhteyev (v. 0.2-1.6b)
4 # Copyright 2004 Manfred Stienstra (the original version)
5 #
6 # All rights reserved.
7 #
8 # Redistribution and use in source and binary forms, with or without
9 # modification, are permitted provided that the following conditions are met:
10 #
11 # * Redistributions of source code must retain the above copyright
12 # notice, this list of conditions and the following disclaimer.
13 # * Redistributions in binary form must reproduce the above copyright
14 # notice, this list of conditions and the following disclaimer in the
15 # documentation and/or other materials provided with the distribution.
16 # * Neither the name of the <organization> nor the
17 # names of its contributors may be used to endorse or promote products
18 # derived from this software without specific prior written permission.
19 #
20 # THIS SOFTWARE IS PROVIDED BY THE PYTHON MARKDOWN PROJECT ''AS IS'' AND ANY
21 # EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
22 # WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
23 # DISCLAIMED. IN NO EVENT SHALL ANY CONTRIBUTORS TO THE PYTHON MARKDOWN PROJECT
24 # BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
25 # CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
26 # SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
27 # INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
28 # CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
29 # ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
30 # POSSIBILITY OF SUCH DAMAGE.
31
32
33 """ 1 """
34 Python Markdown 2 Python Markdown
35 =============== 3 ===============
36 4
37 Python Markdown converts Markdown to HTML and can be used as a library or 5 Python Markdown converts Markdown to HTML and can be used as a library or
38 called from the command line. 6 called from the command line.
39 7
40 ## Basic usage as a module: 8 ## Basic usage as a module:
41 9
42 import markdown 10 import markdown
43 html = markdown.markdown(your_text_string) 11 html = markdown.markdown(your_text_string)
44 12
45 See <http://packages.python.org/Markdown/> for more 13 See <https://pythonhosted.org/Markdown/> for more
46 information and instructions on how to extend the functionality of 14 information and instructions on how to extend the functionality of
47 Python Markdown. Read that before you try modifying this file. 15 Python Markdown. Read that before you try modifying this file.
48 16
49 ## Authors and License 17 ## Authors and License
50 18
51 Started by [Manfred Stienstra](http://www.dwerg.net/). Continued and 19 Started by [Manfred Stienstra](http://www.dwerg.net/). Continued and
52 maintained by [Yuri Takhteyev](http://www.freewisdom.org), [Waylan 20 maintained by [Yuri Takhteyev](http://www.freewisdom.org), [Waylan
53 Limberg](http://achinghead.com/) and [Artem Yunusov](http://blog.splyer.com). 21 Limberg](http://achinghead.com/) and [Artem Yunusov](http://blog.splyer.com).
54 22
55 Contact: markdown@freewisdom.org 23 Contact: markdown@freewisdom.org
56 24
57 Copyright 2007-2013 The Python Markdown Project (v. 1.7 and later) 25 Copyright 2007-2013 The Python Markdown Project (v. 1.7 and later)
58 Copyright 200? Django Software Foundation (OrderedDict implementation) 26 Copyright 200? Django Software Foundation (OrderedDict implementation)
59 Copyright 2004, 2005, 2006 Yuri Takhteyev (v. 0.2-1.6b) 27 Copyright 2004, 2005, 2006 Yuri Takhteyev (v. 0.2-1.6b)
60 Copyright 2004 Manfred Stienstra (the original version) 28 Copyright 2004 Manfred Stienstra (the original version)
61 29
62 License: BSD (see LICENSE for details). 30 License: BSD (see LICENSE for details).
63 """ 31 """
64 32
65 from __future__ import absolute_import 33 from __future__ import absolute_import
66 from __future__ import unicode_literals 34 from __future__ import unicode_literals
67 from .__version__ import version, version_info 35 from .__version__ import version, version_info # noqa
68 import re
69 import codecs 36 import codecs
70 import sys 37 import sys
71 import logging 38 import logging
39 import warnings
40 import importlib
72 from . import util 41 from . import util
73 from .preprocessors import build_preprocessors 42 from .preprocessors import build_preprocessors
74 from .blockprocessors import build_block_parser 43 from .blockprocessors import build_block_parser
75 from .treeprocessors import build_treeprocessors 44 from .treeprocessors import build_treeprocessors
76 from .inlinepatterns import build_inlinepatterns 45 from .inlinepatterns import build_inlinepatterns
77 from .postprocessors import build_postprocessors 46 from .postprocessors import build_postprocessors
78 from .extensions import Extension 47 from .extensions import Extension
79 from .serializers import to_html_string, to_xhtml_string 48 from .serializers import to_html_string, to_xhtml_string
80 49
81 __all__ = ['Markdown', 'markdown', 'markdownFromFile'] 50 __all__ = ['Markdown', 'markdown', 'markdownFromFile']
82 51
52
83 logger = logging.getLogger('MARKDOWN') 53 logger = logging.getLogger('MARKDOWN')
84 54
85 55
86 class Markdown(object): 56 class Markdown(object):
87 """Convert Markdown to HTML.""" 57 """Convert Markdown to HTML."""
88 58
89 doc_tag = "div" # Element used to wrap document - later removed 59 doc_tag = "div" # Element used to wrap document - later removed
90 60
91 option_defaults = { 61 option_defaults = {
92 'html_replacement_text' : '[HTML_REMOVED]', 62 'html_replacement_text': '[HTML_REMOVED]',
93 'tab_length' : 4, 63 'tab_length': 4,
94 'enable_attributes' : True, 64 'enable_attributes': True,
95 'smart_emphasis' : True, 65 'smart_emphasis': True,
96 'lazy_ol' : True, 66 'lazy_ol': True,
97 } 67 }
98 68
99 output_formats = { 69 output_formats = {
100 'html' : to_html_string, 70 'html': to_html_string,
101 'html4' : to_html_string, 71 'html4': to_html_string,
102 'html5' : to_html_string, 72 'html5': to_html_string,
103 'xhtml' : to_xhtml_string, 73 'xhtml': to_xhtml_string,
104 'xhtml1': to_xhtml_string, 74 'xhtml1': to_xhtml_string,
105 'xhtml5': to_xhtml_string, 75 'xhtml5': to_xhtml_string,
106 } 76 }
107 77
108 ESCAPED_CHARS = ['\\', '`', '*', '_', '{', '}', '[', ']', 78 ESCAPED_CHARS = ['\\', '`', '*', '_', '{', '}', '[', ']',
109 '(', ')', '>', '#', '+', '-', '.', '!'] 79 '(', ')', '>', '#', '+', '-', '.', '!']
110 80
111 def __init__(self, *args, **kwargs): 81 def __init__(self, *args, **kwargs):
112 """ 82 """
113 Creates a new Markdown instance. 83 Creates a new Markdown instance.
114 84
115 Keyword arguments: 85 Keyword arguments:
116 86
117 * extensions: A list of extensions. 87 * extensions: A list of extensions.
118 If they are of type string, the module mdx_name.py will be loaded. 88 If they are of type string, the module mdx_name.py will be loaded.
119 If they are a subclass of markdown.Extension, they will be used 89 If they are a subclass of markdown.Extension, they will be used
120 as-is. 90 as-is.
121 * extension_configs: Configuration settingis for extensions. 91 * extension_configs: Configuration settings for extensions.
122 * output_format: Format of output. Supported formats are: 92 * output_format: Format of output. Supported formats are:
123 * "xhtml1": Outputs XHTML 1.x. Default. 93 * "xhtml1": Outputs XHTML 1.x. Default.
124 * "xhtml5": Outputs XHTML style tags of HTML 5 94 * "xhtml5": Outputs XHTML style tags of HTML 5
125 * "xhtml": Outputs latest supported version of XHTML (currently XHTM L 1.1). 95 * "xhtml": Outputs latest supported version of XHTML
96 (currently XHTML 1.1).
126 * "html4": Outputs HTML 4 97 * "html4": Outputs HTML 4
127 * "html5": Outputs HTML style tags of HTML 5 98 * "html5": Outputs HTML style tags of HTML 5
128 * "html": Outputs latest supported version of HTML (currently HTML 4 ). 99 * "html": Outputs latest supported version of HTML
100 (currently HTML 4).
129 Note that it is suggested that the more specific formats ("xhtml1" 101 Note that it is suggested that the more specific formats ("xhtml1"
130 and "html4") be used as "xhtml" or "html" may change in the future 102 and "html4") be used as "xhtml" or "html" may change in the future
131 if it makes sense at that time. 103 if it makes sense at that time.
132 * safe_mode: Disallow raw html. One of "remove", "replace" or "escape". 104 * safe_mode: Deprecated! Disallow raw html. One of "remove", "replace"
133 * html_replacement_text: Text used when safe_mode is set to "replace". 105 or "escape".
106 * html_replacement_text: Deprecated! Text used when safe_mode is set
107 to "replace".
134 * tab_length: Length of tabs in the source. Default: 4 108 * tab_length: Length of tabs in the source. Default: 4
135 * enable_attributes: Enable the conversion of attributes. Default: True 109 * enable_attributes: Enable the conversion of attributes. Default: True
136 * smart_emphasis: Treat `_connected_words_` intelegently Default: True 110 * smart_emphasis: Treat `_connected_words_` intelligently Default: True
137 * lazy_ol: Ignore number of first item of ordered lists. Default: True 111 * lazy_ol: Ignore number of first item of ordered lists. Default: True
138 112
139 """ 113 """
140 114
141 # For backward compatibility, loop through old positional args 115 # For backward compatibility, loop through old positional args
142 pos = ['extensions', 'extension_configs', 'safe_mode', 'output_format'] 116 pos = ['extensions', 'extension_configs', 'safe_mode', 'output_format']
143 c = 0 117 for c, arg in enumerate(args):
144 for arg in args:
145 if pos[c] not in kwargs: 118 if pos[c] not in kwargs:
146 kwargs[pos[c]] = arg 119 kwargs[pos[c]] = arg
147 c += 1 120 if c+1 == len(pos): # pragma: no cover
148 if c == len(pos):
149 # ignore any additional args 121 # ignore any additional args
150 break 122 break
123 if len(args):
124 warnings.warn('Positional arguments are deprecated in Markdown. '
125 'Use keyword arguments only.',
126 DeprecationWarning)
151 127
152 # Loop through kwargs and assign defaults 128 # Loop through kwargs and assign defaults
153 for option, default in self.option_defaults.items(): 129 for option, default in self.option_defaults.items():
154 setattr(self, option, kwargs.get(option, default)) 130 setattr(self, option, kwargs.get(option, default))
155 131
156 self.safeMode = kwargs.get('safe_mode', False) 132 self.safeMode = kwargs.get('safe_mode', False)
157 if self.safeMode and 'enable_attributes' not in kwargs: 133 if self.safeMode and 'enable_attributes' not in kwargs:
158 # Disable attributes in safeMode when not explicitly set 134 # Disable attributes in safeMode when not explicitly set
159 self.enable_attributes = False 135 self.enable_attributes = False
160 136
137 if 'safe_mode' in kwargs:
138 warnings.warn('"safe_mode" is deprecated in Python-Markdown. '
139 'Use an HTML sanitizer (like '
140 'Bleach http://bleach.readthedocs.org/) '
141 'if you are parsing untrusted markdown text. '
142 'See the 2.6 release notes for more info',
143 DeprecationWarning)
144
145 if 'html_replacement_text' in kwargs:
146 warnings.warn('The "html_replacement_text" keyword is '
147 'deprecated along with "safe_mode".',
148 DeprecationWarning)
149
161 self.registeredExtensions = [] 150 self.registeredExtensions = []
162 self.docType = "" 151 self.docType = ""
163 self.stripTopLevelTags = True 152 self.stripTopLevelTags = True
164 153
165 self.build_parser() 154 self.build_parser()
166 155
167 self.references = {} 156 self.references = {}
168 self.htmlStash = util.HtmlStash() 157 self.htmlStash = util.HtmlStash()
169 self.set_output_format(kwargs.get('output_format', 'xhtml1'))
170 self.registerExtensions(extensions=kwargs.get('extensions', []), 158 self.registerExtensions(extensions=kwargs.get('extensions', []),
171 configs=kwargs.get('extension_configs', {})) 159 configs=kwargs.get('extension_configs', {}))
160 self.set_output_format(kwargs.get('output_format', 'xhtml1'))
172 self.reset() 161 self.reset()
173 162
174 def build_parser(self): 163 def build_parser(self):
175 """ Build the parser from the various parts. """ 164 """ Build the parser from the various parts. """
176 self.preprocessors = build_preprocessors(self) 165 self.preprocessors = build_preprocessors(self)
177 self.parser = build_block_parser(self) 166 self.parser = build_block_parser(self)
178 self.inlinePatterns = build_inlinepatterns(self) 167 self.inlinePatterns = build_inlinepatterns(self)
179 self.treeprocessors = build_treeprocessors(self) 168 self.treeprocessors = build_treeprocessors(self)
180 self.postprocessors = build_postprocessors(self) 169 self.postprocessors = build_postprocessors(self)
181 return self 170 return self
182 171
183 def registerExtensions(self, extensions, configs): 172 def registerExtensions(self, extensions, configs):
184 """ 173 """
185 Register extensions with this instance of Markdown. 174 Register extensions with this instance of Markdown.
186 175
187 Keyword arguments: 176 Keyword arguments:
188 177
189 * extensions: A list of extensions, which can either 178 * extensions: A list of extensions, which can either
190 be strings or objects. See the docstring on Markdown. 179 be strings or objects. See the docstring on Markdown.
191 * configs: A dictionary mapping module names to config options. 180 * configs: A dictionary mapping module names to config options.
192 181
193 """ 182 """
194 for ext in extensions: 183 for ext in extensions:
195 if isinstance(ext, util.string_type): 184 if isinstance(ext, util.string_type):
196 ext = self.build_extension(ext, configs.get(ext, [])) 185 ext = self.build_extension(ext, configs.get(ext, {}))
197 if isinstance(ext, Extension): 186 if isinstance(ext, Extension):
198 ext.extendMarkdown(self, globals()) 187 ext.extendMarkdown(self, globals())
188 logger.debug(
189 'Successfully loaded extension "%s.%s".'
190 % (ext.__class__.__module__, ext.__class__.__name__)
191 )
199 elif ext is not None: 192 elif ext is not None:
200 raise TypeError( 193 raise TypeError(
201 'Extension "%s.%s" must be of type: "markdown.Extension"' 194 'Extension "%s.%s" must be of type: "markdown.Extension"'
202 % (ext.__class__.__module__, ext.__class__.__name__)) 195 % (ext.__class__.__module__, ext.__class__.__name__))
203 196
204 return self 197 return self
205 198
206 def build_extension(self, ext_name, configs = []): 199 def build_extension(self, ext_name, configs):
207 """Build extension by name, then return the module. 200 """Build extension by name, then return the module.
208 201
209 The extension name may contain arguments as part of the string in the 202 The extension name may contain arguments as part of the string in the
210 following format: "extname(key1=value1,key2=value2)" 203 following format: "extname(key1=value1,key2=value2)"
211 204
212 """ 205 """
213 206
207 configs = dict(configs)
208
214 # Parse extensions config params (ignore the order) 209 # Parse extensions config params (ignore the order)
215 configs = dict(configs) 210 pos = ext_name.find("(") # find the first "("
216 pos = ext_name.find("(") # find the first "("
217 if pos > 0: 211 if pos > 0:
218 ext_args = ext_name[pos+1:-1] 212 ext_args = ext_name[pos+1:-1]
219 ext_name = ext_name[:pos] 213 ext_name = ext_name[:pos]
220 pairs = [x.split("=") for x in ext_args.split(",")] 214 pairs = [x.split("=") for x in ext_args.split(",")]
221 configs.update([(x.strip(), y.strip()) for (x, y) in pairs]) 215 configs.update([(x.strip(), y.strip()) for (x, y) in pairs])
216 warnings.warn('Setting configs in the Named Extension string is '
217 'deprecated. It is recommended that you '
218 'pass an instance of the extension class to '
219 'Markdown or use the "extension_configs" keyword. '
220 'The current behavior will raise an error in version 2 .7. '
221 'See the Release Notes for Python-Markdown version '
222 '2.6 for more info.', DeprecationWarning)
222 223
223 # Setup the module name 224 # Get class name (if provided): `path.to.module:ClassName`
224 module_name = ext_name 225 ext_name, class_name = ext_name.split(':', 1) \
225 if '.' not in ext_name: 226 if ':' in ext_name else (ext_name, '')
226 module_name = '.'.join(['third_party.markdown.extensions', ext_name] )
227 227
228 # Try loading the extension first from one place, then another 228 # Try loading the extension first from one place, then another
229 try: # New style (markdown.extensons.<extension>) 229 try:
230 module = __import__(module_name, {}, {}, [module_name.rpartition('.' )[0]]) 230 # Assume string uses dot syntax (`path.to.some.module`)
231 module = importlib.import_module(ext_name)
232 logger.debug(
233 'Successfuly imported extension module "%s".' % ext_name
234 )
235 # For backward compat (until deprecation)
236 # check that this is an extension.
237 if ('.' not in ext_name and not (hasattr(module, 'makeExtension') or
238 (class_name and hasattr(module, class_name)))):
239 # We have a name conflict
240 # eg: extensions=['tables'] and PyTables is installed
241 raise ImportError
231 except ImportError: 242 except ImportError:
232 module_name_old_style = '_'.join(['mdx', ext_name]) 243 # Preppend `markdown.extensions.` to name
233 try: # Old style (mdx_<extension>) 244 module_name = '.'.join(['markdown.extensions', ext_name])
234 module = __import__(module_name_old_style) 245 try:
235 except ImportError as e: 246 module = importlib.import_module(module_name)
236 message = "Failed loading extension '%s' from '%s' or '%s'" \ 247 logger.debug(
237 % (ext_name, module_name, module_name_old_style) 248 'Successfuly imported extension module "%s".' %
249 module_name
250 )
251 warnings.warn('Using short names for Markdown\'s builtin '
252 'extensions is deprecated. Use the '
253 'full path to the extension with Python\'s dot '
254 'notation (eg: "%s" instead of "%s"). The '
255 'current behavior will raise an error in version '
256 '2.7. See the Release Notes for '
257 'Python-Markdown version 2.6 for more info.' %
258 (module_name, ext_name),
259 DeprecationWarning)
260 except ImportError:
261 # Preppend `mdx_` to name
262 module_name_old_style = '_'.join(['mdx', ext_name])
263 try:
264 module = importlib.import_module(module_name_old_style)
265 logger.debug(
266 'Successfuly imported extension module "%s".' %
267 module_name_old_style)
268 warnings.warn('Markdown\'s behavior of prepending "mdx_" '
269 'to an extension name is deprecated. '
270 'Use the full path to the '
271 'extension with Python\'s dot notation '
272 '(eg: "%s" instead of "%s"). The current '
273 'behavior will raise an error in version 2.7. '
274 'See the Release Notes for Python-Markdown '
275 'version 2.6 for more info.' %
276 (module_name_old_style, ext_name),
277 DeprecationWarning)
278 except ImportError as e:
279 message = "Failed loading extension '%s' from '%s', '%s' " \
280 "or '%s'" % (ext_name, ext_name, module_name,
281 module_name_old_style)
282 e.args = (message,) + e.args[1:]
283 raise
284
285 if class_name:
286 # Load given class name from module.
287 return getattr(module, class_name)(**configs)
288 else:
289 # Expect makeExtension() function to return a class.
290 try:
291 return module.makeExtension(**configs)
292 except AttributeError as e:
293 message = e.args[0]
294 message = "Failed to initiate extension " \
295 "'%s': %s" % (ext_name, message)
238 e.args = (message,) + e.args[1:] 296 e.args = (message,) + e.args[1:]
239 raise 297 raise
240 298
241 # If the module is loaded successfully, we expect it to define a
242 # function called makeExtension()
243 try:
244 return module.makeExtension(configs.items())
245 except AttributeError as e:
246 message = e.args[0]
247 message = "Failed to initiate extension " \
248 "'%s': %s" % (ext_name, message)
249 e.args = (message,) + e.args[1:]
250 raise
251
252 def registerExtension(self, extension): 299 def registerExtension(self, extension):
253 """ This gets called by the extension """ 300 """ This gets called by the extension """
254 self.registeredExtensions.append(extension) 301 self.registeredExtensions.append(extension)
255 return self 302 return self
256 303
257 def reset(self): 304 def reset(self):
258 """ 305 """
259 Resets all state variables so that we can start with a new text. 306 Resets all state variables so that we can start with a new text.
260 """ 307 """
261 self.htmlStash.reset() 308 self.htmlStash.reset()
262 self.references.clear() 309 self.references.clear()
263 310
264 for extension in self.registeredExtensions: 311 for extension in self.registeredExtensions:
265 if hasattr(extension, 'reset'): 312 if hasattr(extension, 'reset'):
266 extension.reset() 313 extension.reset()
267 314
268 return self 315 return self
269 316
270 def set_output_format(self, format): 317 def set_output_format(self, format):
271 """ Set the output format for the class instance. """ 318 """ Set the output format for the class instance. """
272 self.output_format = format.lower() 319 self.output_format = format.lower()
273 try: 320 try:
274 self.serializer = self.output_formats[self.output_format] 321 self.serializer = self.output_formats[self.output_format]
275 except KeyError as e: 322 except KeyError as e:
276 valid_formats = list(self.output_formats.keys()) 323 valid_formats = list(self.output_formats.keys())
277 valid_formats.sort() 324 valid_formats.sort()
278 message = 'Invalid Output Format: "%s". Use one of %s.' \ 325 message = 'Invalid Output Format: "%s". Use one of %s.' \
279 % (self.output_format, 326 % (self.output_format,
280 '"' + '", "'.join(valid_formats) + '"') 327 '"' + '", "'.join(valid_formats) + '"')
281 e.args = (message,) + e.args[1:] 328 e.args = (message,) + e.args[1:]
282 raise 329 raise
283 return self 330 return self
284 331
285 def convert(self, source): 332 def convert(self, source):
286 """ 333 """
287 Convert markdown to serialized XHTML or HTML. 334 Convert markdown to serialized XHTML or HTML.
288 335
289 Keyword arguments: 336 Keyword arguments:
290 337
(...skipping 28 matching lines...) Expand all
319 self.lines = source.split("\n") 366 self.lines = source.split("\n")
320 for prep in self.preprocessors.values(): 367 for prep in self.preprocessors.values():
321 self.lines = prep.run(self.lines) 368 self.lines = prep.run(self.lines)
322 369
323 # Parse the high-level elements. 370 # Parse the high-level elements.
324 root = self.parser.parseDocument(self.lines).getroot() 371 root = self.parser.parseDocument(self.lines).getroot()
325 372
326 # Run the tree-processors 373 # Run the tree-processors
327 for treeprocessor in self.treeprocessors.values(): 374 for treeprocessor in self.treeprocessors.values():
328 newRoot = treeprocessor.run(root) 375 newRoot = treeprocessor.run(root)
329 if newRoot: 376 if newRoot is not None:
330 root = newRoot 377 root = newRoot
331 378
332 # Serialize _properly_. Strip top-level tags. 379 # Serialize _properly_. Strip top-level tags.
333 output = self.serializer(root) 380 output = self.serializer(root)
334 if self.stripTopLevelTags: 381 if self.stripTopLevelTags:
335 try: 382 try:
336 start = output.index('<%s>'%self.doc_tag)+len(self.doc_tag)+2 383 start = output.index(
337 end = output.rindex('</%s>'%self.doc_tag) 384 '<%s>' % self.doc_tag) + len(self.doc_tag) + 2
385 end = output.rindex('</%s>' % self.doc_tag)
338 output = output[start:end].strip() 386 output = output[start:end].strip()
339 except ValueError: 387 except ValueError: # pragma: no cover
340 if output.strip().endswith('<%s />'%self.doc_tag): 388 if output.strip().endswith('<%s />' % self.doc_tag):
341 # We have an empty document 389 # We have an empty document
342 output = '' 390 output = ''
343 else: 391 else:
344 # We have a serious problem 392 # We have a serious problem
345 raise ValueError('Markdown failed to strip top-level tags. D ocument=%r' % output.strip()) 393 raise ValueError('Markdown failed to strip top-level '
394 'tags. Document=%r' % output.strip())
346 395
347 # Run the text post-processors 396 # Run the text post-processors
348 for pp in self.postprocessors.values(): 397 for pp in self.postprocessors.values():
349 output = pp.run(output) 398 output = pp.run(output)
350 399
351 return output.strip() 400 return output.strip()
352 401
353 def convertFile(self, input=None, output=None, encoding=None): 402 def convertFile(self, input=None, output=None, encoding=None):
354 """Converts a markdown file and returns the HTML as a unicode string. 403 """Converts a Markdown file and returns the HTML as a Unicode string.
355 404
356 Decodes the file using the provided encoding (defaults to utf-8), 405 Decodes the file using the provided encoding (defaults to utf-8),
357 passes the file content to markdown, and outputs the html to either 406 passes the file content to markdown, and outputs the html to either
358 the provided stream or the file with provided name, using the same 407 the provided stream or the file with provided name, using the same
359 encoding as the source file. The 'xmlcharrefreplace' error handler is 408 encoding as the source file. The 'xmlcharrefreplace' error handler is
360 used when encoding the output. 409 used when encoding the output.
361 410
362 **Note:** This is the only place that decoding and encoding of unicode 411 **Note:** This is the only place that decoding and encoding of Unicode
363 takes place in Python-Markdown. (All other code is unicode-in / 412 takes place in Python-Markdown. (All other code is Unicode-in /
364 unicode-out.) 413 Unicode-out.)
365 414
366 Keyword arguments: 415 Keyword arguments:
367 416
368 * input: File object or path. Reads from stdin if `None`. 417 * input: File object or path. Reads from stdin if `None`.
369 * output: File object or path. Writes to stdout if `None`. 418 * output: File object or path. Writes to stdout if `None`.
370 * encoding: Encoding of input and output files. Defaults to utf-8. 419 * encoding: Encoding of input and output files. Defaults to utf-8.
371 420
372 """ 421 """
373 422
374 encoding = encoding or "utf-8" 423 encoding = encoding or "utf-8"
375 424
376 # Read the source 425 # Read the source
377 if input: 426 if input:
378 if isinstance(input, util.string_type): 427 if isinstance(input, util.string_type):
379 input_file = codecs.open(input, mode="r", encoding=encoding) 428 input_file = codecs.open(input, mode="r", encoding=encoding)
380 else: 429 else:
381 input_file = codecs.getreader(encoding)(input) 430 input_file = codecs.getreader(encoding)(input)
382 text = input_file.read() 431 text = input_file.read()
383 input_file.close() 432 input_file.close()
384 else: 433 else:
385 text = sys.stdin.read() 434 text = sys.stdin.read()
386 if not isinstance(text, util.text_type): 435 if not isinstance(text, util.text_type):
387 text = text.decode(encoding) 436 text = text.decode(encoding)
388 437
389 text = text.lstrip('\ufeff') # remove the byte-order mark 438 text = text.lstrip('\ufeff') # remove the byte-order mark
390 439
391 # Convert 440 # Convert
392 html = self.convert(text) 441 html = self.convert(text)
393 442
394 # Write to file or stdout 443 # Write to file or stdout
395 if output: 444 if output:
396 if isinstance(output, util.string_type): 445 if isinstance(output, util.string_type):
397 output_file = codecs.open(output, "w", 446 output_file = codecs.open(output, "w",
398 encoding=encoding, 447 encoding=encoding,
399 errors="xmlcharrefreplace") 448 errors="xmlcharrefreplace")
400 output_file.write(html) 449 output_file.write(html)
401 output_file.close() 450 output_file.close()
402 else: 451 else:
403 writer = codecs.getwriter(encoding) 452 writer = codecs.getwriter(encoding)
404 output_file = writer(output, errors="xmlcharrefreplace") 453 output_file = writer(output, errors="xmlcharrefreplace")
405 output_file.write(html) 454 output_file.write(html)
406 # Don't close here. User may want to write more. 455 # Don't close here. User may want to write more.
407 else: 456 else:
408 # Encode manually and write bytes to stdout. 457 # Encode manually and write bytes to stdout.
409 html = html.encode(encoding, "xmlcharrefreplace") 458 html = html.encode(encoding, "xmlcharrefreplace")
410 try: 459 try:
411 # Write bytes directly to buffer (Python 3). 460 # Write bytes directly to buffer (Python 3).
412 sys.stdout.buffer.write(html) 461 sys.stdout.buffer.write(html)
413 except AttributeError: 462 except AttributeError:
414 # Probably Python 2, which works with bytes by default. 463 # Probably Python 2, which works with bytes by default.
415 sys.stdout.write(html) 464 sys.stdout.write(html)
416 465
417 return self 466 return self
418 467
419 468
420 """ 469 """
421 EXPORTED FUNCTIONS 470 EXPORTED FUNCTIONS
422 ============================================================================= 471 =============================================================================
423 472
424 Those are the two functions we really mean to export: markdown() and 473 Those are the two functions we really mean to export: markdown() and
425 markdownFromFile(). 474 markdownFromFile().
426 """ 475 """
427 476
477
428 def markdown(text, *args, **kwargs): 478 def markdown(text, *args, **kwargs):
429 """Convert a markdown string to HTML and return HTML as a unicode string. 479 """Convert a Markdown string to HTML and return HTML as a Unicode string.
430 480
431 This is a shortcut function for `Markdown` class to cover the most 481 This is a shortcut function for `Markdown` class to cover the most
432 basic use case. It initializes an instance of Markdown, loads the 482 basic use case. It initializes an instance of Markdown, loads the
433 necessary extensions and runs the parser on the given text. 483 necessary extensions and runs the parser on the given text.
434 484
435 Keyword arguments: 485 Keyword arguments:
436 486
437 * text: Markdown formatted text as Unicode or ASCII string. 487 * text: Markdown formatted text as Unicode or ASCII string.
438 * Any arguments accepted by the Markdown class. 488 * Any arguments accepted by the Markdown class.
439 489
(...skipping 20 matching lines...) Expand all
460 """ 510 """
461 # For backward compatibility loop through positional args 511 # For backward compatibility loop through positional args
462 pos = ['input', 'output', 'extensions', 'encoding'] 512 pos = ['input', 'output', 'extensions', 'encoding']
463 c = 0 513 c = 0
464 for arg in args: 514 for arg in args:
465 if pos[c] not in kwargs: 515 if pos[c] not in kwargs:
466 kwargs[pos[c]] = arg 516 kwargs[pos[c]] = arg
467 c += 1 517 c += 1
468 if c == len(pos): 518 if c == len(pos):
469 break 519 break
520 if len(args):
521 warnings.warn('Positional arguments are depreacted in '
522 'Markdown and will raise an error in version 2.7. '
523 'Use keyword arguments only.',
524 DeprecationWarning)
470 525
471 md = Markdown(**kwargs) 526 md = Markdown(**kwargs)
472 md.convertFile(kwargs.get('input', None), 527 md.convertFile(kwargs.get('input', None),
473 kwargs.get('output', None), 528 kwargs.get('output', None),
474 kwargs.get('encoding', None)) 529 kwargs.get('encoding', None))
475
OLDNEW
« no previous file with comments | « third_party/Python-Markdown/README.md ('k') | third_party/Python-Markdown/markdown/__main__.py » ('j') | no next file with comments »

Powered by Google App Engine
This is Rietveld 408576698