|
|
Created:
5 years, 6 months ago by Tiger (Sony Mobile) Modified:
5 years, 5 months ago CC:
blink-reviews, blink-reviews-dom_chromium.org, dglazkov+blink, sof, eae+blinkwatch, rwlbuis Base URL:
https://chromium.googlesource.com/chromium/blink.git@master Target Ref:
refs/heads/master Project:
blink Visibility:
Public. |
DescriptionAdd link rewrite functionality to page serializer
To be able to take over the work of WebPageSerializerImpl we need to be
able to rewrite links to match a given structure.
--
This review is part of set aiming to merge the two existing page
serializers, WebPageSerializerImpl and PageSerializer into the latter
one.
In addition to this it moves all the old tests from
WebPageNewSerializerTest and WebPageSerializerTest to the newer
PageSerializerTest structure and splits out a test for MHTML into the
MHTMLTest file.
Note: This is a rebase and split of
https://codereview.chromium.org/68613003 from Opera which was reverted
due to failing tests. In addition to the rebase some smaller fixes was
done to accommodate to changes in the code from when it was first
landed.
BUG=328354
R=hajimehoshi@chromium.org
Committed: https://src.chromium.org/viewvc/blink?view=rev&revision=198143
Patch Set 1 #
Total comments: 14
Patch Set 2 : Fix issues #
Total comments: 18
Patch Set 3 : Moar issues #
Total comments: 8
Patch Set 4 : More issues #
Total comments: 8
Patch Set 5 : Rebase + issues #
Total comments: 35
Patch Set 6 : Even more issues #Patch Set 7 : Forgot about .h #Patch Set 8 : Undo shouldIgnoreElement and remove isJavascriptURLAttribute #Patch Set 9 : Add TODO for future coding #
Total comments: 1
Patch Set 10 : Adjust TODO comment style #
Messages
Total messages: 27 (4 generated)
PTAL
https://codereview.chromium.org/1177733002/diff/1/Source/core/page/PageSerial... File Source/core/page/PageSerializer.cpp (right): https://codereview.chromium.org/1177733002/diff/1/Source/core/page/PageSerial... Source/core/page/PageSerializer.cpp:116: RawPtrWillBeMember<const Document> m_document; Create a getter and make member variables private. https://codereview.chromium.org/1177733002/diff/1/Source/core/page/PageSerial... Source/core/page/PageSerializer.cpp:228: // FIXME: We could move the uncommenting to appendOpenTag and appendCloseTag, or just remove it Use TODO(yourname) https://codereview.chromium.org/1177733002/diff/1/Source/core/page/PageSerial... Source/core/page/PageSerializer.cpp:231: result.appendLiteral("<!--"); Can you avoid appending <base> element instead of commenting out in this case? https://codereview.chromium.org/1177733002/diff/1/Source/core/page/PageSerial... Source/core/page/PageSerializer.cpp:235: result.append(String::format("\n<!-- saved from url=(%04d)%s -->\n", Use MarkupFormatter::appendComment https://codereview.chromium.org/1177733002/diff/1/Source/core/page/PageSerial... Source/core/page/PageSerializer.cpp:246: // FIXME: Refactor MarkupAccumulator so it is easier to append an element like this, without special cases for XHTML Use TODO(yourname) https://codereview.chromium.org/1177733002/diff/1/Source/core/page/PageSerial... Source/core/page/PageSerializer.cpp:251: result.append(m_document->baseTarget()); Use MarkupFormatter::appendAttributeValue https://codereview.chromium.org/1177733002/diff/1/Source/core/page/PageSerial... Source/core/page/PageSerializer.cpp:277: result.append(m_replaceLinks->get(completeURL)); Use MarkupFormatter::appendAttributeValue
https://codereview.chromium.org/1177733002/diff/1/Source/core/page/PageSerial... File Source/core/page/PageSerializer.cpp (right): https://codereview.chromium.org/1177733002/diff/1/Source/core/page/PageSerial... Source/core/page/PageSerializer.cpp:231: result.appendLiteral("<!--"); On 2015/06/11 01:55:33, hajimehoshi wrote: > Can you avoid appending <base> element instead of commenting out in this case? It was done like this in the old serializer, but yes - I don't actually think it is worth commenting out the old base tag. I'll add it to the shouldIgnoreElement function instead.
https://codereview.chromium.org/1177733002/diff/1/Source/core/page/PageSerial... File Source/core/page/PageSerializer.cpp (right): https://codereview.chromium.org/1177733002/diff/1/Source/core/page/PageSerial... Source/core/page/PageSerializer.cpp:116: RawPtrWillBeMember<const Document> m_document; On 2015/06/11 01:55:32, hajimehoshi wrote: > Create a getter and make member variables private. Done. https://codereview.chromium.org/1177733002/diff/1/Source/core/page/PageSerial... Source/core/page/PageSerializer.cpp:228: // FIXME: We could move the uncommenting to appendOpenTag and appendCloseTag, or just remove it On 2015/06/11 01:55:33, hajimehoshi wrote: > Use TODO(yourname) Done. https://codereview.chromium.org/1177733002/diff/1/Source/core/page/PageSerial... Source/core/page/PageSerializer.cpp:231: result.appendLiteral("<!--"); On 2015/06/11 12:23:29, Tiger (Sony Mobile) wrote: > On 2015/06/11 01:55:33, hajimehoshi wrote: > > Can you avoid appending <base> element instead of commenting out in this case? > > It was done like this in the old serializer, but yes - I don't actually think it > is worth commenting out the old base tag. I'll add it to the shouldIgnoreElement > function instead. Done. https://codereview.chromium.org/1177733002/diff/1/Source/core/page/PageSerial... Source/core/page/PageSerializer.cpp:246: // FIXME: Refactor MarkupAccumulator so it is easier to append an element like this, without special cases for XHTML On 2015/06/11 01:55:32, hajimehoshi wrote: > Use TODO(yourname) Done. https://codereview.chromium.org/1177733002/diff/1/Source/core/page/PageSerial... Source/core/page/PageSerializer.cpp:251: result.append(m_document->baseTarget()); On 2015/06/11 01:55:32, hajimehoshi wrote: > Use MarkupFormatter::appendAttributeValue Done. https://codereview.chromium.org/1177733002/diff/1/Source/core/page/PageSerial... Source/core/page/PageSerializer.cpp:277: result.append(m_replaceLinks->get(completeURL)); On 2015/06/11 01:55:32, hajimehoshi wrote: > Use MarkupFormatter::appendAttributeValue Done.
hajimehoshi@chromium.org changed reviewers: + yosin@chromium.org
+yosin (I'm ooo on Monday)
I think all issues here are resolved now.
https://codereview.chromium.org/1177733002/diff/20001/Source/core/page/PageSe... File Source/core/page/PageSerializer.cpp (right): https://codereview.chromium.org/1177733002/diff/20001/Source/core/page/PageSe... Source/core/page/PageSerializer.cpp:110: virtual bool shouldIgnoreElement(const Element&); Can we mark this function |const|? https://codereview.chromium.org/1177733002/diff/20001/Source/core/page/PageSe... Source/core/page/PageSerializer.cpp:211: PageSerializer* SerializerMarkupAccumulator::getPageSerializer() In Blink, we don't use "get" prefix. https://codereview.chromium.org/1177733002/diff/20001/Source/core/page/PageSe... Source/core/page/PageSerializer.cpp:216: const Document& SerializerMarkupAccumulator::getDocument() In Blink, we don't use "get" prefix. https://codereview.chromium.org/1177733002/diff/20001/Source/core/page/PageSe... Source/core/page/PageSerializer.cpp:221: class LinkChangeSerializerMarkupAccumulator : public SerializerMarkupAccumulator { |final|? https://codereview.chromium.org/1177733002/diff/20001/Source/core/page/PageSe... Source/core/page/PageSerializer.cpp:226: virtual void appendElement(StringBuilder&, Element&, Namespaces*) override; no need to have |virtual|. https://codereview.chromium.org/1177733002/diff/20001/Source/core/page/PageSe... Source/core/page/PageSerializer.cpp:227: virtual void appendAttribute(StringBuilder&, const Element&, const Attribute&, Namespaces*) override; no need to have |virtual|. https://codereview.chromium.org/1177733002/diff/20001/Source/core/page/PageSe... Source/core/page/PageSerializer.cpp:229: virtual bool shouldIgnoreElement(const Element&); Please use |override| instead of |virtual|. https://codereview.chromium.org/1177733002/diff/20001/Source/core/page/PageSe... File Source/core/page/PageSerializer.h (right): https://codereview.chromium.org/1177733002/diff/20001/Source/core/page/PageSe... Source/core/page/PageSerializer.h:75: PageSerializer(Vector<SerializedResource>*, PassOwnPtr<Delegate>, LinkLocalPathMap* = 0, String directory = ""); nit: s/0/nullptr/ https://codereview.chromium.org/1177733002/diff/20001/Source/web/tests/PageSe... File Source/web/tests/PageSerializerTest.cpp (right): https://codereview.chromium.org/1177733002/diff/20001/Source/web/tests/PageSe... Source/web/tests/PageSerializerTest.cpp:122: m_rewriteURLs.isEmpty() ? 0: &m_rewriteURLs, m_rewriteFolder); s/0:/nullptr :/
https://codereview.chromium.org/1177733002/diff/20001/Source/core/page/PageSe... File Source/core/page/PageSerializer.cpp (right): https://codereview.chromium.org/1177733002/diff/20001/Source/core/page/PageSe... Source/core/page/PageSerializer.cpp:110: virtual bool shouldIgnoreElement(const Element&); On 2015/06/15 09:34:37, Yosi_UTC9 wrote: > Can we mark this function |const|? Done. https://codereview.chromium.org/1177733002/diff/20001/Source/core/page/PageSe... Source/core/page/PageSerializer.cpp:211: PageSerializer* SerializerMarkupAccumulator::getPageSerializer() On 2015/06/15 09:34:37, Yosi_UTC9 wrote: > In Blink, we don't use "get" prefix. Done. https://codereview.chromium.org/1177733002/diff/20001/Source/core/page/PageSe... Source/core/page/PageSerializer.cpp:216: const Document& SerializerMarkupAccumulator::getDocument() On 2015/06/15 09:34:37, Yosi_UTC9 wrote: > In Blink, we don't use "get" prefix. Done. https://codereview.chromium.org/1177733002/diff/20001/Source/core/page/PageSe... Source/core/page/PageSerializer.cpp:221: class LinkChangeSerializerMarkupAccumulator : public SerializerMarkupAccumulator { On 2015/06/15 09:34:36, Yosi_UTC9 wrote: > |final|? Done. https://codereview.chromium.org/1177733002/diff/20001/Source/core/page/PageSe... Source/core/page/PageSerializer.cpp:226: virtual void appendElement(StringBuilder&, Element&, Namespaces*) override; On 2015/06/15 09:34:36, Yosi_UTC9 wrote: > no need to have |virtual|. Done. https://codereview.chromium.org/1177733002/diff/20001/Source/core/page/PageSe... Source/core/page/PageSerializer.cpp:227: virtual void appendAttribute(StringBuilder&, const Element&, const Attribute&, Namespaces*) override; On 2015/06/15 09:34:37, Yosi_UTC9 wrote: > no need to have |virtual|. Done. https://codereview.chromium.org/1177733002/diff/20001/Source/core/page/PageSe... Source/core/page/PageSerializer.cpp:229: virtual bool shouldIgnoreElement(const Element&); On 2015/06/15 09:34:37, Yosi_UTC9 wrote: > Please use |override| instead of |virtual|. Done. https://codereview.chromium.org/1177733002/diff/20001/Source/core/page/PageSe... File Source/core/page/PageSerializer.h (right): https://codereview.chromium.org/1177733002/diff/20001/Source/core/page/PageSe... Source/core/page/PageSerializer.h:75: PageSerializer(Vector<SerializedResource>*, PassOwnPtr<Delegate>, LinkLocalPathMap* = 0, String directory = ""); On 2015/06/15 09:34:37, Yosi_UTC9 wrote: > nit: s/0/nullptr/ Done. https://codereview.chromium.org/1177733002/diff/20001/Source/web/tests/PageSe... File Source/web/tests/PageSerializerTest.cpp (right): https://codereview.chromium.org/1177733002/diff/20001/Source/web/tests/PageSe... Source/web/tests/PageSerializerTest.cpp:122: m_rewriteURLs.isEmpty() ? 0: &m_rewriteURLs, m_rewriteFolder); On 2015/06/15 09:34:37, Yosi_UTC9 wrote: > s/0:/nullptr :/ Done.
https://codereview.chromium.org/1177733002/diff/40001/Source/core/page/PageSe... File Source/core/page/PageSerializer.cpp (right): https://codereview.chromium.org/1177733002/diff/40001/Source/core/page/PageSe... Source/core/page/PageSerializer.cpp:225: protected: Since this class is marked |final|, please move protected members to private. https://codereview.chromium.org/1177733002/diff/40001/Source/core/page/PageSe... File Source/core/page/PageSerializer.h (right): https://codereview.chromium.org/1177733002/diff/40001/Source/core/page/PageSe... Source/core/page/PageSerializer.h:75: PageSerializer(Vector<SerializedResource>*, PassOwnPtr<Delegate>, LinkLocalPathMap* = nullptr, String directory = ""); It seems that it is better to have member functions |registerRewriteURL()| and |setRewriteURLFolder()| in |PageSerializer| rather than passing them in ctor, to make |PageSerializer| API not to depend on |HashMap<K, V>|.
https://codereview.chromium.org/1177733002/diff/40001/Source/core/page/PageSe... File Source/core/page/PageSerializer.cpp (right): https://codereview.chromium.org/1177733002/diff/40001/Source/core/page/PageSe... Source/core/page/PageSerializer.cpp:264: result.append(document().baseTarget()); MarkupFormatter::appendAttributeValue https://codereview.chromium.org/1177733002/diff/40001/Source/core/page/PageSe... Source/core/page/PageSerializer.cpp:288: result.append('/'); MarkupFormatter::appendAttributeValue(result, "./" + m_directoryName + "/", document().isHTMLDocument());
https://codereview.chromium.org/1177733002/diff/40001/Source/core/page/PageSe... File Source/core/page/PageSerializer.cpp (right): https://codereview.chromium.org/1177733002/diff/40001/Source/core/page/PageSe... Source/core/page/PageSerializer.cpp:225: protected: On 2015/06/16 01:21:50, Yosi_UTC9 wrote: > Since this class is marked |final|, please move protected members to private. Done. https://codereview.chromium.org/1177733002/diff/40001/Source/core/page/PageSe... Source/core/page/PageSerializer.cpp:264: result.append(document().baseTarget()); On 2015/06/16 04:46:54, hajimehoshi wrote: > MarkupFormatter::appendAttributeValue Done. https://codereview.chromium.org/1177733002/diff/40001/Source/core/page/PageSe... Source/core/page/PageSerializer.cpp:288: result.append('/'); On 2015/06/16 04:46:55, hajimehoshi wrote: > MarkupFormatter::appendAttributeValue(result, "./" + m_directoryName + "/", > document().isHTMLDocument()); Done. https://codereview.chromium.org/1177733002/diff/40001/Source/core/page/PageSe... File Source/core/page/PageSerializer.h (right): https://codereview.chromium.org/1177733002/diff/40001/Source/core/page/PageSe... Source/core/page/PageSerializer.h:75: PageSerializer(Vector<SerializedResource>*, PassOwnPtr<Delegate>, LinkLocalPathMap* = nullptr, String directory = ""); On 2015/06/16 01:21:50, Yosi_UTC9 wrote: > It seems that it is better to have member functions |registerRewriteURL()| and > |setRewriteURLFolder()| in |PageSerializer| rather than passing them in ctor, to > make |PageSerializer| API not to depend on |HashMap<K, V>|. Done.
lgtm
https://codereview.chromium.org/1177733002/diff/60001/Source/core/page/PageSe... File Source/core/page/PageSerializer.cpp (right): https://codereview.chromium.org/1177733002/diff/60001/Source/core/page/PageSe... Source/core/page/PageSerializer.cpp:223: LinkChangeSerializerMarkupAccumulator(PageSerializer*, const Document&, WillBeHeapVector<RawPtrWillBeMember<Node>>&, HashMap<String, String>&, String&); |const String&| for the last parameter? I think it is better to have name of parameter unless PRESUBMIT doesn't warn. https://codereview.chromium.org/1177733002/diff/60001/Source/core/page/PageSe... Source/core/page/PageSerializer.cpp:285: MarkupFormatter::appendAttributeValue(result, "./" + m_rewriteFolder + "/", document().isHTMLDocument()); Do we need to have "./"? https://codereview.chromium.org/1177733002/diff/60001/Source/core/page/PageSe... File Source/core/page/PageSerializer.h (right): https://codereview.chromium.org/1177733002/diff/60001/Source/core/page/PageSe... Source/core/page/PageSerializer.h:78: void registerRewriteURL(String, String); Could you use |const String&|? https://codereview.chromium.org/1177733002/diff/60001/Source/core/page/PageSe... Source/core/page/PageSerializer.h:79: void setRewriteURLFolder(String); Could you use |const String&|?
https://codereview.chromium.org/1177733002/diff/60001/Source/core/page/PageSe... File Source/core/page/PageSerializer.cpp (right): https://codereview.chromium.org/1177733002/diff/60001/Source/core/page/PageSe... Source/core/page/PageSerializer.cpp:223: LinkChangeSerializerMarkupAccumulator(PageSerializer*, const Document&, WillBeHeapVector<RawPtrWillBeMember<Node>>&, HashMap<String, String>&, String&); On 2015/06/18 06:45:46, Yosi_UTC9 wrote: > |const String&| for the last parameter? > I think it is better to have name of parameter unless PRESUBMIT doesn't warn. Done. https://codereview.chromium.org/1177733002/diff/60001/Source/core/page/PageSe... Source/core/page/PageSerializer.cpp:285: MarkupFormatter::appendAttributeValue(result, "./" + m_rewriteFolder + "/", document().isHTMLDocument()); On 2015/06/18 06:45:46, Yosi_UTC9 wrote: > Do we need to have "./"? I don't think it makes a difference no. This is how it looks in the old WebPageSerializerImpl code. Removing. https://codereview.chromium.org/1177733002/diff/60001/Source/core/page/PageSe... File Source/core/page/PageSerializer.h (right): https://codereview.chromium.org/1177733002/diff/60001/Source/core/page/PageSe... Source/core/page/PageSerializer.h:78: void registerRewriteURL(String, String); On 2015/06/18 06:45:46, Yosi_UTC9 wrote: > Could you use |const String&|? Done. https://codereview.chromium.org/1177733002/diff/60001/Source/core/page/PageSe... Source/core/page/PageSerializer.h:79: void setRewriteURLFolder(String); On 2015/06/18 06:45:46, Yosi_UTC9 wrote: > Could you use |const String&|? Done.
lgtm
philipj@opera.com changed reviewers: + philipj@opera.com
https://codereview.chromium.org/1177733002/diff/80001/Source/core/page/PageSe... File Source/core/page/PageSerializer.cpp (right): https://codereview.chromium.org/1177733002/diff/80001/Source/core/page/PageSe... Source/core/page/PageSerializer.cpp:112: PageSerializer* pageSerializer(); I can't see this actually used in the patch... me blind? https://codereview.chromium.org/1177733002/diff/80001/Source/core/page/PageSe... Source/core/page/PageSerializer.cpp:221: class LinkChangeSerializerMarkupAccumulator final : public SerializerMarkupAccumulator { Should this be STACK_ALLOCATED()? https://codereview.chromium.org/1177733002/diff/80001/Source/core/page/PageSe... Source/core/page/PageSerializer.cpp:231: // m_rewriteURLs include all pair of local resource path and corresponding original link. Plural pairs, paths and links. https://codereview.chromium.org/1177733002/diff/80001/Source/core/page/PageSe... Source/core/page/PageSerializer.cpp:236: LinkChangeSerializerMarkupAccumulator::LinkChangeSerializerMarkupAccumulator(PageSerializer* serializer, const Document& document, WillBeHeapVector<RawPtrWillBeMember<Node>>& nodes, HashMap<String, String>& rewriteURLs, const String& rewriteFolder) Double space on this line. https://codereview.chromium.org/1177733002/diff/80001/Source/core/page/PageSe... Source/core/page/PageSerializer.cpp:247: // See http://msdn2.microsoft.com/en-us/library/ms537628(VS.85).aspx. This format is wonderfully strange :) https://codereview.chromium.org/1177733002/diff/80001/Source/core/page/PageSe... Source/core/page/PageSerializer.cpp:250: static_cast<int>(document().url().string().utf8().length()), Can you use %u or whatever the type is to avoid the cast? https://codereview.chromium.org/1177733002/diff/80001/Source/core/page/PageSe... Source/core/page/PageSerializer.cpp:259: // Append a new base tag declaration. I'm not sure I understand this. Is the main thing to change the URL in the href attribute, or to do something with the target attribute? It does look like if there are other attributes on the base element, those will simply be lost by this step. Probably doesn't matter, but seems odd since overall it seems like all attributes are serialized, whether they do something useful or not. https://codereview.chromium.org/1177733002/diff/80001/Source/core/page/PageSe... Source/core/page/PageSerializer.cpp:275: if (!m_rewriteURLs.isEmpty() && element.isURLAttribute(attribute) && !element.isJavaScriptURLAttribute(attribute)) { Does the javascript: case matter here? There's no test for it, and it doesn't look like this was copied from some existing code. https://codereview.chromium.org/1177733002/diff/80001/Source/core/page/PageSe... Source/core/page/PageSerializer.cpp:280: // TODO(tiger): Refactor MarkupAccumulator so it is easier to append an attribute like this. Do you mean to refactor this so that one can rewrite attributes values without also having to duplicate the steps for outputting the attribute name, = and ""? That would be very nice indeed. https://codereview.chromium.org/1177733002/diff/80001/Source/core/page/PageSe... Source/core/page/PageSerializer.cpp:284: if (!m_rewriteFolder.isEmpty()) { Don't need the {} here. https://codereview.chromium.org/1177733002/diff/80001/Source/core/page/PageSe... Source/core/page/PageSerializer.cpp:297: return SerializerMarkupAccumulator::shouldIgnoreElement(element) || isHTMLBaseElement(element); Instead of overriding this, can't LinkChangeSerializerMarkupAccumulator::appendElement simply not call SerializerMarkupAccumulator::appendElement for base elements since it does its own thing instead? https://codereview.chromium.org/1177733002/diff/80001/Source/core/page/PageSe... Source/core/page/PageSerializer.cpp:348: WTF::TextEncoding textEncoding(document.charset()); document.encoding() is already a WTF::TextEncoding. https://codereview.chromium.org/1177733002/diff/80001/Source/core/page/PageSe... File Source/core/page/PageSerializer.h (right): https://codereview.chromium.org/1177733002/diff/80001/Source/core/page/PageSe... Source/core/page/PageSerializer.h:78: void registerRewriteURL(const String&, const String&); It would be nice with argument names here, it's pretty opaque otherwise.
https://codereview.chromium.org/1177733002/diff/80001/Source/core/page/PageSe... File Source/core/page/PageSerializer.cpp (right): https://codereview.chromium.org/1177733002/diff/80001/Source/core/page/PageSe... Source/core/page/PageSerializer.cpp:112: PageSerializer* pageSerializer(); On 2015/06/26 09:07:55, philipj wrote: > I can't see this actually used in the patch... me blind? And now, neither can I. Removing. https://codereview.chromium.org/1177733002/diff/80001/Source/core/page/PageSe... Source/core/page/PageSerializer.cpp:221: class LinkChangeSerializerMarkupAccumulator final : public SerializerMarkupAccumulator { On 2015/06/26 09:07:55, philipj wrote: > Should this be STACK_ALLOCATED()? Probably, yes. Adding. https://codereview.chromium.org/1177733002/diff/80001/Source/core/page/PageSe... Source/core/page/PageSerializer.cpp:231: // m_rewriteURLs include all pair of local resource path and corresponding original link. On 2015/06/26 09:07:56, philipj wrote: > Plural pairs, paths and links. Done. https://codereview.chromium.org/1177733002/diff/80001/Source/core/page/PageSe... Source/core/page/PageSerializer.cpp:236: LinkChangeSerializerMarkupAccumulator::LinkChangeSerializerMarkupAccumulator(PageSerializer* serializer, const Document& document, WillBeHeapVector<RawPtrWillBeMember<Node>>& nodes, HashMap<String, String>& rewriteURLs, const String& rewriteFolder) On 2015/06/26 09:07:55, philipj wrote: > Double space on this line. Done. https://codereview.chromium.org/1177733002/diff/80001/Source/core/page/PageSe... Source/core/page/PageSerializer.cpp:247: // See http://msdn2.microsoft.com/en-us/library/ms537628(VS.85).aspx. On 2015/06/26 09:07:56, philipj wrote: > This format is wonderfully strange :) It is :) https://codereview.chromium.org/1177733002/diff/80001/Source/core/page/PageSe... Source/core/page/PageSerializer.cpp:250: static_cast<int>(document().url().string().utf8().length()), On 2015/06/26 09:07:56, philipj wrote: > Can you use %u or whatever the type is to avoid the cast? Is there a format like PRIuS from format_macros.h in Chromium for Blink? The type here is a size_t which is a bit tricky but we probably won't see any large numbers here. The format here is a bit ambiguous as well, should the length be bytes or characters? This is how it was in the old serializer so perhaps we should stick with it for now, but I'd have guessed characters. https://codereview.chromium.org/1177733002/diff/80001/Source/core/page/PageSe... Source/core/page/PageSerializer.cpp:259: // Append a new base tag declaration. On 2015/06/26 09:07:56, philipj wrote: > I'm not sure I understand this. Is the main thing to change the URL in the href > attribute, or to do something with the target attribute? It does look like if > there are other attributes on the base element, those will simply be lost by > this step. Probably doesn't matter, but seems odd since overall it seems like > all attributes are serialized, whether they do something useful or not. The main thing is to change the URL. This could be done by using shouldIgnoreAttribute and shouldIgnoreAttribute (need to add element/namespace paramters) instead. The latter would keep all other attributes but I don't think that is common. Not sure which way is better here. Thoughts? https://codereview.chromium.org/1177733002/diff/80001/Source/core/page/PageSe... Source/core/page/PageSerializer.cpp:275: if (!m_rewriteURLs.isEmpty() && element.isURLAttribute(attribute) && !element.isJavaScriptURLAttribute(attribute)) { It doesn't really matter no, it was just added to avoid the completeURL and map lookup for javascript in href attributes and such. Can be removed. https://codereview.chromium.org/1177733002/diff/80001/Source/core/page/PageSe... Source/core/page/PageSerializer.cpp:280: // TODO(tiger): Refactor MarkupAccumulator so it is easier to append an attribute like this. On 2015/06/26 09:07:55, philipj wrote: > Do you mean to refactor this so that one can rewrite attributes values without > also having to duplicate the steps for outputting the attribute name, = and ""? > That would be very nice indeed. Exactly so. https://codereview.chromium.org/1177733002/diff/80001/Source/core/page/PageSe... Source/core/page/PageSerializer.cpp:284: if (!m_rewriteFolder.isEmpty()) { On 2015/06/26 09:07:56, philipj wrote: > Don't need the {} here. Done. https://codereview.chromium.org/1177733002/diff/80001/Source/core/page/PageSe... Source/core/page/PageSerializer.cpp:297: return SerializerMarkupAccumulator::shouldIgnoreElement(element) || isHTMLBaseElement(element); On 2015/06/26 09:07:56, philipj wrote: > Instead of overriding this, can't > LinkChangeSerializerMarkupAccumulator::appendElement simply not call > SerializerMarkupAccumulator::appendElement for base elements since it does its > own thing instead? True! Fixing. https://codereview.chromium.org/1177733002/diff/80001/Source/core/page/PageSe... Source/core/page/PageSerializer.cpp:348: WTF::TextEncoding textEncoding(document.charset()); On 2015/06/26 09:07:55, philipj wrote: > document.encoding() is already a WTF::TextEncoding. Done. https://codereview.chromium.org/1177733002/diff/80001/Source/core/page/PageSe... File Source/core/page/PageSerializer.h (right): https://codereview.chromium.org/1177733002/diff/80001/Source/core/page/PageSe... Source/core/page/PageSerializer.h:78: void registerRewriteURL(const String&, const String&); On 2015/06/26 09:07:56, philipj wrote: > It would be nice with argument names here, it's pretty opaque otherwise. Done.
https://codereview.chromium.org/1177733002/diff/80001/Source/core/page/PageSe... File Source/core/page/PageSerializer.cpp (right): https://codereview.chromium.org/1177733002/diff/80001/Source/core/page/PageSe... Source/core/page/PageSerializer.cpp:250: static_cast<int>(document().url().string().utf8().length()), On 2015/06/30 11:31:53, Tiger (Sony Mobile) wrote: > On 2015/06/26 09:07:56, philipj wrote: > > Can you use %u or whatever the type is to avoid the cast? > > Is there a format like PRIuS from format_macros.h in Chromium for Blink? The > type here is a size_t which is a bit tricky but we probably won't see any large > numbers here. > > The format here is a bit ambiguous as well, should the length be bytes or > characters? This is how it was in the old serializer so perhaps we should stick > with it for now, but I'd have guessed characters. I guess the assumption is that the URL will be 7-bit ASCII? Is it, are there tests for how URLs with non-ASCII in the source are serialized? https://codereview.chromium.org/1177733002/diff/80001/Source/core/page/PageSe... Source/core/page/PageSerializer.cpp:259: // Append a new base tag declaration. On 2015/06/30 11:31:53, Tiger (Sony Mobile) wrote: > On 2015/06/26 09:07:56, philipj wrote: > > I'm not sure I understand this. Is the main thing to change the URL in the > href > > attribute, or to do something with the target attribute? It does look like if > > there are other attributes on the base element, those will simply be lost by > > this step. Probably doesn't matter, but seems odd since overall it seems like > > all attributes are serialized, whether they do something useful or not. > > The main thing is to change the URL. This could be done by using > shouldIgnoreAttribute and shouldIgnoreAttribute (need to add element/namespace > paramters) instead. The latter would keep all other attributes but I don't think > that is common. Not sure which way is better here. Thoughts? The original code for this is WebPageSerializerImpl::postActionAfterSerializeEndTag I guess? That was really quite weird, commenting out the <base> tag and then generating a new one. It seems to me like a good overall structure for this URL rewriting would be a callback given the element, attribute name and attribute value which returns a new attribute value, or perhaps null to remove the attribute if that's ever needed. Would it be possible to implement both the URL rewriting of <base> and in the general context with that approach? Perhaps not here, but would it make sense? https://codereview.chromium.org/1177733002/diff/80001/Source/core/page/PageSe... Source/core/page/PageSerializer.cpp:275: if (!m_rewriteURLs.isEmpty() && element.isURLAttribute(attribute) && !element.isJavaScriptURLAttribute(attribute)) { On 2015/06/30 11:31:52, Tiger (Sony Mobile) wrote: > It doesn't really matter no, it was just added to avoid the completeURL and map > lookup for javascript in href attributes and such. Can be removed. Forgot to remove it, or do you mean to do it later? https://codereview.chromium.org/1177733002/diff/80001/Source/core/page/PageSe... Source/core/page/PageSerializer.cpp:297: return SerializerMarkupAccumulator::shouldIgnoreElement(element) || isHTMLBaseElement(element); On 2015/06/30 11:31:52, Tiger (Sony Mobile) wrote: > On 2015/06/26 09:07:56, philipj wrote: > > Instead of overriding this, can't > > LinkChangeSerializerMarkupAccumulator::appendElement simply not call > > SerializerMarkupAccumulator::appendElement for base elements since it does its > > own thing instead? > > True! Fixing. With that, I think you can also undo the other changes around shouldIgnoreElement
https://codereview.chromium.org/1177733002/diff/80001/Source/core/page/PageSe... File Source/core/page/PageSerializer.cpp (right): https://codereview.chromium.org/1177733002/diff/80001/Source/core/page/PageSe... Source/core/page/PageSerializer.cpp:250: static_cast<int>(document().url().string().utf8().length()), On 2015/06/30 11:56:27, philipj wrote: > On 2015/06/30 11:31:53, Tiger (Sony Mobile) wrote: > > On 2015/06/26 09:07:56, philipj wrote: > > > Can you use %u or whatever the type is to avoid the cast? > > > > Is there a format like PRIuS from format_macros.h in Chromium for Blink? The > > type here is a size_t which is a bit tricky but we probably won't see any > large > > numbers here. > > > > The format here is a bit ambiguous as well, should the length be bytes or > > characters? This is how it was in the old serializer so perhaps we should > stick > > with it for now, but I'd have guessed characters. > > I guess the assumption is that the URL will be 7-bit ASCII? Is it, are there > tests for how URLs with non-ASCII in the source are serialized? That sounds more reasonable yes, and it seems that the URL will always be in this format from the manual tests I did. No automatic tests for this. Worth adding? Ideally a test with IE to see if the page opens correctly and is in the right security zone would be nice but that can't be done. https://codereview.chromium.org/1177733002/diff/80001/Source/core/page/PageSe... Source/core/page/PageSerializer.cpp:259: // Append a new base tag declaration. On 2015/06/30 11:56:27, philipj wrote: > On 2015/06/30 11:31:53, Tiger (Sony Mobile) wrote: > > On 2015/06/26 09:07:56, philipj wrote: > > > I'm not sure I understand this. Is the main thing to change the URL in the > > href > > > attribute, or to do something with the target attribute? It does look like > if > > > there are other attributes on the base element, those will simply be lost by > > > this step. Probably doesn't matter, but seems odd since overall it seems > like > > > all attributes are serialized, whether they do something useful or not. > > > > The main thing is to change the URL. This could be done by using > > shouldIgnoreAttribute and shouldIgnoreAttribute (need to add element/namespace > > paramters) instead. The latter would keep all other attributes but I don't > think > > that is common. Not sure which way is better here. Thoughts? > > The original code for this is > WebPageSerializerImpl::postActionAfterSerializeEndTag I guess? That was really > quite weird, commenting out the <base> tag and then generating a new one. > > It seems to me like a good overall structure for this URL rewriting would be a > callback given the element, attribute name and attribute value which returns a > new attribute value, or perhaps null to remove the attribute if that's ever > needed. Would it be possible to implement both the URL rewriting of <base> and > in the general context with that approach? Perhaps not here, but would it make > sense? I think it would make sense, and it would work for both those cases. In the future URL rewriting would need to be expanded to CSS as well to fix some existing bugs that exists in the old serializer (e.g. for @import URLs and similar) that still exists for 'Webpage, Complete' even after moving to the newer serializer. With a more general approach I think the two SerializerMarkupAccumulator classes here could be merged as well. I'm a bit hesitant to do this in this CL, but it is would be good to do in the future. https://codereview.chromium.org/1177733002/diff/80001/Source/core/page/PageSe... Source/core/page/PageSerializer.cpp:275: if (!m_rewriteURLs.isEmpty() && element.isURLAttribute(attribute) && !element.isJavaScriptURLAttribute(attribute)) { On 2015/06/30 11:56:27, philipj wrote: > On 2015/06/30 11:31:52, Tiger (Sony Mobile) wrote: > > It doesn't really matter no, it was just added to avoid the completeURL and > map > > lookup for javascript in href attributes and such. Can be removed. > > Forgot to remove it, or do you mean to do it later? Removing now. https://codereview.chromium.org/1177733002/diff/80001/Source/core/page/PageSe... Source/core/page/PageSerializer.cpp:297: return SerializerMarkupAccumulator::shouldIgnoreElement(element) || isHTMLBaseElement(element); On 2015/06/30 11:56:27, philipj wrote: > On 2015/06/30 11:31:52, Tiger (Sony Mobile) wrote: > > On 2015/06/26 09:07:56, philipj wrote: > > > Instead of overriding this, can't > > > LinkChangeSerializerMarkupAccumulator::appendElement simply not call > > > SerializerMarkupAccumulator::appendElement for base elements since it does > its > > > own thing instead? > > > > True! Fixing. > > With that, I think you can also undo the other changes around > shouldIgnoreElement Done.
https://codereview.chromium.org/1177733002/diff/80001/Source/core/page/PageSe... File Source/core/page/PageSerializer.cpp (right): https://codereview.chromium.org/1177733002/diff/80001/Source/core/page/PageSe... Source/core/page/PageSerializer.cpp:259: // Append a new base tag declaration. On 2015/06/30 12:59:02, Tiger (Sony Mobile) wrote: > On 2015/06/30 11:56:27, philipj wrote: > > On 2015/06/30 11:31:53, Tiger (Sony Mobile) wrote: > > > On 2015/06/26 09:07:56, philipj wrote: > > > > I'm not sure I understand this. Is the main thing to change the URL in the > > > href > > > > attribute, or to do something with the target attribute? It does look like > > if > > > > there are other attributes on the base element, those will simply be lost > by > > > > this step. Probably doesn't matter, but seems odd since overall it seems > > like > > > > all attributes are serialized, whether they do something useful or not. > > > > > > The main thing is to change the URL. This could be done by using > > > shouldIgnoreAttribute and shouldIgnoreAttribute (need to add > element/namespace > > > paramters) instead. The latter would keep all other attributes but I don't > > think > > > that is common. Not sure which way is better here. Thoughts? > > > > The original code for this is > > WebPageSerializerImpl::postActionAfterSerializeEndTag I guess? That was really > > quite weird, commenting out the <base> tag and then generating a new one. > > > > It seems to me like a good overall structure for this URL rewriting would be a > > callback given the element, attribute name and attribute value which returns a > > new attribute value, or perhaps null to remove the attribute if that's ever > > needed. Would it be possible to implement both the URL rewriting of <base> and > > in the general context with that approach? Perhaps not here, but would it make > > sense? > > I think it would make sense, and it would work for both those cases. In the > future URL rewriting would need to be expanded to CSS as well to fix some > existing bugs that exists in the old serializer (e.g. for @import URLs and > similar) that still exists for 'Webpage, Complete' even after moving to the > newer serializer. > > With a more general approach I think the two SerializerMarkupAccumulator classes > here could be merged as well. I'm a bit hesitant to do this in this CL, but it > is would be good to do in the future. I've added a small TODO section about this.
LGTM % consistency (sorry) https://codereview.chromium.org/1177733002/diff/160001/Source/core/page/PageS... File Source/core/page/PageSerializer.cpp (right): https://codereview.chromium.org/1177733002/diff/160001/Source/core/page/PageS... Source/core/page/PageSerializer.cpp:214: /* TODO(tiger): Right now there is no support for rewriting URLs inside CSS http://google-styleguide.googlecode.com/svn/trunk/cppguide.html#Comments wants consistency and the other comments in this file are // Thumbs up on the actual content though!
The CQ bit was checked by gustav.tiger@sonymobile.com
The patchset sent to the CQ was uploaded after l-g-t-m from hajimehoshi@chromium.org, yosin@chromium.org, philipj@opera.com Link to the patchset: https://codereview.chromium.org/1177733002/#ps180001 (title: "Adjust TODO comment style")
CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/patch-status/1177733002/180001
Message was sent while issue was closed.
Committed patchset #10 (id:180001) as https://src.chromium.org/viewvc/blink?view=rev&revision=198143 |