|
|
Created:
3 years, 6 months ago by karandeepb Modified:
3 years, 5 months ago CC:
chromium-reviews, subresource-filter-reviews_chromium.org Target Ref:
refs/heads/master Project:
chromium Visibility:
Public. |
DescriptionUrlPatternIndex: Introduce ElementType and ActivationType enums to url_pattern_index schema.
Currently, the UrlPatternIndex flatbuffer structure stores the element types and
activation types as bitmasks of the corresponding proto types. This CL
introduces ElementType and ActivationType enums to the UrlPatternIndex
flatbuffer schema, hence removing the dependence on the corresponding proto
types. This should allow the url_pattern_index clients to work without a proto
dependency.
BUG=713774
Patch Set 1 #Patch Set 2 : -- #Patch Set 3 : Format #
Total comments: 12
Patch Set 4 : Address review. #
Messages
Total messages: 20 (11 generated)
The CQ bit was checked by karandeepb@chromium.org to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at: https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
The CQ bit was unchecked by commit-bot@chromium.org
Dry run: Try jobs failed on following builders: android_arm64_dbg_recipe on master.tryserver.chromium.android (JOB_FAILED, https://build.chromium.org/p/tryserver.chromium.android/builders/android_arm6...)
Description was changed from ========== Remove proto dependency. ========== to ========== UrlPatternIndex: Introduce ElementType and ActivationType enums to url_pattern_index schema. Currently, the UrlPatternIndex flatbuffer structure stores the element types and activation types as bitmasks of the corresponding proto types. This CL introduces ElementType and ActivationType enums to the UrlPatternIndex flatbuffer schema, hence removing the dependence on the corresponding proto types. BUG=713774 ==========
The CQ bit was checked by karandeepb@chromium.org to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at: https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
Description was changed from ========== UrlPatternIndex: Introduce ElementType and ActivationType enums to url_pattern_index schema. Currently, the UrlPatternIndex flatbuffer structure stores the element types and activation types as bitmasks of the corresponding proto types. This CL introduces ElementType and ActivationType enums to the UrlPatternIndex flatbuffer schema, hence removing the dependence on the corresponding proto types. BUG=713774 ========== to ========== UrlPatternIndex: Introduce ElementType and ActivationType enums to url_pattern_index schema. Currently, the UrlPatternIndex flatbuffer structure stores the element types and activation types as bitmasks of the corresponding proto types. This CL introduces ElementType and ActivationType enums to the UrlPatternIndex flatbuffer schema, hence removing the dependence on the corresponding proto types. This should allow the url_pattern_index clients to work without a proto dependency. BUG=713774 ==========
karandeepb@chromium.org changed reviewers: + pkalinnikov@chromium.org
PTAL Pavel.
Ping.
I can't comment inline on the .fbs file, it is displayed as binary. I took a look at it manually, here are some comments: 1. Lines 35-36: Please align the comments (just like clang-format does for enums). 2. Line 79. Drop "Should not be an empty array.", this is not true. Both arrays can be empty. Could you please either mark this file as non-binary (not sure whether it's possible), or try putting this CL to gerrit? https://codereview.chromium.org/2954613002/diff/40001/components/url_pattern_... File components/url_pattern_index/url_pattern_index.cc (right): https://codereview.chromium.org/2954613002/diff/40001/components/url_pattern_... components/url_pattern_index/url_pattern_index.cc:31: const std::map<proto::ActivationType, flat::ActivationType> kActivationTypeMap = For performance reasons (queries should be fast), can you please replace these maps by one of the following? - Just switch-case clauses in the below functions, remove #includes. Iterating through all proto::ElementTypes is doable by looping an int from OTHER to MAX. The DCHECK_IS_ON checks would become redundant because ProtoToFlat*Type called from these loops would do the thing. - I bet "base/containers/flat_map.h" should be more compact and fast than std::map for such a use-case. Once the proto-specific code is detached and moved back to subresource_filter, this mapping will no longer be needed in queries, since flat::ElementType can be used directly. A list of pairs will still be needed to convert from proto wire format to FlatBuffers. Alternative idea. Could url_pattern_index be agnostic to the types? It could assume that they are a bitmask (as currently), without knowing which bits correspond to which types. In this case there would be no need in conversion. Subresource filter and DNR could use independent typing systems. E.g., DNR could use WebRequestResourceType. Additional benefit is the ability to get rid of the notion of element type and activation type in UrlPatternIndex, and simply speak about "types". https://codereview.chromium.org/2954613002/diff/40001/components/url_pattern_... components/url_pattern_index/url_pattern_index.cc:226: uint16_t mask = 0; nit: Can you make it int? Same below for ActivationType. https://codereview.chromium.org/2954613002/diff/40001/components/url_pattern_... components/url_pattern_index/url_pattern_index.cc:242: if (element_types_ & flat::ElementType_OBJECT_SUBREQUEST) Should we consider merging the 2 types together? Subresource Filter never uses OBJECT_SUBREQUEST for queries (see ToElementType in "web_document_subresource_filter_impl.cc"), i.e., "the other way round" never happens, thus the 2 types are effectively equivalent. https://codereview.chromium.org/2954613002/diff/40001/components/url_pattern_... File components/url_pattern_index/url_pattern_index.h (right): https://codereview.chromium.org/2954613002/diff/40001/components/url_pattern_... components/url_pattern_index/url_pattern_index.h:119: const flat::UrlRule* FindMatch(const GURL& url, Do you mean this method and proto-specific code to be removed from url_pattern_index in a follow-up?
pkalinnikov@chromium.org changed reviewers: + csharrison@chromium.org
+Charlie
It generally looks good, I'll do a final pass once Pavel's comments are addressed and the binary fbs is fixed.
Patchset #4 (id:60001) has been deleted
1. Lines 35-36: Please align the comments (just like clang-format does for enums). Done 2. Line 79. Drop "Should not be an empty array.", this is not true. Both arrays can be empty. This is enforced by the DCHECK(is_generic || rule.domains_included()->size()); here - https://cs.chromium.org/chromium/src/components/url_pattern_index/url_pattern.... I think this makes sense since the default value for a flatbuffer vector is null (I think) and using an empty array would lead to more space. https://codereview.chromium.org/2954613002/diff/40001/components/url_pattern_... File components/url_pattern_index/url_pattern_index.cc (right): https://codereview.chromium.org/2954613002/diff/40001/components/url_pattern_... components/url_pattern_index/url_pattern_index.cc:31: const std::map<proto::ActivationType, flat::ActivationType> kActivationTypeMap = On 2017/06/28 15:59:43, pkalinnikov wrote: > For performance reasons (queries should be fast), can you please replace these > maps by one of the following? > - Just switch-case clauses in the below functions, remove #includes. Iterating > through all proto::ElementTypes is doable by looping an int from OTHER to MAX. > The DCHECK_IS_ON checks would become redundant because ProtoToFlat*Type called > from these loops would do the thing. > - I bet "base/containers/flat_map.h" should be more compact and fast than > std::map for such a use-case. I preferred to use a flat map, since I think it leads to cleaner code. > Once the proto-specific code is detached and moved back to subresource_filter, > this mapping will no longer be needed in queries, since flat::ElementType can be > used directly. A list of pairs will still be needed to convert from proto wire > format to FlatBuffers. Correct. > Alternative idea. Could url_pattern_index be agnostic to the types? It could > assume that they are a bitmask (as currently), without knowing which bits > correspond to which types. In this case there would be no need in conversion. > Subresource filter and DNR could use independent typing systems. E.g., DNR could > use WebRequestResourceType. Additional benefit is the ability to get rid of the > notion of element type and activation type in UrlPatternIndex, and simply speak > about "types". I prefer the current form. url_pattern_index is meant to match Url rules, and hence it isn't incorrect for it to know about resource types. Also, the same argument can be used for OptionFlag. For url_pattern_index to support a generic bitmask w/o knowing anything about it, seems weird to me. For e.g., how do we know how many bits to support etc. https://codereview.chromium.org/2954613002/diff/40001/components/url_pattern_... components/url_pattern_index/url_pattern_index.cc:226: uint16_t mask = 0; On 2017/06/28 15:59:43, pkalinnikov wrote: > nit: Can you make it int? Same below for ActivationType. Isn't this better being a more concrete type? |element_types_| also uses uint16_t. https://codereview.chromium.org/2954613002/diff/40001/components/url_pattern_... components/url_pattern_index/url_pattern_index.cc:242: if (element_types_ & flat::ElementType_OBJECT_SUBREQUEST) On 2017/06/28 15:59:42, pkalinnikov wrote: > Should we consider merging the 2 types together? Subresource Filter never uses > OBJECT_SUBREQUEST for queries (see ToElementType in > "web_document_subresource_filter_impl.cc"), i.e., "the other way round" never > happens, thus the 2 types are effectively equivalent. Yeah I thought of that as well. We can add DCHECKs to FindMatch to ensure it is not queried with POPUP and OBJECT_SUBREQUEST types. However since FindMatch takes in a proto::ElementType currently, I think it is better to support all types for now. Once, proto code is moved back to subresource filter and we only support flat:: types, we can remove OBJECT_SUBREQUEST. Added a TODO to that effect. https://codereview.chromium.org/2954613002/diff/40001/components/url_pattern_... File components/url_pattern_index/url_pattern_index.h (right): https://codereview.chromium.org/2954613002/diff/40001/components/url_pattern_... components/url_pattern_index/url_pattern_index.h:119: const flat::UrlRule* FindMatch(const GURL& url, On 2017/06/28 15:59:43, pkalinnikov wrote: > Do you mean this method and proto-specific code to be removed from > url_pattern_index in a follow-up? Eventually yeah this should be removed, however I am currently not working on a followup to this. This CL is meant to allow url_pattern_index clients to not have a proto dependency. I'll try and take it on once I get the time, or maybe you can work on it?
I can't seem to find a fix for the fbs detected as binary issue. Even one of the difftools I use locally, detects it as binary. So this is probably not just limited to reitveld and should be fixed. Will upload this to gerrit for now.
On 2017/06/29 00:58:18, karandeepb wrote: > 2. Line 79. Drop "Should not be an empty array.", this is not true. Both arrays > can be empty. > This is enforced by the DCHECK(is_generic || rule.domains_included()->size()); > here - > https://cs.chromium.org/chromium/src/components/url_pattern_index/url_pattern.... > I think this makes sense since the default value for a flatbuffer vector is null > (I think) and using an empty array would lead to more space. Makes sense, but see my comment in gerrit.
https://codereview.chromium.org/2954613002/diff/40001/components/url_pattern_... File components/url_pattern_index/url_pattern_index.cc (right): https://codereview.chromium.org/2954613002/diff/40001/components/url_pattern_... components/url_pattern_index/url_pattern_index.cc:31: const std::map<proto::ActivationType, flat::ActivationType> kActivationTypeMap = On 2017/06/29 00:58:18, karandeepb wrote: > On 2017/06/28 15:59:43, pkalinnikov wrote: > > For performance reasons (queries should be fast), can you please replace these > > maps by one of the following? > > - Just switch-case clauses in the below functions, remove #includes. > Iterating > > through all proto::ElementTypes is doable by looping an int from OTHER to MAX. > > The DCHECK_IS_ON checks would become redundant because ProtoToFlat*Type called > > from these loops would do the thing. > > - I bet "base/containers/flat_map.h" should be more compact and fast than > > std::map for such a use-case. > I preferred to use a flat map, since I think it leads to cleaner code. > > > > Once the proto-specific code is detached and moved back to subresource_filter, > > this mapping will no longer be needed in queries, since flat::ElementType can > be > > used directly. A list of pairs will still be needed to convert from proto wire > > format to FlatBuffers. > Correct. > > > Alternative idea. Could url_pattern_index be agnostic to the types? It could > > assume that they are a bitmask (as currently), without knowing which bits > > correspond to which types. In this case there would be no need in conversion. > > Subresource filter and DNR could use independent typing systems. E.g., DNR > could > > use WebRequestResourceType. Additional benefit is the ability to get rid of > the > > notion of element type and activation type in UrlPatternIndex, and simply > speak > > about "types". > I prefer the current form. url_pattern_index is meant to match Url rules, and > hence it isn't incorrect for it to know about resource types. Also, the same > argument can be used for OptionFlag. For url_pattern_index to support a generic > bitmask w/o knowing anything about it, seems weird to me. For e.g., how do we > know how many bits to support etc. Acknowledged. https://codereview.chromium.org/2954613002/diff/40001/components/url_pattern_... components/url_pattern_index/url_pattern_index.cc:226: uint16_t mask = 0; On 2017/06/29 00:58:18, karandeepb wrote: > On 2017/06/28 15:59:43, pkalinnikov wrote: > > nit: Can you make it int? Same below for ActivationType. > > Isn't this better being a more concrete type? |element_types_| also uses > uint16_t. I agree that *current* proto::ElementType values fit into uint16_t. However, the type of |pair.first| below, which is an enum, is only <= int (not specified further by the standard). The static_assert above doesn't check proto::ElementType boundary, so you can either add another static_cast for that, or (which is shorter) just use int in this line. https://codereview.chromium.org/2954613002/diff/40001/components/url_pattern_... components/url_pattern_index/url_pattern_index.cc:242: if (element_types_ & flat::ElementType_OBJECT_SUBREQUEST) On 2017/06/29 00:58:18, karandeepb wrote: > On 2017/06/28 15:59:42, pkalinnikov wrote: > > Should we consider merging the 2 types together? Subresource Filter never uses > > OBJECT_SUBREQUEST for queries (see ToElementType in > > "web_document_subresource_filter_impl.cc"), i.e., "the other way round" never > > happens, thus the 2 types are effectively equivalent. > > Yeah I thought of that as well. We can add DCHECKs to FindMatch to ensure it is > not queried with POPUP and OBJECT_SUBREQUEST types. > However since FindMatch takes in a proto::ElementType currently, I think it is > better to support all types for now. Once, proto code is moved back to > subresource filter and we only support flat:: types, we can remove > OBJECT_SUBREQUEST. Added a TODO to that effect. Acknowledged. https://codereview.chromium.org/2954613002/diff/40001/components/url_pattern_... File components/url_pattern_index/url_pattern_index.h (right): https://codereview.chromium.org/2954613002/diff/40001/components/url_pattern_... components/url_pattern_index/url_pattern_index.h:119: const flat::UrlRule* FindMatch(const GURL& url, On 2017/06/29 00:58:18, karandeepb wrote: > On 2017/06/28 15:59:43, pkalinnikov wrote: > > Do you mean this method and proto-specific code to be removed from > > url_pattern_index in a follow-up? > > Eventually yeah this should be removed, however I am currently not working on a > followup to this. This CL is meant to allow url_pattern_index clients to not > have a proto dependency. > > I'll try and take it on once I get the time, or maybe you can work on it? I can do that, no worries. |