|
|
Created:
7 years, 7 months ago by jrummell Modified:
7 years, 6 months ago Reviewers:
jrummell, xhwang, acolwell GONE FROM CHROMIUM, commit-bot: I haz the power, Ilya Sherman, scherkus (not reviewing) CC:
chromium-reviews, feature-media-reviews_chromium.org, MAD, Ilya Sherman, jar (doing other things), Alexei Svitkine (slow), Mark P, SteveT, vadimb Base URL:
https://chromium.googlesource.com/chromium/src.git@master Visibility:
Public. |
DescriptionAdd UMA stats for audio/video containers.
BUG=238108
Committed: https://src.chromium.org/viewvc/chrome?view=rev&revision=202275
Patch Set 1 #
Total comments: 46
Patch Set 2 : #
Total comments: 40
Patch Set 3 : #
Total comments: 67
Patch Set 4 : #Patch Set 5 : #
Total comments: 30
Patch Set 6 : #
Total comments: 75
Patch Set 7 : #
Total comments: 26
Patch Set 8 : #Patch Set 9 : #Patch Set 10 : #Patch Set 11 : #Patch Set 12 : #Patch Set 13 : #Patch Set 14 : #
Messages
Total messages: 35 (0 generated)
PTAL.
Thanks for the nice patch! Haven't looked at the detailed check_xxx functions. This is my initial comments. https://codereview.chromium.org/14495010/diff/1/media/base/container_names.cc File media/base/container_names.cc (right): https://codereview.chromium.org/14495010/diff/1/media/base/container_names.cc... media/base/container_names.cc:275: } indent https://codereview.chromium.org/14495010/diff/1/media/base/container_names.h File media/base/container_names.h (right): https://codereview.chromium.org/14495010/diff/1/media/base/container_names.h#... media/base/container_names.h:14: // FFmpeg code. Not all of these are enabled in FFmpeg. Add a comment that this list must be in sync with the histograms.xml file? https://codereview.chromium.org/14495010/diff/1/media/base/container_names.h#... media/base/container_names.h:260: static const ContainerNameMapping container_name_mapping[215]; 215 seems a magic number? Is it CONTAINER_MAX? https://codereview.chromium.org/14495010/diff/1/media/base/container_names.h#... media/base/container_names.h:260: static const ContainerNameMapping container_name_mapping[215]; Hide this as a file scope constant in the .cc file? Then we can make all member functions of ContainerNames static. If we do that, drop the ctor/dtor and replace DISALLOW_COPY_AND_ASSIGN with DISALLOW_IMPLICIT_CONSTRUCTORS. https://codereview.chromium.org/14495010/diff/1/media/base/container_names.h#... media/base/container_names.h:265: size_t buffer_size); We discourage the use of function overloading: http://google-styleguide.googlecode.com/svn/trunk/cppguide.xml?showone=Functi... The second one is not actually simple table lookup. Use something like ExtractContainer()? https://codereview.chromium.org/14495010/diff/1/media/base/container_names_un... File media/base/container_names_unittest.cc (right): https://codereview.chromium.org/14495010/diff/1/media/base/container_names_un... media/base/container_names_unittest.cc:13: template <typename T, size_t N> inline size_t size(T(&)[N]) { return N; } Can you use arraysize directly? https://code.google.com/p/chromium/codesearch#chromium/src/base/basictypes.h&... https://codereview.chromium.org/14495010/diff/1/media/base/container_names_un... media/base/container_names_unittest.cc:15: #define BOM "\xef\xbb\xbf" what is BOM? Use a more readable name? Or add a comment if this is a common abbreviation in it's context. Also move this to line 115 (before it's used)? https://codereview.chromium.org/14495010/diff/1/media/base/container_names_un... media/base/container_names_unittest.cc:20: ASSERT_LT(strcmp(ContainerNames::container_name_mapping[i - 1].name, Use base::strcasecmp ? https://code.google.com/p/chromium/codesearch#chromium/src/base/string_util.h... https://codereview.chromium.org/14495010/diff/1/media/filters/ffmpeg_glue.cc File media/filters/ffmpeg_glue.cc (left): https://codereview.chromium.org/14495010/diff/1/media/filters/ffmpeg_glue.cc#... media/filters/ffmpeg_glue.cc:6: keep this empty line: http://google-styleguide.googlecode.com/svn/trunk/cppguide.xml?showone=Names_... https://codereview.chromium.org/14495010/diff/1/media/filters/ffmpeg_glue.cc File media/filters/ffmpeg_glue.cc (right): https://codereview.chromium.org/14495010/diff/1/media/filters/ffmpeg_glue.cc#... media/filters/ffmpeg_glue.cc:174: uint8 buf[8192]; This code path only happens in the failure case. But we always use a 8k buffer on the stack, which seems inefficient. Allocate this on the heap? https://codereview.chromium.org/14495010/diff/1/tools/metrics/histograms/hist... File tools/metrics/histograms/histograms.xml (right): https://codereview.chromium.org/14495010/diff/1/tools/metrics/histograms/hist... tools/metrics/histograms/histograms.xml:939: <histogram name="Media.Container" enum="FFmpegContainers"> nit: does it make sense to use Media.DetectedContainer to be consistent with the following two?
/cc other UMA folks as an FYI. Vadim, is a histogram with roughly 450 buckets going to cause any issues with the backend processing? (Recall that this discussion is in a public forum, so please switch to an internal medium if answering would require discussing the backend pipeline beyond what's ok in a public forum.) Apologies for my drive-by comments on the non-UMA related parts of this CL, but I'm concerned by all the code that seems to be re-implementing existing code. Please scan through the code that you've written and consider whether it seems like the sort of code that you'd expect to already exist somewhere. For code that you think probably ought to already exist, please look around for it, or ask your teammates or others on the Chromium team whether there's existing code that handles what you're aiming to do. Thanks :) https://codereview.chromium.org/14495010/diff/1/media/base/container_names.cc File media/base/container_names.cc (right): https://codereview.chromium.org/14495010/diff/1/media/base/container_names.cc... media/base/container_names.cc:15: template <typename T, size_t N> inline size_t size(T(&)[N]) { return N; } Any reason not to use the shared arraysize macro defined in base/basictypes.h? https://codereview.chromium.org/14495010/diff/1/media/base/container_names.cc... media/base/container_names.cc:19: ((const uint8_t*)(p))[1]) Please use C++-style casts, rather than C-style casts. Also, it seems really likely that this code already exists somewhere in base, perhaps under base/strings/string_number_conversions.h. https://codereview.chromium.org/14495010/diff/1/media/base/container_names.cc... media/base/container_names.cc:38: const ContainerNameMapping ContainerNames::container_name_mapping[] = { Any reason not to tuck this into an anonymous namespace, rather than having it be a static class member? https://codereview.chromium.org/14495010/diff/1/media/base/container_names.cc... media/base/container_names.cc:261: DCHECK_NE(container_name, (const char*)NULL); nit: This is usually written as "DCHECK(container_name);" https://codereview.chromium.org/14495010/diff/1/media/base/container_names.cc... media/base/container_names.cc:263: // Do a binary search through supported_mapping looking for a match. Any reason not to just use binary_search() from std::algorithm? https://codereview.chromium.org/14495010/diff/1/media/base/container_names.cc... media/base/container_names.cc:281: #define BOM "\xef\xbb\xbf" Any reason not to define this as a named constant, i.e. "const char kByteOrderMark[] = "\xef\xbb\xbf";", rather than a macro? https://codereview.chromium.org/14495010/diff/1/media/base/container_names.cc... media/base/container_names.cc:315: static const char IDF_SIGNATURE[] = nit: This should use kConstantName casing rather than MACRO_NAME casing. https://codereview.chromium.org/14495010/diff/1/media/base/container_names.cc... media/base/container_names.cc:1982: "Media.Container", container * 2, CONTAINER_MAX * 2); Can you use a sparse histogram for this histogram? An enumerated histogram with ~450 buckets costs 450*(12 bytes per bucket) = 5.4KB, which is relatively expensive. https://codereview.chromium.org/14495010/diff/1/media/base/container_names.cc... media/base/container_names.cc:1990: "Media.Container", container * 2 + 1, CONTAINER_MAX * 2); Please define a helper function along the lines of void LogContainer(FFmpegContainerName container, bool is_guess) { int metric = 2 * container; if (is_guess) ++metric; UMA_HISTOGRAM_SPARSE_SLOWLY("Media.Container", metric); } This way, it's harder for the two UMA_HISTOGRAM_ENUMERATION macros to drift out of sync with each other, e.g. by having a typo that affects only one of them. https://codereview.chromium.org/14495010/diff/1/media/base/container_names.h File media/base/container_names.h (right): https://codereview.chromium.org/14495010/diff/1/media/base/container_names.h#... media/base/container_names.h:246: } ContainerNameMapping; nit: No need for typedef nor the second instance of "ContainerNameMapping"
Updated CL. PTAL. https://codereview.chromium.org/14495010/diff/1/media/base/container_names.cc File media/base/container_names.cc (right): https://codereview.chromium.org/14495010/diff/1/media/base/container_names.cc... media/base/container_names.cc:15: template <typename T, size_t N> inline size_t size(T(&)[N]) { return N; } On 2013/04/29 23:48:06, Ilya Sherman wrote: > Any reason not to use the shared arraysize macro defined in base/basictypes.h? Done. https://codereview.chromium.org/14495010/diff/1/media/base/container_names.cc... media/base/container_names.cc:19: ((const uint8_t*)(p))[1]) On 2013/04/29 23:48:06, Ilya Sherman wrote: > Please use C++-style casts, rather than C-style casts. Also, it seems really > likely that this code already exists somewhere in base, perhaps under > base/strings/string_number_conversions.h. Unable to find a similar function in base. Converted them to inline functions for better type checking. https://codereview.chromium.org/14495010/diff/1/media/base/container_names.cc... media/base/container_names.cc:38: const ContainerNameMapping ContainerNames::container_name_mapping[] = { On 2013/04/29 23:48:06, Ilya Sherman wrote: > Any reason not to tuck this into an anonymous namespace, rather than having it > be a static class member? This is used by the unittests to verify it is sorted properly. https://codereview.chromium.org/14495010/diff/1/media/base/container_names.cc... media/base/container_names.cc:261: DCHECK_NE(container_name, (const char*)NULL); On 2013/04/29 23:48:06, Ilya Sherman wrote: > nit: This is usually written as "DCHECK(container_name);" Done. https://codereview.chromium.org/14495010/diff/1/media/base/container_names.cc... media/base/container_names.cc:263: // Do a binary search through supported_mapping looking for a match. On 2013/04/29 23:48:06, Ilya Sherman wrote: > Any reason not to just use binary_search() from std::algorithm? My understanding of binary_search is that it simply returns true/false. I need to get the appropriate CONTAINER value. I guess I could use lower_bound, but I suspect it will be just as much code to get the array into a vector and write comparison methods. There is a unit test that verifies calling this function with each string returns the expected result. https://codereview.chromium.org/14495010/diff/1/media/base/container_names.cc... media/base/container_names.cc:275: } On 2013/04/29 21:00:25, xhwang wrote: > indent Done. https://codereview.chromium.org/14495010/diff/1/media/base/container_names.cc... media/base/container_names.cc:281: #define BOM "\xef\xbb\xbf" On 2013/04/29 23:48:06, Ilya Sherman wrote: > Any reason not to define this as a named constant, i.e. "const char > kByteOrderMark[] = "\xef\xbb\xbf";", rather than a macro? This is used in several definitions below, where it is prepended on a signature to make an alternate signature. https://codereview.chromium.org/14495010/diff/1/media/base/container_names.cc... media/base/container_names.cc:315: static const char IDF_SIGNATURE[] = On 2013/04/29 23:48:06, Ilya Sherman wrote: > nit: This should use kConstantName casing rather than MACRO_NAME casing. Done. https://codereview.chromium.org/14495010/diff/1/media/base/container_names.cc... media/base/container_names.cc:1982: "Media.Container", container * 2, CONTAINER_MAX * 2); On 2013/04/29 23:48:06, Ilya Sherman wrote: > Can you use a sparse histogram for this histogram? An enumerated histogram with > ~450 buckets costs 450*(12 bytes per bucket) = 5.4KB, which is relatively > expensive. Done. https://codereview.chromium.org/14495010/diff/1/media/base/container_names.cc... media/base/container_names.cc:1990: "Media.Container", container * 2 + 1, CONTAINER_MAX * 2); On 2013/04/29 23:48:06, Ilya Sherman wrote: > Please define a helper function along the lines of > > void LogContainer(FFmpegContainerName container, bool is_guess) { > int metric = 2 * container; > if (is_guess) > ++metric; > > UMA_HISTOGRAM_SPARSE_SLOWLY("Media.Container", metric); > } > > This way, it's harder for the two UMA_HISTOGRAM_ENUMERATION macros to drift out > of sync with each other, e.g. by having a typo that affects only one of them. Great idea. Done. https://codereview.chromium.org/14495010/diff/1/media/base/container_names.h File media/base/container_names.h (right): https://codereview.chromium.org/14495010/diff/1/media/base/container_names.h#... media/base/container_names.h:14: // FFmpeg code. Not all of these are enabled in FFmpeg. On 2013/04/29 21:00:25, xhwang wrote: > Add a comment that this list must be in sync with the histograms.xml file? Done. https://codereview.chromium.org/14495010/diff/1/media/base/container_names.h#... media/base/container_names.h:246: } ContainerNameMapping; On 2013/04/29 23:48:06, Ilya Sherman wrote: > nit: No need for typedef nor the second instance of "ContainerNameMapping" Done. https://codereview.chromium.org/14495010/diff/1/media/base/container_names.h#... media/base/container_names.h:260: static const ContainerNameMapping container_name_mapping[215]; On 2013/04/29 21:00:25, xhwang wrote: > 215 seems a magic number? Is it CONTAINER_MAX? It's the actual array size. Since the tests need access to the size, added a function to return the size. https://codereview.chromium.org/14495010/diff/1/media/base/container_names.h#... media/base/container_names.h:260: static const ContainerNameMapping container_name_mapping[215]; On 2013/04/29 21:00:25, xhwang wrote: > Hide this as a file scope constant in the .cc file? Then we can make all member > functions of ContainerNames static. If we do that, drop the ctor/dtor and > replace DISALLOW_COPY_AND_ASSIGN with DISALLOW_IMPLICIT_CONSTRUCTORS. The mapping needs to be accessed by the test code. However, I've made everything static. https://codereview.chromium.org/14495010/diff/1/media/base/container_names.h#... media/base/container_names.h:265: size_t buffer_size); On 2013/04/29 21:00:25, xhwang wrote: > We discourage the use of function overloading: > http://google-styleguide.googlecode.com/svn/trunk/cppguide.xml?showone=Functi... > > The second one is not actually simple table lookup. Use something like > ExtractContainer()? Done. https://codereview.chromium.org/14495010/diff/1/media/base/container_names_un... File media/base/container_names_unittest.cc (right): https://codereview.chromium.org/14495010/diff/1/media/base/container_names_un... media/base/container_names_unittest.cc:13: template <typename T, size_t N> inline size_t size(T(&)[N]) { return N; } On 2013/04/29 21:00:25, xhwang wrote: > Can you use arraysize directly? > > https://code.google.com/p/chromium/codesearch#chromium/src/base/basictypes.h&... Done. https://codereview.chromium.org/14495010/diff/1/media/base/container_names_un... media/base/container_names_unittest.cc:15: #define BOM "\xef\xbb\xbf" On 2013/04/29 21:00:25, xhwang wrote: > what is BOM? Use a more readable name? Or add a comment if this is a common > abbreviation in it's context. > > Also move this to line 115 (before it's used)? Done. https://codereview.chromium.org/14495010/diff/1/media/base/container_names_un... media/base/container_names_unittest.cc:20: ASSERT_LT(strcmp(ContainerNames::container_name_mapping[i - 1].name, On 2013/04/29 21:00:25, xhwang wrote: > Use base::strcasecmp ? > > https://code.google.com/p/chromium/codesearch#chromium/src/base/string_util.h... Done. Also in container_names.cc. https://codereview.chromium.org/14495010/diff/1/media/filters/ffmpeg_glue.cc File media/filters/ffmpeg_glue.cc (left): https://codereview.chromium.org/14495010/diff/1/media/filters/ffmpeg_glue.cc#... media/filters/ffmpeg_glue.cc:6: On 2013/04/29 21:00:25, xhwang wrote: > keep this empty line: > http://google-styleguide.googlecode.com/svn/trunk/cppguide.xml?showone=Names_... Done. https://codereview.chromium.org/14495010/diff/1/media/filters/ffmpeg_glue.cc File media/filters/ffmpeg_glue.cc (right): https://codereview.chromium.org/14495010/diff/1/media/filters/ffmpeg_glue.cc#... media/filters/ffmpeg_glue.cc:174: uint8 buf[8192]; On 2013/04/29 21:00:25, xhwang wrote: > This code path only happens in the failure case. But we always use a 8k buffer > on the stack, which seems inefficient. Allocate this on the heap? Done. https://codereview.chromium.org/14495010/diff/1/tools/metrics/histograms/hist... File tools/metrics/histograms/histograms.xml (right): https://codereview.chromium.org/14495010/diff/1/tools/metrics/histograms/hist... tools/metrics/histograms/histograms.xml:939: <histogram name="Media.Container" enum="FFmpegContainers"> On 2013/04/29 21:00:25, xhwang wrote: > nit: does it make sense to use Media.DetectedContainer to be consistent with the > following two? Done.
Mostly nits. Not reviewed the detailed parsing code yet :) Will do it tomorrow. https://codereview.chromium.org/14495010/diff/6001/media/base/container_names.cc File media/base/container_names.cc (right): https://codereview.chromium.org/14495010/diff/6001/media/base/container_names... media/base/container_names.cc:32: } I'll be helpful to document the endianness of these functions. https://codereview.chromium.org/14495010/diff/6001/media/base/container_names... media/base/container_names.cc:256: // Return the number of elements in container_name_mapping. nit: s/Return/Returns This applies to declaration comments but not in implementation. https://codereview.chromium.org/14495010/diff/6001/media/base/container_names... media/base/container_names.cc:283: } re to the other comment about using std::binary_search. You could probably use std::equal_range. https://codereview.chromium.org/14495010/diff/6001/media/base/container_names... media/base/container_names.cc:287: #define BYTE_ORDER_MARK "\xef\xbb\xbf" Here and below. We don't usually use white space for alignment. I can see that you have a lot of them so alignment might help readability a bit... scherkus@: thoughts? https://codereview.chromium.org/14495010/diff/6001/media/base/container_names... media/base/container_names.cc:1328: enum FFmpegContainerName LookupContainerByFirst4(const uint8 buffer[], here and below, drop "enum" https://codereview.chromium.org/14495010/diff/6001/media/base/container_names... media/base/container_names.cc:1759: case YUV4_TAG: // TAG('Y','U','V','4') Won't just using TAG('Y','U','V','4') be more clear? ditto for below... https://codereview.chromium.org/14495010/diff/6001/media/base/container_names... media/base/container_names.cc:1949: memcmp(buffer + offset, ipm_string, sizeof(ipm_string) - 1) == 0) s/sizeof/arraysize https://codereview.chromium.org/14495010/diff/6001/media/base/container_names.h File media/base/container_names.h (right): https://codereview.chromium.org/14495010/diff/6001/media/base/container_names... media/base/container_names.h:244: // Mapping of a ID to name nit: Add "." at end of comment. https://codereview.chromium.org/14495010/diff/6001/media/base/container_names... media/base/container_names.h:246: enum FFmpegContainerName id; nit: you can drop "enum" here https://codereview.chromium.org/14495010/diff/6001/media/base/container_names... media/base/container_names.h:256: static void LogContainer(const uint8 buffer[], size_t buffer_size); Here and everywhere else, why not use const uint8* and int? We use "int" as much as we can: http://google-styleguide.googlecode.com/svn/trunk/cppguide.xml?showone=Intege... https://codereview.chromium.org/14495010/diff/6001/media/base/container_names... media/base/container_names.h:260: static const size_t ContainerNameMappingSize(); You can move these two to "private:" and make the unittest a friend. https://codereview.chromium.org/14495010/diff/6001/media/base/container_names... File media/base/container_names_unittest.cc (right): https://codereview.chromium.org/14495010/diff/6001/media/base/container_names... media/base/container_names_unittest.cc:15: for (size_t i = 1; i < ContainerNames::ContainerNameMappingSize(); ++i) { s/size_t/int https://codereview.chromium.org/14495010/diff/6001/media/base/container_names... media/base/container_names_unittest.cc:72: ContainerNames::DetermineContainer((uint8_t*)buffer3, strlen(buffer3))); Write a helper function to do this EXPECT_EQ()? https://codereview.chromium.org/14495010/diff/6001/media/base/container_names... media/base/container_names_unittest.cc:112: enum FFmpegContainerName id; drop enum https://codereview.chromium.org/14495010/diff/6001/media/base/container_names... media/base/container_names_unittest.cc:168: while (fixed_strings[index].id != CONTAINER_UNKNOWN) { why not use a for loop? https://codereview.chromium.org/14495010/diff/6001/media/base/container_names... media/base/container_names_unittest.cc:179: ContainerNames::DetermineContainer((uint8_t*)buffer, sizeof(buffer))); sizeof(buffer) is always 256. is this what we want? If yes, just use the const instead of sizeof()? https://codereview.chromium.org/14495010/diff/6001/media/base/container_names... media/base/container_names_unittest.cc:192: size_t read = fread(buffer, sizeof(uint8_t), sizeof(buffer), f); Use ReadFile? https://code.google.com/p/chromium/codesearch#chromium/src/base/file_util.h&q... https://codereview.chromium.org/14495010/diff/6001/media/base/container_names... media/base/container_names_unittest.cc:199: TEST(ContainerNamesTest, FileCheck) { Nice test!
Thanks, the histogram changes LGTM. I'll let the media/ reviewers comment on the rest of the code, with a final few suggestions there: https://codereview.chromium.org/14495010/diff/1/media/base/container_names.cc File media/base/container_names.cc (right): https://codereview.chromium.org/14495010/diff/1/media/base/container_names.cc... media/base/container_names.cc:38: const ContainerNameMapping ContainerNames::container_name_mapping[] = { On 2013/04/30 21:36:50, jrummell wrote: > On 2013/04/29 23:48:06, Ilya Sherman wrote: > > Any reason not to tuck this into an anonymous namespace, rather than having it > > be a static class member? > > This is used by the unittests to verify it is sorted properly. If you're only exposing it for testing, then I'd recommend moving it out of the class into an "internal" namespace, with a comment indicating that it's exposed just for testing. https://codereview.chromium.org/14495010/diff/1/media/base/container_names.cc... media/base/container_names.cc:263: // Do a binary search through supported_mapping looking for a match. On 2013/04/30 21:36:50, jrummell wrote: > On 2013/04/29 23:48:06, Ilya Sherman wrote: > > Any reason not to just use binary_search() from std::algorithm? > My understanding of binary_search is that it simply returns true/false. I need > to get the appropriate CONTAINER value. I guess I could use lower_bound, but I > suspect it will be just as much code to get the array into a vector and write > comparison methods. There is a unit test that verifies calling this function > with each string returns the expected result. lower_bound should work with raw pointers into an array just as well as with a vector. https://codereview.chromium.org/14495010/diff/6001/media/base/container_names.cc File media/base/container_names.cc (right): https://codereview.chromium.org/14495010/diff/6001/media/base/container_names... media/base/container_names.cc:2088: } nit: Please move this function definition into an anonymous namespace, typically placed at the top of the file. Once you do, there's no need to mark the function as static. https://codereview.chromium.org/14495010/diff/6001/media/base/container_names.h File media/base/container_names.h (right): https://codereview.chromium.org/14495010/diff/6001/media/base/container_names... media/base/container_names.h:269: }; nit: This seems more like a namespace than a class. How about actually defining it as such?
Updated based on the comments received. Have also added much better checking that tests don't go off of the end of the buffer, both for memcmp and sscanf (very simple scanf like method added to verify formatted strings). https://codereview.chromium.org/14495010/diff/1/media/base/container_names.cc File media/base/container_names.cc (right): https://codereview.chromium.org/14495010/diff/1/media/base/container_names.cc... media/base/container_names.cc:38: const ContainerNameMapping ContainerNames::container_name_mapping[] = { On 2013/05/02 06:34:09, Ilya Sherman wrote: > On 2013/04/30 21:36:50, jrummell wrote: > > On 2013/04/29 23:48:06, Ilya Sherman wrote: > > > Any reason not to tuck this into an anonymous namespace, rather than having > it > > > be a static class member? > > > > This is used by the unittests to verify it is sorted properly. > > If you're only exposing it for testing, then I'd recommend moving it out of the > class into an "internal" namespace, with a comment indicating that it's exposed > just for testing. Moved to internal, test class is a friend. https://codereview.chromium.org/14495010/diff/1/media/base/container_names.cc... media/base/container_names.cc:263: // Do a binary search through supported_mapping looking for a match. On 2013/05/02 06:34:09, Ilya Sherman wrote: > On 2013/04/30 21:36:50, jrummell wrote: > > On 2013/04/29 23:48:06, Ilya Sherman wrote: > > > Any reason not to just use binary_search() from std::algorithm? > > My understanding of binary_search is that it simply returns true/false. I need > > to get the appropriate CONTAINER value. I guess I could use lower_bound, but I > > suspect it will be just as much code to get the array into a vector and write > > comparison methods. There is a unit test that verifies calling this function > > with each string returns the expected result. > > lower_bound should work with raw pointers into an array just as well as with a > vector. Done. https://codereview.chromium.org/14495010/diff/6001/media/base/container_names.cc File media/base/container_names.cc (right): https://codereview.chromium.org/14495010/diff/6001/media/base/container_names... media/base/container_names.cc:32: } On 2013/05/01 01:49:11, xhwang wrote: > I'll be helpful to document the endianness of these functions. Done. https://codereview.chromium.org/14495010/diff/6001/media/base/container_names... media/base/container_names.cc:256: // Return the number of elements in container_name_mapping. On 2013/05/01 01:49:11, xhwang wrote: > nit: s/Return/Returns > > This applies to declaration comments but not in implementation. Done. https://codereview.chromium.org/14495010/diff/6001/media/base/container_names... media/base/container_names.cc:287: #define BYTE_ORDER_MARK "\xef\xbb\xbf" On 2013/05/01 01:49:11, xhwang wrote: > Here and below. We don't usually use white space for alignment. I can see that > you have a lot of them so alignment might help readability a bit... > > scherkus@: thoughts? I converted these to const char arrays, so the spacing issue has gone away. https://codereview.chromium.org/14495010/diff/6001/media/base/container_names... media/base/container_names.cc:1328: enum FFmpegContainerName LookupContainerByFirst4(const uint8 buffer[], On 2013/05/01 01:49:11, xhwang wrote: > here and below, drop "enum" Done. https://codereview.chromium.org/14495010/diff/6001/media/base/container_names... media/base/container_names.cc:1759: case YUV4_TAG: // TAG('Y','U','V','4') On 2013/05/01 01:49:11, xhwang wrote: > Won't just using TAG('Y','U','V','4') be more clear? > > ditto for below... I defined the YUV4_TAG next to YUV4_SIGNATURE, with the intent that if you need to change the signature, you will see the tag directly below it and change it as well. However, it is unlikely that these strings will change, so I have removed the _TAG values. https://codereview.chromium.org/14495010/diff/6001/media/base/container_names... media/base/container_names.cc:1949: memcmp(buffer + offset, ipm_string, sizeof(ipm_string) - 1) == 0) On 2013/05/01 01:49:11, xhwang wrote: > s/sizeof/arraysize Since I'm doing a memcmp, I think sizeof is better. https://codereview.chromium.org/14495010/diff/6001/media/base/container_names... media/base/container_names.cc:2088: } On 2013/05/02 06:34:09, Ilya Sherman wrote: > nit: Please move this function definition into an anonymous namespace, typically > placed at the top of the file. Once you do, there's no need to mark the > function as static. I've been told that in media code we use static rather than anonymous namespace. So I've added static to all the other methods that need them. https://codereview.chromium.org/14495010/diff/6001/media/base/container_names.h File media/base/container_names.h (right): https://codereview.chromium.org/14495010/diff/6001/media/base/container_names... media/base/container_names.h:244: // Mapping of a ID to name On 2013/05/01 01:49:11, xhwang wrote: > nit: Add "." at end of comment. Done. https://codereview.chromium.org/14495010/diff/6001/media/base/container_names... media/base/container_names.h:246: enum FFmpegContainerName id; On 2013/05/01 01:49:11, xhwang wrote: > nit: you can drop "enum" here Done. https://codereview.chromium.org/14495010/diff/6001/media/base/container_names... media/base/container_names.h:260: static const size_t ContainerNameMappingSize(); On 2013/05/01 01:49:11, xhwang wrote: > You can move these two to "private:" and make the unittest a friend. Done. https://codereview.chromium.org/14495010/diff/6001/media/base/container_names... media/base/container_names.h:269: }; On 2013/05/02 06:34:09, Ilya Sherman wrote: > nit: This seems more like a namespace than a class. How about actually defining > it as such? I'll let xhwang@ or scherkus@ make the call. https://codereview.chromium.org/14495010/diff/6001/media/base/container_names... File media/base/container_names_unittest.cc (right): https://codereview.chromium.org/14495010/diff/6001/media/base/container_names... media/base/container_names_unittest.cc:15: for (size_t i = 1; i < ContainerNames::ContainerNameMappingSize(); ++i) { On 2013/05/01 01:49:11, xhwang wrote: > s/size_t/int Done. https://codereview.chromium.org/14495010/diff/6001/media/base/container_names... media/base/container_names_unittest.cc:72: ContainerNames::DetermineContainer((uint8_t*)buffer3, strlen(buffer3))); On 2013/05/01 01:49:11, xhwang wrote: > Write a helper function to do this EXPECT_EQ()? Done. https://codereview.chromium.org/14495010/diff/6001/media/base/container_names... media/base/container_names_unittest.cc:112: enum FFmpegContainerName id; On 2013/05/01 01:49:11, xhwang wrote: > drop enum Done. https://codereview.chromium.org/14495010/diff/6001/media/base/container_names... media/base/container_names_unittest.cc:168: while (fixed_strings[index].id != CONTAINER_UNKNOWN) { On 2013/05/01 01:49:11, xhwang wrote: > why not use a for loop? Done. https://codereview.chromium.org/14495010/diff/6001/media/base/container_names... media/base/container_names_unittest.cc:179: ContainerNames::DetermineContainer((uint8_t*)buffer, sizeof(buffer))); On 2013/05/01 01:49:11, xhwang wrote: > sizeof(buffer) is always 256. is this what we want? If yes, just use the const > instead of sizeof()? Yes. Some of the tests need at least 128 characters in the buffer (which shouldn't be a problem for real media files). https://codereview.chromium.org/14495010/diff/6001/media/base/container_names... media/base/container_names_unittest.cc:192: size_t read = fread(buffer, sizeof(uint8_t), sizeof(buffer), f); On 2013/05/01 01:49:11, xhwang wrote: > Use ReadFile? > https://code.google.com/p/chromium/codesearch#chromium/src/base/file_util.h&q... I don't want the whole file ... just the first 8K (if it exists). Some of the files are huge, although it looks like I picked mostly smaller files.
I looked at part of the check_xxx functions. Are these all copied from ffmpeg as-is or extracted from there with modifications? I have no way to know if the implementation is correct, but we should make sure all functions are secure: no access violation. I have pointed out a few cases where we start to read without checking the buffer size, and where we return true even the buffer size is zero. Please help check all functions on these issues. This is a great CL that requires a lot of effort. Thanks for the nice work and patience! https://codereview.chromium.org/14495010/diff/6001/media/base/container_names.h File media/base/container_names.h (right): https://codereview.chromium.org/14495010/diff/6001/media/base/container_names... media/base/container_names.h:269: }; On 2013/05/02 21:53:09, jrummell wrote: > On 2013/05/02 06:34:09, Ilya Sherman wrote: > > nit: This seems more like a namespace than a class. How about actually > defining > > it as such? > > I'll let xhwang@ or scherkus@ make the call. I saw both in chromium code. Then I searched around and found using a namespace is more recommended. If we do that, we have to expose container_name_mapping[] so that it can be used by the test, which seems bad (no access control in namespaces). But then we can hide LookupContainer() and DetermineContainer() in the .cc file, which seems nice. With this, I am a little bit leaning towards using namespace. WDYT? https://codereview.chromium.org/14495010/diff/6001/media/base/container_names... File media/base/container_names_unittest.cc (right): https://codereview.chromium.org/14495010/diff/6001/media/base/container_names... media/base/container_names_unittest.cc:192: size_t read = fread(buffer, sizeof(uint8_t), sizeof(buffer), f); On 2013/05/02 21:53:09, jrummell wrote: > On 2013/05/01 01:49:11, xhwang wrote: > > Use ReadFile? > > > https://code.google.com/p/chromium/codesearch#chromium/src/base/file_util.h&q... > > I don't want the whole file ... just the first 8K (if it exists). Some of the > files are huge, although it looks like I picked mostly smaller files. In that case, use ReadFile()? https://code.google.com/p/chromium/codesearch#chromium/src/base/file_util.h&q... https://codereview.chromium.org/14495010/diff/16001/media/base/container_name... File media/base/container_names.cc (right): https://codereview.chromium.org/14495010/diff/16001/media/base/container_name... media/base/container_names.cc:34: static bool CompareEqual(const uint8_t* buffer, This function is checking that "search" is a prefix of "buffer", not checking they are equal. Rename as such? https://codereview.chromium.org/14495010/diff/16001/media/base/container_name... media/base/container_names.cc:43: // if the first argument is less than the second. s/less than the second/less than the second in lexicographical order/ https://codereview.chromium.org/14495010/diff/16001/media/base/container_name... media/base/container_names.cc:106: const uint16_t ac3_frame_size_table[38][3] = { use kCamelCase for consts? https://codereview.chromium.org/14495010/diff/16001/media/base/container_name... media/base/container_names.cc:121: static bool check_aac(const uint8_t buffer[], size_t buffer_size) { use const uint8_t* and int? https://codereview.chromium.org/14495010/diff/16001/media/base/container_name... media/base/container_names.cc:126: uint32_t syncword = (Read16(buffer + offset) >> 4) & 0xfff; we are using uint32_t in this function even for 16bit integer. Then we are also using "int" for the result of bitwise operations (e.g. on line 147). Are these all coming from FFmpeg? If so, let's just keep the way it is. If it's our new code, let's be consistent. https://codereview.chromium.org/14495010/diff/16001/media/base/container_name... media/base/container_names.cc:134: return true; So if buffer_size is <= 5, we return true here. Is this what we wanted? https://codereview.chromium.org/14495010/diff/16001/media/base/container_name... media/base/container_names.cc:161: return true; ditto https://codereview.chromium.org/14495010/diff/16001/media/base/container_name... media/base/container_names.cc:188: return true; ditto https://codereview.chromium.org/14495010/diff/16001/media/base/container_name... media/base/container_names.cc:239: index += buffer[i + 2]; just double check, so we are not using buffer[i + 3]? https://codereview.chromium.org/14495010/diff/16001/media/base/container_name... media/base/container_names.cc:240: } indent https://codereview.chromium.org/14495010/diff/16001/media/base/container_name... media/base/container_names.cc:294: if (buffer_size < 20) s/</<=/ ? It seems we only need 20 bytes. https://codereview.chromium.org/14495010/diff/16001/media/base/container_name... media/base/container_names.cc:310: #define HLS3 "#EXT-X-MEDIA-SEQUENCE:" here and 359-362: use const instead of macro? https://codereview.chromium.org/14495010/diff/16001/media/base/container_name... media/base/container_names.cc:356: return true; ditto about buffer_size check https://codereview.chromium.org/14495010/diff/16001/media/base/container_name... media/base/container_names.cc:419: uint32_t type = (length + 2 < buffer_size) ? Read16(buffer + length) : 0; here, length looks like type_offset.. then on line 420, length must be 22 or 24. so buffer_size should at least be 24? https://codereview.chromium.org/14495010/diff/16001/media/base/container_name... media/base/container_names.cc:428: static bool check_mov(const uint8_t buffer[], size_t buffer_size) { ditto, shall we check buffer_size? https://codereview.chromium.org/14495010/diff/16001/media/base/container_name... media/base/container_names.cc:660: { 0, 32, 64, 96, 128, 160, 192, 224, 256, 288, 320, 352, 384, 416, 448, 0 }; here and below, 4 space indentation. https://codereview.chromium.org/14495010/diff/16001/media/base/container_name... media/base/container_names.cc:725: (buffer[9] & 0x7f) + indent https://codereview.chromium.org/14495010/diff/16001/media/base/container_name... media/base/container_names.cc:735: bool seenHeader) { ditto, check buffer_size? https://codereview.chromium.org/14495010/diff/16001/media/base/container_name... media/base/container_names.cc:751: static int64_t get_mpc8_header_size(const uint8_t* b, size_t* count) { s/b/buffer and s/count/buffer_size to be consistent? https://codereview.chromium.org/14495010/diff/16001/media/base/container_name... media/base/container_names.cc:767: while (offset < buffer_size - 14) { size_t -14 is dangerous. https://codereview.chromium.org/14495010/diff/16001/media/base/container_name... media/base/container_names.cc:768: if (!isupper(buffer[offset]) || !isupper(buffer[offset + 1])) include what you use, add #include <cctype> ? https://codereview.chromium.org/14495010/diff/16001/media/base/container_name... media/base/container_names.cc:789: uint32_t width = Read16(buffer + 2); check buffer_size before starting to read? https://codereview.chromium.org/14495010/diff/16001/media/base/container_name... media/base/container_names.cc:811: uint32_t size = Read16(buffer + 5); check buffer_size before reading? https://codereview.chromium.org/14495010/diff/16001/media/base/container_name... media/base/container_names.cc:829: if (buffer[4] != 0) check buffer_size before read? https://codereview.chromium.org/14495010/diff/16001/media/base/container_name... File media/base/container_names_unittest.cc (right): https://codereview.chromium.org/14495010/diff/16001/media/base/container_name... media/base/container_names_unittest.cc:36: static void Verify(FFmpegContainerName name, Verify seems too generic. How about ExpectContainer? https://codereview.chromium.org/14495010/diff/16001/media/base/container_name... media/base/container_names_unittest.cc:104: char buffer4[256] = ".snd"; will 128 work? https://codereview.chromium.org/14495010/diff/16001/media/base/container_name... media/base/container_names_unittest.cc:107: // HLS has it's own loop, so try it Here and below, comments should be complete sentences. https://codereview.chromium.org/14495010/diff/16001/media/base/container_name... media/base/container_names_unittest.cc:109: "some other random stuff" indent https://codereview.chromium.org/14495010/diff/16001/media/base/container_name... media/base/container_names_unittest.cc:115: "some filler\n" indent https://codereview.chromium.org/14495010/diff/16001/media/base/container_name... media/base/container_names_unittest.cc:129: struct Mapping { Document what this is. https://codereview.chromium.org/14495010/diff/16001/media/base/container_name... media/base/container_names_unittest.cc:131: int length; Comment about what this length is. Why you have 0 and non-zero lengthes... https://codereview.chromium.org/14495010/diff/16001/media/base/container_name... media/base/container_names_unittest.cc:138: const Mapping fixed_strings[] = { Use kCamelCase for consts. https://codereview.chromium.org/14495010/diff/16001/media/base/container_name... media/base/container_names_unittest.cc:207: size_t read = fread(buffer, sizeof(char), sizeof(buffer), f); copy from other comment: Use file_util::ReadFile()
Changes due to comments, plus moved the code into it's own namespace and removed the class. https://codereview.chromium.org/14495010/diff/6001/media/base/container_names... File media/base/container_names_unittest.cc (right): https://codereview.chromium.org/14495010/diff/6001/media/base/container_names... media/base/container_names_unittest.cc:192: size_t read = fread(buffer, sizeof(uint8_t), sizeof(buffer), f); On 2013/05/03 19:59:06, xhwang wrote: > On 2013/05/02 21:53:09, jrummell wrote: > > On 2013/05/01 01:49:11, xhwang wrote: > > > Use ReadFile? > > > > > > https://code.google.com/p/chromium/codesearch#chromium/src/base/file_util.h&q... > > > > I don't want the whole file ... just the first 8K (if it exists). Some of the > > files are huge, although it looks like I picked mostly smaller files. > > In that case, use ReadFile()? > https://code.google.com/p/chromium/codesearch#chromium/src/base/file_util.h&q... Done. https://codereview.chromium.org/14495010/diff/16001/media/base/container_name... File media/base/container_names.cc (right): https://codereview.chromium.org/14495010/diff/16001/media/base/container_name... media/base/container_names.cc:34: static bool CompareEqual(const uint8_t* buffer, On 2013/05/03 19:59:06, xhwang wrote: > This function is checking that "search" is a prefix of "buffer", not checking > they are equal. Rename as such? Done. https://codereview.chromium.org/14495010/diff/16001/media/base/container_name... media/base/container_names.cc:43: // if the first argument is less than the second. On 2013/05/03 19:59:06, xhwang wrote: > s/less than the second/less than the second in lexicographical order/ Done. https://codereview.chromium.org/14495010/diff/16001/media/base/container_name... media/base/container_names.cc:106: const uint16_t ac3_frame_size_table[38][3] = { On 2013/05/03 19:59:06, xhwang wrote: > use kCamelCase for consts? Done. https://codereview.chromium.org/14495010/diff/16001/media/base/container_name... media/base/container_names.cc:121: static bool check_aac(const uint8_t buffer[], size_t buffer_size) { On 2013/05/03 19:59:06, xhwang wrote: > use const uint8_t* and int? Done. https://codereview.chromium.org/14495010/diff/16001/media/base/container_name... media/base/container_names.cc:126: uint32_t syncword = (Read16(buffer + offset) >> 4) & 0xfff; On 2013/05/03 19:59:06, xhwang wrote: > we are using uint32_t in this function even for 16bit integer. Then we are also > using "int" for the result of bitwise operations (e.g. on line 147). Are these > all coming from FFmpeg? If so, let's just keep the way it is. If it's our new > code, let's be consistent. Done. https://codereview.chromium.org/14495010/diff/16001/media/base/container_name... media/base/container_names.cc:134: return true; On 2013/05/03 19:59:06, xhwang wrote: > So if buffer_size is <= 5, we return true here. Is this what we wanted? Done. https://codereview.chromium.org/14495010/diff/16001/media/base/container_name... media/base/container_names.cc:161: return true; On 2013/05/03 19:59:06, xhwang wrote: > ditto Done. https://codereview.chromium.org/14495010/diff/16001/media/base/container_name... media/base/container_names.cc:188: return true; On 2013/05/03 19:59:06, xhwang wrote: > ditto Done. https://codereview.chromium.org/14495010/diff/16001/media/base/container_name... media/base/container_names.cc:239: index += buffer[i + 2]; On 2013/05/03 19:59:06, xhwang wrote: > just double check, so we are not using buffer[i + 3]? This is what is in ffmpeg (https://code.google.com/p/chromium/codesearch#chromium/src/third_party/ffmpeg...). https://codereview.chromium.org/14495010/diff/16001/media/base/container_name... media/base/container_names.cc:240: } On 2013/05/03 19:59:06, xhwang wrote: > indent Done. https://codereview.chromium.org/14495010/diff/16001/media/base/container_name... media/base/container_names.cc:294: if (buffer_size < 20) On 2013/05/03 19:59:06, xhwang wrote: > s/</<=/ ? It seems we only need 20 bytes. Correct. If we have 20 bytes we can test. If it's only 19 then we go off the end of the buffer. https://codereview.chromium.org/14495010/diff/16001/media/base/container_name... media/base/container_names.cc:356: return true; On 2013/05/03 19:59:06, xhwang wrote: > ditto about buffer_size check Done. https://codereview.chromium.org/14495010/diff/16001/media/base/container_name... media/base/container_names.cc:419: uint32_t type = (length + 2 < buffer_size) ? Read16(buffer + length) : 0; On 2013/05/03 19:59:06, xhwang wrote: > here, length looks like type_offset.. then on line 420, length must be 22 or 24. > so buffer_size should at least be 24? Done. https://codereview.chromium.org/14495010/diff/16001/media/base/container_name... media/base/container_names.cc:428: static bool check_mov(const uint8_t buffer[], size_t buffer_size) { On 2013/05/03 19:59:06, xhwang wrote: > ditto, shall we check buffer_size? Done. https://codereview.chromium.org/14495010/diff/16001/media/base/container_name... media/base/container_names.cc:660: { 0, 32, 64, 96, 128, 160, 192, 224, 256, 288, 320, 352, 384, 416, 448, 0 }; On 2013/05/03 19:59:06, xhwang wrote: > here and below, 4 space indentation. Reformatted to what clang-format does. https://codereview.chromium.org/14495010/diff/16001/media/base/container_name... media/base/container_names.cc:725: (buffer[9] & 0x7f) + On 2013/05/03 19:59:06, xhwang wrote: > indent Done. https://codereview.chromium.org/14495010/diff/16001/media/base/container_name... media/base/container_names.cc:735: bool seenHeader) { On 2013/05/03 19:59:06, xhwang wrote: > ditto, check buffer_size? Not sure why? We only return true if we see at least 10 headers, so if it's too small it won't match. https://codereview.chromium.org/14495010/diff/16001/media/base/container_name... media/base/container_names.cc:751: static int64_t get_mpc8_header_size(const uint8_t* b, size_t* count) { On 2013/05/03 19:59:06, xhwang wrote: > s/b/buffer and s/count/buffer_size to be consistent? Done s/b/buffer. However, the second is the number processed, not the buffer size. https://codereview.chromium.org/14495010/diff/16001/media/base/container_name... media/base/container_names.cc:767: while (offset < buffer_size - 14) { On 2013/05/03 19:59:06, xhwang wrote: > size_t -14 is dangerous. Done. https://codereview.chromium.org/14495010/diff/16001/media/base/container_name... media/base/container_names.cc:768: if (!isupper(buffer[offset]) || !isupper(buffer[offset + 1])) On 2013/05/03 19:59:06, xhwang wrote: > include what you use, add #include <cctype> ? Done. https://codereview.chromium.org/14495010/diff/16001/media/base/container_name... media/base/container_names.cc:789: uint32_t width = Read16(buffer + 2); On 2013/05/03 19:59:06, xhwang wrote: > check buffer_size before starting to read? All the checks tagged with "Additional" are only called from LookupContainerByFirst4, so we know the size is at least 12 bytes. But it doesn't hurt to make it explicit. https://codereview.chromium.org/14495010/diff/16001/media/base/container_name... media/base/container_names.cc:811: uint32_t size = Read16(buffer + 5); On 2013/05/03 19:59:06, xhwang wrote: > check buffer_size before reading? Done. https://codereview.chromium.org/14495010/diff/16001/media/base/container_name... media/base/container_names.cc:829: if (buffer[4] != 0) On 2013/05/03 19:59:06, xhwang wrote: > check buffer_size before read? Done. https://codereview.chromium.org/14495010/diff/16001/media/base/container_name... File media/base/container_names_unittest.cc (right): https://codereview.chromium.org/14495010/diff/16001/media/base/container_name... media/base/container_names_unittest.cc:36: static void Verify(FFmpegContainerName name, On 2013/05/03 19:59:06, xhwang wrote: > Verify seems too generic. How about ExpectContainer? Done. https://codereview.chromium.org/14495010/diff/16001/media/base/container_name... media/base/container_names_unittest.cc:104: char buffer4[256] = ".snd"; On 2013/05/03 19:59:06, xhwang wrote: > will 128 work? Done. https://codereview.chromium.org/14495010/diff/16001/media/base/container_name... media/base/container_names_unittest.cc:107: // HLS has it's own loop, so try it On 2013/05/03 19:59:06, xhwang wrote: > Here and below, comments should be complete sentences. Done. https://codereview.chromium.org/14495010/diff/16001/media/base/container_name... media/base/container_names_unittest.cc:109: "some other random stuff" On 2013/05/03 19:59:06, xhwang wrote: > indent Done. https://codereview.chromium.org/14495010/diff/16001/media/base/container_name... media/base/container_names_unittest.cc:115: "some filler\n" On 2013/05/03 19:59:06, xhwang wrote: > indent Done. https://codereview.chromium.org/14495010/diff/16001/media/base/container_name... media/base/container_names_unittest.cc:129: struct Mapping { On 2013/05/03 19:59:06, xhwang wrote: > Document what this is. Done. https://codereview.chromium.org/14495010/diff/16001/media/base/container_name... media/base/container_names_unittest.cc:131: int length; On 2013/05/03 19:59:06, xhwang wrote: > Comment about what this length is. Why you have 0 and non-zero lengthes... Done. https://codereview.chromium.org/14495010/diff/16001/media/base/container_name... media/base/container_names_unittest.cc:138: const Mapping fixed_strings[] = { On 2013/05/03 19:59:06, xhwang wrote: > Use kCamelCase for consts. Done. https://codereview.chromium.org/14495010/diff/16001/media/base/container_name... media/base/container_names_unittest.cc:207: size_t read = fread(buffer, sizeof(char), sizeof(buffer), f); On 2013/05/03 19:59:06, xhwang wrote: > copy from other comment: Use file_util::ReadFile() Done.
Looking really good. Just a few more comments. https://codereview.chromium.org/14495010/diff/16001/media/base/container_name... File media/base/container_names.cc (right): https://codereview.chromium.org/14495010/diff/16001/media/base/container_name... media/base/container_names.cc:239: index += buffer[i + 2]; On 2013/05/03 23:56:21, jrummell wrote: > On 2013/05/03 19:59:06, xhwang wrote: > > just double check, so we are not using buffer[i + 3]? > > This is what is in ffmpeg > (https://code.google.com/p/chromium/codesearch#chromium/src/third_party/ffmpeg...). ok, thanks for the link. https://codereview.chromium.org/14495010/diff/31001/media/base/container_name... File media/base/container_names.cc (right): https://codereview.chromium.org/14495010/diff/31001/media/base/container_name... media/base/container_names.cc:8: #include <limits> nit: extra line after this https://codereview.chromium.org/14495010/diff/31001/media/base/container_name... media/base/container_names.cc:16: ((static_cast<uint8_t>(a) << 24) | (static_cast<uint8_t>(b) << 16) | \ sorry for not mentioning this earlier. we use uint8 instead of uint8_t. It's confusing to look at the style guide for this. Probably "be consistent" rule applies here. https://codereview.chromium.org/14495010/diff/31001/media/base/container_name... media/base/container_names.cc:41: memcmp(buffer, search, search_size) == 0); It looks like we always call StartWith with an C string as the third parameter so we always use StartWith(buffer, size, Foo, sizeof(Foo) - 1). If so, we should be able to drop the last parameter. How about using strnlen in this function and drop the last parameter. This can help you get simpler formatting in most call sites below. https://codereview.chromium.org/14495010/diff/31001/media/base/container_name... media/base/container_names.cc:131: int size = (Read24(buffer + offset + 3) >> 5) & 0x1fff; shall we make sure size>0 here? Otherwise we'll have a infinite loop. https://codereview.chromium.org/14495010/diff/31001/media/base/container_name... media/base/container_names.cc:136: return (offset > 0); In a lot of functions we are checking the "buffer_size" at the beginning of the call, e.g. in CheckDnxhd, CheckIdcin and CheckMp3. Can we also do this in those functions that "return (offset > 0)" for better consistency? Also we prefer handling abnormal cases upfront to leaving it at the end of the function. https://codereview.chromium.org/14495010/diff/31001/media/base/container_name... media/base/container_names.cc:1180: sizeof(kSmjpegSignature) - 1)) "conditional or loop statements with complex conditions or statements may be more readable with curly braces" http://www.corp.google.com/eng/doc/cppguide.xml?showone=Conditionals#Conditio... We typically use {} when the condition or statements have multiple lines. https://codereview.chromium.org/14495010/diff/31001/media/base/container_name... media/base/container_names.cc:2262: size_t buffer_size) { indent https://codereview.chromium.org/14495010/diff/31001/media/base/container_names.h File media/base/container_names.h (right): https://codereview.chromium.org/14495010/diff/31001/media/base/container_name... media/base/container_names.h:11: namespace container_names { How about put container_names namespace within media namespace? https://codereview.chromium.org/14495010/diff/31001/media/base/container_name... File media/base/container_names_unittest.cc (right): https://codereview.chromium.org/14495010/diff/31001/media/base/container_name... media/base/container_names_unittest.cc:13: // Verify the container name matches the result nit: wrapping too early? https://codereview.chromium.org/14495010/diff/31001/media/base/container_name... media/base/container_names_unittest.cc:15: void ExpectContainer(FFmpegContainerName name, nit: make this function static.
https://codereview.chromium.org/14495010/diff/16001/media/base/container_name... File media/base/container_names.cc (right): https://codereview.chromium.org/14495010/diff/16001/media/base/container_name... media/base/container_names.cc:239: index += buffer[i + 2]; On 2013/05/03 23:56:21, jrummell wrote: > On 2013/05/03 19:59:06, xhwang wrote: > > just double check, so we are not using buffer[i + 3]? > > This is what is in ffmpeg > (https://code.google.com/p/chromium/codesearch#chromium/src/third_party/ffmpeg...). Just double-checking: Is it consistent with ffmpeg's licensing for us to copy code out of ffmpeg into the Chromium tree?
Apologies for not getting to this sooner -- I was under the impression we'd focus on our container parsing code. We can (and should) simplify this CL before landing. Let's do the following: * Simplify container_names.cc (see my comments there) * Share a doc / spreadsheet with containers with the team and we'll whittle down the list of containers to a smaller, more managable subset https://codereview.chromium.org/14495010/diff/31001/media/base/container_name... File media/base/container_names.cc (right): https://codereview.chromium.org/14495010/diff/31001/media/base/container_name... media/base/container_names.cc:16: ((static_cast<uint8_t>(a) << 24) | (static_cast<uint8_t>(b) << 16) | \ nit: this should be a 4-space indent https://codereview.chromium.org/14495010/diff/31001/media/base/container_name... media/base/container_names.cc:55: if (is_guess) I don't think we're gaining much by logging both what FFmpeg guesses and (if it can't guess) logging what we detect. I found the resulting impact to histograms.xml where we have to repeat values confusing. Not only that, but we have to expand the enum to handle many cases that don't have valid real-world file representations (e.g., dshow, alsa, pulse) and it also places an implicit dependency on those strings not changing in the future. Let's focus on our own container guessing code: * Remove the FFmpeg string-to-ID mapping * Remove the duplicate entries in histograms.xml * Remove any entries from the enum that no longer make sense https://codereview.chromium.org/14495010/diff/31001/media/base/container_name... media/base/container_names.cc:1812: // ipmovie, mxf, subviewer1 I can say with a good deal of certainty that we don't care about any of these containers. Same with the ones that are checked by LookupContainerByStringLine(). I'd much rather err on the side of less code / buffer parsing code and remove LookupContainerByStringScan(), LookupContainerByStringLine(), and SaferScanf(). https://codereview.chromium.org/14495010/diff/31001/media/filters/ffmpeg_glue.cc File media/filters/ffmpeg_glue.cc (right): https://codereview.chromium.org/14495010/diff/31001/media/filters/ffmpeg_glue... media/filters/ffmpeg_glue.cc:172: // position unchanged. As mentioned in container_names.cc, let's focus on our own container guessing code and not rely on AVFormatContext. In that case, we should alter this code to run prior to avformat_open_input(). https://codereview.chromium.org/14495010/diff/31001/media/filters/ffmpeg_glue... media/filters/ffmpeg_glue.cc:178: int numRead = protocol_->Read(8192, buffer.get()); it's possible for this to return an error if (for example) the network connection terminates before returning 8192 bytes
On 2013/04/29 23:48:06, Ilya Sherman wrote: > /cc other UMA folks as an FYI. Vadim, is a histogram with roughly 450 buckets > going to cause any issues with the backend processing? (Recall that this > discussion is in a public forum, so please switch to an internal medium if > answering would require discussing the backend pipeline beyond what's ok in a > public forum.) Sorry for the late response due to my vacation. The short answer is that it could be a problem especially if used together with lots of experiments. At some point the data could be filtered out and if that happens you may need to use your own analysis tools. But then you probably wanted to have your own analysis tools anyway because I don't think a front end normally displaying 5-10 enum values is a good tool to display 450 values. If you want a deeper explanation we may start an internal discussion.
+ acolwell@ since scherkus@ is on vacation Multiple changes to reduce the number of formats detected to match the list discussed. I have verified the code against "real" files for all formats except DTSHD. Major change to the code to use BitReader rather than bitmasks as it simplified things. https://codereview.chromium.org/14495010/diff/31001/media/base/container_name... File media/base/container_names.cc (right): https://codereview.chromium.org/14495010/diff/31001/media/base/container_name... media/base/container_names.cc:8: #include <limits> On 2013/05/06 23:51:27, xhwang wrote: > nit: extra line after this Done. https://codereview.chromium.org/14495010/diff/31001/media/base/container_name... media/base/container_names.cc:16: ((static_cast<uint8_t>(a) << 24) | (static_cast<uint8_t>(b) << 16) | \ On 2013/05/06 23:51:27, xhwang wrote: > sorry for not mentioning this earlier. we use uint8 instead of uint8_t. It's > confusing to look at the style guide for this. Probably "be consistent" rule > applies here. Done. https://codereview.chromium.org/14495010/diff/31001/media/base/container_name... media/base/container_names.cc:16: ((static_cast<uint8_t>(a) << 24) | (static_cast<uint8_t>(b) << 16) | \ On 2013/05/07 00:50:20, scherkus wrote: > nit: this should be a 4-space indent Done. I just let clang-format do it's thing. https://codereview.chromium.org/14495010/diff/31001/media/base/container_name... media/base/container_names.cc:41: memcmp(buffer, search, search_size) == 0); On 2013/05/06 23:51:27, xhwang wrote: > It looks like we always call StartWith with an C string as the third parameter > so we always use StartWith(buffer, size, Foo, sizeof(Foo) - 1). If so, we should > be able to drop the last parameter. How about using strnlen in this function and > drop the last parameter. > > This can help you get simpler formatting in most call sites below. Done. However, some of the "strings" contain embedded \0, so I left them as passing buffer + size. https://codereview.chromium.org/14495010/diff/31001/media/base/container_name... media/base/container_names.cc:55: if (is_guess) On 2013/05/07 00:50:20, scherkus wrote: > I don't think we're gaining much by logging both what FFmpeg guesses and (if it > can't guess) logging what we detect. > > I found the resulting impact to histograms.xml where we have to repeat values > confusing. Not only that, but we have to expand the enum to handle many cases > that don't have valid real-world file representations (e.g., dshow, alsa, pulse) > and it also places an implicit dependency on those strings not changing in the > future. > > Let's focus on our own container guessing code: > * Remove the FFmpeg string-to-ID mapping > * Remove the duplicate entries in histograms.xml > * Remove any entries from the enum that no longer make sense Done. https://codereview.chromium.org/14495010/diff/31001/media/base/container_name... media/base/container_names.cc:131: int size = (Read24(buffer + offset + 3) >> 5) & 0x1fff; On 2013/05/06 23:51:27, xhwang wrote: > shall we make sure size>0 here? Otherwise we'll have a infinite loop. Done. https://codereview.chromium.org/14495010/diff/31001/media/base/container_name... media/base/container_names.cc:136: return (offset > 0); On 2013/05/06 23:51:27, xhwang wrote: > In a lot of functions we are checking the "buffer_size" at the beginning of the > call, e.g. in CheckDnxhd, CheckIdcin and CheckMp3. Can we also do this in those > functions that "return (offset > 0)" for better consistency? Also we prefer > handling abnormal cases upfront to leaving it at the end of the function. Done. https://codereview.chromium.org/14495010/diff/31001/media/base/container_name... media/base/container_names.cc:1180: sizeof(kSmjpegSignature) - 1)) On 2013/05/06 23:51:27, xhwang wrote: > "conditional or loop statements with complex conditions or statements may be > more readable with curly braces" > http://www.corp.google.com/eng/doc/cppguide.xml?showone=Conditionals#Conditio... > > We typically use {} when the condition or statements have multiple lines. Done. https://codereview.chromium.org/14495010/diff/31001/media/base/container_name... media/base/container_names.cc:1812: // ipmovie, mxf, subviewer1 On 2013/05/07 00:50:20, scherkus wrote: > I can say with a good deal of certainty that we don't care about any of these > containers. Same with the ones that are checked by > LookupContainerByStringLine(). > > I'd much rather err on the side of less code / buffer parsing code and remove > LookupContainerByStringScan(), LookupContainerByStringLine(), and SaferScanf(). Partially done. Still need LookupContainerByStringLine for SRT format. https://codereview.chromium.org/14495010/diff/31001/media/base/container_name... media/base/container_names.cc:2262: size_t buffer_size) { On 2013/05/06 23:51:27, xhwang wrote: > indent Done. https://codereview.chromium.org/14495010/diff/31001/media/base/container_names.h File media/base/container_names.h (right): https://codereview.chromium.org/14495010/diff/31001/media/base/container_name... media/base/container_names.h:11: namespace container_names { On 2013/05/06 23:51:27, xhwang wrote: > How about put container_names namespace within media namespace? Done. https://codereview.chromium.org/14495010/diff/31001/media/base/container_name... File media/base/container_names_unittest.cc (right): https://codereview.chromium.org/14495010/diff/31001/media/base/container_name... media/base/container_names_unittest.cc:13: // Verify the container name matches the result On 2013/05/06 23:51:27, xhwang wrote: > nit: wrapping too early? Done. https://codereview.chromium.org/14495010/diff/31001/media/base/container_name... media/base/container_names_unittest.cc:15: void ExpectContainer(FFmpegContainerName name, On 2013/05/06 23:51:27, xhwang wrote: > nit: make this function static. Done. https://codereview.chromium.org/14495010/diff/31001/media/filters/ffmpeg_glue.cc File media/filters/ffmpeg_glue.cc (right): https://codereview.chromium.org/14495010/diff/31001/media/filters/ffmpeg_glue... media/filters/ffmpeg_glue.cc:172: // position unchanged. On 2013/05/07 00:50:20, scherkus wrote: > As mentioned in container_names.cc, let's focus on our own container guessing > code and not rely on AVFormatContext. > > In that case, we should alter this code to run prior to avformat_open_input(). Done. https://codereview.chromium.org/14495010/diff/31001/media/filters/ffmpeg_glue... media/filters/ffmpeg_glue.cc:178: int numRead = protocol_->Read(8192, buffer.get()); On 2013/05/07 00:50:20, scherkus wrote: > it's possible for this to return an error if (for example) the network > connection terminates before returning 8192 bytes Done.
Sorry I didn't get to a full review today. Here are some initial comments from browsing over the code. I'll do a more detailed review of the parsing code on Monday. https://codereview.chromium.org/14495010/diff/44001/media/base/bit_reader.cc File media/base/bit_reader.cc (right): https://codereview.chromium.org/14495010/diff/44001/media/base/bit_reader.cc#... media/base/bit_reader.cc:19: while (num_remaining_bits_in_curr_byte_ != 0 && num_bits != 0) { You might want to take a stab at implementing this w/o a loop. I think it could be pretty straight forward if you do the following. - Use ReadBitsInternal() to byte align. - Update curr_byte_ and friends to reflect full byte reads. - Use ReadBitsInternal() again to get any remaining bits. https://codereview.chromium.org/14495010/diff/44001/media/base/bit_reader_uni... File media/base/bit_reader_unittest.cc (right): https://codereview.chromium.org/14495010/diff/44001/media/base/bit_reader_uni... media/base/bit_reader_unittest.cc:57: EXPECT_TRUE(reader1.SkipBits(0)); Isn't this supposed to return false since the previous line reads past the end? https://codereview.chromium.org/14495010/diff/44001/media/base/container_name... File media/base/container_names.cc (right): https://codereview.chromium.org/14495010/diff/44001/media/base/container_name... media/base/container_names.cc:19: ((static_cast<uint8>(a) << 24) | (static_cast<uint8>(b) << 16) | \ nit: Is it always safe to shift a uint8 more than 8 bits? Any reason you can't cast to uint32 instead? https://codereview.chromium.org/14495010/diff/44001/media/base/container_name... media/base/container_names.cc:23: static uint32 Read16(const uint8* p) { I think these ReadXXX methods should just be replaced with BitStream reader calls. I think that would reduce a bunch of the offset calculations in the methods that use them. https://codereview.chromium.org/14495010/diff/44001/media/base/container_name... media/base/container_names.cc:39: return p[3] << 24 | p[2] << 16 | p[1] << 8 | p[0]; It would be nice to see this method rewritten in terms of BitReader as well. https://codereview.chromium.org/14495010/diff/44001/media/base/container_names.h File media/base/container_names.h (right): https://codereview.chromium.org/14495010/diff/44001/media/base/container_name... media/base/container_names.h:19: enum FFmpegContainerName { nit: Is there any reason to keep the FFmpeg here? There is no dependency on ffmpeg at this point right? s/FFmpeg/Media/? https://codereview.chromium.org/14495010/diff/44001/media/filters/ffmpeg_glue.cc File media/filters/ffmpeg_glue.cc (right): https://codereview.chromium.org/14495010/diff/44001/media/filters/ffmpeg_glue... media/filters/ffmpeg_glue.cc:166: protocol_->GetPosition(&pos); nit: I think you can use avio_context_->opaque to retrieve the protocol pointer. It might be better to do that instead so protocol_ and avio_context_ can't get out of sync. It might even be worth using avio_tell(), avio_seek(), and avio_read() here so you don't even have to worry about FFmpegURLProtocol. https://codereview.chromium.org/14495010/diff/44001/tools/metrics/histograms/... File tools/metrics/histograms/histograms.xml (right): https://codereview.chromium.org/14495010/diff/44001/tools/metrics/histograms/... tools/metrics/histograms/histograms.xml:2108: <histogram name="Media.DetectedContainer" enum="FFmpegContainers"> nit: Is there any particular reason to tie this to FFmpeg? s/FFmpeg/Media/ ?
Sorry I didn't get to a full review today. Here are some initial comments from browsing over the code. I'll do a more detailed review of the parsing code on Monday.
You might also want to check out media/mp4/box_definitions.cc and media/mp4/rcheck.h . I think if you applied the RCHECK() macro to your code it could clean things up quite a bit. You'd need to move the definition to media/base, but it might be worth considering.
Looks much better/cleaner than the previous patch! Just a few more comments. Due to the length of this CL, I didn't go through all the spec to check the details/correctness of all parsing code. I was just trying to make sure we don't have access violation in those code. https://codereview.chromium.org/14495010/diff/44001/media/base/bit_reader.cc File media/base/bit_reader.cc (right): https://codereview.chromium.org/14495010/diff/44001/media/base/bit_reader.cc#... media/base/bit_reader.cc:18: bool BitReader::SkipBits(int num_bits) { This seems mostly duplicate of ReadBitsInternal. How about calling ReadBitsInternal and just drop the result? If num_bits > 64, we can call ReadBitsInternal multiple times. https://codereview.chromium.org/14495010/diff/44001/media/base/container_name... File media/base/container_names.cc (right): https://codereview.chromium.org/14495010/diff/44001/media/base/container_name... media/base/container_names.cc:59: memcmp(buffer, search, search_size) == 0); nit: how about s/search/prefix ? https://codereview.chromium.org/14495010/diff/44001/media/base/container_name... media/base/container_names.cc:63: static int ReadBits(BitReader* reader, int num_bits) { I wonder whether it's always more convenient to wrap BitReader::ReadBits like this (so that we DCHECK the size and return the read value instead of return boolean). If so, we probably want to change BitReader::ReadBits to do so. https://codereview.chromium.org/14495010/diff/44001/media/base/container_name... media/base/container_names.cc:74: #define BYTE_ORDER_MARK "\xef\xbb\xbf" drop extra white spaces? https://codereview.chromium.org/14495010/diff/44001/media/base/container_name... media/base/container_names.cc:82: static const char kHlsSignature[] = "#EXTM3U"; Define consts right before where where it's used. For example, define kHlsSignature around line 660. https://codereview.chromium.org/14495010/diff/44001/media/base/container_name... media/base/container_names.cc:98: { 2304, 2508, 3456 }, { 2560, 2768, 3840 }, { 2560, 2770, 3840 }, nit: drop the ',' at EOL? https://codereview.chromium.org/14495010/diff/44001/media/base/container_name... media/base/container_names.cc:109: while (offset + 6 < buffer_size) { It's a bit confusing that we use 6 here but the header is 7 or 9 bytes. I read the spec and understand why, but can use use 7 here for less confusion? Or add a comment? https://codereview.chromium.org/14495010/diff/44001/media/base/container_name... media/base/container_names.cc:144: #define AC3_SYNCWORD 0x0b77 const int instead of macro? https://codereview.chromium.org/14495010/diff/44001/media/base/container_name... media/base/container_names.cc:169: if (sample_rate_code == 3) // Reserved. merge the above two lines? https://codereview.chromium.org/14495010/diff/44001/media/base/container_name... media/base/container_names.cc:174: if (frame_size_code >= 38) // Undefined. ditto? https://codereview.chromium.org/14495010/diff/44001/media/base/container_name... media/base/container_names.cc:285: // 01.03.01_60/ts_102114v010301p.pdf) here and elsewhere, URL is not subject to 80-char limitation. http://google-styleguide.googlecode.com/svn/trunk/cppguide.xml?showone=Line_L... https://codereview.chromium.org/14495010/diff/44001/media/base/container_name... media/base/container_names.cc:291: BitReader reader(buffer + offset, 11); Again, requiring 12 bytes but only parsing 11 bytes is a bit confusing. https://codereview.chromium.org/14495010/diff/44001/media/base/container_name... media/base/container_names.cc:440: offset = offset + 33; nit: offset += 33; https://codereview.chromium.org/14495010/diff/44001/media/base/container_name... media/base/container_names.cc:450: // Checks for a H.261 container. nit: s/a/an https://codereview.chromium.org/14495010/diff/44001/media/base/container_name... media/base/container_names.cc:465: } Can we move this block into the bit while loop on line 470? It seems duplicate to have this start code searching code in two places. https://codereview.chromium.org/14495010/diff/44001/media/base/container_name... media/base/container_names.cc:527: } ditto about duplicate start code searching code. https://codereview.chromium.org/14495010/diff/44001/media/base/container_name... media/base/container_names.cc:608: } ditto. Also can we make all startcode searching code consistent? https://codereview.chromium.org/14495010/diff/44001/media/base/container_name... media/base/container_names.cc:943: #define VOP_START_CODE 0xb6 static const int? https://codereview.chromium.org/14495010/diff/44001/media/base/container_name... media/base/container_names.cc:949: // Expect to see SEQ | VO1 | VOL* | VO2 ... Are those "..." something to finish? https://codereview.chromium.org/14495010/diff/44001/media/base/container_name... media/base/container_names.cc:1153: case TAG('x','m','l',' '): Up to this point all tags are ordered. But the tags after this point aren't. Any reason for that? https://codereview.chromium.org/14495010/diff/44001/media/base/container_name... media/base/container_names.cc:1200: if (atomsize <= 0) Since we read |atomsize| first, can be handle it first before reading |atomtype|? https://codereview.chromium.org/14495010/diff/44001/media/base/container_name... media/base/container_names.cc:1226: }; style nit: We use UPPER_CASE_0 for enums in chromium. https://codereview.chromium.org/14495010/diff/44001/media/base/container_name... media/base/container_names.cc:1245: static bool ValidMpegAudioFrameHeader(const uint8* header, int* framesize) { It's a bit scary to have a function reading memory without specifying size. Can we DCHECK or at least comment about this? https://codereview.chromium.org/14495010/diff/44001/media/base/container_name... media/base/container_names.cc:1332: // Read a Matroska TAG, updating offset to point past it. I am not sure what a TAG or VTAG type is. Use spec language/terminology so that it's easier to follow? https://codereview.chromium.org/14495010/diff/44001/media/base/container_name... media/base/container_names.cc:1431: #define VC1_FRAME_START_CODE 0x0d static const int? https://codereview.chromium.org/14495010/diff/44001/media/base/container_name... media/base/container_names.cc:1531: } break; nit: Put break on a separate line. nit: Here and below, empty line after "break"? https://codereview.chromium.org/14495010/diff/44001/media/base/container_name... media/base/container_names.cc:1555: int buffer_size) { s/First4/First4Bytes It looks like we need at least 12 bytes for this function. Why it's called "First4"? https://codereview.chromium.org/14495010/diff/44001/media/base/container_name... media/base/container_names.cc:1615: case TAG('\x64','\xa3','\x01','\x00'): s/'\x64'/0x64? https://codereview.chromium.org/14495010/diff/44001/media/base/container_name... media/base/container_names.cc:1821: static FFmpegContainerName LookupContainerByStringLine(const uint8* buffer, Can we rename this to something like CheckSrt to be consistent with other CheckXxx functions? https://codereview.chromium.org/14495010/diff/44001/media/base/container_name... media/base/container_names.cc:1832: ptr, remaining, "%d:%2d:%2d%[,.]%3d --> %d:%2d:%2d%[,.]%3d")) { The function StringFormatCompare() looks cool! But I wonder, since we only use it here for SRT parsing, can we hard-code the parsing instead of using a generic function to make the parsing simpler?
Changes to use BitReader more, and use macro for checks to improve readability. https://codereview.chromium.org/14495010/diff/44001/media/base/bit_reader.cc File media/base/bit_reader.cc (right): https://codereview.chromium.org/14495010/diff/44001/media/base/bit_reader.cc#... media/base/bit_reader.cc:18: bool BitReader::SkipBits(int num_bits) { On 2013/05/20 23:24:24, xhwang wrote: > This seems mostly duplicate of ReadBitsInternal. How about calling > ReadBitsInternal and just drop the result? If num_bits > 64, we can call > ReadBitsInternal multiple times. Implemented Aaron's suggestion instead. https://codereview.chromium.org/14495010/diff/44001/media/base/bit_reader.cc#... media/base/bit_reader.cc:19: while (num_remaining_bits_in_curr_byte_ != 0 && num_bits != 0) { On 2013/05/18 01:22:09, acolwell wrote: > You might want to take a stab at implementing this w/o a loop. I think it could > be pretty straight forward if you do the following. > - Use ReadBitsInternal() to byte align. > - Update curr_byte_ and friends to reflect full byte reads. > - Use ReadBitsInternal() again to get any remaining bits. Done. https://codereview.chromium.org/14495010/diff/44001/media/base/bit_reader_uni... File media/base/bit_reader_unittest.cc (right): https://codereview.chromium.org/14495010/diff/44001/media/base/bit_reader_uni... media/base/bit_reader_unittest.cc:57: EXPECT_TRUE(reader1.SkipBits(0)); On 2013/05/18 01:22:09, acolwell wrote: > Isn't this supposed to return false since the previous line reads past the end? The original comment for ReadBits states "When return false, the stream will enter a state where further ReadBits/SkipBits operations will always return false unless |num_bits| is 0." Since it already referred to SkipBits (which didn't exist), I figured that was the desired behavior. https://codereview.chromium.org/14495010/diff/44001/media/base/container_name... File media/base/container_names.cc (right): https://codereview.chromium.org/14495010/diff/44001/media/base/container_name... media/base/container_names.cc:19: ((static_cast<uint8>(a) << 24) | (static_cast<uint8>(b) << 16) | \ On 2013/05/18 01:22:09, acolwell wrote: > nit: Is it always safe to shift a uint8 more than 8 bits? Any reason you can't > cast to uint32 instead? The problem with casting to uint32 is that each value gets sign-extended (and thus mess up the result). The alternative would be to cast to uint32 and then AND each value with 0xff. https://codereview.chromium.org/14495010/diff/44001/media/base/container_name... media/base/container_names.cc:23: static uint32 Read16(const uint8* p) { On 2013/05/18 01:22:09, acolwell wrote: > I think these ReadXXX methods should just be replaced with BitStream reader > calls. I think that would reduce a bunch of the offset calculations in the > methods that use them. The problem with that is that most formats have a frame size in the header (that includes all the optional bits). So I would need to keep track of how many bits have been read in order to advance to the next header start, which I think complicates things too much. https://codereview.chromium.org/14495010/diff/44001/media/base/container_name... media/base/container_names.cc:59: memcmp(buffer, search, search_size) == 0); On 2013/05/20 23:24:24, xhwang wrote: > nit: how about s/search/prefix ? Done. https://codereview.chromium.org/14495010/diff/44001/media/base/container_name... media/base/container_names.cc:63: static int ReadBits(BitReader* reader, int num_bits) { On 2013/05/20 23:24:24, xhwang wrote: > I wonder whether it's always more convenient to wrap BitReader::ReadBits like > this (so that we DCHECK the size and return the read value instead of return > boolean). If so, we probably want to change BitReader::ReadBits to do so. This is used in the MP4 code. They want to fail if the bits aren't in the stream. https://codereview.chromium.org/14495010/diff/44001/media/base/container_name... media/base/container_names.cc:74: #define BYTE_ORDER_MARK "\xef\xbb\xbf" On 2013/05/20 23:24:24, xhwang wrote: > drop extra white spaces? Done. https://codereview.chromium.org/14495010/diff/44001/media/base/container_name... media/base/container_names.cc:82: static const char kHlsSignature[] = "#EXTM3U"; On 2013/05/20 23:24:24, xhwang wrote: > Define consts right before where where it's used. For example, define > kHlsSignature around line 660. Done. https://codereview.chromium.org/14495010/diff/44001/media/base/container_name... media/base/container_names.cc:98: { 2304, 2508, 3456 }, { 2560, 2768, 3840 }, { 2560, 2770, 3840 }, On 2013/05/20 23:24:24, xhwang wrote: > nit: drop the ',' at EOL? Done. https://codereview.chromium.org/14495010/diff/44001/media/base/container_name... media/base/container_names.cc:109: while (offset + 6 < buffer_size) { On 2013/05/20 23:24:24, xhwang wrote: > It's a bit confusing that we use 6 here but the header is 7 or 9 bytes. I read > the spec and understand why, but can use use 7 here for less confusion? Or add a > comment? Done. https://codereview.chromium.org/14495010/diff/44001/media/base/container_name... media/base/container_names.cc:144: #define AC3_SYNCWORD 0x0b77 On 2013/05/20 23:24:24, xhwang wrote: > const int instead of macro? Done. https://codereview.chromium.org/14495010/diff/44001/media/base/container_name... media/base/container_names.cc:169: if (sample_rate_code == 3) // Reserved. On 2013/05/20 23:24:24, xhwang wrote: > merge the above two lines? sample_rate_code is used below to determine the frame size. https://codereview.chromium.org/14495010/diff/44001/media/base/container_name... media/base/container_names.cc:174: if (frame_size_code >= 38) // Undefined. On 2013/05/20 23:24:24, xhwang wrote: > ditto? frame_size_code is used below to determine the frame size. https://codereview.chromium.org/14495010/diff/44001/media/base/container_name... media/base/container_names.cc:285: // 01.03.01_60/ts_102114v010301p.pdf) On 2013/05/20 23:24:24, xhwang wrote: > here and elsewhere, URL is not subject to 80-char limitation. > http://google-styleguide.googlecode.com/svn/trunk/cppguide.xml?showone=Line_L... Guide says "that line may be longer than 80 characters". However, I will change it (and see if presubmit allows it). https://codereview.chromium.org/14495010/diff/44001/media/base/container_name... media/base/container_names.cc:291: BitReader reader(buffer + offset, 11); On 2013/05/20 23:24:24, xhwang wrote: > Again, requiring 12 bytes but only parsing 11 bytes is a bit confusing. Done. https://codereview.chromium.org/14495010/diff/44001/media/base/container_name... media/base/container_names.cc:440: offset = offset + 33; On 2013/05/20 23:24:24, xhwang wrote: > nit: offset += 33; Done. https://codereview.chromium.org/14495010/diff/44001/media/base/container_name... media/base/container_names.cc:450: // Checks for a H.261 container. On 2013/05/20 23:24:24, xhwang wrote: > nit: s/a/an Done. https://codereview.chromium.org/14495010/diff/44001/media/base/container_name... media/base/container_names.cc:465: } On 2013/05/20 23:24:24, xhwang wrote: > Can we move this block into the bit while loop on line 470? It seems duplicate > to have this start code searching code in two places. Done. https://codereview.chromium.org/14495010/diff/44001/media/base/container_name... media/base/container_names.cc:527: } On 2013/05/20 23:24:24, xhwang wrote: > ditto about duplicate start code searching code. Done. https://codereview.chromium.org/14495010/diff/44001/media/base/container_name... media/base/container_names.cc:608: } On 2013/05/20 23:24:24, xhwang wrote: > ditto. Also can we make all startcode searching code consistent? Done. https://codereview.chromium.org/14495010/diff/44001/media/base/container_name... media/base/container_names.cc:943: #define VOP_START_CODE 0xb6 On 2013/05/20 23:24:24, xhwang wrote: > static const int? Converted to an enum. https://codereview.chromium.org/14495010/diff/44001/media/base/container_name... media/base/container_names.cc:949: // Expect to see SEQ | VO1 | VOL* | VO2 ... On 2013/05/20 23:24:24, xhwang wrote: > Are those "..." something to finish? Removed. https://codereview.chromium.org/14495010/diff/44001/media/base/container_name... media/base/container_names.cc:1153: case TAG('x','m','l',' '): On 2013/05/20 23:24:24, xhwang wrote: > Up to this point all tags are ordered. But the tags after this point aren't. Any > reason for that? This is the order on www.mp4ra.org. However, list adjusted to just have top-level ids in it. https://codereview.chromium.org/14495010/diff/44001/media/base/container_name... media/base/container_names.cc:1200: if (atomsize <= 0) On 2013/05/20 23:24:24, xhwang wrote: > Since we read |atomsize| first, can be handle it first before reading > |atomtype|? Not really. If the size is extended, it follows the |atomtype| anyway. https://codereview.chromium.org/14495010/diff/44001/media/base/container_name... media/base/container_names.cc:1226: }; On 2013/05/20 23:24:24, xhwang wrote: > style nit: We use UPPER_CASE_0 for enums in chromium. Done. https://codereview.chromium.org/14495010/diff/44001/media/base/container_name... media/base/container_names.cc:1245: static bool ValidMpegAudioFrameHeader(const uint8* header, int* framesize) { On 2013/05/20 23:24:24, xhwang wrote: > It's a bit scary to have a function reading memory without specifying size. Can > we DCHECK or at least comment about this? Done. https://codereview.chromium.org/14495010/diff/44001/media/base/container_name... media/base/container_names.cc:1332: // Read a Matroska TAG, updating offset to point past it. On 2013/05/20 23:24:24, xhwang wrote: > I am not sure what a TAG or VTAG type is. Use spec language/terminology so that > it's easier to follow? Done. https://codereview.chromium.org/14495010/diff/44001/media/base/container_name... media/base/container_names.cc:1431: #define VC1_FRAME_START_CODE 0x0d On 2013/05/20 23:24:24, xhwang wrote: > static const int? Converted to an enum. https://codereview.chromium.org/14495010/diff/44001/media/base/container_name... media/base/container_names.cc:1531: } break; On 2013/05/20 23:24:24, xhwang wrote: > nit: Put break on a separate line. > nit: Here and below, empty line after "break"? Done. https://codereview.chromium.org/14495010/diff/44001/media/base/container_name... media/base/container_names.cc:1555: int buffer_size) { On 2013/05/20 23:24:24, xhwang wrote: > s/First4/First4Bytes > > It looks like we need at least 12 bytes for this function. Why it's called > "First4"? Uses the first 4 bytes as an index. Since second4 and third4 are rarely used, they have been removed. https://codereview.chromium.org/14495010/diff/44001/media/base/container_name... media/base/container_names.cc:1615: case TAG('\x64','\xa3','\x01','\x00'): On 2013/05/20 23:24:24, xhwang wrote: > s/'\x64'/0x64? Simplified even farther. https://codereview.chromium.org/14495010/diff/44001/media/base/container_name... media/base/container_names.cc:1821: static FFmpegContainerName LookupContainerByStringLine(const uint8* buffer, On 2013/05/20 23:24:24, xhwang wrote: > Can we rename this to something like CheckSrt to be consistent with other > CheckXxx functions? Done. https://codereview.chromium.org/14495010/diff/44001/media/base/container_name... media/base/container_names.cc:1832: ptr, remaining, "%d:%2d:%2d%[,.]%3d --> %d:%2d:%2d%[,.]%3d")) { On 2013/05/20 23:24:24, xhwang wrote: > The function StringFormatCompare() looks cool! But I wonder, since we only use > it here for SRT parsing, can we hard-code the parsing instead of using a generic > function to make the parsing simpler? Done. https://codereview.chromium.org/14495010/diff/44001/media/base/container_names.h File media/base/container_names.h (right): https://codereview.chromium.org/14495010/diff/44001/media/base/container_name... media/base/container_names.h:19: enum FFmpegContainerName { On 2013/05/18 01:22:09, acolwell wrote: > nit: Is there any reason to keep the FFmpeg here? There is no dependency on > ffmpeg at this point right? s/FFmpeg/Media/? Done. https://codereview.chromium.org/14495010/diff/44001/media/filters/ffmpeg_glue.cc File media/filters/ffmpeg_glue.cc (right): https://codereview.chromium.org/14495010/diff/44001/media/filters/ffmpeg_glue... media/filters/ffmpeg_glue.cc:166: protocol_->GetPosition(&pos); On 2013/05/18 01:22:09, acolwell wrote: > nit: I think you can use avio_context_->opaque to retrieve the protocol pointer. > It might be better to do that instead so protocol_ and avio_context_ can't get > out of sync. > > It might even be worth using avio_tell(), avio_seek(), and avio_read() here so > you don't even have to worry about FFmpegURLProtocol. Done. https://codereview.chromium.org/14495010/diff/44001/tools/metrics/histograms/... File tools/metrics/histograms/histograms.xml (right): https://codereview.chromium.org/14495010/diff/44001/tools/metrics/histograms/... tools/metrics/histograms/histograms.xml:2108: <histogram name="Media.DetectedContainer" enum="FFmpegContainers"> On 2013/05/18 01:22:09, acolwell wrote: > nit: Is there any particular reason to tie this to FFmpeg? s/FFmpeg/Media/ ? Done.
lgtm % nits . https://codereview.chromium.org/14495010/diff/56001/media/base/container_name... File media/base/container_names.cc (right): https://codereview.chromium.org/14495010/diff/56001/media/base/container_name... media/base/container_names.cc:133: RCHECK(size > 0); nit: Do you really need this? 13 bits shouldn't go negative right? https://codereview.chromium.org/14495010/diff/56001/media/base/container_name... media/base/container_names.cc:289: BitReader reader(buffer + offset, 11); nit: If you put all the SkipBits() inside RCHECK() and use buffer_size - offset here and all the similar situations elsewhere, then you don't have to track these special constants like 11 here and in the line above. https://codereview.chromium.org/14495010/diff/56001/media/base/container_name... media/base/container_names.cc:681: // Special handling of SOS (start of scan) marker since the entrophy nit: s/entrophy/entropy https://codereview.chromium.org/14495010/diff/56001/media/filters/ffmpeg_glue.cc File media/filters/ffmpeg_glue.cc (right): https://codereview.chromium.org/14495010/diff/56001/media/filters/ffmpeg_glue... media/filters/ffmpeg_glue.cc:162: scoped_ptr<uint8[]> buffer(new uint8[8192]); nit: Could std::vector<uint8> work here? If so then you can replace the 8192 below w/ buffer.size() so we only have the constant here once.
This CL looks much cleaner now. Thanks! lgtm % a few comments/nits. https://codereview.chromium.org/14495010/diff/56001/media/base/bit_reader.cc File media/base/bit_reader.cc (right): https://codereview.chromium.org/14495010/diff/56001/media/base/bit_reader.cc#... media/base/bit_reader.cc:26: UpdateCurrByte(); nit: can we skip multiple bytes in one operation instead of skip one byte for multiple times? https://codereview.chromium.org/14495010/diff/56001/media/base/container_name... File media/base/container_names.cc (right): https://codereview.chromium.org/14495010/diff/56001/media/base/container_name... media/base/container_names.cc:71: static int ReadBits(BitReader* reader, int num_bits) { s/int/uint32 ? Otherwise could it overflow when num_bits == 32? https://codereview.chromium.org/14495010/diff/56001/media/base/container_name... media/base/container_names.cc:80: static uint64 ReadBits64(BitReader* reader, int num_bits) { I am not a fan of this new name. It actually means "ReadUpTo64Bits". Since BitReader actually uses uint64 internally regardless of the type of the integer to be written to. Can we just use uint64 everywhere to make things simpler? https://codereview.chromium.org/14495010/diff/56001/media/base/container_name... media/base/container_names.cc:133: RCHECK(size > 0); On 2013/05/22 21:03:14, acolwell wrote: > nit: Do you really need this? 13 bits shouldn't go negative right? I wonder should the result of ReadBits() ever be negative? https://codereview.chromium.org/14495010/diff/56001/media/base/container_name... media/base/container_names.cc:787: } nit: curly braces not necessary/required in this case. https://codereview.chromium.org/14495010/diff/56001/media/base/container_name... media/base/container_names.cc:800: return (offset + 14 > buffer_size || Read32(buffer + offset) == 0x000001b9); In which case can "offset + 14 > buffer_size" be true? https://codereview.chromium.org/14495010/diff/56001/media/base/container_name... media/base/container_names.cc:853: } nit: curly braces not required now. https://codereview.chromium.org/14495010/diff/56001/media/base/container_name... media/base/container_names.cc:981: case TAG('b','l','o','c'): I <3 this change! https://codereview.chromium.org/14495010/diff/56001/media/base/container_name... media/base/container_names.cc:1182: ; How about while (foo) {} so we explicitly say we do nothing in the while loop. See: http://google-styleguide.googlecode.com/svn/trunk/cppguide.xml?showone=Horizo... https://codereview.chromium.org/14495010/diff/56001/media/base/container_name... media/base/container_names.cc:1675: return CONTAINER_SRT; nit: is there a reason for the ordering of all checking functions? Why can't we put CheckSrt() and CheckGsm after line 1660?
Thanks for the reviews. Updating for the nits mentioned. https://codereview.chromium.org/14495010/diff/56001/media/base/bit_reader.cc File media/base/bit_reader.cc (right): https://codereview.chromium.org/14495010/diff/56001/media/base/bit_reader.cc#... media/base/bit_reader.cc:26: UpdateCurrByte(); On 2013/05/22 22:18:32, xhwang wrote: > nit: can we skip multiple bytes in one operation instead of skip one byte for > multiple times? I added a DLOG message to print a warning if large skips are encountered, so that the code could be optimized. https://codereview.chromium.org/14495010/diff/56001/media/base/container_name... File media/base/container_names.cc (right): https://codereview.chromium.org/14495010/diff/56001/media/base/container_name... media/base/container_names.cc:71: static int ReadBits(BitReader* reader, int num_bits) { On 2013/05/22 22:18:32, xhwang wrote: > s/int/uint32 ? Otherwise could it overflow when num_bits == 32? Now returns uint64 (and removed ReadBits64). https://codereview.chromium.org/14495010/diff/56001/media/base/container_name... media/base/container_names.cc:80: static uint64 ReadBits64(BitReader* reader, int num_bits) { On 2013/05/22 22:18:32, xhwang wrote: > I am not a fan of this new name. It actually means "ReadUpTo64Bits". Since > BitReader actually uses uint64 internally regardless of the type of the integer > to be written to. Can we just use uint64 everywhere to make things simpler? Done. https://codereview.chromium.org/14495010/diff/56001/media/base/container_name... media/base/container_names.cc:133: RCHECK(size > 0); On 2013/05/22 21:03:14, acolwell wrote: > nit: Do you really need this? 13 bits shouldn't go negative right? True. But I want to avoid an infinite loop in case this is random data that looks good up till here. https://codereview.chromium.org/14495010/diff/56001/media/base/container_name... media/base/container_names.cc:289: BitReader reader(buffer + offset, 11); On 2013/05/22 21:03:14, acolwell wrote: > nit: If you put all the SkipBits() inside RCHECK() and use buffer_size - offset > here and all the similar situations elsewhere, then you don't have to track > these special constants like 11 here and in the line above. Since the code only processes a portion of the file, the code is written this way so that if there happens to be a header that is truncated at the end of the buffer, it gets ignored. Otherwise it would consistently fail when it discovers a truncated header. I could use buffer_size - offset in the constructor to BitReader, but I wanted the code to verify that the header size is correct (as testing should cause errors if it too small). https://codereview.chromium.org/14495010/diff/56001/media/base/container_name... media/base/container_names.cc:681: // Special handling of SOS (start of scan) marker since the entrophy On 2013/05/22 21:03:14, acolwell wrote: > nit: s/entrophy/entropy Done. https://codereview.chromium.org/14495010/diff/56001/media/base/container_name... media/base/container_names.cc:787: } On 2013/05/22 22:18:32, xhwang wrote: > nit: curly braces not necessary/required in this case. Done. https://codereview.chromium.org/14495010/diff/56001/media/base/container_name... media/base/container_names.cc:800: return (offset + 14 > buffer_size || Read32(buffer + offset) == 0x000001b9); On 2013/05/22 22:18:32, xhwang wrote: > In which case can "offset + 14 > buffer_size" be true? Changed. It will always be the case that getting here means the buffer is used up (or at least < 14 bytes remaining). So it is unlikely that the 0x1b9 will ever be seen. https://codereview.chromium.org/14495010/diff/56001/media/base/container_name... media/base/container_names.cc:853: } On 2013/05/22 22:18:32, xhwang wrote: > nit: curly braces not required now. Done. https://codereview.chromium.org/14495010/diff/56001/media/base/container_name... media/base/container_names.cc:1182: ; On 2013/05/22 22:18:32, xhwang wrote: > How about > > while (foo) {} > > so we explicitly say we do nothing in the while loop. > > See: > http://google-styleguide.googlecode.com/svn/trunk/cppguide.xml?showone=Horizo... Done. https://codereview.chromium.org/14495010/diff/56001/media/base/container_name... media/base/container_names.cc:1675: return CONTAINER_SRT; On 2013/05/22 22:18:32, xhwang wrote: > nit: is there a reason for the ordering of all checking functions? Why can't we > put CheckSrt() and CheckGsm after line 1660? Done. https://codereview.chromium.org/14495010/diff/56001/media/filters/ffmpeg_glue.cc File media/filters/ffmpeg_glue.cc (right): https://codereview.chromium.org/14495010/diff/56001/media/filters/ffmpeg_glue... media/filters/ffmpeg_glue.cc:162: scoped_ptr<uint8[]> buffer(new uint8[8192]); On 2013/05/22 21:03:14, acolwell wrote: > nit: Could std::vector<uint8> work here? If so then you can replace the 8192 > below w/ buffer.size() so we only have the constant here once. Done.
Final update.
CQ is trying da patch. Follow status at https://chromium-status.appspot.com/cq/jrummell@chromium.org/14495010/78001
Retried try job too often on chromium_presubmit for step(s) presubmit http://build.chromium.org/p/tryserver.chromium/buildstatus?builder=chromium_p...
CQ is trying da patch. Follow status at https://chromium-status.appspot.com/cq/jrummell@chromium.org/14495010/84001
Sorry for I got bad news for ya. Compile failed with a clobber build on linux_chromeos_clang. http://build.chromium.org/p/tryserver.chromium/buildstatus?builder=linux_chro... Your code is likely broken or HEAD is junk. Please ensure your code is not broken then alert the build sheriffs. Look at the try server FAQ for more details.
CQ is trying da patch. Follow status at https://chromium-status.appspot.com/cq/jrummell@chromium.org/14495010/94002
Sorry for I got bad news for ya. Compile failed with a clobber build on android_clang_dbg. http://build.chromium.org/p/tryserver.chromium/buildstatus?builder=android_cl... Your code is likely broken or HEAD is junk. Please ensure your code is not broken then alert the build sheriffs. Look at the try server FAQ for more details.
CQ is trying da patch. Follow status at https://chromium-status.appspot.com/cq/jrummell@chromium.org/14495010/118001
Retried try job too often on android_dbg for step(s) slave_steps http://build.chromium.org/p/tryserver.chromium/buildstatus?builder=android_db...
CQ is trying da patch. Follow status at https://chromium-status.appspot.com/cq/jrummell@chromium.org/14495010/56003
Retried try job too often on win_rel for step(s) media_unittests http://build.chromium.org/p/tryserver.chromium/buildstatus?builder=win_rel&nu...
CQ is trying da patch. Follow status at https://chromium-status.appspot.com/cq/jrummell@chromium.org/14495010/160001
Message was sent while issue was closed.
Change committed as 202275 |