|
|
Created:
5 years, 1 month ago by johngro Modified:
5 years, 1 month ago CC:
mojo-reviews_chromium.org, gregsimon, qsr+mojo_chromium.org, viettrungluu+watch_chromium.org, abarth-chromium, Aaron Boodman, darin (slow to review), ben+mojo_chromium.org Base URL:
https://github.com/domokit/mojo.git@change4 Target Ref:
refs/heads/master Project:
mojo Visibility:
Public. |
DescriptionAdd an initial revision of an audio server.
Add a functional skeleton for a Motown audio server. Currently, the server
allows clients to create and configure AudioTracks, and push packets of audio
information to the tracks. The mixer will mix for multiple outputs, and there
is a many to many relationship between tracks and outputs. Right now, there is
only a single, generic, audio output implementation whose job in life is to
proved backpressure for audio track pipelines when no other outputs are present.
R=jeffbrown@google.com, jamesr@chromium.org
BUG=
Committed: https://chromium.googlesource.com/external/mojo/+/1cff1d372574761491c915137cab8928036b5deb
Patch Set 1 #Patch Set 2 : #Patch Set 3 : #Patch Set 4 : refactor MixerKernel into a class to prepare for the addition of a linear interpolation sampler #
Total comments: 124
Patch Set 5 : #Patch Set 6 : #Patch Set 7 : final rebase and trybot preflight #Patch Set 8 : fix issues discovered with initial preflight #Messages
Total messages: 11 (2 generated)
Description was changed from ========== Add an initial revision of an audio server. Add a functional skeleton for a Motown audio server. Currently, the server allows clients to create and configure AudioTracks, and push packets of audio information to the tracks. The mixer will mix for multiple outputs, and there is a many to many relationship between tracks and outputs. Right now, there is only a single, generic, audio output implementation whose job in life is to proved backpressure for audio track pipelines when no other outputs are present. R=jamesr@chromium.org, jeffbrown@google.com BUG= ========== to ========== Add an initial revision of an audio server. Add a functional skeleton for a Motown audio server. Currently, the server allows clients to create and configure AudioTracks, and push packets of audio information to the tracks. The mixer will mix for multiple outputs, and there is a many to many relationship between tracks and outputs. Right now, there is only a single, generic, audio output implementation whose job in life is to proved backpressure for audio track pipelines when no other outputs are present. R=jamesr@chromium.org, jeffbrown@google.com BUG= ==========
johngro@google.com changed reviewers: + dalesat@chromium.org
https://codereview.chromium.org/1424933002/diff/60001/mojo/services/media/aud... File mojo/services/media/audio/interfaces/audio_track.mojom (right): https://codereview.chromium.org/1424933002/diff/60001/mojo/services/media/aud... mojo/services/media/audio/interfaces/audio_track.mojom:20: bool supports_pull_transport; I thought we were only going to support push? https://codereview.chromium.org/1424933002/diff/60001/mojo/services/media/aud... mojo/services/media/audio/interfaces/audio_track.mojom:58: // kInvalidArgs: close the connection https://codereview.chromium.org/1424933002/diff/60001/mojo/services/media/aud... mojo/services/media/audio/interfaces/audio_track.mojom:63: // kUnsupportedConfig: if the client was able to determine this ahead of time (by looking at the descriptor) then we can treat this as invalid arguments and close the connection https://codereview.chromium.org/1424933002/diff/60001/mojo/services/media/aud... mojo/services/media/audio/interfaces/audio_track.mojom:73: // kBadState: This suggests coupling the creation of an audio track with its configuration. I think it might also be reasonable to say that an audio track can never be reconfigured. If you need a new configuration you can just create a new audio track. Audio tracks are fungible. https://codereview.chromium.org/1424933002/diff/60001/mojo/services/media/aud... mojo/services/media/audio/interfaces/audio_track.mojom:88: GetRateControl(RateControl& rate_control) => (MediaResult result); I can't think of a meaningful error to report here. Note that it's not possible for one end of message pipe to be bound the same thing twice because it is only ever moved and never duplicated. https://codereview.chromium.org/1424933002/diff/60001/services/media/audio/au... File services/media/audio/audio_output.cc (right): https://codereview.chromium.org/1424933002/diff/60001/services/media/audio/au... services/media/audio/audio_output.cc:43: DCHECK(this == link->GetOutput().get()); Not sure I understand why this has to happen inside the lock given that the InitializeLink method manipulates the link outside of the lock. Would it make sense for the initialization to happen on the worker thread too? We could say that everything to do with processing and link management is effectively asynchronous. That would make it easier to understand what state gets manipulated where. (Does processing touch the link?) https://codereview.chromium.org/1424933002/diff/60001/services/media/audio/au... services/media/audio/audio_output.cc:92: // to schedule callbacks. There's a race condition here given that shutdown may be initiated both by the processing function and by the output manager. When the output manager initiates it, it's possible the processing function is still running and may try to schedule another callback. This is one reason why it might be a good idea for shutdown to be more directed. https://codereview.chromium.org/1424933002/diff/60001/services/media/audio/au... services/media/audio/audio_output.cc:191: processing_lock_.Release(); Obviously this blocks the main thread but I have to wonder whether that's strictly necessary. If the processing state is fully decoupled from the output object itself then there might not be any harm in letting it run to completion and kill itself asynchronously. We would only need to update the bookkeeping here. This assumes there isn't any other shared state to worry about (and it might be better if there weren't). https://codereview.chromium.org/1424933002/diff/60001/services/media/audio/au... File services/media/audio/audio_output.h (right): https://codereview.chromium.org/1424933002/diff/60001/services/media/audio/au... services/media/audio/audio_output.h:31: // by this outputs. Called only from the main message loop. Obtains the outputs -> output https://codereview.chromium.org/1424933002/diff/60001/services/media/audio/au... services/media/audio/audio_output.h:32: // processing_lock and may block for the time it takes the derrived class to derrived -> derived https://codereview.chromium.org/1424933002/diff/60001/services/media/audio/au... services/media/audio/audio_output.h:34: // called. In general it's safer to avoid immediately calling back into object here. Are there any downsides to simply scheduling the task to be run at the next available opportunity? https://codereview.chromium.org/1424933002/diff/60001/services/media/audio/au... services/media/audio/audio_output.h:53: virtual MediaResult Init(); Consider using a convention such as OnInit() to distinguish methods which are intended to be overridden by implementors from others which are not. https://codereview.chromium.org/1424933002/diff/60001/services/media/audio/au... services/media/audio/audio_output.h:65: // Called from within the context of the processing lock any time a scheduled Technically you don't need a lock to guarantee serialization. Does this run on the main thread too or on a worker? If it's on a worker, you might want to make the subclass set the processing callback in some clearly distinguishable way, such as by passing a closure to a function to register or schedule it. https://codereview.chromium.org/1424933002/diff/60001/services/media/audio/au... services/media/audio/audio_output.h:86: // from the within a processing callback, the processing_lock will always be This here suggests that these methods should be provided as part of a context object to the processing callback rather than mixed in here alongside everything else. ie. The code could invoke the processing function and pass an object with the relevant state and functions. Process(ProcessingContext* context) https://codereview.chromium.org/1424933002/diff/60001/services/media/audio/au... services/media/audio/audio_output.h:115: // priorities will be the trick). Could just use a heap. https://codereview.chromium.org/1424933002/diff/60001/services/media/audio/au... services/media/audio/audio_output.h:138: // Called from Shutdown (main message loop) and ShutdowSelf (processing I think it's a little weird for there to be two different sources of shutdown events (the processing callback and the output manager). It runs the risk of the output getting into states that the output manager doesn't expect. Consider providing some kind of callback for the output to tell the output manager that it's in a bad state and needs to be shut down so that the output manager is always the one driving the shutdown and can stay in sync with everything. https://codereview.chromium.org/1424933002/diff/60001/services/media/audio/au... services/media/audio/audio_output.h:153: base::Lock processing_lock_; We won't need this lock if the processing state is encapsulated elsewhere. https://codereview.chromium.org/1424933002/diff/60001/services/media/audio/au... services/media/audio/audio_output.h:154: base::Lock shutdown_lock_; We won't need this lock is shutdown is always initiated by the output manager on the main thread. https://codereview.chromium.org/1424933002/diff/60001/services/media/audio/au... File services/media/audio/audio_output_manager.cc (right): https://codereview.chromium.org/1424933002/diff/60001/services/media/audio/au... services/media/audio/audio_output_manager.cc:69: // TODO(johngro): Probably should log something about this, assuming that Another approach would be to handle this symmetrically with other kinds of asynchronous self-shutdown of the output. We could assume Init always succeeds here and wait for a callback to tell us otherwise. Bonus of that is that the output could actually do its real hardcore initialization in the background on some other thread. Useful in case it's got to open up and configure audio devices which could take a *very* long time. (USB Audio anyone?) https://codereview.chromium.org/1424933002/diff/60001/services/media/audio/au... services/media/audio/audio_output_manager.cc:93: // but it should not take long. Famous last words. :) https://codereview.chromium.org/1424933002/diff/60001/services/media/audio/au... File services/media/audio/audio_output_manager.h (right): https://codereview.chromium.org/1424933002/diff/60001/services/media/audio/au... services/media/audio/audio_output_manager.h:44: // Very Seriously Wrong. Instead of this comment, maybe just make it blow up. https://codereview.chromium.org/1424933002/diff/60001/services/media/audio/au... services/media/audio/audio_output_manager.h:92: scoped_refptr<base::SequencedWorkerPool> thread_pool_; Alternately if the number of audio outputs is small then perhaps it would be ok to give each output its own dedicated thread. That's kind of nice from the perspective of treating audio outputs as autonomous agents so that init and shutdown can also be managed asynchronously just like processing work. https://codereview.chromium.org/1424933002/diff/60001/services/media/audio/au... File services/media/audio/audio_pipe.cc (right): https://codereview.chromium.org/1424933002/diff/60001/services/media/audio/au... services/media/audio/audio_pipe.cc:32: server_->SchedulePacketCleanup(std::move(state_)); Destructor side-effects like these make be a little nervous, but ok. Assuming this is RAII-ish. https://codereview.chromium.org/1424933002/diff/60001/services/media/audio/au... services/media/audio/audio_pipe.cc:61: std::vector<AudioPacketRef::Region> regions; I was slightly surprised to see you allocating on the heap here. https://codereview.chromium.org/1424933002/diff/60001/services/media/audio/au... services/media/audio/audio_pipe.cc:68: state->SetResult(MediaResult::INVALID_ARGUMENT); can we just close the pipe? https://codereview.chromium.org/1424933002/diff/60001/services/media/audio/au... services/media/audio/audio_pipe.cc:80: static_cast<const uint8_t*>(buffer()) + (*region)->offset, Make sure to check that the offset is within the buffer and is properly aligned too. https://codereview.chromium.org/1424933002/diff/60001/services/media/audio/au... services/media/audio/audio_pipe.cc:117: std::move(regions), So this object contains a pointer to an mmap'd region of the media pipe buffer. Are we guaranteed that the pointer will be discarded before the buffer is unmapped? I have the feeling this buffer is going to flow deep into the system and be sent to worker threads which are operating asynchronously. Consider including a reference counted reference to the buffer along the way so we don't screw it up. https://codereview.chromium.org/1424933002/diff/60001/services/media/audio/au... File services/media/audio/audio_pipe.h (right): https://codereview.chromium.org/1424933002/diff/60001/services/media/audio/au... services/media/audio/audio_pipe.h:42: // of 0. Might be good to define a struct for manipulating fixed-point values of this type. https://codereview.chromium.org/1424933002/diff/60001/services/media/audio/au... File services/media/audio/audio_server_impl.h (right): https://codereview.chromium.org/1424933002/diff/60001/services/media/audio/au... services/media/audio/audio_server_impl.h:67: scoped_refptr<base::SingleThreadTaskRunner> task_runner_; Do we need to reference count this? Can we ensure everything that uses it gets shut down before we destroy it? It seems to me that if anyone is handing onto the task runner longer than they should then they're possibly holding onto other things they shouldn't too. https://codereview.chromium.org/1424933002/diff/60001/services/media/audio/au... File services/media/audio/audio_track_impl.cc (right): https://codereview.chromium.org/1424933002/diff/60001/services/media/audio/au... services/media/audio/audio_track_impl.cc:109: // Are we already configured? Here's a place where it would be nice to just configure at the same time the track is created. Reduces the state space. https://codereview.chromium.org/1424933002/diff/60001/services/media/audio/au... services/media/audio/audio_track_impl.cc:118: cbk.Run(MediaResult::UNSUPPORTED_CONFIG); Assuming clients should have known better before asking for this, just close the connection. I presume clients will need to know a-priori what kinds of configs are supported so if they got this far then they already screwed up. https://codereview.chromium.org/1424933002/diff/60001/services/media/audio/au... services/media/audio/audio_track_impl.cc:144: int32_t N = static_cast<int32_t>(configuration->audio_frame_ratio); numerator / denominator, or num / denom Also, seeing int32_t and uin32_t used together for these fractions makes me worry that the wrong types might get used someplace (possibly in the client if not here). We spoke earlier of using a separate bool to encode whether playback is forward or reversed. I think that might be worth exploring further in future patches. (Or make both the numerator and denominator signed at the cost of 1 bit of precision.) https://codereview.chromium.org/1424933002/diff/60001/services/media/audio/au... services/media/audio/audio_track_impl.cc:204: format_ = cfg->Clone(); Looks to me like the config is just going to be dropped on the floor when we return. So we might as well just Pass() it and take ownership. https://codereview.chromium.org/1424933002/diff/60001/services/media/audio/au... services/media/audio/audio_track_impl.cc:252: // something about this? DCHECK! We failed an internal invariant. https://codereview.chromium.org/1424933002/diff/60001/services/media/audio/au... services/media/audio/audio_track_impl.cc:256: void AudioTrackImpl::OnPacketReceived(AudioPipe::AudioPacketRefPtr packet) { Do we need to DCHECK the packet? https://codereview.chromium.org/1424933002/diff/60001/services/media/audio/au... services/media/audio/audio_track_impl.cc:265: DCHECK(output); I feel like this DCHECK is kind of redundant given we're already enforcing invariants on what goes into the outputs list, but it's not harmful. https://codereview.chromium.org/1424933002/diff/60001/services/media/audio/au... File services/media/audio/audio_track_impl.h (right): https://codereview.chromium.org/1424933002/diff/60001/services/media/audio/au... services/media/audio/audio_track_impl.h:28: static constexpr size_t PTS_FRACTIONAL_BITS = 12; Maybe make a new fixed point datatype instead. https://codereview.chromium.org/1424933002/diff/60001/services/media/audio/au... services/media/audio/audio_track_impl.h:35: void Describe(const DescribeCallback& cbk) override; These should be private. https://codereview.chromium.org/1424933002/diff/60001/services/media/audio/au... services/media/audio/audio_track_impl.h:68: AudioTrackImplWeakPtr weak_this_; Using lots of typedefs makes it harder to understand the semantics of these values without having to grep the header files. Is this a WeakPtr? I'm not sure. You might want to tone it down a bit before this turns into Hungarian notation. https://codereview.chromium.org/1424933002/diff/60001/services/media/audio/au... services/media/audio/audio_track_impl.h:70: Binding<AudioTrack> binding_; alignment is a little wonky here More importantly, I don't think git cl format will let you get away with this... https://codereview.chromium.org/1424933002/diff/60001/services/media/audio/au... File services/media/audio/audio_track_to_output_link.cc (right): https://codereview.chromium.org/1424933002/diff/60001/services/media/audio/au... services/media/audio/audio_track_to_output_link.cc:53: // queued in. FWIW, we might actually find it worthwhile to push all packets through the audio output itself as usual even when flushed. Just mark them as flushed so that they can skip the main processing stages. That way there will be one consistent path by which packets are queued, processed, and retired. I've found this kind of pattern useful for other pipelines and it's no more expensive in the end. https://codereview.chromium.org/1424933002/diff/60001/services/media/audio/au... File services/media/audio/audio_track_to_output_link.h (right): https://codereview.chromium.org/1424933002/diff/60001/services/media/audio/au... services/media/audio/audio_track_to_output_link.h:34: // system. No assumptions may be made about threading when destructing. This could be amended by making the cleanup signal come from only one place. Might be worth exploring in a future patch. https://codereview.chromium.org/1424933002/diff/60001/services/media/audio/fw... File services/media/audio/fwd_decls.h (right): https://codereview.chromium.org/1424933002/diff/60001/services/media/audio/fw... services/media/audio/fwd_decls.h:21: using AudioOutputPtr = std::shared_ptr<AudioOutput>; Are we actually taking advantage of the fact that these pointer are reference counted or do we only need weak semantics? If the latter, consider using base::WeakPtr perhaps. https://codereview.chromium.org/1424933002/diff/60001/services/media/audio/pl... File services/media/audio/platform/generic/mixer.cc (right): https://codereview.chromium.org/1424933002/diff/60001/services/media/audio/pl... services/media/audio/platform/generic/mixer.cc:19: if (!src_format) { return nullptr; } Seems worthy of a DCHECK. Otherwise the caller has to deal with nullptr which might be worse. https://codereview.chromium.org/1424933002/diff/60001/services/media/audio/pl... File services/media/audio/platform/generic/mixer.h (right): https://codereview.chromium.org/1424933002/diff/60001/services/media/audio/pl... services/media/audio/platform/generic/mixer.h:32: static MixerPtr Select(const LpcmMediaTypeDetailsPtr& src_format, This is looking much cleaner than the first patch. Nice. https://codereview.chromium.org/1424933002/diff/60001/services/media/audio/pl... services/media/audio/platform/generic/mixer.h:65: // How much to increment the fractional sampling position for each output Is this actually a fraction? Like a fixed point value of some kind? https://codereview.chromium.org/1424933002/diff/60001/services/media/audio/pl... File services/media/audio/platform/generic/mixers/mixer_utils.h (right): https://codereview.chromium.org/1424933002/diff/60001/services/media/audio/pl... services/media/audio/platform/generic/mixers/mixer_utils.h:19: // Template to read samples and normalize them into signed 32 bit integers. What you're not saying in this comment is that those signed 32 bit integers actually contain 16 bit samples. That was a little surprising to me. https://codereview.chromium.org/1424933002/diff/60001/services/media/audio/pl... services/media/audio/platform/generic/mixers/mixer_utils.h:30: register SType tmp = static_cast<const SType*>(src)[0]; *src instead of src[0]? I don't see why you need the static cast here given that the argument is already of the same type. https://codereview.chromium.org/1424933002/diff/60001/services/media/audio/pl... services/media/audio/platform/generic/mixers/mixer_utils.h:43: return static_cast<int32_t>(static_cast<const SType*>(src)[0]); Why isn't this good enough? (Of course even the cast is redundant here.) return static_cast<int32_t>(*src); https://codereview.chromium.org/1424933002/diff/60001/services/media/audio/pl... services/media/audio/platform/generic/mixers/mixer_utils.h:47: // Template to read normalized source samples, and combine channels if required. Doesn't this need a downmix matrix in the general case? https://codereview.chromium.org/1424933002/diff/60001/services/media/audio/pl... services/media/audio/platform/generic/mixers/mixer_utils.h:52: class SrcReader; SrcReader -> SampleReader? https://codereview.chromium.org/1424933002/diff/60001/services/media/audio/pl... services/media/audio/platform/generic/mixers/mixer_utils.h:85: template <typename DType, typename Enable = void> class DstConverter; DstConverter -> SampleWriter? https://codereview.chromium.org/1424933002/diff/60001/services/media/audio/pl... services/media/audio/platform/generic/mixers/mixer_utils.h:120: DoAccumulate == false, Instead of testing for DoAccumulate == false maybe have the caller treat accumulation as a special case, where the initial value of "sample" is initialized to the normalized destination sample. Otherwise I don't see what this particular template is buying you. https://codereview.chromium.org/1424933002/diff/60001/services/media/audio/pl... services/media/audio/platform/generic/mixers/mixer_utils.h:135: static inline int32_t Mix(const DType* dst, int32_t sample) { I'll admit to being deeply confused by this function. It has a parameter call dst that we are reading from. Then it converts the summed output into a value of the destination type and widens it to int32_t. So the input is normalized but the output is in the destination space. Kind of weird... The clamping seems to be done on a sample-by-sample basis but we should probably do this at the very end after looping over all samples to be accumulated. https://codereview.chromium.org/1424933002/diff/60001/services/media/audio/pl... File services/media/audio/platform/generic/mixers/no_op.cc (right): https://codereview.chromium.org/1424933002/diff/60001/services/media/audio/pl... services/media/audio/platform/generic/mixers/no_op.cc:28: frames_produced = (dst_frames - *dst_offset); The mixer functions themselves might be a little simpler if the caller did all of this clamping for them. After all, it should know whether an overflow will happen. Although I could see some argument for letting the mixer calculate it as it goes since we'll probably determine these results while the mixer's loop executes too (at least for a non-noop mixer). https://codereview.chromium.org/1424933002/diff/60001/services/media/audio/pl... services/media/audio/platform/generic/mixers/no_op.cc:34: return (*frac_src_offset >= frac_src_frames); I don't see why we need to return this value since it's something the caller could determine directly. https://codereview.chromium.org/1424933002/diff/60001/services/media/audio/pl... File services/media/audio/platform/generic/mixers/point_sampler.cc (right): https://codereview.chromium.org/1424933002/diff/60001/services/media/audio/pl... services/media/audio/platform/generic/mixers/point_sampler.cc:72: uint32_t S = (soff >> AudioTrackImpl::PTS_FRACTIONAL_BITS) * SChCount; variables shouldn't be uppercase https://codereview.chromium.org/1424933002/diff/60001/services/media/audio/pl... services/media/audio/platform/generic/mixers/point_sampler.cc:75: for (size_t D = 0; D < DChCount; ++D) { We could handle all of the specialization for accumulation at this level instead of burying it down in DM::Mix. We know the function just turns into reading from Dest and doing an extra mix. https://codereview.chromium.org/1424933002/diff/60001/services/media/audio/pl... File services/media/audio/platform/generic/standard_output_base.cc (right): https://codereview.chromium.org/1424933002/diff/60001/services/media/audio/pl... services/media/audio/platform/generic/standard_output_base.cc:137: for (auto iter = links_.begin(); iter != links_.end(); ) { I'm actually kind of surprised that you designed the mixer with an outer loop over the tracks instead of over the outputs. This means as we go through the tracks, we have to accumulate all of the samples into the destination buffer which may cause a loss of precision and bad clipping artifacts. With that in mind, perhaps we should only support accumulation buffers that have a normalized representation (say, int32_t), and convert them into the destination representation at the very last moment. That should make your Mix functions a little simpler since they won't have to deal with converting the accumulated samples to a destination format immediately. They can do all of the math in the normalized domain. Another alternative might be to try to exchange the order of the loops but I suspect that the bookkeeping involved in doing that efficiently might be a little much! https://codereview.chromium.org/1424933002/diff/60001/services/media/audio/pl... services/media/audio/platform/generic/standard_output_base.cc:304: bool consumed_source = info->mixer->Mix(buf, As designed, we're going to have problems replacing the point sampler with one that does interpolation. An interpolating sampler may need access to data that comes from multiple regions within the packet (or from prior packets depending on how the audio got sliced up). In order words, the sampler will need access to a window of samples within the track, not just one. To work around this problem, consider passing the mix function a random access iterator for accessing the Nth sample of the input rather than a pointer to it. You'll have to define what the window size is upfront though. The window might need to depend on properties of the timeline transformation. Alternately, it's possible that copying a moving window's worth of the input into a linear buffer will actually end up being faster than trying to process it in small chunks when crossing region boundaries. (Best of both worlds, assuming the windows are small compared to typical regions so we don't have to deal with overlap very often.)
https://codereview.chromium.org/1424933002/diff/60001/mojo/services/media/aud... File mojo/services/media/audio/interfaces/audio_track.mojom (right): https://codereview.chromium.org/1424933002/diff/60001/mojo/services/media/aud... mojo/services/media/audio/interfaces/audio_track.mojom:20: bool supports_pull_transport; On 2015/11/04 23:43:32, jeffbrown wrote: > I thought we were only going to support push? copied directly from Dale's design doc. Let's bring this up when we get together to talk about overall MediaPipe design stuff. I'll leave a comment here and add the topic to my design todo list. https://codereview.chromium.org/1424933002/diff/60001/mojo/services/media/aud... mojo/services/media/audio/interfaces/audio_track.mojom:58: // kInvalidArgs: On 2015/11/04 23:43:32, jeffbrown wrote: > close the connection Acknowledged. FWIW - I'm putting these all in my close the post V0 close-the-connection bucket. From here on out, I'll be acking these comments, and adding them to my list of cleanup tasks, but I'm not going to change them in this CL. Once this is in, I'll file a bug and get to work cleaning this up. https://codereview.chromium.org/1424933002/diff/60001/mojo/services/media/aud... mojo/services/media/audio/interfaces/audio_track.mojom:63: // kUnsupportedConfig: On 2015/11/04 23:43:32, jeffbrown wrote: > if the client was able to determine this ahead of time (by looking at the > descriptor) then we can treat this as invalid arguments and close the connection Acknowledged. https://codereview.chromium.org/1424933002/diff/60001/mojo/services/media/aud... mojo/services/media/audio/interfaces/audio_track.mojom:73: // kBadState: On 2015/11/04 23:43:32, jeffbrown wrote: > This suggests coupling the creation of an audio track with its configuration. I > think it might also be reasonable to say that an audio track can never be > reconfigured. If you need a new configuration you can just create a new audio > track. Audio tracks are fungible. Acknowledged. Agreed, I do not think we currently have a plans to reconfigure audio pipes on the fly. Per our face to face discussion, you currently need to have a track interface in order to query the server's track capabilities. I'm going to add this to the design discussion list, but assuming that we all agree, I will come back later and hoist track capability enumeration up to the server interface level, and make configuration part of CreateTrack (eliminating this method in the process). https://codereview.chromium.org/1424933002/diff/60001/mojo/services/media/aud... mojo/services/media/audio/interfaces/audio_track.mojom:88: GetRateControl(RateControl& rate_control) => (MediaResult result); On 2015/11/04 23:43:32, jeffbrown wrote: > I can't think of a meaningful error to report here. > > Note that it's not possible for one end of message pipe to be bound the same > thing twice because it is only ever moved and never duplicated. It is possible for anyone with access to the AudioTrack interface instance to create a new interface request (which gives them a new unbound message pipe endpoint) and call this method again. The rate control interface facet exposed by audio tracks currently supports only 1-to-1 relationships, not one to many, so it will refuse to bind to a second message pipe (if asked to do so) and return BadState in response. In theory, there might be some merit in supporting a 1-to-many for the rate control interface, but I cannot come up with a practical use case yet, so it is limited to 1-1 for the moment. I'm going to lump this into the close-connection bucket. https://codereview.chromium.org/1424933002/diff/60001/services/media/audio/au... File services/media/audio/audio_output.cc (right): https://codereview.chromium.org/1424933002/diff/60001/services/media/audio/au... services/media/audio/audio_output.cc:43: DCHECK(this == link->GetOutput().get()); On 2015/11/04 23:43:32, jeffbrown wrote: > Not sure I understand why this has to happen inside the lock given that the > InitializeLink method manipulates the link outside of the lock. > > Would it make sense for the initialization to happen on the worker thread too? > We could say that everything to do with processing and link management is > effectively asynchronous. That would make it easier to understand what state > gets manipulated where. (Does processing touch the link?) Link initialization happens outside of the scope of the processing lock in the context of the message loop thread. For a base AudioOutput, this is a no-op. For derived classes, it is an opportunity to perform tasks involving initialization of link specific state (such as selecting mixer kernels and initialization of filter state) without holding up the output's mixing tasks in any way. As described in earlier feedback, the processing lock serves two purposes. In this context, it serving to protect the links_ collection which must be held static during a processing callback. As mentioned earlier, the scope of this protection could be further limited, but would require the introduction of a separate lock which would need to be acquired and released in the more limited scope by the derived class. Re: Does processing touch the link? Yes, but only after the link has been added to the set of active links. This is why initialization and teardown happen outside of the scope of the processing lock, and only when we can prove that the processing scope can not be holding any references to the link object itself. https://codereview.chromium.org/1424933002/diff/60001/services/media/audio/au... services/media/audio/audio_output.cc:92: // to schedule callbacks. On 2015/11/04 23:43:32, jeffbrown wrote: > There's a race condition here given that shutdown may be initiated both by the > processing function and by the output manager. When the output manager > initiates it, it's possible the processing function is still running and may try > to schedule another callback. > > This is one reason why it might be a good idea for shutdown to be more directed. No, I do not think that there is. I took care to ensure that there was not. Please refer to my extensive response in the header file feed back to see why there is no race here. If you disagree with my analysis (I cannot bring myself to refer to it as a "proof") given there, please provide me the hypothetical sequence of actions which would lead to the thread-safety violation. https://codereview.chromium.org/1424933002/diff/60001/services/media/audio/au... services/media/audio/audio_output.cc:191: processing_lock_.Release(); On 2015/11/04 23:43:32, jeffbrown wrote: > Obviously this blocks the main thread but I have to wonder whether that's > strictly necessary. > > If the processing state is fully decoupled from the output object itself then > there might not be any harm in letting it run to completion and kill itself > asynchronously. We would only need to update the bookkeeping here. > > This assumes there isn't any other shared state to worry about (and it might be > better if there weren't). I believe that I have already address this in responses given previously. Processing state cannot be completely decoupled from the output object as it is a function of the track/output relationship. Please review my earlier analysis. If you still disagree, lets get together and work this out face to face; for complex interactions like this, I feel that coming to a mutual understanding is more efficiently done in the context of an interactive dialog instead of via email/code-review with its high latency. https://codereview.chromium.org/1424933002/diff/60001/services/media/audio/au... File services/media/audio/audio_output.h (right): https://codereview.chromium.org/1424933002/diff/60001/services/media/audio/au... services/media/audio/audio_output.h:31: // by this outputs. Called only from the main message loop. Obtains the On 2015/11/04 23:43:32, jeffbrown wrote: > outputs -> output Done. https://codereview.chromium.org/1424933002/diff/60001/services/media/audio/au... services/media/audio/audio_output.h:32: // processing_lock and may block for the time it takes the derrived class to On 2015/11/04 23:43:33, jeffbrown wrote: > derrived -> derived Done. https://codereview.chromium.org/1424933002/diff/60001/services/media/audio/au... services/media/audio/audio_output.h:34: // called. On 2015/11/04 23:43:32, jeffbrown wrote: > In general it's safer to avoid immediately calling back into object here. Are > there any downsides to simply scheduling the task to be run at the next > available opportunity? I'm not sure that I understand the question. As the comment mentions, things which call these methods are always scheduled to run on the main message loop thread; IOW - if an output want to shut itself down, for whatever reason, the act of unlinking will be scheduled on the message loop thread, and "run at the next available opportunity". If an output is being added or removed because of an action taken on the server side of thing, it is already in the context of the message loop thread, and this is the "next available opportunity". https://codereview.chromium.org/1424933002/diff/60001/services/media/audio/au... services/media/audio/audio_output.h:53: virtual MediaResult Init(); On 2015/11/04 23:43:32, jeffbrown wrote: > Consider using a convention such as OnInit() to distinguish methods which are > intended to be overridden by implementors from others which are not. If I had not intended for the method to be overridden, I would not have marked it virtual, nor would I have added the helpful comment telling people that they should override this method and do particular things here. Furthermore, if this was a method which was meant to be overridden, and I was overriding it in a derived class, and I didn't want subsequent derivations to override the method, I would mark it "final" instead of marking it as "override". Finally, if I wanted to added a non-virtual method, and force the client to never have a method with the same signature, I could also mark it as "final". This has the unfortunate side effect of forcing the method to become virtual, and non-overridable. I probably would not bother to do this because it.. 1) costs something. 2) protects nothing. Derived classes can always add methods with whatever name they want, but someone with a base class pointer cannot accidentally call them without first casting to the derived class. https://codereview.chromium.org/1424933002/diff/60001/services/media/audio/au... services/media/audio/audio_output.h:65: // Called from within the context of the processing lock any time a scheduled On 2015/11/04 23:43:32, jeffbrown wrote: > Technically you don't need a lock to guarantee serialization. > > Does this run on the main thread too or on a worker? If it's on a worker, you > might want to make the subclass set the processing callback in some clearly > distinguishable way, such as by passing a closure to a function to register or > schedule it. The lock is not being used to guarantee serialization. Serialization is guaranteed by the base::SequencedTaskRunner which acts as the thread pool and imposes the message driven architecture encouraged by the mojo folks. The lock is used for the following two purposes... 1) Protecting the client/link set. 2) Synchronizing with callbacks in-flight during shutdown. re #1 Right now, the client set is held static for the duration of a "process" operation. This scope can be tightened (by adding a specific lock to protect the set), but would shift the burden of properly protecting the link set to the derived classes which actually process their links during the processing callback. re #2 When final cleanup is happening in the context of the main message loop thread, we need to reach a point in time where we have a strong guarantee that any callbacks in flight have either been safely canceled, or run to completion. A classic pattern is implemented here... The guy doing the shutdown (main message loop) begins by flagging the TaskRunner to shutdown an cancel all callbacks, then sets a volatile flag in the output indicating that any callback which had already launched should fast-abort. Then, it attempts to enter the callback lock. On the other side of things, A callback in flight begins by promoting its weak pointer to the output object, and then entering the processing lock and fast aborting if the shutdown flag is set. This flag is made available to derived classes so that they may also abort their computation early if a shutdown has been requested. Back on the message loop thread, if an output process task has made it past all of these checks, it will end up blocking until the output process task notices the shutdown flag, or terminates normally. After it has successfully entered the lock, it has achieved its strong guarantee that all in-flight callbacks have been effectively canceled, or run to completion. A couple more small points. + Memory safety from the processing callback perspective is ensured by the weak pointer. If a callback is in flight, and the task runner shutdown does not manage to cancel the task before a thread picks it up and is running, and the message loop side of things gets all of the way to the point where it has destroyed the object, the weak pointer promotion will fail and the callback will do nothing. + OTOH, if the process callback successfully promotes the weak pointer, and then the message loop runs to the point of releasing its reference before the processing callback takes any other steps, the process callback will enter an uncontested lock, notice the shutdown flag and immediately exit the lock having done nothing. When the promoted weak pointer goes out of scope, the output's memory will be returned to the heap. No complex interactions take place in the destructor of an audio output, just freeing memory. All freeing of driver specific resources happened in the context of the message loop thread during the shutdown process after the processing callbacks had been successfully neutralized. Re: thread context. It runs strictly within the scope of on of the threads which comes from the base::SequencedTaskManager thread pool. This base:: object ensures that for any given output, all scheduled callbacks are serialized (although, not guaranteed to be the same thread between successive invocations). I will update the comment to attempt to make this more clear. https://codereview.chromium.org/1424933002/diff/60001/services/media/audio/au... services/media/audio/audio_output.h:86: // from the within a processing callback, the processing_lock will always be On 2015/11/04 23:43:32, jeffbrown wrote: > This here suggests that these methods should be provided as part of a context > object to the processing callback rather than mixed in here alongside everything > else. > > ie. The code could invoke the processing function and pass an object with the > relevant state and functions. > > Process(ProcessingContext* context) I don't understand. The context, in this case, is the base object itself. Actions need to be taken to schedule the next callback, and to begin the process of deferring shutdown to the message loop thread which require the private state of the base object itself. Why would it be advantageous to add a layer of indirection between the derived object and its base class? https://codereview.chromium.org/1424933002/diff/60001/services/media/audio/au... services/media/audio/audio_output.h:115: // priorities will be the trick). On 2015/11/04 23:43:33, jeffbrown wrote: > Could just use a heap. sure, I pretty much already am; right now I am using a std::set which is probably a red/black tree internally. I could either overload the less operator used to sort the set, or I could just use a std::map (I would prefer the former, but need to think about it some). What I am really concerned about here (and expressed in previous versions of this comment, but somehow lost by the time it got here) is optimizing for indexing this set two ways. There is a short list of operations which need to be optimized here, IMO. 1) Traversing the list in priority order. 2) Adding a track/link to the set. 3) Removing a track/link from the set. 4) Altering the priority of a track which is in the set. re #1: This is the primary operation as it will happen far more often than other ops. It must be O(k). Indexing the set based on priority is the obvious solution. re #2: This is an insertion operation based on the priority index. As the insertion into the underlying std::set/std::map is guaranteed to be O(log2), we should be good here simply by indexing by priority. (most implementations I have every looked at of set/map use a red-black tree under the hood, so are guaranteed to be 2*log(N) at worst) re #3: Removing a track from the set involves looking it up by id/pointer/handle/whatever. Right now, this is O(log2) because my current set is indexed by pointer. If I change that to be indexed by link priority, then removal becomes O(N). This is undesirable, not because I am overly concerned with the client (it is probably shutting down) but because it need to hold the link set static for the duration of the lookup and removal operation which means that if we want to perform a mix operation, we may have to wait for the O(N) operation to complete. re #4: This is basically an atomic remove-then-insert operation. In the outline above this would be O(N). If #3 can be optimized, it becomes O(log2) So; there are a few different ways I could optimize #3. One way would be to index the set two different ways. Given that std::set/map do not support this, you would need to accomplish this by making the object a member of two different set-like objects. In order to make this an improvement over what is described here, you need to be able to get to the object in Index #2 after looking it up in Index #1 in O(k) time (and vice-versa). The best way I know to do this is to use intrusive containers, and make the set bookkeeping for the indices part of the object itself. std:: containers do not support intrusive semantics. boost containers do, but are clearly not allowed here. Rolling my own is unappealing, but if I had to resort to such a thing, I would want to make it part of the overall project, not just this. Another option is what I am attempting to describe above. If I can guarantee uniqueness of priority, the set could be indexed based on priority alone and we could leave it at that. How the uniqueness is determined is the tricky part. One option would be to build a key which is made of the priority in the high order bits, and the this pointer of the link in the low order bits. This would end up being a 96 bit object and would suffer some lookup performance hit as a result. Another way to do it would be to use the sequence number I mention above. Instead of the pointer, use a 32 bit sequence number generated at link instantiate time appended to the priority in order to uniquify the key. This has the advantage to keeping our key to 64 bits, but has the disadvantage of making sequence number generation complex. Sequence numbers must either be assumed to be unique (likely to be OK, but strictly speaking not OK), or they need to be tested for uniqueness. Testing them using this container alone is technically O(N); a separate set of active sequence numbers would need to be introduced to guarantee no collisions in O(log2) during generation. Additional complexity, but performed entirely on the message loop thread, so no locks are required. If the message loop ever becomes multi-threaded, the scope of lock contention is limited to the input domain and does not affect the output domain. TL;DR? Yeah, I have thought about this some and there are some fine points which deserve consideration. I'm going to wait until we implement the priority feature before implementing a solution. https://codereview.chromium.org/1424933002/diff/60001/services/media/audio/au... services/media/audio/audio_output.h:138: // Called from Shutdown (main message loop) and ShutdowSelf (processing On 2015/11/04 23:43:32, jeffbrown wrote: > I think it's a little weird for there to be two different sources of shutdown > events (the processing callback and the output manager). It runs the risk of > the output getting into states that the output manager doesn't expect. > > Consider providing some kind of callback for the output to tell the output > manager that it's in a bad state and needs to be shut down so that the output > manager is always the one driving the shutdown and can stay in sync with > everything. That is how things handled now. Shutdown self posts a message to the main message loop (executed by the output manager) which actually performs the shutdown through the central Shutdown choke-point. https://codereview.chromium.org/1424933002/diff/60001/services/media/audio/au... services/media/audio/audio_output.h:153: base::Lock processing_lock_; On 2015/11/04 23:43:33, jeffbrown wrote: > We won't need this lock if the processing state is encapsulated elsewhere. I disagree. See my comments regarding synchronization above. https://codereview.chromium.org/1424933002/diff/60001/services/media/audio/au... services/media/audio/audio_output.h:154: base::Lock shutdown_lock_; On 2015/11/04 23:43:32, jeffbrown wrote: > We won't need this lock is shutdown is always initiated by the output manager on > the main thread. Unfortunately, I think it is an important requirement that shutdown be initiated from either side of the threading-fence. There are many very good reasons that an output may need to shut down in a timely fashion due to some unrecoverable error situation. Note: shutdown is not executed in the context of the thread-pool thread, it is simply scheduled to execute on the main message loop thread via BeginShutdown. The shutdown_lock_ is used to make this operation thread safe in the unlikely case that both side of the system decide that they want to shut down an output at the same time. https://codereview.chromium.org/1424933002/diff/60001/services/media/audio/au... File services/media/audio/audio_output_manager.cc (right): https://codereview.chromium.org/1424933002/diff/60001/services/media/audio/au... services/media/audio/audio_output_manager.cc:69: // TODO(johngro): Probably should log something about this, assuming that On 2015/11/04 23:43:33, jeffbrown wrote: > Another approach would be to handle this symmetrically with other kinds of > asynchronous self-shutdown of the output. We could assume Init always succeeds > here and wait for a callback to tell us otherwise. > > Bonus of that is that the output could actually do its real hardcore > initialization in the background on some other thread. Useful in case it's got > to open up and configure audio devices which could take a *very* long time. > (USB Audio anyone?) Perhaps, but I do not feel that there is ever a good solution to badly behaved device drivers. Currently, pushing this task into the scope of an "output thread context" (loosely defined term here) it not a solution to the problem as the output threading context is done withing the context of a thread pool, and badly behaved drivers will just end up pissing the the thread pool and potentially damaging other outputs because of their bad behavior. A thread-per-output is not a universal solution to this problem either. At some point, the main system needs to synchronize with the state of the output, whether the output owns a dedicated thread or not. The dependencies mentioned above can be broken down to the point where we require a thread per output, or dedicated init/shutdown threads, and impose an architecture where all interactions with the driver/os happen on the dedicated thread. We are almost already there with this design, but badly behaved drivers can still DoS our service or the whole system even in this hypothetical design (examples available upon request). Fundamentally, hardware and drivers are highly privileged. I feel that our first line of defense should be a solid OS (see reasons for FNL), high levels of compliance testing (see Android CTS, Apple's monopoly on hardware/kernel/drivers, Microsoft's WHQL compliance testing), and draconian consequences in the driver model for breaking the rules (see WDM/BSoD, and... well... I guess Linux and therefor Android has no such thing... but hopefully FNL will be better). DIRQ_LEVEL_NOT_LESS_THAN_OR_EQUAL anyone? Seriously though, this is just a start. We will refine as time goes on and we all have a better idea of what guarantees the HAL/OSAL can provide, and what practical mitigations we can implement of the underlying OS is less than we had hoped for. At the end of the day, the OS/Drivers must be solid. If they cannot be, then it really does not matter what we do here. Windows will BSoD if a driver shuts off interrupts and deadlocks (or makes any attempt to acquire any lock for that matter). Linux will not, and Android will always be vulnerable to this as a result. It is upon us to ensure that FNL is built to a higher standard than what Linux currently offers. https://codereview.chromium.org/1424933002/diff/60001/services/media/audio/au... services/media/audio/audio_output_manager.cc:93: // but it should not take long. On 2015/11/04 23:43:33, jeffbrown wrote: > Famous last words. :) See above. Hopefully not, but we will adapt if they prove to be incorrect. I have already been instructed not to concern myself too much with bad linux behavior, but to focus instead on making sure that FNL does its job correctly (which is to say it does provide guarantees, instead of shirking those responsibilities and make them the responsibilities of higher levels). https://codereview.chromium.org/1424933002/diff/60001/services/media/audio/au... File services/media/audio/audio_output_manager.h (right): https://codereview.chromium.org/1424933002/diff/60001/services/media/audio/au... services/media/audio/audio_output_manager.h:44: // Very Seriously Wrong. On 2015/11/04 23:43:33, jeffbrown wrote: > Instead of this comment, maybe just make it blow up. I already have some DCHECK protection for the idempotency requirement. Concrete suggestions for a universal DCHECK approach to enforcing safe-to-call from output manager destructor requirement are welcome. I'm all about DCHECKing my internal invariants, but this one seems like a difficult one to universally express in code. https://codereview.chromium.org/1424933002/diff/60001/services/media/audio/au... services/media/audio/audio_output_manager.h:92: scoped_refptr<base::SequencedWorkerPool> thread_pool_; On 2015/11/04 23:43:33, jeffbrown wrote: > Alternately if the number of audio outputs is small then perhaps it would be ok > to give each output its own dedicated thread. That's kind of nice from the > perspective of treating audio outputs as autonomous agents so that init and > shutdown can also be managed asynchronously just like processing work. Acknowledged. We may end up at this point. Per-our face to face discussion, this does not eliminate OSAL issues. I'm making a good faith effort here to use the exiting Mojo OS abstractions, and to use the message-centric architecture that the Mojo team seems to prefer and has encouraged me to try. If it turns out that hard data shows that this message-centric architecture is not sufficiently performant, and I cannot make it so by working with the Mojo team, I will move in the direction of a message free event/signal architecture (generally, my preferred architecture). Currently, Mojo has limitations when it comes OS abstraction which prevent me from doing this. This is orthogonal to the thread pool vs. not thread pool issue. Even if I do not use a thread pool, I already have this problem with attempting to use Mojo as my OSAL. Before resorting declaring the Mojo framework to be incapable of the task, I am going to attempt to use it as is, and work with the team to bring the framework to the point where it will meet my requirements. This list above is an initial list of where I have some issues with the support provided by the framework. While I am willing to make a best effort attempt to meet in the middle regarding performance and timing requirements, I do have some hard requirements. I feel that these are requirements not frivolous. If they cannot (or will not) be met by the framework's capabilities, I will reluctantly embark upon the task of rolling-my-own solution. It has not come to that point yet, and I hope it does not. https://codereview.chromium.org/1424933002/diff/60001/services/media/audio/au... File services/media/audio/audio_pipe.cc (right): https://codereview.chromium.org/1424933002/diff/60001/services/media/audio/au... services/media/audio/audio_pipe.cc:32: server_->SchedulePacketCleanup(std::move(state_)); On 2015/11/04 23:43:33, jeffbrown wrote: > Destructor side-effects like these make be a little nervous, but ok. Assuming > this is RAII-ish. yeah, I am not thrilled with this either. It is a necessary consequence of not being able to safely send messages on a Mojo message pipe from anything but the main message loop thread. It also adds unfortunate latency where we really want to have none at all. Clients can currently mitigate this using a strictly feed forward timing model for buffering, and over-allocating their buffer just a bit, but I would very much like to eliminate this trampoline behavior in general. For now, this consequence of what the mojo team has asked me to do based on current framework architecture. Moving forward, I would like to collect some hard data on the performance penalty, and work to address the issue (presuming it is real and significant, and I expect that it will be at scale), either in the fundamental design of inter-process media transport, or in the thread-safety capabilities of the mojo framework. https://codereview.chromium.org/1424933002/diff/60001/services/media/audio/au... services/media/audio/audio_pipe.cc:61: std::vector<AudioPacketRef::Region> regions; On 2015/11/04 23:43:33, jeffbrown wrote: > I was slightly surprised to see you allocating on the heap here. I was made slightly ill by my actions in doing so. At this point, even thinking about what it costs to move media from point A to point B makes me want to go lie down in a dark room and cry. What makes me hate myself even more is that I have (repeatedly) caved into logic from others which says (in effect) "Its already bad, whats wrong with a little bit more bad?" At this point, the situation is not good. Making it better will require hard data and load requirements to drive the system in the direction of efficiency as a whole. This is a post V0 thing, but rest assured that I have a long mental list of the places where I already know we can come back and start to make things better. https://codereview.chromium.org/1424933002/diff/60001/services/media/audio/au... services/media/audio/audio_pipe.cc:68: state->SetResult(MediaResult::INVALID_ARGUMENT); On 2015/11/04 23:43:33, jeffbrown wrote: > can we just close the pipe? Acknowledged. Placing the general close-the-pipe-after-V0 bucket. https://codereview.chromium.org/1424933002/diff/60001/services/media/audio/au... services/media/audio/audio_pipe.cc:80: static_cast<const uint8_t*>(buffer()) + (*region)->offset, On 2015/11/04 23:43:33, jeffbrown wrote: > Make sure to check that the offset is within the buffer and is properly aligned > too. Offset and length containment has already been handled by the MediaPipe level of things before we got here. Alignment and length-divisibility restrictions belongs at that level as well, I have multiple comments all over that code calling that out. I have not bothered to address this so far because... 1) This is V0, and I was concerned with unblocking to enable parallel development. 2) There are already cries to scrap this whole thing and redesign. Until I get to the other side of that debate and have reached consensus, I am not particularly motivated to continue to improve something which may be completely replaced in the near term anyway. https://codereview.chromium.org/1424933002/diff/60001/services/media/audio/au... services/media/audio/audio_pipe.cc:117: std::move(regions), On 2015/11/04 23:43:33, jeffbrown wrote: > So this object contains a pointer to an mmap'd region of the media pipe buffer. > Are we guaranteed that the pointer will be discarded before the buffer is > unmapped? > > I have the feeling this buffer is going to flow deep into the system and be sent > to worker threads which are operating asynchronously. Consider including a > reference counted reference to the buffer along the way so we don't screw it up. FWIW - enforcement like this belong at the MediaPipeBase level, not at this level. MediaPipeBase manages the mapping and unmapping of the buffer, as well as the sanity checking of the MediaPackets and MediaPacketState objects created to manage the lifetime of the objects whcih refer to the memory as well as the callback which *must* be made to inform the client that the shared buffer is available for writing again. Re: lifecycle management issues. It is currently guaranteed based on they way things used in the system as a whole, but I agree that this is a weak and non-obvious guarantee. Having a stronger, more direct, guarantee would require one of a few different techniques... 1) MediaPacketState objects could hold a strong reference to the state in the MediaPipeBase which controls the mapping/unmapping of the shared buffer. Expensive, but would provide a strong guarantee. 2) MediaPacketState objects could pursue a #1 style solution, but hold a weak reference instead. This puts further requirements on the consumer of the media in the form of promoting the weak reference and releasing generated strong reference, but it provides an equally strong guarantee along with better guarantees on timely unmapping of the shared buffer. 3) MediaPipeBase objects could implement some form of debug only ref counting checks. This would require MediaPacketState objects to hold some form of reference to the MediaPipeBase object (currently, they hold none). This still relies on "everyone managing the lifecycle properly" which is the current state of affairs, but provides some debug sanity checking. I actually considered implementing this early on, but I think that I decided that if I was going to pay the price of holding a reference in the MediaPacketState object, that I should go with #1 or #2 and have a strong guarantee instead of #3 which provides only some debug checking on an already weak promise (its not even a guarantee, TBH). So; I'm going to put this on my list of TODOs, and will most likely pursue #1 assuming that any of this code survives the up-and-coming MediaPipe design post-mortem. https://codereview.chromium.org/1424933002/diff/60001/services/media/audio/au... File services/media/audio/audio_pipe.h (right): https://codereview.chromium.org/1424933002/diff/60001/services/media/audio/au... services/media/audio/audio_pipe.h:42: // of 0. On 2015/11/04 23:43:33, jeffbrown wrote: > Might be good to define a struct for manipulating fixed-point values of this > type. I'm not sure that a struct is what you want here (in fact, I'm almost 100% sure that it is not, but am more than willing to concrete proposals for how they could make things better). I'm not even sure that we want to keep the concept of fractional timestamps expressed in track frames at this level. Unfortunately, allowing media timelines to be expressed in units which are not fundamentally divisible by the underlying frame rates of the substreams (think audio and video, but there are other valid types of substreams in the media world) and an un-avoidable consequence of this decision. The MPEG-2 system layer (both PS and TS) fundamentally disallows this given their specific choices of media clock rates (27MHz for TS and 90KHz for PS), and restricted choices of audio and video frame rates (easy to do when audio and video are restricted to MP1L[123] audio and MP[12] video). Of course, this immediately becomes undone by people who then turn around and encapsulate other audio and video codecs which support broader options for frame rates in the container. Other systems just give up and spec high resolution timestamps (usually in expressed nanoseconds, but sometimes things like 100s or 10s of nanoseconds) as their universal, internal clock resolution. Gstreamer and (IIRC) DirectShow took this approach. I have found this to be a reasonable approach, provided that PTS interpolation is handled properly (see below). Other containers (I'm looking at you FLV) just fundamentally drop the ball and spec millisecond as their media time, which is entirely insufficient, regardless of the nature of the encoded media they are attempting to transport. TL;DR? It basically becomes the problem of the player/renderer combination to deal with. I provide a solution by allowing... 1) Unsynchronized audio to just not deal with timestamps at all (I have a bit of work to do to properly handle this while guaranteeing no truncation on output, but I promise you that this is not too difficult). 2) Synchronized audio where initial timestamps may be provided at the container resolutions at random access points, and interpolated from there on out. The solution I use below deals properly with interpolation by using transformations from the initial, explicitly provided, PTS. 3) Strictly obeying PTS explicitly provided by the user/player. Right now, this requires the player to deal with the problem of creeping rounding error. Allowing some level of slop in this to account for explicit (but incorrect), timestamps is a subject I bring up in some of my initial requirements documents, but have not handled here. We need to decide if we are going to do anything at all about it, and if we do, design/document the system first. For now, either get your PTSs right, or let me interpolate them for you. https://codereview.chromium.org/1424933002/diff/60001/services/media/audio/au... File services/media/audio/audio_server_impl.h (right): https://codereview.chromium.org/1424933002/diff/60001/services/media/audio/au... services/media/audio/audio_server_impl.h:67: scoped_refptr<base::SingleThreadTaskRunner> task_runner_; On 2015/11/04 23:43:33, jeffbrown wrote: > Do we need to reference count this? Can we ensure everything that uses it gets > shut down before we destroy it? It seems to me that if anyone is handing onto > the task runner longer than they should then they're possibly holding onto other > things they shouldn't too. It is a side effect of best practice for using a base::TaskRunner in general. Tasks scheduled on the TaskRunner need to be bound to it. The way base:: bindings work requires the object to be refcounted (All task runners are) and for users to hold scoped_refptr to them. https://codereview.chromium.org/1424933002/diff/60001/services/media/audio/au... File services/media/audio/audio_track_impl.cc (right): https://codereview.chromium.org/1424933002/diff/60001/services/media/audio/au... services/media/audio/audio_track_impl.cc:109: // Are we already configured? On 2015/11/04 23:43:33, jeffbrown wrote: > Here's a place where it would be nice to just configure at the same time the > track is created. Reduces the state space. Acknowledged. Its on the TODO list. https://codereview.chromium.org/1424933002/diff/60001/services/media/audio/au... services/media/audio/audio_track_impl.cc:118: cbk.Run(MediaResult::UNSUPPORTED_CONFIG); On 2015/11/04 23:43:33, jeffbrown wrote: > Assuming clients should have known better before asking for this, just close the > connection. > > I presume clients will need to know a-priori what kinds of configs are supported > so if they got this far then they already screwed up. Acknowledged. In the close-the-connection TODO bucket. https://codereview.chromium.org/1424933002/diff/60001/services/media/audio/au... services/media/audio/audio_track_impl.cc:144: int32_t N = static_cast<int32_t>(configuration->audio_frame_ratio); On 2015/11/04 23:43:33, jeffbrown wrote: > numerator / denominator, or num / denom > > Also, seeing int32_t and uin32_t used together for these fractions makes me > worry that the wrong types might get used someplace (possibly in the client if > not here). We spoke earlier of using a separate bool to encode whether playback > is forward or reversed. I think that might be worth exploring further in future > patches. > > (Or make both the numerator and denominator signed at the cost of 1 bit of > precision.) re: future efforts... Precision is important here. I am loathe to sacrifice it. I am much more inclined to add the bool to gain another bit of precision rather than losing one and adding redundant states. Before I go either route, I want some consensus on whether or not signed scaling is important at all or not. For a generalized linear transformation library, it is. For media, it can be, but only in the context of reverse trick play (a phrase which makes most people make "ewwwww, yucky!" faces) and usually requires a ton of tricksy-ness at the player level to get right anyway (where you can make an argument that perhaps the timestamps should just be re-written there; the one case where they should be). It should be noted that, with some minor modification, the sampler core here could actually (rather easily) play audio backwards without needing to re-order the samples in the buffer or re-write timestamps. I'm probably the only person who thinks that is cool, however, so... re: numerator and denominator.. Dude... You are killing me.... The values came from audio_frame_RATIO and media_time_RATIO. The are then fed into a LinearTransform::***RATIO*** and multiplied by other RATIOs. If you don't know what N and D mean in this context, ffs how did you manage to pass the Google interviews? Done. (but my soul is crying inside). https://codereview.chromium.org/1424933002/diff/60001/services/media/audio/au... services/media/audio/audio_track_impl.cc:204: format_ = cfg->Clone(); On 2015/11/04 23:43:33, jeffbrown wrote: > Looks to me like the config is just going to be dropped on the floor when we > return. So we might as well just Pass() it and take ownership. Done. The fact that this is even allowed is a bit surprising/disturbing. Had an interesting conversation with Vardhan about it. Someday over drinks, we should discuss the ramifications which allowing this might have on reaching a minimal copy/minimal heap allocation world when it comes to marshalling parameters across mojom interfaces. https://codereview.chromium.org/1424933002/diff/60001/services/media/audio/au... services/media/audio/audio_track_impl.cc:252: // something about this? On 2015/11/04 23:43:33, jeffbrown wrote: > DCHECK! We failed an internal invariant. Done. https://codereview.chromium.org/1424933002/diff/60001/services/media/audio/au... services/media/audio/audio_track_impl.cc:256: void AudioTrackImpl::OnPacketReceived(AudioPipe::AudioPacketRefPtr packet) { On 2015/11/04 23:43:33, jeffbrown wrote: > Do we need to DCHECK the packet? no, but it does not hurt to do so. I have been called out by many engineers many times for DCHECKing pointers which I am just about to de-reference. Their argument is that de-reffing NULL is just going to crash anyway (and mine is that the DCHECK/ASSERT implementation usually produces much nicer output) Done. https://codereview.chromium.org/1424933002/diff/60001/services/media/audio/au... services/media/audio/audio_track_impl.cc:265: DCHECK(output); On 2015/11/04 23:43:33, jeffbrown wrote: > I feel like this DCHECK is kind of redundant given we're already enforcing > invariants on what goes into the outputs list, but it's not harmful. Acknowledged. Can't have it both ways. Either we are checking at levels like this, or we are not. If we are it says, and I add the DCHECK(packet) above. If not, I strike both of them. I lean in the cautious direction, so DCHECKs all around! https://codereview.chromium.org/1424933002/diff/60001/services/media/audio/au... File services/media/audio/audio_track_impl.h (right): https://codereview.chromium.org/1424933002/diff/60001/services/media/audio/au... services/media/audio/audio_track_impl.h:28: static constexpr size_t PTS_FRACTIONAL_BITS = 12; On 2015/11/04 23:43:33, jeffbrown wrote: > Maybe make a new fixed point datatype instead. perhaps, but I would rather not at the moment. I can make an entire templated fixed point math library, but right now my use cases require nothing that sophisticated. I will put it on the list of things to consider doing eventually (esp. as the team grows and we have more bandwidth) but right now I think we have much more important things to do. https://codereview.chromium.org/1424933002/diff/60001/services/media/audio/au... services/media/audio/audio_track_impl.h:35: void Describe(const DescribeCallback& cbk) override; On 2015/11/04 23:43:33, jeffbrown wrote: > These should be private. Done. https://codereview.chromium.org/1424933002/diff/60001/services/media/audio/au... services/media/audio/audio_track_impl.h:68: AudioTrackImplWeakPtr weak_this_; On 2015/11/04 23:43:33, jeffbrown wrote: > Using lots of typedefs makes it harder to understand the semantics of these > values without having to grep the header files. Is this a WeakPtr? I'm not > sure. You might want to tone it down a bit before this turns into Hungarian > notation. I'm not sure of exactly what you are concerned about here... Is it the type name or the member variable name? Re; the type name... I am attempting to follow the mojo style here. They (frequently) generate pointer types for their smart pointers to their interfaces and other objects. It follows the form MyObjectOrInterfacePtr. Re: Is this a WeakPtr? If you are asking about the type name, I had hoped that... AudioTrackImpl**WeakPtr** had made it clear that this type was a weak pointer to an AudioTrackImpl. Re: typedefs. The relationships between objects in this system mandates that many of the objects hold various references and pointers to other objects in the system. Unless I wanted to end up in circular include file hell (is there a special circle for that sin?), this requires many forward declarations, which is why I centralized it all in the fwd_decls.h header. I went ahead and defined types for pointers and weak pointers and sets and the like because.. 1) It seems to be the local mojo style. 2) The std pointer are collections are great! Their syntax gets to be a little gross (go ahead and type std::set<std::shared_ptr<AudioTrackToOutputLink>, std::owner_less<std::shared_ptr<AudioTrackToOutputLink>>>; five times fast.) 3) Using typedefs help to enforce top level ownership design decisions. Is this supposed to be a unique_ptr or a shared_ptr? The decision has already been made for you, and is enforced for you. In the unlikely case that we want to change that decision, or switch from std::shared_ptr to mojo::even_better_shared_ptr, we can do it in one place instead of all over the place. Finally, re: weak_this_ This is an unfortunate side effect of the non-intrusive nature of shared/weak pointers in the std:: library. Sorry. https://codereview.chromium.org/1424933002/diff/60001/services/media/audio/au... services/media/audio/audio_track_impl.h:70: Binding<AudioTrack> binding_; On 2015/11/04 23:43:33, jeffbrown wrote: > alignment is a little wonky here > > More importantly, I don't think git cl format will let you get away with this... yeah, this happened because of the global rename from Sink to Track. It has been fixed. https://codereview.chromium.org/1424933002/diff/60001/services/media/audio/au... File services/media/audio/audio_track_to_output_link.cc (right): https://codereview.chromium.org/1424933002/diff/60001/services/media/audio/au... services/media/audio/audio_track_to_output_link.cc:53: // queued in. On 2015/11/04 23:43:34, jeffbrown wrote: > FWIW, we might actually find it worthwhile to push all packets through the audio > output itself as usual even when flushed. Just mark them as flushed so that > they can skip the main processing stages. > > That way there will be one consistent path by which packets are queued, > processed, and retired. I've found this kind of pattern useful for other > pipelines and it's no more expensive in the end. As discussed F2F yesterday, this will add a ton of statefulness to the front end of the flush operation. We can consider it later on, but I still feel that I can make number a strong argument for handling it like this, including (but not limited to) 1) Performance. 2) Centralization of implementation (and reducing the responsibilities of all the various output implementations) 3) General complexity and maintainability. https://codereview.chromium.org/1424933002/diff/60001/services/media/audio/au... File services/media/audio/audio_track_to_output_link.h (right): https://codereview.chromium.org/1424933002/diff/60001/services/media/audio/au... services/media/audio/audio_track_to_output_link.h:34: // system. No assumptions may be made about threading when destructing. On 2015/11/04 23:43:34, jeffbrown wrote: > This could be amended by making the cleanup signal come from only one place. > Might be worth exploring in a future patch. Acknowledged. https://codereview.chromium.org/1424933002/diff/60001/services/media/audio/fw... File services/media/audio/fwd_decls.h (right): https://codereview.chromium.org/1424933002/diff/60001/services/media/audio/fw... services/media/audio/fwd_decls.h:21: using AudioOutputPtr = std::shared_ptr<AudioOutput>; On 2015/11/04 23:43:34, jeffbrown wrote: > Are we actually taking advantage of the fact that these pointer are reference > counted or do we only need weak semantics? If the latter, consider using > base::WeakPtr perhaps. I'm not sure I completely understand your question... One or more strong references to the Output and Track objects exist in a functioning system, as well as (potentially) many weak ones. Whether or not this is implemented using a reference count or faerie dust seems like an implementation details to me. The strongest argument I have received from James R. about using base:: pointer classes is that they are instrusive and save allocation overhead (an argument which I find near and dear to my heart). However, whatever form the bookkeeping takes (reference count, faeries, whatever), in order to have weak semtantics, it cannot be part of the referenced object itself (otherwise, how would the object go away, but still allow the weak references to exist). Assuming that this is correct, I am also trying to abide by the directive "to use std:: c++11 wherever I can and to use base:: only when there is not equivalent functionality in c++11 std::". So: Help me to understand what base::WeakPtr does which std::weak_ptr does not. Right now, using std:: seems to give me... 1) standardized shared/weak interface semantics. 2) interoperability with other components of the std:: library (see std::owner_less being used as my sorting predicate for std:: container classes below) 3) No more overhead (to my knowledge) than another implementation would require (note my assertion that some sort of out-of-object bookkeeping seems like a requirement) https://codereview.chromium.org/1424933002/diff/60001/services/media/audio/pl... File services/media/audio/platform/generic/mixer.cc (right): https://codereview.chromium.org/1424933002/diff/60001/services/media/audio/pl... services/media/audio/platform/generic/mixer.cc:19: if (!src_format) { return nullptr; } On 2015/11/04 23:43:34, jeffbrown wrote: > Seems worthy of a DCHECK. Otherwise the caller has to deal with nullptr which > might be worse. which he will immediately DCHECK about :) I'll add another one here though. This operation only happens when adding a Track to an Output, so DCHECKs all around! https://codereview.chromium.org/1424933002/diff/60001/services/media/audio/pl... File services/media/audio/platform/generic/mixer.h (right): https://codereview.chromium.org/1424933002/diff/60001/services/media/audio/pl... services/media/audio/platform/generic/mixer.h:32: static MixerPtr Select(const LpcmMediaTypeDetailsPtr& src_format, On 2015/11/04 23:43:34, jeffbrown wrote: > This is looking much cleaner than the first patch. Nice. Acknowledged. https://codereview.chromium.org/1424933002/diff/60001/services/media/audio/pl... services/media/audio/platform/generic/mixer.h:65: // How much to increment the fractional sampling position for each output On 2015/11/04 23:43:34, jeffbrown wrote: > Is this actually a fraction? Like a fixed point value of some kind? yes, that is exactly what it is. https://codereview.chromium.org/1424933002/diff/60001/services/media/audio/pl... File services/media/audio/platform/generic/mixers/mixer_utils.h (right): https://codereview.chromium.org/1424933002/diff/60001/services/media/audio/pl... services/media/audio/platform/generic/mixers/mixer_utils.h:19: // Template to read samples and normalize them into signed 32 bit integers. On 2015/11/04 23:43:34, jeffbrown wrote: > What you're not saying in this comment is that those signed 32 bit integers > actually contain 16 bit samples. That was a little surprising to me. because they don't They contain either signed 16 bit samples or unsigned 8 bit samples, normalized into signed 32 bit samples. Someday, this can be extended to normalize signed/unsigned 24 bit samples, unsigned 16 bit samples, and so on. The decision for now is to SW scale and mix using 32 math for everything. As supported sample formats grow, more specialized mixer implementation will probably start to show up which deal with more expensive internal representations of audio (double precision floating point audio anyone?) https://codereview.chromium.org/1424933002/diff/60001/services/media/audio/pl... services/media/audio/platform/generic/mixers/mixer_utils.h:30: register SType tmp = static_cast<const SType*>(src)[0]; On 2015/11/04 23:43:34, jeffbrown wrote: > *src instead of src[0]? > > I don't see why you need the static cast here given that the argument is already > of the same type. Done. I think that this is a leftover from an earlier form of this. https://codereview.chromium.org/1424933002/diff/60001/services/media/audio/pl... services/media/audio/platform/generic/mixers/mixer_utils.h:43: return static_cast<int32_t>(static_cast<const SType*>(src)[0]); On 2015/11/04 23:43:34, jeffbrown wrote: > Why isn't this good enough? (Of course even the cast is redundant here.) > > return static_cast<int32_t>(*src); Done. it is, see above. I think this was leftover from some ancient proto form of this code. https://codereview.chromium.org/1424933002/diff/60001/services/media/audio/pl... services/media/audio/platform/generic/mixers/mixer_utils.h:47: // Template to read normalized source samples, and combine channels if required. On 2015/11/04 23:43:34, jeffbrown wrote: > Doesn't this need a downmix matrix in the general case? Absolutely. Its on the feature list, I'm not sure what form an efficient implementation will eventually take. Even the implementation here may be considered to be wrong in some cases... For example, when going from 2 -> 1 channels, I average the signal. The proper thing to do may be to add and saturate, I'm honestly not sure. When we give down/up-mix matrices a more formal treatment, all of this will need to be done more rigorously (and compliance tests will need to be defined/written) https://codereview.chromium.org/1424933002/diff/60001/services/media/audio/pl... services/media/audio/platform/generic/mixers/mixer_utils.h:52: class SrcReader; On 2015/11/04 23:43:34, jeffbrown wrote: > SrcReader -> SampleReader? its not really a generalized sampler reader, however. SampleNormalizer is what fills that role. This template is specially designed to read one, or both of a source's input channel, and either combine (downmix) them into 1 channel, or keep them as independent channels. IOW - its definition is dependent on the concept of the parameters of both the source and the destination, and it specifically samples the source and puts it into a format compatible with the destination. https://codereview.chromium.org/1424933002/diff/60001/services/media/audio/pl... services/media/audio/platform/generic/mixers/mixer_utils.h:85: template <typename DType, typename Enable = void> class DstConverter; On 2015/11/04 23:43:34, jeffbrown wrote: > DstConverter -> SampleWriter? This is not a writer at all. This is a template which (as the name implies) converts from a normalized sample (signed 32 bit is what we are using here) to a destination format (in the case of signed int16, this is a case, in the case of unsigned 8, this is a shift and offset operation). https://codereview.chromium.org/1424933002/diff/60001/services/media/audio/pl... services/media/audio/platform/generic/mixers/mixer_utils.h:120: DoAccumulate == false, On 2015/11/04 23:43:34, jeffbrown wrote: > Instead of testing for DoAccumulate == false maybe have the caller treat > accumulation as a special case, where the initial value of "sample" is > initialized to the normalized destination sample. Otherwise I don't see what > this particular template is buying you. Bear in mind, the testing here is happening at compile time. The purpose of this template is to allow us to produce two versions of the Mix function; one which reads from the dest, normalizes the sample, sums with the provided sample and then clips (the next specialization), and this version which becomes a no-op (just take the provided sample is, don't actually accumulate/mix). The invocation ends up looking something like... *dst = DstMixer<DoAccumulate>(dst, sample); One version becomes the complicated read, normalize, add, threshold, de-nomrmalize, writeback. The other becomes de-normalize and write. So, why the specialization? Two main reasons. First, it is the style I am using here. Style consistency uber alles! Second, I could write the following... if (DoAccumulate) { complicated; } else { less_complicated; } Since DoAccumulate is compile time constant, any level of optimization should produce something equivalent to what I have here. That said, its not a guarantee (small comfort considering how much we are depending on compiler optimization here), and -O0 may actually be forced to keep the check like-it-or-not. This form removes the ambiguity. There will be no branch based on DoAccumulate no matter what. Its not a huge thing, but whatever. We are already deep in template-meta-hell, whats a little more? :P https://codereview.chromium.org/1424933002/diff/60001/services/media/audio/pl... services/media/audio/platform/generic/mixers/mixer_utils.h:135: static inline int32_t Mix(const DType* dst, int32_t sample) { On 2015/11/04 23:43:34, jeffbrown wrote: > I'll admit to being deeply confused by this function. It has a parameter call > dst that we are reading from. Then it converts the summed output into a value > of the destination type and widens it to int32_t. So the input is normalized > but the output is in the destination space. Kind of weird... > > The clamping seems to be done on a sample-by-sample basis but we should probably > do this at the very end after looping over all samples to be accumulated. See above. Clamping on a sample by sample basis is not an option here; our output buffer is of destination sample size. We would need to have allocated an intermediate buffer to hold the 32 bit normalized results, and we have not here. As we start to optimize these mixer, especially to use vector instruction sets (SSE/Neon) and intermediate buffer is almost certain. I leave that as an optimization for the future. https://codereview.chromium.org/1424933002/diff/60001/services/media/audio/pl... File services/media/audio/platform/generic/mixers/no_op.cc (right): https://codereview.chromium.org/1424933002/diff/60001/services/media/audio/pl... services/media/audio/platform/generic/mixers/no_op.cc:28: frames_produced = (dst_frames - *dst_offset); On 2015/11/04 23:43:34, jeffbrown wrote: > The mixer functions themselves might be a little simpler if the caller did all > of this clamping for them. After all, it should know whether an overflow will > happen. > > Although I could see some argument for letting the mixer calculate it as it goes > since we'll probably determine these results while the mixer's loop executes too > (at least for a non-noop mixer). I'm confused; there is no sample clamping going on here. The goal (and what is going on here) is to let the mixer/filter say how much of the input it has consumed and how much of the output it has produced. The reason for this is that as we add more complicated temporal filters, the understanding of this relationship really belong in the mixer class, and not in the outer loop. The outer loop should just keep giving the mixer more data until it has produced all the sample it needs, or the mixer has decided that it has seen all of the input it is interested in. FWIW - I have some work in progress to allow filters of non-epsilon width, and as part of that work, this NoOp Mix implementation actually disappears. https://codereview.chromium.org/1424933002/diff/60001/services/media/audio/pl... services/media/audio/platform/generic/mixers/no_op.cc:34: return (*frac_src_offset >= frac_src_frames); On 2015/11/04 23:43:34, jeffbrown wrote: > I don't see why we need to return this value since it's something the caller > could determine directly. See above; the caller needs to know details about the filter implemented by the mixer to actually do that. The filters implemented so far are trivial, but they will not always be so. https://codereview.chromium.org/1424933002/diff/60001/services/media/audio/pl... File services/media/audio/platform/generic/mixers/point_sampler.cc (right): https://codereview.chromium.org/1424933002/diff/60001/services/media/audio/pl... services/media/audio/platform/generic/mixers/point_sampler.cc:72: uint32_t S = (soff >> AudioTrackImpl::PTS_FRACTIONAL_BITS) * SChCount; On 2015/11/04 23:43:34, jeffbrown wrote: > variables shouldn't be uppercase Done. https://codereview.chromium.org/1424933002/diff/60001/services/media/audio/pl... services/media/audio/platform/generic/mixers/point_sampler.cc:75: for (size_t D = 0; D < DChCount; ++D) { On 2015/11/04 23:43:34, jeffbrown wrote: > We could handle all of the specialization for accumulation at this level instead > of burying it down in DM::Mix. We know the function just turns into reading > from Dest and doing an extra mix. I disagree. I don't want to write two versions of this loop which differ only in whether they overwrite or accumulate and saturate. This approach produces the same code with no cut-n-paste. Adding a pseudo-runtime check could avoid the duplication (something like "if (DoAccumulate) ...") but leaves to door open for the compiler to not actually optimize. I realize that this is a small argument given how much we are depending on the optimizer here, but a -O0 build might actually leave the branch in (for no good reason). The style here is to do everything with compile time template specialization, I feel that this is consistent with that style. https://codereview.chromium.org/1424933002/diff/60001/services/media/audio/pl... File services/media/audio/platform/generic/standard_output_base.cc (right): https://codereview.chromium.org/1424933002/diff/60001/services/media/audio/pl... services/media/audio/platform/generic/standard_output_base.cc:137: for (auto iter = links_.begin(); iter != links_.end(); ) { On 2015/11/04 23:43:34, jeffbrown wrote: > I'm actually kind of surprised that you designed the mixer with an outer loop > over the tracks instead of over the outputs. > > This means as we go through the tracks, we have to accumulate all of the samples > into the destination buffer which may cause a loss of precision and bad clipping > artifacts. > > With that in mind, perhaps we should only support accumulation buffers that have > a normalized representation (say, int32_t), and convert them into the > destination representation at the very last moment. > > That should make your Mix functions a little simpler since they won't have to > deal with converting the accumulated samples to a destination format > immediately. They can do all of the math in the normalized domain. > > Another alternative might be to try to exchange the order of the loops but I > suspect that the bookkeeping involved in doing that efficiently might be a > little much! Keep in mind that the outputs are independent and mix in parallel. They also have completely independent timing/buffering requirements. We never loop over outputs. That said, I suspect that picking a canonical output sample format, or small set of them (int32_t/double) and separating the mix from the clip operation is a likely direction that this will end up moving in. Things get a bit more complicated when 24 bit audio comes in (int32_t internal allows a max of 127 tracks before internal clipping becomes a possibility), and when greater-than-unity gain is permitted on input tracks (something which Dale has expressed interest in supporting). At this point, we are looking at int64_t as our intermediate format, and its starting to look seriously expensive. TL;DR? I expect that an intermediate mix buffer will be coming soon, its just not here yet. Finally; re: permuting the order of the loops.. If what you are trying to say is that you are puzzled why I would write foreach(track : tracks_) foreach(packet : track.pending_packets_) I'm not sure I see a way to permute the order of these loops. The pending_packet_ concept would need to be homogeneous across the loops and it is not. Even if it was, I would still go over tracks first in order to implement priority properly (something which Carlos has requested). His request (and I think it is reasonable/good) is that if this starts to get so overloaded because of the number of tracks in the system, that we cut-n-run during the mix operation. This means that low priority tracks go silent, but we do not underflow our output DMA. In the first model, we punish low priority tracks when we are overloaded. In the second, we punish all tracks. Carlos asserts that this is what Apple does and Microsoft does not, which is one of the reasons he preferred the Apple audio system over the system offered by MSFT. https://codereview.chromium.org/1424933002/diff/60001/services/media/audio/pl... services/media/audio/platform/generic/standard_output_base.cc:304: bool consumed_source = info->mixer->Mix(buf, On 2015/11/04 23:43:34, jeffbrown wrote: > As designed, we're going to have problems replacing the point sampler with one > that does interpolation. An interpolating sampler may need access to data that > comes from multiple regions within the packet (or from prior packets depending > on how the audio got sliced up). In order words, the sampler will need access > to a window of samples within the track, not just one. > > To work around this problem, consider passing the mix function a random access > iterator for accessing the Nth sample of the input rather than a pointer to it. > You'll have to define what the window size is upfront though. The window might > need to depend on properties of the timeline transformation. > > Alternately, it's possible that copying a moving window's worth of the input > into a linear buffer will actually end up being faster than trying to process it > in small chunks when crossing region boundaries. (Best of both worlds, assuming > the windows are small compared to typical regions so we don't have to deal with > overlap very often.) Acknowledged. This is a complicated optimization subject. I don't know where we will eventually end up, but right now I am trying to pursue a model where we minimize copies (one of Dale's primary concerns). I have a WiP of a linear interpolation resampler which holds just the last sample from the last region/packet, and resamples in place for everything else. As filters grow wider, and buffers grow smaller (to hit extreme low latency requirements) the overhead of handling the edges of the system is going to start to grow, however. Pursuing a strategy of copying into a temporally contiguous input buffer may become the most efficient thing to do. This could also serve to allow simpler implementations of graceful ramp down/up after/before discontinuities and still preserve the ability to allow sparse audio, something that can really benefit an application like the announcer. We'll see what the future holds. Generally, optimizing the core mix loops is going to be a full-time job for someone someday soon.
https://codereview.chromium.org/1424933002/diff/60001/mojo/services/media/aud... File mojo/services/media/audio/interfaces/audio_track.mojom (right): https://codereview.chromium.org/1424933002/diff/60001/mojo/services/media/aud... mojo/services/media/audio/interfaces/audio_track.mojom:20: bool supports_pull_transport; On 2015/11/06 02:20:23, johngro wrote: > On 2015/11/04 23:43:32, jeffbrown wrote: > > I thought we were only going to support push? > > copied directly from Dale's design doc. Let's bring this up when we get > together to talk about overall MediaPipe design stuff. I'll leave a comment > here and add the topic to my design todo list. Push-only came up in the doc review...no longer there.
https://codereview.chromium.org/1424933002/diff/60001/mojo/services/media/aud... File mojo/services/media/audio/interfaces/audio_track.mojom (right): https://codereview.chromium.org/1424933002/diff/60001/mojo/services/media/aud... mojo/services/media/audio/interfaces/audio_track.mojom:20: bool supports_pull_transport; On 2015/11/06 06:14:11, dalesat wrote: > On 2015/11/06 02:20:23, johngro wrote: > > On 2015/11/04 23:43:32, jeffbrown wrote: > > > I thought we were only going to support push? > > > > copied directly from Dale's design doc. Let's bring this up when we get > > together to talk about overall MediaPipe design stuff. I'll leave a comment > > here and add the topic to my design todo list. > > Push-only came up in the doc review...no longer there. Acknowledged. What do you think I should do here? Keep these fields, kill them, or keep the comment and discuss later?
https://codereview.chromium.org/1424933002/diff/60001/mojo/services/media/aud... File mojo/services/media/audio/interfaces/audio_track.mojom (right): https://codereview.chromium.org/1424933002/diff/60001/mojo/services/media/aud... mojo/services/media/audio/interfaces/audio_track.mojom:20: bool supports_pull_transport; On 2015/11/06 20:24:25, johngro wrote: > On 2015/11/06 06:14:11, dalesat wrote: > > On 2015/11/06 02:20:23, johngro wrote: > > > On 2015/11/04 23:43:32, jeffbrown wrote: > > > > I thought we were only going to support push? > > > > > > copied directly from Dale's design doc. Let's bring this up when we get > > > together to talk about overall MediaPipe design stuff. I'll leave a comment > > > here and add the topic to my design todo list. > > > > Push-only came up in the doc review...no longer there. > > Acknowledged. > > What do you think I should do here? Keep these fields, kill them, or keep the > comment and discuss later? If you're not using them, you might as well delete them.
https://codereview.chromium.org/1424933002/diff/60001/mojo/services/media/aud... File mojo/services/media/audio/interfaces/audio_track.mojom (right): https://codereview.chromium.org/1424933002/diff/60001/mojo/services/media/aud... mojo/services/media/audio/interfaces/audio_track.mojom:20: bool supports_pull_transport; On 2015/11/06 21:03:56, dalesat wrote: > On 2015/11/06 20:24:25, johngro wrote: > > On 2015/11/06 06:14:11, dalesat wrote: > > > On 2015/11/06 02:20:23, johngro wrote: > > > > On 2015/11/04 23:43:32, jeffbrown wrote: > > > > > I thought we were only going to support push? > > > > > > > > copied directly from Dale's design doc. Let's bring this up when we get > > > > together to talk about overall MediaPipe design stuff. I'll leave a > comment > > > > here and add the topic to my design todo list. > > > > > > Push-only came up in the doc review...no longer there. > > > > Acknowledged. > > > > What do you think I should do here? Keep these fields, kill them, or keep the > > comment and discuss later? > > If you're not using them, you might as well delete them. Done.
lgtm. let's just move forward with this
Message was sent while issue was closed.
Committed patchset #8 (id:130001) manually as 1cff1d372574761491c915137cab8928036b5deb (presubmit successful). |