Issue 1424933002: Add an initial revision of an audio server.

johngro

5 years, 1 month ago (2015-10-28 00:28:13 UTC) #1

johngro

Description was changed from ========== Add an initial revision of an audio server. Add a ...

5 years, 1 month ago (2015-10-28 16:55:29 UTC) #2

johngro

johngro@google.com changed reviewers: + dalesat@chromium.org

5 years, 1 month ago (2015-10-28 16:55:29 UTC) #3

jeffbrown

https://codereview.chromium.org/1424933002/diff/60001/mojo/services/media/audio/interfaces/audio_track.mojom File mojo/services/media/audio/interfaces/audio_track.mojom (right): https://codereview.chromium.org/1424933002/diff/60001/mojo/services/media/audio/interfaces/audio_track.mojom#newcode20 mojo/services/media/audio/interfaces/audio_track.mojom:20: bool supports_pull_transport; I thought we were only going to ...

5 years, 1 month ago (2015-11-04 23:43:34 UTC) #4

johngro

https://codereview.chromium.org/1424933002/diff/60001/mojo/services/media/audio/interfaces/audio_track.mojom File mojo/services/media/audio/interfaces/audio_track.mojom (right): https://codereview.chromium.org/1424933002/diff/60001/mojo/services/media/audio/interfaces/audio_track.mojom#newcode20 mojo/services/media/audio/interfaces/audio_track.mojom:20: bool supports_pull_transport; On 2015/11/04 23:43:32, jeffbrown wrote: > I ...

5 years, 1 month ago (2015-11-06 02:20:27 UTC) #5

https://codereview.chromium.org/1424933002/diff/60001/mojo/services/media/aud...
File mojo/services/media/audio/interfaces/audio_track.mojom (right):

https://codereview.chromium.org/1424933002/diff/60001/mojo/services/media/aud...
mojo/services/media/audio/interfaces/audio_track.mojom:20: bool
supports_pull_transport;
On 2015/11/04 23:43:32, jeffbrown wrote:
> I thought we were only going to support push?

copied directly from Dale's design doc.  Let's bring this up when we get
together to talk about overall MediaPipe design stuff.  I'll leave a comment
here and add the topic to my design todo list.

https://codereview.chromium.org/1424933002/diff/60001/mojo/services/media/aud...
mojo/services/media/audio/interfaces/audio_track.mojom:58: // kInvalidArgs:
On 2015/11/04 23:43:32, jeffbrown wrote:
> close the connection

Acknowledged.

FWIW - I'm putting these all in my close the post V0 close-the-connection
bucket.  From here on out, I'll be acking these comments, and adding them to my
list of cleanup tasks, but I'm not going to change them in this CL.  Once this
is in, I'll file a bug and get to work cleaning this up.

https://codereview.chromium.org/1424933002/diff/60001/mojo/services/media/aud...
mojo/services/media/audio/interfaces/audio_track.mojom:63: //
kUnsupportedConfig:
On 2015/11/04 23:43:32, jeffbrown wrote:
> if the client was able to determine this ahead of time (by looking at the
> descriptor) then we can treat this as invalid arguments and close the
connection

Acknowledged.

https://codereview.chromium.org/1424933002/diff/60001/mojo/services/media/aud...
mojo/services/media/audio/interfaces/audio_track.mojom:73: // kBadState:
On 2015/11/04 23:43:32, jeffbrown wrote:
> This suggests coupling the creation of an audio track with its configuration. 
I
> think it might also be reasonable to say that an audio track can never be
> reconfigured.  If you need a new configuration you can just create a new audio
> track.  Audio tracks are fungible.

Acknowledged.

Agreed, I do not think we currently have a plans to reconfigure audio pipes on
the fly.  Per our face to face discussion, you currently need to have a track
interface in order to query the server's track capabilities.  I'm going to add
this to the design discussion list, but assuming that we all agree, I will come
back later and hoist track capability enumeration up to the server interface
level, and make configuration part of CreateTrack (eliminating this method in
the process).

https://codereview.chromium.org/1424933002/diff/60001/mojo/services/media/aud...
mojo/services/media/audio/interfaces/audio_track.mojom:88:
GetRateControl(RateControl& rate_control) => (MediaResult result);
On 2015/11/04 23:43:32, jeffbrown wrote:
> I can't think of a meaningful error to report here.
> 
> Note that it's not possible for one end of message pipe to be bound the same
> thing twice because it is only ever moved and never duplicated.

It is possible for anyone with access to the AudioTrack interface instance to
create a new interface request (which gives them a new unbound message pipe
endpoint) and call this method again.  The rate control interface facet exposed
by audio tracks currently supports only 1-to-1 relationships, not one to many,
so it will refuse to bind to a second message pipe (if asked to do so) and
return BadState in response.

In theory, there might be some merit in supporting a 1-to-many for the rate
control interface, but I cannot come up with a practical use case yet, so it is
limited to 1-1 for the moment.

I'm going to lump this into the close-connection bucket.

https://codereview.chromium.org/1424933002/diff/60001/services/media/audio/au...
File services/media/audio/audio_output.cc (right):

https://codereview.chromium.org/1424933002/diff/60001/services/media/audio/au...
services/media/audio/audio_output.cc:43: DCHECK(this ==
link->GetOutput().get());
On 2015/11/04 23:43:32, jeffbrown wrote:
> Not sure I understand why this has to happen inside the lock given that the
> InitializeLink method manipulates the link outside of the lock.
> 
> Would it make sense for the initialization to happen on the worker thread too?

> We could say that everything to do with processing and link management is
> effectively asynchronous.  That would make it easier to understand what state
> gets manipulated where.  (Does processing touch the link?)

Link initialization happens outside of the scope of the processing lock in the
context of the message loop thread.  For a base AudioOutput, this is a no-op. 
For derived classes, it is an opportunity to perform tasks involving
initialization of link specific state (such as selecting mixer kernels and
initialization of filter state) without holding up the output's mixing tasks in
any way.

As described in earlier feedback, the processing lock serves two purposes.  In
this context, it serving to protect the links_ collection which must be held
static during a processing callback.

As mentioned earlier, the scope of this protection could be further limited, but
would require the introduction of a separate lock which would need to be
acquired and released in the more limited scope by the derived class.

Re: Does processing touch the link?
Yes, but only after the link has been added to the set of active links.  This is
why initialization and teardown happen outside of the scope of the processing
lock, and only when we can prove that the processing scope can not be holding
any references to the link object itself.

https://codereview.chromium.org/1424933002/diff/60001/services/media/audio/au...
services/media/audio/audio_output.cc:92: // to schedule callbacks.
On 2015/11/04 23:43:32, jeffbrown wrote:
> There's a race condition here given that shutdown may be initiated both by the
> processing function and by the output manager.  When the output manager
> initiates it, it's possible the processing function is still running and may
try
> to schedule another callback.
> 
> This is one reason why it might be a good idea for shutdown to be more
directed.

No, I do not think that there is.  I took care to ensure that there was not. 
Please refer to my extensive response in the header file feed back to see why
there is no race here.

If you disagree with my analysis (I cannot bring myself to refer to it as a
"proof") given there, please provide me the hypothetical sequence of actions
which would lead to the thread-safety violation.

https://codereview.chromium.org/1424933002/diff/60001/services/media/audio/au...
services/media/audio/audio_output.cc:191: processing_lock_.Release();
On 2015/11/04 23:43:32, jeffbrown wrote:
> Obviously this blocks the main thread but I have to wonder whether that's
> strictly necessary.
> 
> If the processing state is fully decoupled from the output object itself then
> there might not be any harm in letting it run to completion and kill itself
> asynchronously.  We would only need to update the bookkeeping here.
> 
> This assumes there isn't any other shared state to worry about (and it might
be
> better if there weren't).

I believe that I have already address this in responses given previously. 
Processing state cannot be completely decoupled from the output object as it is
a function of the track/output relationship.  Please review my earlier analysis.
 If you still disagree, lets get together and work this out face to face; for
complex interactions like this, I feel that coming to a mutual understanding is
more efficiently done in the context of an interactive dialog instead of via
email/code-review with its high latency.

https://codereview.chromium.org/1424933002/diff/60001/services/media/audio/au...
File services/media/audio/audio_output.h (right):

https://codereview.chromium.org/1424933002/diff/60001/services/media/audio/au...
services/media/audio/audio_output.h:31: // by this outputs.  Called only from
the main message loop.  Obtains the
On 2015/11/04 23:43:32, jeffbrown wrote:
> outputs -> output

Done.

https://codereview.chromium.org/1424933002/diff/60001/services/media/audio/au...
services/media/audio/audio_output.h:32: // processing_lock and may block for the
time it takes the derrived class to
On 2015/11/04 23:43:33, jeffbrown wrote:
> derrived -> derived

Done.

https://codereview.chromium.org/1424933002/diff/60001/services/media/audio/au...
services/media/audio/audio_output.h:34: // called.
On 2015/11/04 23:43:32, jeffbrown wrote:
> In general it's safer to avoid immediately calling back into object here.  Are
> there any downsides to simply scheduling the task to be run at the next
> available opportunity?

I'm not sure that I understand the question.  As the comment mentions, things
which call these methods are always scheduled to run on the main message loop
thread; IOW - if an output want to shut itself down, for whatever reason, the
act of unlinking will be scheduled on the message loop thread, and "run at the
next available opportunity".

If an output is being added or removed because of an action taken on the server
side of thing, it is already in the context of the message loop thread, and this
is the "next available opportunity".

https://codereview.chromium.org/1424933002/diff/60001/services/media/audio/au...
services/media/audio/audio_output.h:53: virtual MediaResult Init();
On 2015/11/04 23:43:32, jeffbrown wrote:
> Consider using a convention such as OnInit() to distinguish methods which are
> intended to be overridden by implementors from others which are not.

If I had not intended for the method to be overridden, I would not have marked
it virtual, nor would I have added the helpful comment telling people that they
should override this method and do particular things here.

Furthermore, if this was a method which was meant to be overridden, and I was
overriding it in a derived class, and I didn't want subsequent derivations to
override the method, I would mark it "final" instead of marking it as
"override".

Finally, if I wanted to added a non-virtual method, and force the client to
never have a method with the same signature, I could also mark it as "final". 
This has the unfortunate side effect of forcing the method to become virtual,
and non-overridable.  I probably would not bother to do this because it..
1) costs something.
2) protects nothing.  Derived classes can always add methods with whatever name
they want, but someone with a base class pointer cannot accidentally call them
without first casting to the derived class.

https://codereview.chromium.org/1424933002/diff/60001/services/media/audio/au...
services/media/audio/audio_output.h:65: // Called from within the context of the
processing lock any time a scheduled
On 2015/11/04 23:43:32, jeffbrown wrote:
> Technically you don't need a lock to guarantee serialization.
> 
> Does this run on the main thread too or on a worker?  If it's on a worker, you
> might want to make the subclass set the processing callback in some clearly
> distinguishable way, such as by passing a closure to a function to register or
> schedule it.

The lock is not being used to guarantee serialization.  Serialization is
guaranteed by the base::SequencedTaskRunner which acts as the thread pool and
imposes the message driven architecture encouraged by the mojo folks.

The lock is used for the following two purposes...
1) Protecting the client/link set.
2) Synchronizing with callbacks in-flight during shutdown.

re #1
Right now, the client set is held static for the duration of a "process"
operation.  This scope can be tightened (by adding a specific lock to protect
the set), but would shift the burden of properly protecting the link set to the
derived classes which actually process their links during the processing
callback.

re #2
When final cleanup is happening in the context of the main message loop thread,
we need to reach a point in time where we have a strong guarantee that any
callbacks in flight have either been safely canceled, or run to completion.  A
classic pattern is implemented here... The guy doing the shutdown (main message
loop) begins by flagging the TaskRunner to shutdown an cancel all callbacks,
then sets a volatile flag in the output indicating that any callback which had
already launched should fast-abort.  Then, it attempts to enter the callback
lock.

On the other side of things, A callback in flight begins by promoting its weak
pointer to the output object, and then entering the processing lock and fast
aborting if the shutdown flag is set.  This flag is made available to derived
classes so that they may also abort their computation early if a shutdown has
been requested.

Back on the message loop thread, if an output process task has made it past all
of these checks, it will end up blocking until the output process task notices
the shutdown flag, or terminates normally.  After it has successfully entered
the lock, it has achieved its strong guarantee that all in-flight callbacks have
been effectively canceled, or run to completion.

A couple more small points.
+ Memory safety from the processing callback perspective is ensured by the weak
pointer.  If a callback is in flight, and the task runner shutdown does not
manage to cancel the task before a thread picks it up and is running, and the
message loop side of things gets all of the way to the point where it has
destroyed the object, the weak pointer promotion will fail and the callback will
do nothing.
+ OTOH, if the process callback successfully promotes the weak pointer, and then
the message loop runs to the point of releasing its reference before the
processing callback takes any other steps, the process callback will enter an
uncontested lock, notice the shutdown flag and immediately exit the lock having
done nothing.  When the promoted weak pointer goes out of scope, the output's
memory will be returned to the heap.  No complex interactions take place in the
destructor of an audio output, just freeing memory.  All freeing of driver
specific resources happened in the context of the message loop thread during the
shutdown process after the processing callbacks had been successfully
neutralized.

Re: thread context.  It runs strictly within the scope of on of the threads
which comes from the base::SequencedTaskManager thread pool.  This base:: object
ensures that for any given output, all scheduled callbacks are serialized
(although, not guaranteed to be the same thread between successive invocations).
 I will update the comment to attempt to make this more clear.

https://codereview.chromium.org/1424933002/diff/60001/services/media/audio/au...
services/media/audio/audio_output.h:86: // from the within a processing
callback, the processing_lock will always be
On 2015/11/04 23:43:32, jeffbrown wrote:
> This here suggests that these methods should be provided as part of a context
> object to the processing callback rather than mixed in here alongside
everything
> else.
> 
> ie. The code could invoke the processing function and pass an object with the
> relevant state and functions.
> 
> Process(ProcessingContext* context)

I don't understand.  The context, in this case, is the base object itself. 
Actions need to be taken to schedule the next callback, and to begin the process
of deferring shutdown to the message loop thread which require the private state
of the base object itself.  Why would it be advantageous to add a layer of
indirection between the derived object and its base class?

https://codereview.chromium.org/1424933002/diff/60001/services/media/audio/au...
services/media/audio/audio_output.h:115: // priorities will be the trick).
On 2015/11/04 23:43:33, jeffbrown wrote:
> Could just use a heap.

sure, I pretty much already am; right now I am using a std::set which is
probably a red/black tree internally.  I could either overload the less operator
used to sort the set, or I could just use a std::map (I would prefer the former,
but need to think about it some).

What I am really concerned about here (and expressed in previous versions of
this comment, but somehow lost by the time it got here) is optimizing for
indexing this set two ways.  There is a short list of operations which need to
be optimized here, IMO.

1) Traversing the list in priority order.
2) Adding a track/link to the set.
3) Removing a track/link from the set.
4) Altering the priority of a track which is in the set.

re #1: This is the primary operation as it will happen far more often than other
ops.  It must be O(k).  Indexing the set based on priority is the obvious
solution.

re #2: This is an insertion operation based on the priority index.  As the
insertion into the underlying std::set/std::map is guaranteed to be O(log2), we
should be good here simply by indexing by priority.  (most implementations I
have every looked at of set/map use a red-black tree under the hood, so are
guaranteed to be 2*log(N) at worst)

re #3: Removing a track from the set involves looking it up by
id/pointer/handle/whatever.  Right now, this is O(log2) because my current set
is indexed by pointer.  If I change that to be indexed by link priority, then
removal becomes O(N).  This is undesirable, not because I am overly concerned
with the client (it is probably shutting down) but because it need to hold the
link set static for the duration of the lookup and removal operation which means
that if we want to perform a mix operation, we may have to wait for the O(N)
operation to complete.

re #4: This is basically an atomic remove-then-insert operation.  In the outline
above this would be O(N).  If #3 can be optimized, it becomes O(log2)

So; there are a few different ways I could optimize #3.  One way would be to
index the set two different ways.  Given that std::set/map do not support this,
you would need to accomplish this by making the object a member of two different
set-like objects.  In order to make this an improvement over what is described
here, you need to be able to get to the object in Index #2 after looking it up
in Index #1 in O(k) time (and vice-versa).  The best way I know to do this is to
use intrusive containers, and make the set bookkeeping for the indices part of
the object itself.  std:: containers do not support intrusive semantics.  boost
containers do, but are clearly not allowed here.  Rolling my own is unappealing,
but if I had to resort to such a thing, I would want to make it part of the
overall project, not just this.

Another option is what I am attempting to describe above.  If I can guarantee
uniqueness of priority, the set could be indexed based on priority alone and we
could leave it at that.  How the uniqueness is determined is the tricky part. 
One option would be to build a key which is made of the priority in the high
order bits, and the this pointer of the link in the low order bits.  This would
end up being a 96 bit object and would suffer some lookup performance hit as a
result.  Another way to do it would be to use the sequence number I mention
above.  Instead of the pointer, use a 32 bit sequence number generated at link
instantiate time appended to the priority in order to uniquify the key.  This
has the advantage to keeping our key to 64 bits, but has the disadvantage of
making sequence number generation complex.  Sequence numbers must either be
assumed to be unique (likely to be OK, but strictly speaking not OK), or they
need to be tested for uniqueness.  Testing them using this container alone is
technically O(N); a separate set of active sequence numbers would need to be
introduced to guarantee no collisions in O(log2) during generation.  Additional
complexity, but performed entirely on the message loop thread, so no locks are
required.  If the message loop ever becomes multi-threaded, the scope of lock
contention is limited to the input domain and does not affect the output domain.

TL;DR?
Yeah, I have thought about this some and there are some fine points which
deserve consideration.  I'm going to wait until we implement the priority
feature before implementing a solution.

https://codereview.chromium.org/1424933002/diff/60001/services/media/audio/au...
services/media/audio/audio_output.h:138: // Called from Shutdown (main message
loop) and ShutdowSelf (processing
On 2015/11/04 23:43:32, jeffbrown wrote:
> I think it's a little weird for there to be two different sources of shutdown
> events (the processing callback and the output manager).  It runs the risk of
> the output getting into states that the output manager doesn't expect.
> 
> Consider providing some kind of callback for the output to tell the output
> manager that it's in a bad state and needs to be shut down so that the output
> manager is always the one driving the shutdown and can stay in sync with
> everything.

That is how things handled now.  Shutdown self posts a message to the main
message loop (executed by the output manager) which actually performs the
shutdown through the central Shutdown choke-point.

https://codereview.chromium.org/1424933002/diff/60001/services/media/audio/au...
services/media/audio/audio_output.h:153: base::Lock processing_lock_;
On 2015/11/04 23:43:33, jeffbrown wrote:
> We won't need this lock if the processing state is encapsulated elsewhere.

I disagree.  See my comments regarding synchronization above.

https://codereview.chromium.org/1424933002/diff/60001/services/media/audio/au...
services/media/audio/audio_output.h:154: base::Lock shutdown_lock_;
On 2015/11/04 23:43:32, jeffbrown wrote:
> We won't need this lock is shutdown is always initiated by the output manager
on
> the main thread.

Unfortunately, I think it is an important requirement that shutdown be initiated
from either side of the threading-fence.  There are many very good reasons that
an output may need to shut down in a timely fashion due to some unrecoverable
error situation.

Note: shutdown is not executed in the context of the thread-pool thread, it is
simply scheduled to execute on the main message loop thread via BeginShutdown. 
The shutdown_lock_ is used to make this operation thread safe in the unlikely
case that both side of the system decide that they want to shut down an output
at the same time.

https://codereview.chromium.org/1424933002/diff/60001/services/media/audio/au...
File services/media/audio/audio_output_manager.cc (right):

https://codereview.chromium.org/1424933002/diff/60001/services/media/audio/au...
services/media/audio/audio_output_manager.cc:69: // TODO(johngro): Probably
should log something about this, assuming that
On 2015/11/04 23:43:33, jeffbrown wrote:
> Another approach would be to handle this symmetrically with other kinds of
> asynchronous self-shutdown of the output.  We could assume Init always
succeeds
> here and wait for a callback to tell us otherwise.
> 
> Bonus of that is that the output could actually do its real hardcore
> initialization in the background on some other thread.  Useful in case it's
got
> to open up and configure audio devices which could take a *very* long time. 
> (USB Audio anyone?)

Perhaps, but I do not feel that there is ever a good solution to badly behaved
device drivers.  Currently, pushing this task into the scope of an "output
thread context" (loosely defined term here) it not a solution to the problem as
the output threading context is done withing the context of a thread pool, and
badly behaved drivers will just end up pissing the the thread pool and
potentially damaging other outputs because of their bad behavior.

A thread-per-output is not a universal solution to this problem either.  At some
point, the main system needs to synchronize with the state of the output,
whether the output owns a dedicated thread or not.

The dependencies mentioned above can be broken down to the point where we
require a thread per output, or dedicated init/shutdown threads, and impose an
architecture where all interactions with the driver/os happen on the dedicated
thread.  We are almost already there with this design, but badly behaved drivers
can still DoS our service or the whole system even in this hypothetical design
(examples available upon request).

Fundamentally, hardware and drivers are highly privileged.  I feel that our
first line of defense should be a solid OS (see reasons for FNL), high levels of
compliance testing (see Android CTS, Apple's monopoly on
hardware/kernel/drivers, Microsoft's WHQL compliance testing), and draconian
consequences in the driver model for breaking the rules (see WDM/BSoD, and...
well... I guess Linux and therefor Android has no such thing... but hopefully
FNL will be better).

DIRQ_LEVEL_NOT_LESS_THAN_OR_EQUAL anyone?

Seriously though, this is just a start.  We will refine as time goes on and we
all have a better idea of what guarantees the HAL/OSAL can provide, and what
practical mitigations we can implement of the underlying OS is less than we had
hoped for.  At the end of the day, the OS/Drivers must be solid.  If they cannot
be, then it really does not matter what we do here.  Windows will BSoD if a
driver shuts off interrupts and deadlocks (or makes any attempt to acquire any
lock for that matter).  Linux will not, and Android will always be vulnerable to
this as a result.  It is upon us to ensure that FNL is built to a higher
standard than what Linux currently offers.

https://codereview.chromium.org/1424933002/diff/60001/services/media/audio/au...
services/media/audio/audio_output_manager.cc:93: // but it should not take long.
On 2015/11/04 23:43:33, jeffbrown wrote:
> Famous last words.  :)

See above.  Hopefully not, but we will adapt if they prove to be incorrect.  I
have already been instructed not to concern myself too much with bad linux
behavior, but to focus instead on making sure that FNL does its job correctly
(which is to say it does provide guarantees, instead of shirking those
responsibilities and make them the responsibilities of higher levels).

https://codereview.chromium.org/1424933002/diff/60001/services/media/audio/au...
File services/media/audio/audio_output_manager.h (right):

https://codereview.chromium.org/1424933002/diff/60001/services/media/audio/au...
services/media/audio/audio_output_manager.h:44: // Very Seriously Wrong.
On 2015/11/04 23:43:33, jeffbrown wrote:
> Instead of this comment, maybe just make it blow up.

I already have some DCHECK protection for the idempotency requirement.  Concrete
suggestions for a universal DCHECK approach to enforcing safe-to-call from
output manager destructor requirement are welcome.  I'm all about DCHECKing my
internal invariants, but this one seems like a difficult one to universally
express in code.

https://codereview.chromium.org/1424933002/diff/60001/services/media/audio/au...
services/media/audio/audio_output_manager.h:92:
scoped_refptr<base::SequencedWorkerPool> thread_pool_;
On 2015/11/04 23:43:33, jeffbrown wrote:
> Alternately if the number of audio outputs is small then perhaps it would be
ok
> to give each output its own dedicated thread.  That's kind of nice from the
> perspective of treating audio outputs as autonomous agents so that init and
> shutdown can also be managed asynchronously just like processing work.

Acknowledged.
We may end up at this point.  Per-our face to face discussion, this does not
eliminate OSAL issues.  I'm making a good faith effort here to use the exiting
Mojo OS abstractions, and to use the message-centric architecture that the Mojo
team seems to prefer and has encouraged me to try.

If it turns out that hard data shows that this message-centric architecture is
not sufficiently performant, and I cannot make it so by working with the Mojo
team, I will move in the direction of a message free event/signal architecture
(generally, my preferred architecture).  Currently, Mojo has limitations when it
comes OS abstraction which prevent me from doing this.  This is orthogonal to
the thread pool vs. not thread pool issue.  Even if I do not use a thread pool,
I already have this problem with attempting to use Mojo as my OSAL.  Before
resorting declaring the Mojo framework to be incapable of the task, I am going
to attempt to use it as is, and work with the team to bring the framework to the
point where it will meet my requirements.

This list above is an initial list of where I have some issues with the support
provided by the framework.  While I am willing to make a best effort attempt to
meet in the middle regarding performance and timing requirements, I do have some
hard requirements.  I feel that these are requirements not frivolous.  If they
cannot (or will not) be met by the framework's capabilities, I will reluctantly
embark upon the task of rolling-my-own solution.  It has not come to that point
yet, and I hope it does not.

https://codereview.chromium.org/1424933002/diff/60001/services/media/audio/au...
File services/media/audio/audio_pipe.cc (right):

https://codereview.chromium.org/1424933002/diff/60001/services/media/audio/au...
services/media/audio/audio_pipe.cc:32:
server_->SchedulePacketCleanup(std::move(state_));
On 2015/11/04 23:43:33, jeffbrown wrote:
> Destructor side-effects like these make be a little nervous, but ok.  Assuming
> this is RAII-ish.

yeah, I am not thrilled with this either.  It is a necessary consequence of not
being able to safely send messages on a Mojo message pipe from anything but the
main message loop thread.  It also adds unfortunate latency where we really want
to have none at all.  Clients can currently mitigate this using a strictly feed
forward timing model for buffering, and over-allocating their buffer just a bit,
but I would very much like to eliminate this trampoline behavior in general.

For now, this consequence of what the mojo team has asked me to do based on
current framework architecture.  Moving forward, I would like to collect some
hard data on the performance penalty, and work to address the issue (presuming
it is real and significant, and I expect that it will be at scale), either in
the fundamental design of inter-process media transport, or in the thread-safety
capabilities of the mojo framework.

https://codereview.chromium.org/1424933002/diff/60001/services/media/audio/au...
services/media/audio/audio_pipe.cc:61: std::vector<AudioPacketRef::Region>
regions;
On 2015/11/04 23:43:33, jeffbrown wrote:
> I was slightly surprised to see you allocating on the heap here.

I was made slightly ill by my actions in doing so.  At this point, even thinking
about what it costs to move media from point A to point B makes me want to go
lie down in a dark room and cry.  What makes me hate myself even more is that I
have (repeatedly) caved into logic from others which says (in effect) "Its
already bad, whats wrong with a little bit more bad?"

At this point, the situation is not good.  Making it better will require hard
data and load requirements to drive the system in the direction of efficiency as
a whole.  This is a post V0 thing, but rest assured that I have a long mental
list of the places where I already know we can come back and start to make
things better.

https://codereview.chromium.org/1424933002/diff/60001/services/media/audio/au...
services/media/audio/audio_pipe.cc:68:
state->SetResult(MediaResult::INVALID_ARGUMENT);
On 2015/11/04 23:43:33, jeffbrown wrote:
> can we just close the pipe?

Acknowledged.
Placing the general close-the-pipe-after-V0 bucket.

https://codereview.chromium.org/1424933002/diff/60001/services/media/audio/au...
services/media/audio/audio_pipe.cc:80: static_cast<const uint8_t*>(buffer()) +
(*region)->offset,
On 2015/11/04 23:43:33, jeffbrown wrote:
> Make sure to check that the offset is within the buffer and is properly
aligned
> too.

Offset and length containment has already been handled by the MediaPipe level of
things before we got here.  Alignment and length-divisibility restrictions
belongs at that level as well, I have multiple comments all over that code
calling that out.

I have not bothered to address this so far because...
1) This is V0, and I was concerned with unblocking to enable parallel
development.
2) There are already cries to scrap this whole thing and redesign.  Until I get
to the other side of that debate and have reached consensus, I am not
particularly motivated to continue to improve something which may be completely
replaced in the near term anyway.

https://codereview.chromium.org/1424933002/diff/60001/services/media/audio/au...
services/media/audio/audio_pipe.cc:117: std::move(regions),
On 2015/11/04 23:43:33, jeffbrown wrote:
> So this object contains a pointer to an mmap'd region of the media pipe
buffer. 
> Are we guaranteed that the pointer will be discarded before the buffer is
> unmapped?
> 
> I have the feeling this buffer is going to flow deep into the system and be
sent
> to worker threads which are operating asynchronously.  Consider including a
> reference counted reference to the buffer along the way so we don't screw it
up.

FWIW - enforcement like this belong at the MediaPipeBase level, not at this
level.  MediaPipeBase manages the mapping and unmapping of the buffer, as well
as the sanity checking of the MediaPackets and MediaPacketState objects created
to manage the lifetime of the objects whcih refer to the memory as well as the
callback which *must* be made to inform the client that the shared buffer is
available for writing again.

Re: lifecycle management issues.
It is currently guaranteed based on they way things used in the system as a
whole, but I agree that this is a weak and non-obvious guarantee.

Having a stronger, more direct, guarantee would require one of a few different
techniques...

1) MediaPacketState objects could hold a strong reference to the state in the
MediaPipeBase which controls the mapping/unmapping of the shared buffer. 
Expensive, but would provide a strong guarantee.
2) MediaPacketState objects could pursue a #1 style solution, but hold a weak
reference instead.  This puts further requirements on the consumer of the media
in the form of promoting the weak reference and releasing  generated strong
reference, but it provides an equally strong guarantee along with better
guarantees on timely unmapping of the shared buffer.
3) MediaPipeBase objects could implement some form of debug only ref counting
checks.  This would require MediaPacketState objects to hold some form of
reference to the MediaPipeBase object (currently, they hold none).  This still
relies on "everyone managing the lifecycle properly" which is the current state
of affairs, but provides some debug sanity checking.  I actually considered
implementing this early on, but I think that I decided that if I was going to
pay the price of holding a reference in the MediaPacketState object, that I
should go with #1 or #2 and have a strong guarantee instead of #3 which provides
only some debug checking on an already weak promise (its not even a guarantee,
TBH).

So;
I'm going to put this on my list of TODOs, and will most likely pursue #1
assuming that any of this code survives the up-and-coming MediaPipe design
post-mortem.

https://codereview.chromium.org/1424933002/diff/60001/services/media/audio/au...
File services/media/audio/audio_pipe.h (right):

https://codereview.chromium.org/1424933002/diff/60001/services/media/audio/au...
services/media/audio/audio_pipe.h:42: // of 0.
On 2015/11/04 23:43:33, jeffbrown wrote:
> Might be good to define a struct for manipulating fixed-point values of this
> type.

I'm not sure that a struct is what you want here (in fact, I'm almost 100% sure
that it is not, but am more than willing to concrete proposals for how they
could make things better).  I'm not even sure that we want to keep the concept
of fractional timestamps expressed in track frames at this level. 
Unfortunately, allowing media timelines to be expressed in units which are not
fundamentally divisible by the underlying frame rates of the substreams (think
audio and video, but there are other valid types of substreams in the media
world) and an un-avoidable consequence of this decision.

The MPEG-2 system layer (both PS and TS) fundamentally disallows this given
their specific choices of media clock rates (27MHz for TS and 90KHz for PS), and
restricted choices of audio and video frame rates (easy to do when audio and
video are restricted to MP1L[123] audio and MP[12] video).  Of course, this
immediately becomes undone by people who then turn around and encapsulate other
audio and video codecs which support broader options for frame rates in the
container.

Other systems just give up and spec high resolution timestamps (usually in
expressed nanoseconds, but sometimes things like 100s or 10s of nanoseconds) as
their universal, internal clock resolution.  Gstreamer and (IIRC) DirectShow
took this approach.  I have found this to be a reasonable approach, provided
that PTS interpolation is handled properly (see below).

Other containers (I'm looking at you FLV) just fundamentally drop the ball and
spec millisecond as their media time, which is entirely insufficient, regardless
of the nature of the encoded media they are attempting to transport.

TL;DR?  It basically becomes the problem of the player/renderer combination to
deal with.  I provide a solution by allowing...

1) Unsynchronized audio to just not deal with timestamps at all (I have a bit of
work to do to properly handle this while guaranteeing no truncation on output,
but I promise you that this is not too difficult).
2) Synchronized audio where initial timestamps may be provided at the container
resolutions at random access points, and interpolated from there on out.  The
solution I use below deals properly with interpolation by using transformations
from the initial, explicitly provided, PTS.
3) Strictly obeying PTS explicitly provided by the user/player.  Right now, this
requires the player to deal with the problem of creeping rounding error. 
Allowing some level of slop in this to account for explicit (but incorrect),
timestamps is a subject I bring up in some of my initial requirements documents,
but have not handled here.  We need to decide if we are going to do anything at
all about it, and if we do, design/document the system first.  For now, either
get your PTSs right, or let me interpolate them for you.

https://codereview.chromium.org/1424933002/diff/60001/services/media/audio/au...
File services/media/audio/audio_server_impl.h (right):

https://codereview.chromium.org/1424933002/diff/60001/services/media/audio/au...
services/media/audio/audio_server_impl.h:67:
scoped_refptr<base::SingleThreadTaskRunner> task_runner_;
On 2015/11/04 23:43:33, jeffbrown wrote:
> Do we need to reference count this?  Can we ensure everything that uses it
gets
> shut down before we destroy it?  It seems to me that if anyone is handing onto
> the task runner longer than they should then they're possibly holding onto
other
> things they shouldn't too.

It is a side effect of best practice for using a base::TaskRunner in general. 
Tasks scheduled on the TaskRunner need to be bound to it.  The way base::
bindings work requires the object to be refcounted (All task runners are) and
for users to hold scoped_refptr to them.

https://codereview.chromium.org/1424933002/diff/60001/services/media/audio/au...
File services/media/audio/audio_track_impl.cc (right):

https://codereview.chromium.org/1424933002/diff/60001/services/media/audio/au...
services/media/audio/audio_track_impl.cc:109: // Are we already configured?
On 2015/11/04 23:43:33, jeffbrown wrote:
> Here's a place where it would be nice to just configure at the same time the
> track is created.  Reduces the state space.

Acknowledged.
Its on the TODO list.

https://codereview.chromium.org/1424933002/diff/60001/services/media/audio/au...
services/media/audio/audio_track_impl.cc:118:
cbk.Run(MediaResult::UNSUPPORTED_CONFIG);
On 2015/11/04 23:43:33, jeffbrown wrote:
> Assuming clients should have known better before asking for this, just close
the
> connection.
> 
> I presume clients will need to know a-priori what kinds of configs are
supported
> so if they got this far then they already screwed up.

Acknowledged.
In the close-the-connection TODO bucket.

https://codereview.chromium.org/1424933002/diff/60001/services/media/audio/au...
services/media/audio/audio_track_impl.cc:144: int32_t  N =
static_cast<int32_t>(configuration->audio_frame_ratio);
On 2015/11/04 23:43:33, jeffbrown wrote:
> numerator / denominator, or num / denom
> 
> Also, seeing int32_t and uin32_t used together for these fractions makes me
> worry that the wrong types might get used someplace (possibly in the client if
> not here).  We spoke earlier of using a separate bool to encode whether
playback
> is forward or reversed.  I think that might be worth exploring further in
future
> patches.
> 
> (Or make both the numerator and denominator signed at the cost of 1 bit of
> precision.)

re: future efforts...
Precision is important here.  I am loathe to sacrifice it.  I am much more
inclined to add the bool to gain another bit of precision rather than losing one
and adding redundant states.

Before I go either route, I want some consensus on whether or not signed scaling
is important at all or not.  For a generalized linear transformation library, it
is.  For media, it can be, but only in the context of reverse trick play (a
phrase which makes most people make "ewwwww, yucky!" faces) and usually requires
a ton of tricksy-ness at the player level to get right anyway (where you can
make an argument that perhaps the timestamps should just be re-written there;
the one case where they should be).

It should be noted that, with some minor modification, the sampler core here
could actually (rather easily) play audio backwards without needing to re-order
the samples in the buffer or re-write timestamps.  I'm probably the only person
who thinks that is cool, however, so...

re: numerator and denominator..
Dude... You are killing me....  The values came from audio_frame_RATIO and
media_time_RATIO.  The are then fed into a LinearTransform::***RATIO*** and
multiplied by other RATIOs.

If you don't know what N and D mean in this context, ffs how did you manage to
pass the Google interviews?

Done. (but my soul is crying inside).

https://codereview.chromium.org/1424933002/diff/60001/services/media/audio/au...
services/media/audio/audio_track_impl.cc:204: format_ = cfg->Clone();
On 2015/11/04 23:43:33, jeffbrown wrote:
> Looks to me like the config is just going to be dropped on the floor when we
> return.  So we might as well just Pass() it and take ownership.

Done.

The fact that this is even allowed is a bit surprising/disturbing.  Had an
interesting conversation with Vardhan about it.  Someday over drinks, we should
discuss the ramifications which allowing this might have on reaching a minimal
copy/minimal heap allocation world when it comes to marshalling parameters
across mojom interfaces.

https://codereview.chromium.org/1424933002/diff/60001/services/media/audio/au...
services/media/audio/audio_track_impl.cc:252: // something about this?
On 2015/11/04 23:43:33, jeffbrown wrote:
> DCHECK!  We failed an internal invariant.

Done.

https://codereview.chromium.org/1424933002/diff/60001/services/media/audio/au...
services/media/audio/audio_track_impl.cc:256: void
AudioTrackImpl::OnPacketReceived(AudioPipe::AudioPacketRefPtr packet) {
On 2015/11/04 23:43:33, jeffbrown wrote:
> Do we need to DCHECK the packet?

no, but it does not hurt to do so.  I have been called out by many engineers
many times for DCHECKing pointers which I am just about to de-reference.  Their
argument is that de-reffing NULL is just going to crash anyway (and mine is that
the DCHECK/ASSERT implementation usually produces much nicer output)

Done.

https://codereview.chromium.org/1424933002/diff/60001/services/media/audio/au...
services/media/audio/audio_track_impl.cc:265: DCHECK(output);
On 2015/11/04 23:43:33, jeffbrown wrote:
> I feel like this DCHECK is kind of redundant given we're already enforcing
> invariants on what goes into the outputs list, but it's not harmful.

Acknowledged.

Can't have it both ways.  Either we are checking at levels like this, or we are
not.  If we are it says, and I add the DCHECK(packet) above.  If not, I strike
both of them.

I lean in the cautious direction, so DCHECKs all around!

https://codereview.chromium.org/1424933002/diff/60001/services/media/audio/au...
File services/media/audio/audio_track_impl.h (right):

https://codereview.chromium.org/1424933002/diff/60001/services/media/audio/au...
services/media/audio/audio_track_impl.h:28: static constexpr size_t
PTS_FRACTIONAL_BITS = 12;
On 2015/11/04 23:43:33, jeffbrown wrote:
> Maybe make a new fixed point datatype instead.

perhaps, but I would rather not at the moment.  I can make an entire templated
fixed point math library, but right now my use cases require nothing that
sophisticated.

I will put it on the list of things to consider doing eventually (esp. as the
team grows and we have more bandwidth) but right now I think we have much more
important things to do.

https://codereview.chromium.org/1424933002/diff/60001/services/media/audio/au...
services/media/audio/audio_track_impl.h:35: void Describe(const
DescribeCallback& cbk) override;
On 2015/11/04 23:43:33, jeffbrown wrote:
> These should be private.

Done.

https://codereview.chromium.org/1424933002/diff/60001/services/media/audio/au...
services/media/audio/audio_track_impl.h:68: AudioTrackImplWeakPtr     
weak_this_;
On 2015/11/04 23:43:33, jeffbrown wrote:
> Using lots of typedefs makes it harder to understand the semantics of these
> values without having to grep the header files.  Is this a WeakPtr?  I'm not
> sure.  You might want to tone it down a bit before this turns into Hungarian
> notation.

I'm not sure of exactly what you are concerned about here...  Is it the type
name or the member variable name?

Re; the type name...
I am attempting to follow the mojo style here.  They (frequently) generate
pointer types for their smart pointers to their interfaces and other objects. 
It follows the form MyObjectOrInterfacePtr.

Re: Is this a WeakPtr?
If you are asking about the type name, I had hoped that...
AudioTrackImpl**WeakPtr** had made it clear that this type was a weak pointer to
an AudioTrackImpl.

Re: typedefs.
The relationships between objects in this system mandates that many of the
objects hold various references and pointers to other objects in the system. 
Unless I wanted to end up in circular include file hell (is there a special
circle for that sin?), this requires many forward declarations, which is why I
centralized it all in the fwd_decls.h header.  I went ahead and defined types
for pointers and weak pointers and sets and the like because..
1) It seems to be the local mojo style.
2) The std pointer are collections are great!  Their syntax gets to be a little
gross (go ahead and type    std::set<std::shared_ptr<AudioTrackToOutputLink>,
std::owner_less<std::shared_ptr<AudioTrackToOutputLink>>>;
five times fast.)
3) Using typedefs help to enforce top level ownership design decisions.  Is this
supposed to be a unique_ptr or a shared_ptr?  The decision has already been made
for you, and is enforced for you.  In the unlikely case that we want to change
that decision, or switch from std::shared_ptr to mojo::even_better_shared_ptr,
we can do it in one place instead of all over the place.

Finally, re: weak_this_
This is an unfortunate side effect of the non-intrusive nature of shared/weak
pointers in the std:: library.  Sorry.

https://codereview.chromium.org/1424933002/diff/60001/services/media/audio/au...
services/media/audio/audio_track_impl.h:70: Binding<AudioTrack>        binding_;
On 2015/11/04 23:43:33, jeffbrown wrote:
> alignment is a little wonky here
> 
> More importantly, I don't think git cl format will let you get away with
this...

yeah, this happened because of the global rename from Sink to Track.  It has
been fixed.

https://codereview.chromium.org/1424933002/diff/60001/services/media/audio/au...
File services/media/audio/audio_track_to_output_link.cc (right):

https://codereview.chromium.org/1424933002/diff/60001/services/media/audio/au...
services/media/audio/audio_track_to_output_link.cc:53: // queued in.
On 2015/11/04 23:43:34, jeffbrown wrote:
> FWIW, we might actually find it worthwhile to push all packets through the
audio
> output itself as usual even when flushed.  Just mark them as flushed so that
> they can skip the main processing stages.
> 
> That way there will be one consistent path by which packets are queued,
> processed, and retired.  I've found this kind of pattern useful for other
> pipelines and it's no more expensive in the end.

As discussed F2F yesterday, this will add a ton of statefulness to the front end
of the flush operation.  We can consider it later on, but I still feel that I
can make number a strong argument for handling it like this, including (but not
limited to)

1) Performance.
2) Centralization of implementation (and reducing the responsibilities of all
the various output implementations)
3) General complexity and maintainability.

https://codereview.chromium.org/1424933002/diff/60001/services/media/audio/au...
File services/media/audio/audio_track_to_output_link.h (right):

https://codereview.chromium.org/1424933002/diff/60001/services/media/audio/au...
services/media/audio/audio_track_to_output_link.h:34: // system.  No assumptions
may be made about threading when destructing.
On 2015/11/04 23:43:34, jeffbrown wrote:
> This could be amended by making the cleanup signal come from only one place. 
> Might be worth exploring in a future patch.

Acknowledged.

https://codereview.chromium.org/1424933002/diff/60001/services/media/audio/fw...
File services/media/audio/fwd_decls.h (right):

https://codereview.chromium.org/1424933002/diff/60001/services/media/audio/fw...
services/media/audio/fwd_decls.h:21: using AudioOutputPtr =
std::shared_ptr<AudioOutput>;
On 2015/11/04 23:43:34, jeffbrown wrote:
> Are we actually taking advantage of the fact that these pointer are reference
> counted or do we only need weak semantics?  If the latter, consider using
> base::WeakPtr perhaps.

I'm not sure I completely understand your question...  One or more strong
references  to the Output and Track objects exist in a functioning system, as
well as (potentially) many weak ones.  Whether or not this is implemented using
a reference count or faerie dust seems like an implementation details to me.

The strongest argument I have received from James R. about using base:: pointer
classes is that they are instrusive and save allocation overhead (an argument
which I find near and dear to my heart).  However, whatever form the bookkeeping
takes (reference count, faeries, whatever), in order to have weak semtantics, it
cannot be part of the referenced object itself (otherwise, how would the object
go away, but still allow the weak references to exist).

Assuming that this is correct, I am also trying to abide by the directive "to
use std:: c++11 wherever I can and to use base:: only when there is not
equivalent functionality in c++11 std::".

So: Help me to understand what base::WeakPtr does which std::weak_ptr does not. 
Right now, using std:: seems to give me...

1) standardized shared/weak interface semantics.
2) interoperability with other components of the std:: library (see
std::owner_less being used as my sorting predicate for std:: container classes
below)
3) No more overhead (to my knowledge) than another implementation would require
(note my assertion that some sort of out-of-object bookkeeping seems like a
requirement)

https://codereview.chromium.org/1424933002/diff/60001/services/media/audio/pl...
File services/media/audio/platform/generic/mixer.cc (right):

https://codereview.chromium.org/1424933002/diff/60001/services/media/audio/pl...
services/media/audio/platform/generic/mixer.cc:19: if (!src_format) { return
nullptr; }
On 2015/11/04 23:43:34, jeffbrown wrote:
> Seems worthy of a DCHECK.  Otherwise the caller has to deal with nullptr which
> might be worse.

which he will immediately DCHECK about :)

I'll add another one here though.  This operation only happens when adding a
Track to an Output, so DCHECKs all around!

https://codereview.chromium.org/1424933002/diff/60001/services/media/audio/pl...
File services/media/audio/platform/generic/mixer.h (right):

https://codereview.chromium.org/1424933002/diff/60001/services/media/audio/pl...
services/media/audio/platform/generic/mixer.h:32: static MixerPtr Select(const
LpcmMediaTypeDetailsPtr& src_format,
On 2015/11/04 23:43:34, jeffbrown wrote:
> This is looking much cleaner than the first patch.  Nice.

Acknowledged.

https://codereview.chromium.org/1424933002/diff/60001/services/media/audio/pl...
services/media/audio/platform/generic/mixer.h:65: // How much to increment the
fractional sampling position for each output
On 2015/11/04 23:43:34, jeffbrown wrote:
> Is this actually a fraction?  Like a fixed point value of some kind?

yes, that is exactly what it is.

https://codereview.chromium.org/1424933002/diff/60001/services/media/audio/pl...
File services/media/audio/platform/generic/mixers/mixer_utils.h (right):

https://codereview.chromium.org/1424933002/diff/60001/services/media/audio/pl...
services/media/audio/platform/generic/mixers/mixer_utils.h:19: // Template to
read samples and normalize them into signed 32 bit integers.
On 2015/11/04 23:43:34, jeffbrown wrote:
> What you're not saying in this comment is that those signed 32 bit integers
> actually contain 16 bit samples.  That was a little surprising to me.

because they don't  They contain either signed 16 bit samples or unsigned 8 bit
samples, normalized into signed 32 bit samples.  Someday, this can be extended
to normalize signed/unsigned 24 bit samples, unsigned 16 bit samples, and so on.

The decision for now is to SW scale and mix using 32 math for everything.  As
supported sample formats grow, more specialized mixer implementation will
probably start to show up which deal with more expensive internal
representations of audio (double precision floating point audio anyone?)

https://codereview.chromium.org/1424933002/diff/60001/services/media/audio/pl...
services/media/audio/platform/generic/mixers/mixer_utils.h:30: register SType
tmp = static_cast<const SType*>(src)[0];
On 2015/11/04 23:43:34, jeffbrown wrote:
> *src instead of src[0]?
> 
> I don't see why you need the static cast here given that the argument is
already
> of the same type.

Done.

I think that this is a leftover from an earlier form of this.

https://codereview.chromium.org/1424933002/diff/60001/services/media/audio/pl...
services/media/audio/platform/generic/mixers/mixer_utils.h:43: return
static_cast<int32_t>(static_cast<const SType*>(src)[0]);
On 2015/11/04 23:43:34, jeffbrown wrote:
> Why isn't this good enough?  (Of course even the cast is redundant here.)
> 
> return static_cast<int32_t>(*src);

Done.

it is, see above.  I think this was leftover from some ancient proto form of
this code.

https://codereview.chromium.org/1424933002/diff/60001/services/media/audio/pl...
services/media/audio/platform/generic/mixers/mixer_utils.h:47: // Template to
read normalized source samples, and combine channels if required.
On 2015/11/04 23:43:34, jeffbrown wrote:
> Doesn't this need a downmix matrix in the general case?

Absolutely.  Its on the feature list, I'm not sure what form an efficient
implementation will eventually take.  Even the implementation here may be
considered to be wrong in some cases... For example, when going from 2 -> 1
channels, I average the signal.  The proper thing to do may be to add and
saturate, I'm honestly not sure.

When we give down/up-mix matrices a more formal treatment, all of this will need
to be done more rigorously (and compliance tests will need to be
defined/written)

https://codereview.chromium.org/1424933002/diff/60001/services/media/audio/pl...
services/media/audio/platform/generic/mixers/mixer_utils.h:52: class SrcReader;
On 2015/11/04 23:43:34, jeffbrown wrote:
> SrcReader -> SampleReader?

its not really a generalized sampler reader, however.  SampleNormalizer is what
fills that role.  This template is specially designed to read one, or both of a
source's input channel, and either combine (downmix) them into 1 channel, or
keep them as independent channels.

IOW - its definition is dependent on the concept of the parameters of both the
source and the destination, and it specifically samples the source and puts it
into a format compatible with the destination.

https://codereview.chromium.org/1424933002/diff/60001/services/media/audio/pl...
services/media/audio/platform/generic/mixers/mixer_utils.h:85: template
<typename DType, typename Enable = void> class DstConverter;
On 2015/11/04 23:43:34, jeffbrown wrote:
> DstConverter -> SampleWriter?

This is not a writer at all.  This is a template which (as the name implies)
converts from a normalized sample (signed 32 bit is what we are using here) to a
destination format (in the case of signed int16, this is a case, in the case of
unsigned 8, this is a shift and offset operation).

https://codereview.chromium.org/1424933002/diff/60001/services/media/audio/pl...
services/media/audio/platform/generic/mixers/mixer_utils.h:120: DoAccumulate ==
false,
On 2015/11/04 23:43:34, jeffbrown wrote:
> Instead of testing for DoAccumulate == false maybe have the caller treat
> accumulation as a special case, where the initial value of "sample" is
> initialized to the normalized destination sample.  Otherwise I don't see what
> this particular template is buying you.

Bear in mind, the testing here is happening at compile time.  The purpose of
this template is to allow us to produce two versions of the Mix function; one
which reads from the dest, normalizes the sample, sums with the provided sample
and then clips (the next specialization), and this version which becomes a no-op
(just take the provided sample is, don't actually accumulate/mix).

The invocation ends up looking something like...

*dst = DstMixer<DoAccumulate>(dst, sample);

One version becomes the complicated read, normalize, add, threshold,
de-nomrmalize, writeback.  The other becomes de-normalize and write.

So, why the specialization?
Two main reasons.  First, it is the style I am using here.  Style consistency
uber alles!  Second, I could write the following...

if (DoAccumulate) {
  complicated;
} else {
  less_complicated;
}

Since DoAccumulate is compile time constant, any level of optimization should
produce something equivalent to what I have here.  That said, its not a
guarantee (small comfort considering how much we are depending on compiler
optimization here), and -O0 may actually be forced to keep the check
like-it-or-not.

This form removes the ambiguity.  There will be no branch based on DoAccumulate
no matter what.  Its not a huge thing, but whatever.  We are already deep in
template-meta-hell, whats a little more?  :P

https://codereview.chromium.org/1424933002/diff/60001/services/media/audio/pl...
services/media/audio/platform/generic/mixers/mixer_utils.h:135: static inline
int32_t Mix(const DType* dst, int32_t sample) {
On 2015/11/04 23:43:34, jeffbrown wrote:
> I'll admit to being deeply confused by this function.  It has a parameter call
> dst that we are reading from.  Then it converts the summed output into a value
> of the destination type and widens it to int32_t.  So the input is normalized
> but the output is in the destination space.  Kind of weird...
> 
> The clamping seems to be done on a sample-by-sample basis but we should
probably
> do this at the very end after looping over all samples to be accumulated.

See above.

Clamping on a sample by sample basis is not an option here; our output buffer is
of destination sample size.  We would need to have allocated an intermediate
buffer to hold the 32 bit normalized results, and we have not here.

As we start to optimize these mixer, especially to use vector instruction sets
(SSE/Neon) and intermediate buffer is almost certain.  I leave that as an
optimization for the future.

https://codereview.chromium.org/1424933002/diff/60001/services/media/audio/pl...
File services/media/audio/platform/generic/mixers/no_op.cc (right):

https://codereview.chromium.org/1424933002/diff/60001/services/media/audio/pl...
services/media/audio/platform/generic/mixers/no_op.cc:28: frames_produced =
(dst_frames - *dst_offset);
On 2015/11/04 23:43:34, jeffbrown wrote:
> The mixer functions themselves might be a little simpler if the caller did all
> of this clamping for them.  After all, it should know whether an overflow will
> happen.
> 
> Although I could see some argument for letting the mixer calculate it as it
goes
> since we'll probably determine these results while the mixer's loop executes
too
> (at least for a non-noop mixer).

I'm confused; there is no sample clamping going on here.  The goal (and what is
going on here) is to let the mixer/filter say how much of the input it has
consumed and how much of the output it has produced.  The reason for this is
that as we add more complicated temporal filters, the understanding of this
relationship really belong in the mixer class, and not in the outer loop.  The
outer loop should just keep giving the mixer more data until it has produced all
the sample it needs, or the mixer has decided that it has seen all of the input
it is interested in.

FWIW - I have some work in progress to allow filters of non-epsilon width, and
as part of that work, this NoOp Mix implementation actually disappears.

https://codereview.chromium.org/1424933002/diff/60001/services/media/audio/pl...
services/media/audio/platform/generic/mixers/no_op.cc:34: return
(*frac_src_offset >= frac_src_frames);
On 2015/11/04 23:43:34, jeffbrown wrote:
> I don't see why we need to return this value since it's something the caller
> could determine directly.

See above; the caller needs to know details about the filter implemented by the
mixer to actually do that.  The filters implemented so far are trivial, but they
will not always be so.

https://codereview.chromium.org/1424933002/diff/60001/services/media/audio/pl...
File services/media/audio/platform/generic/mixers/point_sampler.cc (right):

https://codereview.chromium.org/1424933002/diff/60001/services/media/audio/pl...
services/media/audio/platform/generic/mixers/point_sampler.cc:72: uint32_t S =
(soff >> AudioTrackImpl::PTS_FRACTIONAL_BITS) * SChCount;
On 2015/11/04 23:43:34, jeffbrown wrote:
> variables shouldn't be uppercase

Done.

https://codereview.chromium.org/1424933002/diff/60001/services/media/audio/pl...
services/media/audio/platform/generic/mixers/point_sampler.cc:75: for (size_t D
= 0; D < DChCount; ++D) {
On 2015/11/04 23:43:34, jeffbrown wrote:
> We could handle all of the specialization for accumulation at this level
instead
> of burying it down in DM::Mix.  We know the function just turns into reading
> from Dest and doing an extra mix.

I disagree.  I don't want to write two versions of this loop which differ only
in whether they overwrite or accumulate and saturate.  This approach produces
the same code with no cut-n-paste.

Adding a pseudo-runtime check could avoid the duplication (something like "if
(DoAccumulate) ...") but leaves to door open for the compiler to not actually
optimize.  I realize that this is a small argument given how much we are
depending on the optimizer here, but a -O0 build might actually leave the branch
in (for no good reason).

The style here is to do everything with compile time template specialization, I
feel that this is consistent with that style.

https://codereview.chromium.org/1424933002/diff/60001/services/media/audio/pl...
File services/media/audio/platform/generic/standard_output_base.cc (right):

https://codereview.chromium.org/1424933002/diff/60001/services/media/audio/pl...
services/media/audio/platform/generic/standard_output_base.cc:137: for (auto
iter = links_.begin(); iter != links_.end(); ) {
On 2015/11/04 23:43:34, jeffbrown wrote:
> I'm actually kind of surprised that you designed the mixer with an outer loop
> over the tracks instead of over the outputs.
> 
> This means as we go through the tracks, we have to accumulate all of the
samples
> into the destination buffer which may cause a loss of precision and bad
clipping
> artifacts.
> 
> With that in mind, perhaps we should only support accumulation buffers that
have
> a normalized representation (say, int32_t), and convert them into the
> destination representation at the very last moment.
> 
> That should make your Mix functions a little simpler since they won't have to
> deal with converting the accumulated samples to a destination format
> immediately.  They can do all of the math in the normalized domain.
> 
> Another alternative might be to try to exchange the order of the loops but I
> suspect that the bookkeeping involved in doing that efficiently might be a
> little much!

Keep in mind that the outputs are independent and mix in parallel.  They also
have completely independent timing/buffering requirements.  We never loop over
outputs.

That said, I suspect that picking a canonical output sample format, or small set
of them (int32_t/double) and separating the mix from the clip operation is a
likely direction that this will end up moving in.  Things get a bit more
complicated when 24 bit audio comes in (int32_t internal allows a max of 127
tracks before internal clipping becomes a possibility), and when
greater-than-unity gain is permitted on input tracks (something which Dale has
expressed interest in supporting).  At this point, we are looking at int64_t as
our intermediate format, and its starting to look seriously expensive.

TL;DR?  I expect that an intermediate mix buffer will be coming soon, its just
not here yet.

Finally; re: permuting the order of the loops..

If what you are trying to say is that you are puzzled why I would write
foreach(track : tracks_)
  foreach(packet : track.pending_packets_)

I'm not sure I see a way to permute the order of these loops.  The
pending_packet_ concept would need to be homogeneous across the loops and it is
not.  Even if it was, I would still go over tracks first in order to implement
priority properly (something which Carlos has requested).  His request (and I
think it is reasonable/good) is that if this starts to get so overloaded because
of the number of tracks in the system, that we cut-n-run during the mix
operation.  This means that low priority tracks go silent, but we do not
underflow our output DMA.  In the first model, we punish low priority tracks
when we are overloaded.  In the second, we punish all tracks.  Carlos asserts
that this is what Apple does and Microsoft does not, which is one of the reasons
he preferred the Apple audio system over the system offered by MSFT.

https://codereview.chromium.org/1424933002/diff/60001/services/media/audio/pl...
services/media/audio/platform/generic/standard_output_base.cc:304: bool
consumed_source = info->mixer->Mix(buf,
On 2015/11/04 23:43:34, jeffbrown wrote:
> As designed, we're going to have problems replacing the point sampler with one
> that does interpolation.  An interpolating sampler may need access to data
that
> comes from multiple regions within the packet (or from prior packets depending
> on how the audio got sliced up).  In order words, the sampler will need access
> to a window of samples within the track, not just one.
> 
> To work around this problem, consider passing the mix function a random access
> iterator for accessing the Nth sample of the input rather than a pointer to
it. 
> You'll have to define what the window size is upfront though.  The window
might
> need to depend on properties of the timeline transformation.
> 
> Alternately, it's possible that copying a moving window's worth of the input
> into a linear buffer will actually end up being faster than trying to process
it
> in small chunks when crossing region boundaries.  (Best of both worlds,
assuming
> the windows are small compared to typical regions so we don't have to deal
with
> overlap very often.)

Acknowledged.

This is a complicated optimization subject.  I don't know where we will
eventually end up, but right now I am trying to pursue a model where we minimize
copies (one of Dale's primary concerns).  I have a WiP of a linear interpolation
resampler which holds just the last sample from the last region/packet, and
resamples in place for everything else.  As filters grow wider, and buffers grow
smaller (to hit extreme low latency requirements) the overhead of handling the
edges of the system is going to start to grow, however.  Pursuing a strategy of
copying into a temporally contiguous input buffer may become the most efficient
thing to do.  This could also serve to allow simpler implementations of graceful
ramp down/up after/before discontinuities and still preserve the ability to
allow sparse audio, something that can really benefit an application like the
announcer.

We'll see what the future holds.  Generally, optimizing the core mix loops is
going to be a full-time job for someone someday soon.

dalesat

https://codereview.chromium.org/1424933002/diff/60001/mojo/services/media/audio/interfaces/audio_track.mojom File mojo/services/media/audio/interfaces/audio_track.mojom (right): https://codereview.chromium.org/1424933002/diff/60001/mojo/services/media/audio/interfaces/audio_track.mojom#newcode20 mojo/services/media/audio/interfaces/audio_track.mojom:20: bool supports_pull_transport; On 2015/11/06 02:20:23, johngro wrote: > On ...

5 years, 1 month ago (2015-11-06 06:14:12 UTC) #6

johngro

https://codereview.chromium.org/1424933002/diff/60001/mojo/services/media/audio/interfaces/audio_track.mojom File mojo/services/media/audio/interfaces/audio_track.mojom (right): https://codereview.chromium.org/1424933002/diff/60001/mojo/services/media/audio/interfaces/audio_track.mojom#newcode20 mojo/services/media/audio/interfaces/audio_track.mojom:20: bool supports_pull_transport; On 2015/11/06 06:14:11, dalesat wrote: > On ...

5 years, 1 month ago (2015-11-06 20:24:25 UTC) #7

dalesat

https://codereview.chromium.org/1424933002/diff/60001/mojo/services/media/audio/interfaces/audio_track.mojom File mojo/services/media/audio/interfaces/audio_track.mojom (right): https://codereview.chromium.org/1424933002/diff/60001/mojo/services/media/audio/interfaces/audio_track.mojom#newcode20 mojo/services/media/audio/interfaces/audio_track.mojom:20: bool supports_pull_transport; On 2015/11/06 20:24:25, johngro wrote: > On ...

5 years, 1 month ago (2015-11-06 21:03:57 UTC) #8

johngro

https://codereview.chromium.org/1424933002/diff/60001/mojo/services/media/audio/interfaces/audio_track.mojom File mojo/services/media/audio/interfaces/audio_track.mojom (right): https://codereview.chromium.org/1424933002/diff/60001/mojo/services/media/audio/interfaces/audio_track.mojom#newcode20 mojo/services/media/audio/interfaces/audio_track.mojom:20: bool supports_pull_transport; On 2015/11/06 21:03:56, dalesat wrote: > On ...

5 years, 1 month ago (2015-11-09 20:25:16 UTC) #9

johngro

5 years, 1 month ago (2015-11-11 02:35:56 UTC) #11

Message was sent while issue was closed.

Committed patchset #8 (id:130001) manually as
1cff1d372574761491c915137cab8928036b5deb (presubmit successful).

Issue 1424933002: Add an initial revision of an audio server. (Closed)

Description

Patch Set 1 #

Patch Set 2 : #

Patch Set 3 : #

Patch Set 4 : refactor MixerKernel into a class to prepare for the addition of a linear interpolation sampler #

Patch Set 5 : #

Patch Set 6 : #

Patch Set 7 : final rebase and trybot preflight #

Patch Set 8 : fix issues discovered with initial preflight #

Messages