Why Does Audio Latency Matter?

The human ear can hear with a granularity of roughly 25 ms (+/- 5ms). So if two sounds occur within 25 ms of one another, our ears hear them as simultaneous. If they are separated by more than 25 ms, we hear two distinct sounds. That’s why audio latency is more of a problem with multi-track audio apps than other music apps. Since the user is trying to play in time with a backing track, the user needs to hear their note played “simultaneously” with the drummer playing the beat. Any audible mismatch makes the instrument challenging to play. So even a relatively small delay gets magnified in the user’s ear.

Audio Latency On Android

The problem with low latency audio on Android is not the hardware or even the SDK. Modern mobile devices have more than enough CPU power to play audio. The problem comes with scheduling (i.e. thread priority). For example, below is a method trace as it executes a Java SoundPool play call. Note the 17ms delay. And don’t forget that is just the SoundPool delay. It doesn’t include touch detection or anything else. Also note that it is only doing 2.998ms of work during that time. So what’s happening?

SoundPool Systrace

So getting low latency audio on Android isn’t about processing the data faster, it’s about processing the audio buffers consistently. So this isn’t going to be a deep dive into FX algorithms. Rather, this is all about maintaining thread priority on Android.

Use OpenSL ES, not Java

So why does a 2.998 ms task take 17ms? A big part of the delay is the Java layer and, specifically, garbage collection. In order to get low latency audio to work, you need to bypass the Java API and go directly to OpenSL ES.
Doing that alone will solve most of your audio latency problems. But it also brings some challenges. You’ll be using a callback buffer to feed audio to the DAC. The buffer is just a subset of the primary audio recording array representing a short amount of time (as defined by the system, see below). OpenSL then feeds the individual samples to the DAC and issues a callback for more data. And that’s where it gets a little tricky. You need to make sure the buffer is ready when OpenSL issues the callback. Otherwise, the DAC will have no data to play and will play silence, sounding like a pop.

Meeting that deadline consistently means running on a high priority thread. Starting with Jellybean, Android created a special high priority thread just for audio called the “FastMixer”. This was in addition to the pre-existing, lower priority, dedicated audio mixer thread. Consistently using this FastMixer is, in my experience, the single most important factor in getting high performance audio on Android. To get your audio in the FastMixer, start by setting AUDIO_OUTPUT_FLAG_FAST in the audio_output_flags_t parameter of the AudioTrack constructor. Then make sure you meet all of the FastMixer qualifications. The FastMixer thread has more limitations than the normal dedicated audio mixer thread. And if you don’t meet the requirements, you get kicked down to the regular low priority audio mixer thread. You’ll know when your audio has been rejected by FastMixer when you see this warning:

W/AudioTrack: AUDIO_OUTPUT_FLAG_FAST denied by client

So what can get you rejected by the FastMixer?

Sample Rates

Specifically, you can’t use the re-sampler in a FastMixer thread. So you need to make sure you set the sample rate property correctly. You would think it would be easy. Just set it to 44.1k, right? Unfortunately, it doesn’t work like that with Android. The basic problem is device fragmentation – specifically every device has a preferred sample rate as defined by the system. So you can’t assume any sort of industry standard. That means you will need to set the sample rate at runtime. So you’ll have to get the value of the device’s preferred sample rate in the Java layer like this:

AudioManager audioManager = (AudioManager) getSystemService(Contect.AUDIO_SERVICE);
String deviceSampleRate = audioManager.getProperty(AudioManager.PROPERTY_OUTPUT_SAMPLE_RATE);

Now you need to pass it to the JNI code and set it explicitly when you set up your player object. You do it by setting the samplesPerSec value from SLDataFormat_PCM. Note that, despite its name, the samplesPerSec field is in milliHz. So you have it multiply it by 1000.

If you try to play a recording with a different sample rate, the system automatically invokes the re-sampler. But that really isn’t the direct cause of the delay. Rather, invoking the re-sampler precludes using the super special high priority “FastMixer” audio thread. THAT’S what kills your performance.

BTW – Unless you plan to include a different audio source for each device, you’ll need to re-sample your primary source before buffering. Otherwise it will play back at the wrong speed. Just don’t ask me how I know, k?

Callback Buffer Size

This is another symptom of device fragmentation – specifically, like sample rate, every device has a preferred output buffer size. And if you get it wrong, same as sample rate, you get excluded from the FastMixer audio thread. And also like the sample rate, this value can be gotten from a system property:

AudioManager audioManager = (AudioManager) getSystemService(Contect.AUDIO_SERVICE);
String deviceBufferSize = audioManager.getProperty(AudioManager.PROPERTY_OUTPUT_FRAMES_PER_BUFFER);

Like the sample rate, you’ll need to pass the buffer size to the JNI initialization as a jint. Then use it to allocate the size of the audio callback buffer.

So that’s the key to high performance audio on Android. It’s not about floating point optimized compile options or esoteric DSP algorithms. Rather, it is all about getting on the FastMixer thread and staying there.

My Github Portfolio

My GitHub Portfolio

Other Topics

From Java To Kotlin
The Dreaded NPE
Android Polymorpheous Buttons
Android Performance Profiling Tools
Android RecyclerView Gotchas
Android Studio WYSIWYG Render Fail
Why Is Android Studio Asking For My Master Password?
Digital Audio Primer
The Mysterious Red R Error