Tag Archives: audio

Improving The Quality Of Your Video Compositions With Creative Cloud

I’ve been spending a lot of time with Adobe video tools lately… everything from videos for the blog, to promotional videos, to help/technical videos.  Here are a few topics that beginners in video production need to think about… audio processing and color correction.

First, you can make so-so video look great with a few simple color correction techniques. Second, a video is only as good as its audio, so you need solid audio to keep viewers engaged.  Hopefully this post helps you improve your videos with simple steps on both of these topics.

To give you an idea what I’m talking about, check out this before and after video. It’s the exact same clip played twice.  The first run through is just the raw video straight from the camera and mic.  Colors don’t “pop”, it’s a little grainy, and the audio is very quiet.  The second run through has color correction applied to enhance the visuals, and also has processed audio to enhance tone, increase volume, and clean up artifacts.

Let’s first look at color correction.  Below you can see a “before” and “after” still showing the effects of color correction.  The background is darker and has less grain, there is more contrast, and the colors are warmer.

Before and After - Color Correction

Before and After – Color Correction

The visual treatment was achieved using two simple effects in Adobe Premiere Pro.  First I used the Fast Color Corrector to adjust the input levels.  By bringing up the black and gray input levels, the background became darker, and it reduced grain in the darker areas.  Then, I applied the “Warm Overall” Lumetri effect to make the video feel warmer – this enhances the reds to add warmth to the image.

Color Correction Effects in Adobe Premiere

Color Correction Effects in Adobe Premiere

You can enhance colors even further using color correction tools inside of Premiere Pro, or open the Premiere Pro project directly within SpeedGrade for fine tuning.

Next, let’s focus on audio…

You can get by with a mediocre video with good audio, but nobody wants to sit through a nice looking video with terrible audio. Here are three simple tips for Adobe Audition to help improve your audio, and hopefully keep viewers engaged.

In this case, I thought the audio was too quiet and could be difficult to understand.  My goal was to enhance audio volume and dynamics to make this easier to hear.

I first used Dynamics Processing to create a noise gate. This process removes quiet sounds from the audio, leaving us with the louder sounds, and generally cleaner audio.  You could also use Noise Reduction or the Sound Remover effects… the effect that works best will depend on your audio source.

Dynamics Processing (Noise Gate) in Adobe Audition

Dynamics Processing (Noise Gate) in Adobe Audition

Next I used the 10-band graphic equalizer to enhance sounds in specific frequency ranges.  I brought up mid-range sounds to give more depth to the audio track.

10 Band EQ in Adobe Audition

10 Band EQ in Adobe Audition

Finally, I used the Multiband Compressor to enhance the dynamic range of the audio.  Quieter sounds were brought up and louder sounds were brought down to create more level audio that is easier to hear and understand.  However, be careful not to make your audio too loud when using the compressor!  If you’ve ever been watching TV and the advertisements practically blow out your eardrums, this is because of overly compressed audio.

Multi-band Compressor in Adobe Audition

Multi-band Compressor in Adobe Audition

Want to learn more?  Don’t miss the Creative Cloud Learn resources to learn more about all of the Creative Cloud tools – the learning resources are free for everyone! If you aren’t already a member, join Creative Cloud today to access all Adobe media production tools.

Sound Remover in Adobe Audition CC

The update to Creative Cloud that is coming in June is loaded with awesome tools and incredible new features. I recently demonstrated shake reduction in Photoshop, which can greatly enhance photos that are blurred from a shaky camera, but that’s not the only great update coming in June. Another feature that I wanted to highlight is the Sound Remover process in Adobe Audition CC.

The Sound Remover enables you to select specific sound frequencies and patterns, and remove them from a sound file/composition. Imagine that you have a great recording which was ruined by a cell phone ringing, birds chirping in the background, or someone slamming a door. Now it is possible to easily remove those specific frequencies and patterns without losing or damaging the entire audio file. Check out the video below for an example.

In a nutshell, the process is this:

  • In the Spectral Frequency Display, use the paintbrush selection tool to select the frequencies and patterns you want to remove.
  • Go to the Effects menu and select Noise Reduction/Restoration -> Learn Learn Sound Model.
  • Select the clip/segments that you want to be affected.
  • Go to the Effects menu and select Noise Reduction/Restoration -> Sound Remover (process).
  • Next, change the settings as appropriate for your sounds/compositions, and hit “Apply”.

Sound Remover

You can learn more about some of the other great features in Adobe Audition CC in the video below from Adobe TV:

Or, you can learn more about the exciting new features coming in June’s Creative Cloud update on adobe.com and see video previews of a lot of the new Audio/Video features.

Connected Second-Screen App Experiences with PhoneGap & Audio Watermarks

In January 2012, I started the year with a post on multi-screen applications developed with PhoneGap. In that post, I describe an approach for creating mobile applications that extend the app experience onto a second screen – where the mobile device drives the content on that second screen… essentially having an “external monitor” capability on a mobile device.

Mobile App Drives External Screen

Mobile App Drives External Screen

Now, I’m going to turn things around… I’ve been experimenting with a few ideas of connected secondary-experience applications, and I figured this would be a great way to come full circle and end 2012. I see the secondary app experience as having huge potential for our connected/media-centric world. The secondary app experience is different in that the application is your “second screen”, perhaps a companion to something else that you are doing. For example, the secondary screen is a mobile application that augments the experience of watching television. Perhaps it is a mobile application that augments the experience of playing a video game, along the same concept as Xbox Smart Glass though not tied to a particular platform. The key element is that the mobile application is not only an augmented experience to the television-based content, but that it is also updated in real time as you watch the program, or as you play the game.

External Screen Drives Mobile App

External Screen Drives Mobile App

In this post I’ll show a proof-of-concept second screen experience where the content of a mobile PhoneGap application is being driven by an external source (a video) using audio watermarks. In this case, the mobile application is the “second screen”, and your TV is the primary screen. I’d also like to emphasize that this is just a proof of concept – the methods and code in this example are not yet suitable for a production-quality use case for reasons I’ll describe below, but are a great starting point for further exploration.

The Concept

Let’s start with the core concept: a synchronized experience between a content source (TV or other) and a mobile application. Since we are talking about TV or media-based content, you can’t rely on being able to create a client-server architecture for synchronization between the media source and the mobile app. This just wouldn’t be possible due to legacy TV hardware, and the fact that there is no way to digitally synchronize the content. However, TVs are great at producing sounds, and it is very possible to use sound-based cues to invoke actions within a mobile application.

Now let’s focus on audio watermarks: Audio watermarks are markers embedded within an audio signal. They may be either human-imperceptible or within the range of human hearing. In general, humans can hear frequencies between 20Hz and 20kHz, with that range decreasing with age. While we may not be able to hear the markers, mobile devices are able to detect them. When these markers are “heard” by your device, they can invoke an action within your application.

Next, Let’s take a look at my proof of concept application. The proof of concept exemplifies a mobile application themed with content from the HBO series Game of Thrones, synchronized with the opening scene from the TV series. As castles & cities are shown in the video, the content within the mobile applications is updated to show details about each location. In a nutshell:

Proof of Concept

In the video below, you can see the proof of concept in action. It shows the synchronization between the video and the PhoneGap-based application, with a brief description from yours truly.

Note: I have no association with Game of Thrones, HBO, George R.R. Martin, or the book series “A Song of Fire and Ice“. I just thought this made a compelling example. I enjoyed both the book series and the show and recommend them. Full credit for video and mobile content available.

The Application Implementation

There are several ways that you can do audio watermarks. The most basic of which is to embed a single tone in the audio stream, and check for the presence of that tone. The first thing that I started exploring was how to identify the dominant frequency of a sound. A quick Google search yielded an answer in the first result. This post not only describes how to detect the dominant sound frequency on iOS, but also has a downloadable project in github that you can use to get started. No exaggeration, I had this project up and running within minutes. It operates kind of like a guitar tuner… the application detects the dominant frequency of a sound and displays that in the UI.

At a much later time, I also discovered this sample project from Apple, which also demonstrates how to detect frequencies from an audio stream (used for the frequency waveform visualization). This will be useful for maturing the concepts shown here.

Creating Watermarks

Once I had the sample native iOS project up and running, I started exploring inaudible audio “tones” and testing what the devices could accurately detect. I initially started using tones above the 20kHz frequency range, so that humans would not be able to hear the watermark. Tones in the range of 20-22kHz worked great for the iPhone, but I quickly realized that the iPad microphone was incapable of detecting these watermarks, so I dropped down to the 18-20kHz range, which the iPad was able to pick up without any problems. Most adults won’t be able to hear these frequencies, but small children may hear them, and they may drive your pets crazy.

The first thing I did was create “pure” audio tones at specific frequencies using Adobe Audition. In Audition, create a new waveform, then go to the “Effects” menu and select “Generate Tones”. From here, you can create audio tones at any frequency. Just specify your frequency and the tone duration, and hit “OK”. I used 3-second tones to make sure that the tone was long enough to be perceived by the device.

Generate Tones in Adobe Audition

I did this for tones in the range of 18-22kHz, and saved each in a separate wav file. Some of which you can find in the GitHub sample. These files were used for testing, and were embedded in the final video.

To embed the audio watermarks in the video, I fired up Adobe Premiere and started adding the inaudible tones at specific points in time within the video.

Audio Tones in Adobe Premiere

By playing specific tones at specific times, you can synchronize events within your application to those specific tones. This means that you can reliably synchronize in-app content with video content.

Let me reiterate… this is only a proof of concept implementation. These watermarks worked great locally, but wouldn’t work in a real-world solution as-is. I also ran into a few major issues when embedding the watermarks – See the “lessons learned” below for details.

The PhoneGap Implementation

The next logical step was to take the native code example and turn it into a PhoneGap native plugin so that it can be used within a PhoneGap application. I stripped out the native user interface and exposed an API that would allow the PhoneGap/JavaScript content to register to listen for specific frequencies. If these frequencies are registered as the dominant frequency of a sound, the native plugin invokes the JavaScript callback JavaScript function that is mapped to that particular frequency. Using this approach, a unique JavaScript function can be assigned to each frequency.

The final step was to build a user interface using HTML, CSS, & JavaScript that could respond to the audio watermarks. This was the easy part. First, I created a basic project that showed the reception and handling of specific audio frequencies. Next, I created the actual application content themed around the Game Of Thrones video.

Audio Watermark Enabled Applications

Audio Watermark Enabled Applications

The Final Product

You can view the completed project, the sample tones, and the video containing the embedded tones on GitHub.

Lessons Learned

This was a really fun experiment, and I definitely learned a lot while doing it. Below are just a few of my findings:

Dominant Frequency

Dominant Frequency watermarks are not the way to go in a real-world solution for many reasons. The main reason being that the watermark has to be the loudest and most predominant frequency in the captured audio spectrum. If there is a lot of other audio content, such as music, sound effects, talking, etc…, then the watermark has to be louder than all of the other content, otherwise it will not be detectable. This alone is problematic. If you are normalizing or compressing your audio stream, this can cause even more problems. A multi-frequency watermark that is within the audible range, but is un-noticeable would be a more reliable solution.

High-Frequency Watermarks

High-frequency watermarks are also problematic. High-pitch frequencies may be beyond the capabilities of hardware devices. Speakers may have problems playing these frequencies, or microphones may have problems detecting these frequencies, as I discovered with the iPad. High-pitch frequencies also may have issues when encoding your media. Many compression formats/codecs will remove frequencies that are beyond human hearing, thus removing your watermarks. Without those watermarks, there can be no synchronization of content.

Time-Duration or Sequential Tones

The current implementation only detects for a dominant frequency, without a duration. If that frequency is encountered, it triggers the listening JavaScript function regardless of how long the sound was actually being played. All of my experimental tones lasted 3 seconds, so I could ensure it played long enough to be detected. However, I noticed that some of my frequency listeners would be triggered if I slid my mouse across the desk. While the action of moving my mouse across my desk was very brief and I could not hear it, the action apparently generated a frequency that the application could detect. This triggered some of the frequencies that the app was listening for. If there was a minimum duration for the watermark frequency, this erroneous triggering of the event would not have occurred. You could also prevent misfires of audio watermarks by requiring specific series of tones in a sequence to trigger the action.

Media Production and Encoding

If you are using audio frequencies that are near the upper-range of human hearing, you have to be careful when you encode your media content. If the “inaudible” sound waveforms are over-amplified and are clipped, it has the potential to cause an extremely unpleasant high frequency noise that you can hear. I strongly recommend that you do not do this – I learned this from experience.

Additionally, if you are using high-frequency tones, be careful if you transcode between 16 and 32 bit formats or if you transcode sample rates. Transcoding between 16/32 bit depth or between sample rates can cause the inaudible sounds to become audible with very unpleasant artifacts. I found that I had the best results if the Sequence settings in Premiere, the export format, and the source waveform all had the exact same bit depth (32) and sample rate (41000kHz).


From this exploration, some reading, and a lot of trial and error, I think it would be better to have a multi-frequency watermark for a minimum duration. Rather than having one specific frequency that dominates the audio sample, the application would detect for elevated levels of specific frequencies for a minimum period of time. This way the watermark frequencies don’t have to overpower any other frequencies, and the watermark frequencies can be within the normal range of human hearing without being noticed. This also gives you the ability to have significantly more watermarks by using combinations of frequencies. Since the watermark tones would be within the normal range of human hearing, you also would be better able to rely on common hardware to be able to accurately detect those watermarks.


The main conclusion: not only is it really cool to control your mobile app from an audio source, this can be incredibly powerful for connected experiences. There are already TV shows and apps out in the real world employing the audio-watermarking technique to achieve a synchronized multi-screen experience. My guess is that you will start to see more of these experiences in the not-so-distant future. This is an inexpensive low-fi solution that has potential to work extremely well, and has applications far beyond just the synchronization of an app content with a TV show.

Here are just a few ideas where this could be applied:

  • TV & Movies: connected app and media experiences
  • Video games: connected “companion” applications to augment the gaming experience
  • Targeted advertising: Imagine you are using an app while in a retail store & you receive advertisements just be being in the store. The watermarks could be embedded within the music playing in the store.
  • Product placement: Imagine that you area watching a movie, and your favorite actor is drinking your favorite soda… you look down at your device, and you also see an advertisement for that same brand of soda.
  • Museums: Imagine you have a mobile app for your favorite museum. While in the museum, there is an audio track describing the exhibits, or just playing background music. When you approach an exhibit, your app shows you details about that exhibit, all triggered by the sound being played within the museum.

The applications of audio watermarking are only limited by our imaginations. This is a low-cost solution that could enable connected experiences pretty much everywhere that you go. The goal of this experiment was to see if these techniques are possible within PhoneGap apps, and yes, they are.

While PhoneGap is a multi-platform solution, you may have noticed that this proof of concept is iOS only. I’m planning on developing this idea further on iOS, and if successful, I’ll considering porting it to other platforms.


iPad designed by Jason Schmitt from The Noun Project
Television designed by Andy Fuchs from The Noun Project

Low Latency & Polyphonic Audio in PhoneGap

UPDATE 10/07/2013: This plugin has been updated to support PhoneGap 3.0 method signatures and command line interface.  You can access the latest at: https://github.com/triceam/LowLatencyAudio

If you have ever tried to develop any kind of application using HTML5 audio that is widely supported, then you have likely pulled all the hair from your head. In its current state, HTML5 Audio is wrought with issues… lack of consistent codec support across browsers & operating systems, no polyphony (a single audio clip can not be played on top of itself), and lack of concurrency (on some of the leading mobile browsers you can only play one audio file at a time, if at all). Even the leading HTML5 games for desktop browsers don’t even use HTML5 audio (they use Flash). Don’t believe me? Just take a look at Angry Birds, Cut the Rope, or Bejeweled in a proxy/resource monitor…

The Problem

You want fast & responsive audio for your mobile applications.   This is especially the case for multimedia intensive and/or gaming applications.

HTML5 audio is not *yet* ready for prime-time. There are some great libraries like SoundManager, which can help you try to use HTML5 audio with a failover to Flash, but you are still limited without polyphony or concurrency. In desktop browsers, Flash fixes these issues, and Flash is still vastly superior to HTML5 for audio programming.

If you are building mobile applications, you can have great audio capabilities by developing apps with AIR. However, what if you aren’t using AIR? In native applications, you can access the underlying audio APIs and have complete control.

If you are developing mobile applications with PhoneGap, you can use the Media class, which works great. If you want polyphony, then you will have to do some work managing audio files for yourself, which can get tricky. You can also write native plugins that integrate with the audio APIs for the native operating systems, which is what i will be covering in this post.

Before continuing further, let’s take a minute to understand what I am talking about when I refer to concurrency, polyphony, and low-latency…


Concurrency in audio programming refers to the ability to play multiple audio resources simultaneously.  HTML5 in most mobile devices does not support this – not in iOS, not in Android.  In fact, HTML5 Audio does not work *at all* in Android 2.x and earlier.  Native APIs do support this, and so does PhoneGap’s Media class, which is based on Android MediaPlayer and iOS AVAudioPlayer.


Producing many sounds simultaneously; many-voiced.

In this case, polyphony is the production of multiple sounds simultaneously (I’m not referring to the concept of polyphany in music theory). In describing concurrency, I refered to the ability to play 2 separate sounds at the same time, where with polyphony I refer to the ability to play the same sound “on top” of itself. There can be multiple “voices” of the same sound. In the most literal of definitions concurrency could be considered a part of polyphony, and polyphony a part of concurrency… Hopefully you get what I’m trying to say. In its current state, HTML5 audio supports neither concurrency or polyphony.  The PhoneGap Media class does not support polyphony, however you can probably manage multiple media instances via javascript to achieve polyphonic behavior – this requires additional work in the JavaScript side of things to juggle resources.

Low Latency

Low latency refers to “human-unnoticeable delays between an input being processed and the corresponding output providing real time characteristics” according to wikipedia.   In this case, I refer to low latency audio, meaning that there is an imperceptible delay between when a sound is triggered, and when it actually plays.   This means that sounds will play when expected, not after a wait.   This means a bouncing ball sound should be heard as you see the ball bouncing on the screen.   Not after it has already bounced.

In HTML5, you can auto-load a sound so that it is ready when you need it, but don’t expect to play more than one at a time.  With the PhoneGap Media class, the audio file isn’t actually requested until you invoke “play”.   This occurs inside “startPlaying” on Android, and “play” on iOS.   What I wanted was a way to preload the audio so that it is immediately ready for use at the time it is needed.

The Solution

PhoneGap makes it really easy to build natively installed applications using a familiar paradim: HTML & JavaScript.   Luckily, PhoneGap also allows you to tie into native code using the native plugin model.   This enables you to write your own native code and expose that code to your PhoneGap application via a JavaScript interface… and that is exactly what I did to enable low-latency, concurrent, and polyphonic audio in a PhoneGap experience.

I created PhoneGap native plugins for Android and iOS that allow you to preload audio, and playback that audio quickly, with a very simple to use API.   I’ll get into details how this works further in the post, but you can get a pretty good idea of what I mean by viewing the following two videos.

The first is a basic “Drum Machine”.  You just tap the pads to play an audio sample.

The second is a simple user interface that allows you to layer lots of complex audio, mimicking scenarios that may occur within a video gaming context.

Assets used in this example from freesound.org.  See README for specific links & attribution.

You may have noticed a slight delay in this second video between the tap and the actual sounds.  This is because I am using “touchStart” events in the first example, and just using a normal <a href=”javascript:foo()”> link in the second.  There is always a delay for “normal” links in all multi-touch devices/environments because there has to be time for the device to detect a gesture event. You can bypass this delay in mobile web browsers by using touch events for all input.

Side Note:  I have also noticed that touch events are slightly slower to be recognized on Android devices than iOS.   My assumption is that this is related to specific device capabilities – this is more noticeable on the Amazon Kindle Fire than the Motorola Atrix.   The delay does not appear to be a delay in the actual audio playback.

How it works

The native plugins expose a very simple API for hooking into native Audio capabilities.   The basic usage is:

  • Preload the audio asset
  • Play the audio asset
  • When done, unload the audio asset to conserve resources

The basic components of a PhoneGap native plugin are:

  • A JavaScript interface
  • Corresponding Native Code classes
You can learn more about getting started with native plugins on the PhoneGap wiki.

Let’s start by examining the native plugin’s JavaScript API.  You can see that it just hands off the JavaScript calls to the native layer via PhoneGap:

var PGLowLatencyAudio = {

preloadFX: function ( id, assetPath, success, fail) {
    return PhoneGap.exec(success, fail, "PGLowLatencyAudio", "preloadFX", [id, assetPath]);

preloadAudio: function ( id, assetPath, voices, success, fail) {
    return PhoneGap.exec(success, fail, "PGLowLatencyAudio", "preloadAudio", [id, assetPath, voices]);

play: function (id, success, fail) {
    return PhoneGap.exec(success, fail, "PGLowLatencyAudio", "play", [id]);

stop: function (id, success, fail) {
    return PhoneGap.exec(success, fail, "PGLowLatencyAudio", "stop", [id]);

loop: function (id, success, fail) {
    return PhoneGap.exec(success, fail, "PGLowLatencyAudio", "loop", [id]);

unload: function (id, success, fail) {
    return PhoneGap.exec(success, fail, "PGLowLatencyAudio", "unload", [id]);

You would invoke the native functionality by first preloading the audio files BEFORE you need them:

PGLowLatencyAudio.preloadAudio('background', 'assets/background.mp3', 1);
PGLowLatencyAudio.preloadFX('explosion', 'assets/explosion.mp3');
PGLowLatencyAudio.preloadFX('machinegun', 'assets/machine gun.mp3');
PGLowLatencyAudio.preloadFX('missilestrike', 'assets/missle strike.mp3');
PGLowLatencyAudio.preloadAudio('thunder', 'assets/thunder.mp3', 1);

When you need to play an effect you just call either the play or loop functions, passing in the unique sound ID:


Next, let’s examine some intricacies of the plugin…   One thing to keep in mind is that I do not have callbacks to the phonegap app once a media asset is loaded.   If you need “loaded” callbacks, you will need to add those yourself.

preloadFX: function ( id, assetPath, success, fail)

id – string unique ID for the audio file
assetPath – the relative path to the audio asset within the www directory
success – success callback function
fail – error/fail callback function


The preloadFX function loads an audio file into memory.  These are lower-level audio methods and have minimal overhead. These assets should be short (less than 5 seconds). These assets are fully concurrent and polyphonic.

On Android, assets that are loaded using preloadFX are managed/played using the Android SoundPool class. Sound files longer than 5 seconds may have errors including (not playing, clipped content, not looping) – all will fail silently on the device (debug output will be visible if connected to debugger).

On iOS, assets that are loaded using preloadFX are managed/played using System Sound Services from the AudioToolbox framework. Audio loaded using this function is played using AudioServicesPlaySystemSound. These assets should be short, and are not intended to be looped or stopped.

preloadAudio: function ( id, assetPath, voices, success, fail)

id – string unique ID for the audio file
assetPath – the relative path to the audio asset within the www directory
voicesthe number of polyphonic voices available
success – success callback function
fail – error/fail callback function


The preloadAudio function loads an audio file into memory.  These have more overhead than assets laoded via preloadFX, and can be looped/stopped. By default, there is a single “voice” – only one instance that will be stopped & restarted when you hit play. If there are multiple voices (number greater than 0), it will cycle through voices to play overlapping audio.  You must specify multiple voices to have polyphonic audio – keep in mind, this takes up more device resources.

On Android, assets that are loaded using preloadAudio are managed/played using the Android MediaPlayer.

On iOS, assets that are loaded using preloadAudio are managed/played using AVAudioPlayer.

play: function (id, success, fail)

id – string unique ID for the audio file
success – success callback function
fail – error/fail callback function


Plays an audio asset.  You only need to pass the audio ID, and the native plugin will determine the type of asset and play it.

loop: function (id, success, fail)

id – string unique ID for the audio file
success – success callback function
fail – error/fail callback function


Loops an audio asset infinitely.  On iOS, this only works for assets loaded via preloadAudio.  This works for all asset types for Android, however it is recommended to keep usage consistent between platforms.

stop: function (id, success, fail)

id – string unique ID for the audio file
success – success callback function
fail – error/fail callback function


Stops an audio file.  On iOS, this only works for assets loaded via preloadAudio.  This works for all asset types for Android, however it is recommended to keep usage consistent between platforms.

unload: function (id, success, fail)

id – string unique ID for the audio file
success – success callback function
fail – error/fail callback function


Unloads an audio file from memory.   DO NOT FORGET THIS!  Otherwise, you will cause memory leaks.

I’m not just doing this for myself, the audio is completely open source for you to take advantage of as well.  You can download the full code, as well as all examples from github at github:

UPDATE 10/07/2013: This plugin has been updated to support PhoneGap 3.0 method signatures and command line interface.  You can access the latest at: https://github.com/triceam/LowLatencyAudio