What is Audio Mining?
Have you ever used the app Shazam? If not, it’s great. A song comes on the radio, you think “wow! I want to download this song…I wonder who sings it?” You wait to hear the DJ announce it, and you are in the middle of an hour-long, commercial-free set. You’ll never know. But, if you have the Shazam app, you just open the app, hit ‘tap to Shazam,’ quietly stand-by while your smart phone listens and within seconds, you are provided with the name of the song and the artist.
That is musical audio mining. Shazam identifies the melodic, harmonic or rhythmic characteristics of the musical piece it’s listening to, and then searches its database for the song that bears the same characteristics.
Or…
How many times have you called in for customer service and/or support, and have been told: “this call may be monitored or recorded for quality assurance”? Too many to count, right?
That is also audio mining.
Audio mining is the process of searching large volumes of recorded audio for occurrences of specific words and/or phrases.
There are 3 ways a program can analyze a conversation: Large Vocabulary Continuous Speech Recognition (LVCSR), Phonetic Recognition, and Hybrids programs.
Large Vocabulary Continuous Speech Recognition (LVCSR)
LVCSR relies on a database, or dictionary if you will, of words. It uses this database to understand what is being said during a call. You can enhance your LVCSR database with industry-specific terminology or words or phrases that are unique to your organization. You can also add new terms and phrases to your database as needed.
For example, imagine you are a company that specializes in specialty dog treats and you are considering adding an organic line, but you are not sure if your customer base is interested in this type of product. You just add “organic” to your database, pick a time period to analyze (say, the last year) and then reprocess the recorded calls during that time period. You will be given a report that lets you know how many times, during the given time frame, customers called in and the word “organic” came up in conversation.
Phonetic Recognition
A system that relies on phonetic recognition does not search for words or phrases, nor does it make any attempt to try and understand the meat of the conversation. This system strictly searches for sounds that make up words and language.
It is quicker than LVCSR, but it has a higher degree of inaccuracy. This system cannot differentiate the different meanings of the word stock. It also cannot recognize the difference between buy, bye and by. So imagine you want to do a search for calls in which your customers say: “buy.” You will have to sort through conversations in which a customer says “bye” or “by” as well. You could waste a lot of time listening to calls that won’t provide you with the insight you were hoping to gain.
Hybrid Solutions
Hybrid solutions rely on the best of both of these worlds. They combine a large database of words with phonetic analysis. The result is faster, more reliable search results with better comprehension of the conversation. This system can analyze calls made to your business and organize them into the categories you chose: customer complaint, billing, products…you name it!
Imagine, you receive a report and there has been a spike in calls containing the following words/phrases: “I need to speak to a manager,” “unresolved,” “same problem.” You can feel fairly certain that your customers are having issues that customer service is unable to resolve. You can then take the steps to determine if someone in customer services is not doing their job, if your customer service reps need more training to assist customers, or maybe there is an issue with your product that needs to be addressed. In any case, you will be able to contain this issue before it spirals out of control.
Prior to audio mining solutions, businesses may not have been made aware of problems, until they started losing customers. By then, it’s too late.
To learn how audio mining can help your business, please click here.