Audio selection tool for Speaker Identification

Executive Summary

Main objective of this application is to generate an audio signature of a speaker by analyzing an audio file and to store the same in the database.

To do it, the user will work with an existing audio file where the speaker speaks. Using that audio in the application, the user selects some seconds where the speaker speaks clearly and without interruptions and uses the selected time range to train the system. After finding those seconds and having trained the system, the user will use a web form where more information regarding the speaker can be added.

The web application interface has been built using HTML5/CSS and Javascript only. The Web application will be invoked receiving a string identifying an audio file it will work with, an unique id. Using that unique id and an API it will be able to access and retrieve that audio file encoded in WAV, OGG or MP3 format (depending on browser selection).

The app will show a waveform of the audio retrieved, and the user will be able to select part of that waveform, marking a time frame of at least X seconds of audio. The user will be able to listen to the whole file, play from a point in the middle or play just the part he selected. After that, the app will invoke an existing API passing the audioID and (start, end) seconds of selected part by user to train the system and add a speaker to it. Once the audio is loaded, it will draw the waveform for that audio. To draw the waveform wavesurfur.js has been used in web-kit enabled browser like chrome, and in other browsers the waveform image is being fetched by calling an API.

Once both audio and waveform  are loaded, user can analyze the audio. User can play, pause, fast forward, rewind the audio by using control buttons. User can select a position to play by only clicking on the waveform also. They can also change the play position by dragging the play-position line on the waveform.

About our Client

Software Product and Services