Interrupting Sphinx-4 speech recognition in continuous recognition mode

by timvasil 5/14/2011 8:20:00 PM

Sphinx-4 is a speech recognizer developed at Carnegie Mellon University.  Out of the box, it offers two modes of operation: batch ("frontend") and continuous ("epFrontEnd").  In contiuous mode, it performs decoding live based on, say, microphone input.

Unfortunately for me, epFrontEnd turns the Recognizer.recognize() method into a blocking call, and Sphinx-4's API provides no way of interrupting this method. I find this problematic in various scenarios, such as automated tests.  In such a test, I want to determine whether the recognizer recognizes the command correctly, incorrectly, or misses it entirely.  The "miss" case is the tricky one, as in this case the recognize() method just hangs indefinitely, waiting for more audio input.  

I found a way to work around this problem.  It involves inserting a custom data processor into Sphinx-4's data processing stack.

Here's how to do it in three steps:

Step 1:  Implement a custom data processor 

public class InsertableDataBlocker extends BaseDataProcessor
{
    List<Data> insertionDatas = new LinkedList<Data>();

    @Override
    public Data getData() throws DataProcessingException
    {
        if (!insertionDatas.isEmpty())
        {
            insertionDatas.remove(0);
            throw new InterruptException();
        }
        return getPredecessor().getData();
    }

    public void injectInterrupt()
    {
        insertionDatas.add(new DataEndSignal(0));
    }
}

Step 2:  Add this data processor to the processing stack

In the Sphinx-4 XML configuration file, place the processor right after the microphone processor in the stack.  

    <component name="epFrontEnd" type="edu.cmu.sphinx.frontend.FrontEnd">
        <propertylist name="pipeline">
            <item>microphone </item>
            <item>insertableDataBlocker </item> 
            <item>dataBlocker </item>
            <item>speechClassifier </item>
            <item>speechMarker </item>
            <item>nonSpeechDataFilter </item>
            <item>preemphasizer </item>
            <item>windower </item>
            <item>fft </item>
            <item>melFilterBank </item>
            <item>dct </item>
            <item>liveCMN </item>
            <item>featureExtraction </item>
        </propertylist>
    </component> 
 

Step 3:  Interrupt the recognize() method when desired

ConfigurationManager cm = new ConfigurationManager(getClass().getResource("config.xml"));
InsertableDataBlocker inserter = (InsertableDataBlocker)cm.lookup("insertableDataBlocker");
inserter.injectInterrupt(); 

Tags:

Java | Speech

Add comment




  Country flag
biuquote
  • Comment
  • Preview
Loading


Search

Calendar

«  September 2014  »
SuMoTuWeThFrSa
31123456
78910111213
14151617181920
21222324252627
2829301234
567891011

View posts in large calendar

Recent comments

Archive