Interrupting Sphinx-4 speech recognition in continuous recognition mode

by timvasil 5/14/2011 8:20:00 PM

Sphinx-4 is a speech recognizer developed at Carnegie Mellon University.  Out of the box, it offers two modes of operation: batch ("frontend") and continuous ("epFrontEnd").  In contiuous mode, it performs decoding live based on, say, microphone input.

Unfortunately for me, epFrontEnd turns the Recognizer.recognize() method into a blocking call, and Sphinx-4's API provides no way of interrupting this method. I find this problematic in various scenarios, such as automated tests.  In such a test, I want to determine whether the recognizer recognizes the command correctly, incorrectly, or misses it entirely.  The "miss" case is the tricky one, as in this case the recognize() method just hangs indefinitely, waiting for more audio input.  

I found a way to work around this problem.  It involves inserting a custom data processor into Sphinx-4's data processing stack.

Here's how to do it in three steps:

Step 1:  Implement a custom data processor 

public class InsertableDataBlocker extends BaseDataProcessor
{
    List<Data> insertionDatas = new LinkedList<Data>();

    @Override
    public Data getData() throws DataProcessingException
    {
        if (!insertionDatas.isEmpty())
        {
            insertionDatas.remove(0);
            throw new InterruptException();
        }
        return getPredecessor().getData();
    }

    public void injectInterrupt()
    {
        insertionDatas.add(new DataEndSignal(0));
    }
}

Step 2:  Add this data processor to the processing stack

In the Sphinx-4 XML configuration file, place the processor right after the microphone processor in the stack.  

    <component name="epFrontEnd" type="edu.cmu.sphinx.frontend.FrontEnd">
        <propertylist name="pipeline">
            <item>microphone </item>
            <item>insertableDataBlocker </item> 
            <item>dataBlocker </item>
            <item>speechClassifier </item>
            <item>speechMarker </item>
            <item>nonSpeechDataFilter </item>
            <item>preemphasizer </item>
            <item>windower </item>
            <item>fft </item>
            <item>melFilterBank </item>
            <item>dct </item>
            <item>liveCMN </item>
            <item>featureExtraction </item>
        </propertylist>
    </component> 
 

Step 3:  Interrupt the recognize() method when desired

ConfigurationManager cm = new ConfigurationManager(getClass().getResource("config.xml"));
InsertableDataBlocker inserter = (InsertableDataBlocker)cm.lookup("insertableDataBlocker");
inserter.injectInterrupt(); 

Tags:

Java | Speech

Add comment




  Country flag
biuquote
  • Comment
  • Preview
Loading


Search

Calendar

«  April 2014  »
SuMoTuWeThFrSa
303112345
6789101112
13141516171819
20212223242526
27282930123
45678910

View posts in large calendar

Recent comments

Archive