I downloaded the lib and the demo project, spending 2 hours to understand the flow the demo and figure out how to use it in a service for Android app. I am going to list some details I can still remember, and hope it would be helpful for someone who wants to use the lib.
I am assuming readers of this post are familiar with the basic concepts of Android platform and have already setup the development environment in the computer. Eclipse IDE is used in this project.
1. Go to the link http://cmusphinx.sourceforge.net/wiki/download/ to download the library http://sourceforge.net/projects/cmusphinx/files/pocketsphinx/0.8/
2. Go to http://cmusphinx.sourceforge.net/wiki/tutorialandroid to download the demo project, follow the steps to set up the development environment if it has not been set up on the computer
3. Create a service that implements RecognitionListener. Please remember to import the packages in the java code. Most of the code is copied over from the demo code from PocketSphinx project. Please pay attention to the lines I have commented out. Since this is a service, it shall not have UI related lines of code. I am still keeping some of the lines in the original demo code.
package com.me.android.test;
import static edu.cmu.pocketsphinx.SpeechRecognizerSetup.defaultSetup;
import java.io.File;
import java.io.IOException;
import java.util.HashMap;
import android.app.Service;
import android.content.Context;
import android.content.Intent;
import android.os.AsyncTask;
import android.os.IBinder;
import android.util.Log;
import edu.cmu.pocketsphinx.Assets;
import edu.cmu.pocketsphinx.Hypothesis;
import edu.cmu.pocketsphinx.RecognitionListener;
import edu.cmu.pocketsphinx.SpeechRecognizer;
/*
* Based on the sample from PocketSphinxDemo
*/
public class PocketSphinxVoiceRecognitionService extends Service implements RecognitionListener{
public static String TAG ="PocketSphinxVoiceRecognitionService";
private static final String KWS_SEARCH = "wakeup";
private static final String FORECAST_SEARCH = "forecast";
private static final String DIGITS_SEARCH = "digits";
private static final String MENU_SEARCH = "menu";
private static final String KEYPHRASE = "oh mighty computer";
private SpeechRecognizer recognizer;
private HashMap<String, Integer> captions;
public Context context;
@Override
public void onCreate() {
// TODO Auto-generated method stub
super.onCreate();
context = getApplicationContext();
// Prepare the data for UI
Log.i(TAG, "onCreate: setup search options");
captions = new HashMap<String, Integer>();
captions.put(KWS_SEARCH, R.string.kws_caption);
captions.put(MENU_SEARCH, R.string.menu_caption);
captions.put(DIGITS_SEARCH, R.string.digits_caption);
captions.put(FORECAST_SEARCH, R.string.forecast_caption);
//setContentView(R.layout.main);
//((TextView) findViewById(R.id.caption_text)).setText("Preparing the recognizer");
// Recognizer initialization is a time-consuming and it involves IO,
// so we execute it in async task
new AsyncTask<Void, Void, Exception>() {
@Override
protected Exception doInBackground(Void... params) {
try {
Log.i(TAG, "AsyncTask:doInBackground: setup recognizr");
Assets assets = new Assets(context);
File assetDir = assets.syncAssets();
setupRecognizer(assetDir);
} catch (IOException e) {
return e;
}
return null;
}
@Override
protected void onPostExecute(Exception result) {
if (result != null) {
//((TextView) findViewById(R.id.caption_text)).setText("Failed to init recognizer " + result);
Log.e(TAG, "onPostExecute: failed to init recognizer: " + result);
} else {
Log.i(TAG, "AsyncTask: onPostExecute: swtich to the digit search");
switchSearch(/*KWS_SEARCH*/DIGITS_SEARCH);
}
}
}.execute();
}
private void switchSearch(String searchName) {
recognizer.stop();
recognizer.startListening(searchName);
String caption = getResources().getString(captions.get(searchName));
//((TextView) findViewById(R.id.caption_text)).setText(caption);
}
private void setupRecognizer(File assetsDir) {
File modelsDir = new File(assetsDir, "models");
recognizer = defaultSetup()
.setAcousticModel(new File(modelsDir, "hmm/en-us-semi"))
.setDictionary(new File(modelsDir, "dict/cmu07a.dic"))
.setRawLogDir(assetsDir).setKeywordThreshold(1e-20f)
.getRecognizer();
recognizer.addListener(this);
// Create keyword-activation search.
recognizer.addKeyphraseSearch(KWS_SEARCH, KEYPHRASE);
// Create grammar-based searches.
File menuGrammar = new File(modelsDir, "grammar/menu.gram");
recognizer.addGrammarSearch(MENU_SEARCH, menuGrammar);
File digitsGrammar = new File(modelsDir, "grammar/digits.gram");
recognizer.addGrammarSearch(DIGITS_SEARCH, digitsGrammar);
// Create language model search.
File languageModel = new File(modelsDir, "lm/weather.dmp");
recognizer.addNgramSearch(FORECAST_SEARCH, languageModel);
}
public PocketSphinxVoiceRecognitionService() {
// TODO Auto-generated constructor stub
}
@Override
public void onBeginningOfSpeech() {
// TODO Auto-generated method stub
}
@Override
public void onEndOfSpeech() {
// TODO Auto-generated method stub
if (DIGITS_SEARCH.equals(recognizer.getSearchName())
|| FORECAST_SEARCH.equals(recognizer.getSearchName()))
switchSearch(/*KWS_SEARCH*/DIGITS_SEARCH);
}
@Override
public void onPartialResult(Hypothesis hypothesis) {
// TODO Auto-generated method stub
String text = hypothesis.getHypstr();
if (text.equals(KEYPHRASE))
switchSearch(MENU_SEARCH);
else if (text.equals(DIGITS_SEARCH))
switchSearch(DIGITS_SEARCH);
else if (text.equals(FORECAST_SEARCH))
switchSearch(FORECAST_SEARCH);
else {
//((TextView) findViewById(R.id.result_text)).setText(text);
}
}
@Override
public void onResult(Hypothesis hypothesis) {
// TODO Auto-generated method stub
//((TextView) findViewById(R.id.result_text)).setText("");
if (hypothesis != null) {
String text = hypothesis.getHypstr();
//makeText(getApplicationContext(), text, Toast.LENGTH_SHORT).show();
Log.i(TAG, "onResult: " + text);
}
}
@Override
public IBinder onBind(Intent intent) {
// TODO Auto-generated method stub
return null;
}
}
4. Add the service to the manifest file
<service android:name="com.me.android.test.PocketSphinxVoiceRecognitionService"
android:enabled="true"
android:exported="false" />
5. Copy and paste the following folders from the demo project to the "libs" folder of the current project
armeabi
armeabi-v7a
mips
x86
6. Copy the lib pocketsphinx-android-0.8-nolib.jar to the the "libs" folder of current project
7. Copy the "assets" folder from the demo project to the current project
8. Copy the values in the strings.xml under "Values" folder of the demo project, and paste them to the strings.xml of current project.
9. Add your own words to be used in the current project
Open the file digits.gram under "assets/sync/models/grammar" and following the format to add your own words to be used in the current project
10. Copy the following files from the demo project to the current project
assets.xml
custom_rules.xml
build.xml
and make some changes such as project names, etc.
11. In other code such as the main activity or widget service provider, check if this service has been started. If it is not, then start up this service
Up to this step, the project shall be able to get compiled and build. The apk can also be launched to the device.
12. The last but the most important step to make the service work and take the voice command
Project -> Properties -> Builders -> New
Project -> Properties -> Builders -> New
Configure the builds and add the asset list builder to the project. Please refer to the parameters used by "Asset List Builder" of the demo project. The asset list builder shall be in the top of the builder list.
This builder will build the assets of voice recognition.
13. Build the project and launch the apk to the Android device
14. The app will be able to take the voice commands that use the words listed in the digits.gram file.