Notes on the process of installing Kaldi and Kaldi-GStreamer-server on Ubuntu 16.04 LTS. These were modified somewhat, since this is retroactively documented for my own benefit.
Kaldi is a state-of-the-art speech transcription engine, geared towards researchers and people who already know what they're doing. I'm just trying to set it up.
Decide where to put Kaldi and make that your new working directory.
mkdir ~/tools/Clone Kaldi from github.
git clone https://github.com/kaldi-asr/kaldi.gitcd into this new location.
cd ./kaldi-master/toolsCheck for any dependencies. There were a few things I needed to add to my Ubuntu installation; don't remember what they were. Do whatever this output instructs.
extras/check_dependencies.shNow comes the actual installation.
makeRun this next to install the online extensions.
make extNote: if you have more than one core in your machine, you can run make -j 4 to do make in parallel.
Congratulations. Kaldi is installed. Installing Kaldi-GStreamer-server:
Before actually installing the kaldi-gstreamer-server, there's a few more things to do with kaldi itself.
Compile the Gstreamer plugin. First, install dependencies. Note they are older versions of the packages. Make sure you get the right version. On Ubuntu/Debian, run:
sudo apt-get install libgstreamer1.0-dev gstreamer1.0-plugins-good gstreamer1.0-tools gstreamer1.0-pulseaudioKaldi-Gstreamer-server requires the gstreamer plugin to be compiled (makes sense).
cd ~/tools/kaldi-master/src/gst-plugin/This folder (gst-plugin) should now contain the file libgstkaldi.so which contains the Gstreamer plugin.
Now it's time to install the kaldi-gstreamer-server package. First, more dependencies.
sudo apt-get install pip python-yaml python-giNote: You might need to run pip as sudo. e.g. sudo pip install tornado, above.
pip install tornado ws4py==0.3.2 pyyaml
Note: I couldn't figure out which YAML package to install, so I used both. At least, they're both installed, and I don't remember which I actually needed. If I do this again, I'll try to remember to change this.
Clone kaldi-gstreamer-server from GitHub into your tools folder.
git clone https://github.com/alumae/kaldi-gstreamer-server.git
This completes the installation.
cd into the main folder.
cd ./kaldi-gstreamer-server/Open the README file, peruse until understood.
gedit ./readme.mdNow you'll understand what I mean by server and worker. You can start the server with:
python kaldigstserver/master_server.py --port=8888Before starting a worker, make sure that the GST plugin path includes the gstreamer plugin you compiled. If you put everything where I recommended, this is all you have to do:
export GST_PLUGIN_PATH=~/tools/kaldi-master/src/gst-pluginTest to make sure it worked. If it fails, take a look at the README file again. This command should spit out a bunch of information. If it just says something like, 'not found', you did something wrong. I have no idea what.
gst-inspect-1.0 onlinegmmdecodefasterNow you can start a worker.
python kaldigstserver/worker.py -u ws://localhost:8888/worker/ws/speech -c sample_worker.yamlExample of how to use the server to transcribe text:
python kaldigstserver/client.py -r 32000 ~/tools/kaldi-gstreamer-server/test/data/english_test.rawYou can also use a Deep Neural Network (DNN) to process the data, but at time of writing the readme walkthrough was giving me errors.