TOOLS

Wav2letter

Wav2letter is an end-to-end Automatic Speech Recognition (ASR) system for researchers and developers to transcribe speech.

Simple and efficient

Wav2letter implements the architecture proposed in Wav2Letter: an End-to-End ConvNet-based Speech Recognition System and Letter Based Speech Recognition with Gated ConvNets. It provides pre-trained models for the Librispeech dataset to help developers start transcribing speech right away.

Get Started

1

If you plan to train on a CPU, install Intel MKL. For training on a GPU, install NVIDIA CUDA Toolkit. Install LuaJIT + LuaRocks, KenLM, OpenMPI, and TorchMPI as needed to support development.

2

Install Torch and Torch packages.

1
2
3
luarocks install torch
luarocks install cudnn # for GPU support
luarocks install cunn # for GPU support
        

3

Install wav2letter packages.

1
2
3
4
5
6
7
8
9
git clone https://github.com/facebookresearch/wav2letter.git
cd wav2letter
cd gtn && luarocks make rocks/gtn-scm-1.rockspec && cd ..
cd speech && luarocks make rocks/speech-scm-1.rockspec && cd ..
cd torchnet-optim && luarocks make rocks/torchnet-optim-scm-1.rockspec && cd ..
cd wav2letter && luarocks make rocks/wav2letter-scm-1.rockspec && cd ..
# Assuming here you got KenLM in $HOME/kenlm
# And only if you plan to use the decoder:
cd beamer && KENLM_INC=$HOME/kenlm luarocks make rocks/beamer-scm-1.rockspec && cd ..
        

4

Download pre-trained models and iterate on them or build and train new models.

More Tools