Florian joined FAIAR in 2019 to work on multi-modal understanding of speech, video, and language. He has a background in speech and audio processing, including end-to-end processing of conversations, analysis of voice properties, and multi-lingual or low-resource speech recognition. He is also faculty at Carnegie Mellon's Language Technologies Institute (currently on LOA), where he continues to supervise a team of students.
For low resource languages, collecting sufficient training data to build acoustic and language models is time consuming and often expensive. But large amounts of text data, such as online newspapers, web forums or online encyclopedias, usually exist for languages that have a large population of native speakers.
Ankur Gandhe, Long Qin, Florian Metze, Alexander Rudnicky, Ian Lane, Matthias Eck