A Python toolkit that simplifies visual analysis on a wide variety of text generation tasks. The output of models for tasks such as machine translation, image captioning, and speech recognition are blocks of text, which aren’t easy to inspect with the naked eye. Existing automatic evaluation tools generally rely on task-specific metrics, such as BLEU (bilingual evaluation understudy, a common machine translation metric). But these abstract numbers don’t always match up with a human assessment. VizSeq provides a unified, scalable solution. With a user-friendly interface and the latest NLP advancements, VizSeq improves productivity via visualization in Jupyter Notebook and a web app. Moreover, it provides a collection of multiprocess scorers for fast evaluation on large data sets.
Read our paper: VizSeq: A Visual Analysis Toolkit for Text Generation Tasks.
VizSeq visualizes text generation outputs where you can filter, sort, and inspect examples with multimodal data, highlighted differences, and various metrics displayed all in one place. It allows users to explore data set characteristics and compare models holistically under various metrics.
VizSeq also performs speedy evaluation on large data sets (multiprocess accelerated) covering a wide collection of metrics: BLEU, NIST, METEOR, TER, RIBES, chrF, GLEU, ROUGE, CIDEr, WER, LASER and BERTScore. It also has a simple API to help define new metrics.
The field of text generation has a large research community and is useful for a wide range of industrial applications. Existing open source analysis tools often lack the functionality integration and optimization for productivity and scalability. With VizSeq, we are offering a thoughtfully designed and actively evolving solution. Our long-term goal is to build an open and unified analysis platform for accelerating text generation research, facilitating the daily work of academic or industrial researchers. VizSeq is under active development, and we welcome any feedback or code contribution via GitHub.