Dzongkha Automatic Speech Recognition

Record or Upload Audio






Loading


limitations

The Automatic Speech Recognition (ASR) model has several limitations. Firstly, it is trained on a relatively small dataset of only 5000 samples, which may impact its performance on diverse accents, dialects, or speech patterns not well represented in the training data. Secondly, the model, especially when used with a language model (LM), may struggle with accurate word segmentation since it is not specifically trained for that task. As a result, it may produce transcriptions with incorrect grammar, spelling errors, and unrecognizable words or characters. These limitations should be considered when relying on the ASR model's transcriptions, and further improvements can be explored through larger datasets, specialized language models, and post-processing techniques.

Developers

Pema Galey

Guide

Mr. Pema Galey is an associate lecturer at the Information Department, College of Science and Technology. He holds a Master of Engineering in Software Systems Engineering with a specialization in NLP (speech recognition) and Software Engineering.

Ngawang Samten Pelzang

Developer

Ngawang Samten Pelzang is a student at the College of Science and Technology, pursuing an undergraduate degree in Information Technology (2019-2023).

Kinley Rabgay

Developer

Kinley Rabgay is a student at the College of Science and Technology, pursuing an undergraduate degree in Information Technology (2019-2023).

Khentse Dorji

Developer

Khentse Dorji is a student at the College of Science and Technology, pursuing an undergraduate degree in Information Technology (2019-2023).