Tech Sentences For ASR Training

Hugging Face Dataset

TechVoice Dataset Work in Progress – This dataset is actively being expanded with new recordings. Dataset Statistics Metric Current Target Progress Duration 38m 43s 5h 0m 0s ██░░░░░░░░░░░░░░░░░░ 12.9% Words 10,412 50,000 ████░░░░░░░░░░░░░░░░ 20.8% Total Recordings: 205 samples Total Characters: 74,312 A specialized speech dataset for fine-tuning Automatic Speech Recognition (ASR) models on technical and developer vocabulary. Contains human-recorded… See the full description on the dataset page: https://huggingface.co/datasets/danielrosehill/Tech-Sentences-For-ASR-Training.

Project Information

Categories

Tags

task_categories:automatic-speech-recognitionlanguage:enlicense:mitsize_categories:n<1kformat:jsonmodality:audiomodality:tabularmodality:textlibrary:datasetslibrary:pandaslibrary:mlcroissantlibrary:polarsdoi:10.57967/hf/7099region:usspeechwhisperasrfine-tuningtechnical-vocabularydeveloper-speechsoftware-engineeringprogramming