For centuries, engineers and scientists were intrigued by the idea of creating a machine that can converse with humans fluently in a natural spoken language. The earlier examples of this fascination can be found in science fiction movies in the 60s and 70s.
Stanley Kubrick’s 1968 movie “2001: A Space Odyssey” and George Lucas’ famous Star Wars saga, for instance, had an intelligent computer named “HAL” and mobile droids like R2D2 and C3PO respectively, that could move around interacting with people and other droids in natural human language.
Among the earliest genuine attempts of speech recognition was a “digit recognizer” called Audrey (1952) Bell Laboratories which could only recognize spoken numbers, Shoebox by IBM (1962) which could understand 16 words in English, and Harpy (1976) by Carnegie Mellon which could comprehend 1011 words.
In a significant breakthrough, the development of the Markov model in the 1980s was able to determine a word from an unknown sound, without relying on speech patterns or fixed templates. It led to the invention of several industrial and business applications. But all speech recognition systems in the 80s had a big flaw: you had to take a break between each spoken word.
The world had to wait until 1997 to get a glimpse of the world’s first continuous speech recognizer – Dragon NaturallySpeaking. Capable of understanding 100 words per minute without any pause, it is still in use today.
In a human to machine interface, a speech signal is transformed into an analog and digital waveform which can be understood by the machine. Speech technologies are vastly used and have unlimited uses. These technologies enable devices to respond correctly and reliably to human voices and provide useful and valuable services.
In this post, we list the top 35 research papers and projects in speech recognition, published recently. Feel free to download. Share your own research papers with us to be added to this list.
- Scaling Up Online Speech Recognition Using Convnets
- Recurrent neural network based speech recognition using MATLAB
- Minute Meeting System Using Speech Recognition
- Improving Children Speech Recognition through Feature Learning from Raw Speech Signal
- Speech Recognition in High Noise Environment
- Boosting Neuro Evolutionary Techniques for Speech Recognition
- Segment-level training of ANNs based on acoustic confidence measures for hybrid HMM/ANN Speech Recognition
- Recurrent Poisson Process Unit for Speech Recognition
- On the Effect of the Implementation of Human Auditory Systems on Q-Log-Based Features for Robustness of Speech Recognition Against Noise
- Speech Recognition using AANN
- Robust feature extraction for visual speech and speaker recognition
- Hindi Speech Vowel Recognition Using Hidden Markov Model
- A Study of Formation and Recognition of Speech in Speech Signal Processing
- Deep Variational Filter Learning Models For Speech Recognition
- The Art of Developing Accurate Speech Recognition for Military Training
- Application of Speech Recognition Technology on the Evaluation of English Pronunciation Teaching
- On Using 2d Sequence-To-Sequence Models For Speech Recognition
- Inter speech 2018 Low Resource Automatic Speech Recognition Challenge for Indian Languages
- Recurrent neural network language model adaptation for conversational speech recognition
- Investigation on LSTM Recurrent N-gram Language Models for Speech Recognition
- Improving Attention Based Sequence-to-Sequence Models for End-to-End English Conversational Speech Recognition
- Semantic Lattice Processing in Contextual Automatic Speech Recognition for Google Assistant
- A Pruned RNNLM Lattice-Rescoring Algorithm for Automatic Speech Recognition
- Data Augmentation Improves Recognition of Foreign Accented Speech
- Zero-shot keyword spotting for visual speech recognition in-the-wild
- Output-Gate Projected Gated Recurrent Unit for Speech Recognition
- The combination of Sparse Principle Component Analysis and Kernel Ridge Regression methods applied to speech recognition problem.
- Domain Adaptation Using Factorized Hidden Layer for Robust Automatic Speech Recognition
- Combined Speaker Clustering and Role Recognition in Conversational Speech
- Temporal Sensitivity Measured Shortly After Cochlear Implantation Predicts 6-Month Speech Recognition Outcome
- Domain-Adversarial Training for Session Independent EMG-based Speech Recognition
- End-to-end speech recognition using lattice-free MMI
- Language Recognition for Telephone and Video Speech : The JHU HLTCOE Submission for NIST LRE17
- Automatic speech recognition system development in the wild ,
- Improved Accented Speech Recognition Using Accent Embeddings and Multi-task Learning