Human speech can be broken down into sounds produced by interactions of different physiological structures. The objective of this project is to model these anatomical movements and classify them into articulatory features using machine learning and a computational model of artificial neural networks.
The motive is to provide speech synthesis support for low resource languages. It becomes difficult to do this without a transcription for the language. Our goal is to parameterize human voices using the modeling of the language's articulatory features. This should lead to more robust synthetic voices, which are also more flexible for functions to make them more realistic.