Synthetic Speech Detection by Deep Learning Technique Test phase, srs, design phase and source code final deliverable
Project Domain/ Category
AI/ Deep Learning/ Machine Learning
Abstract/ Preface
With the wide operation of social media, the dispersion of fake content over the internet is a serious concern currently. There are multitudinous fake vids in which the audio/ speech of the target speaker is manipulated by colorful means in order to defame or to perform illegal conditioning (e.g https// www.youtube.com/watch?v=oxXpB9pSETo).
The speech of the target speaker is manipulated substantially by TTS ( textbook to speech) system or by Voice synthesizer. First the voice/ speech is uprooted from the videotape and also the uttered words of the speaker are modified. Tacotron 2, Deepvoice 3, wav2lip are many of the speech synthesize operations that produce the nearly natural voice/ speech of the speaker.
As the synthetic speech can be used for illegal and fraudulent conditioning, hence its the need of time to descry them with AI styles. We’ll use Deep literacy ways to descry the instinctively generated speeches/ voices in a videotape. We’ll use the Fake audio data from the given dataset of FakeAVCeleb https//drive.google.com/drive/folders/1SYMs44Z1W7rlrn0W7t-4LcPPusiBlNEB. The thing is to develop a model to descry the synthetic/ fake audio/ speech in the given vids of dataset.
Functional Conditions
The following are the functional conditions of the design
1 The tool grounded operation/ software must download the given Dataset that contains the database of real and fake audio in vids.
2 The system must correspond of a neural network model that contains retired layers for the synthetic speech discovery.
3 Whenever, any videotape is given as an input that contains the synthesized speech into the discovery system, it identifies as real or fake as affair.
4 The discovery system must be suitable to descry the fake audio generated by any of the android apps similar as MadLipz, speaKeretc.
Tools
● Python (programming language)
● Keras (API)
● Tensorflow ( open source software library for machine literacy)
● Jupyter Notebook ( open source web operation)
● Matplotlib ( library)
● Numpy ( library for the python)
Pre-Requisite
The introductory understanding of deep literacy generalities are needed for this design. Following links may help the scholars who do n’t have introductory understanding of deep literacy and speech recognition
1. Deep Learning Tutorial https// www.youtube.com/watch?v=VyWAvY2CF9c
2.Automatic Speech Recognition https//towardsdatascience.com/audio-deep-learning-made-simple-automatic-speechrecognition-asr-how-it-works-716cfce4c706