Synthetic Speech Detection by Deep Learning Technique Test phase, srs, design phase and source code final deliverable
Project/Domain/ Category AI/Deep Learning/Machine Learning Abstract / Introduction With the substantial utilization of social media, the dissemination of faux content material over the net is a severe situation nowadays. There are severa faux movies wherein the audio/speech of the goal speaker is manipulated through diverse way so that it will defame or to perform unlawful activities (e.g: https://www.youtube.com/watch?v=oxXpB9pSETo). The speech of the goal speaker is manipulated mainly through TTS (textual content to speech) technique or through Voice synthesizer. First the voice/speech is extracted from the video after which the uttered phrases of the speaker are modified. Tacotron 2, Deepvoice three, wav2lip are few of the speech synthesize packages that produce the nearly herbal voice/speech of the speaker. As the artificial speech may be used for unlawful and fraudulent activities, for this reason its the want of time to come across them with AI methods. We will use Deep gaining knowledge of strategies to come across the artificially generated speeches/voices in a video. We will use the Fake audio records from the given dataset of FakeAVCeleb: https://drive.google.com/drive/folders/1SYMs44Z1W7rlrn0W7t- 4LcPPusiBlNEB. The intention is to increase a version to come across the artificial/faux audio/speech withinside the given movies of dataset. Functional Requirements The following are the purposeful necessities of the project: 1 The device primarily based totally application/software program have to down load the given Dataset that consists of the database of actual and faux audio in movies. 2 The machine have to include a neural community version that consists of hidden layers for the artificial speech detection. three Whenever, any video is given as an enter that consists of the synthesized speech into the detection machine, it identifies as actual or faux as output. four The detection machine have to be capable of come across the faux audio generated through any of the android apps which include MadLipz, speaKer etc. Tools ● Python (programming language) ● Keras (API) ● Tensorflow (open supply software program library for gadget gaining knowledge of) ● Jupyter Notebook (open supply net application) ● Matplotlib (library) ● Numpy (library for the python) Pre-Requisite: The fundamental knowledge of deep gaining knowledge of principles are required for this project. Following hyperlinks might also additionally assist the scholars who don’t have fundamental knowledge of deep gaining knowledge of and speech recognition: 1. Deep Learning Tutorial: https://www.youtube.com/watch?v=VyWAvY2CF9c 2.Automatic Speech Recognition: https://towardsdatascience.com/audio-deeplearning-made-simple-automatic-speechrecognition-asr-how-it-works-716cfce4c706