NOISE SUPPRESSION USING DEEP LEARNING (DL) FOR REAL-TIME SPEECH ENHANCEMENT

Authors

  • Xudayberganov Jo‘rabek Davlatboyevich Muhammad al-Xorazmiy nomidagi TATU
  • Berdiyev Alisher Alikulovich Muhammad al-Xorazmiy nomidagi TATU

Keywords:

CNN, RNN, DNN, LSTM, TFLite

Abstract

This paper presents a deep learning-based approach to noise suppression in speech signals, comparing convolutional neural networks and recurrent architectures. Inspired by the SEGAN model, a fully convolutional network with residual connections is developed and optimized for TFLite deployment. Results demonstrate effective denoising performance with low latency. Additionally, LSTM models are evaluated, confirming their superiority in handling temporal dependencies in variable-length audio sequences.

References

Khudayberganov J. Design of digital filters to reduce acoustic echo and simulation in the Matlab program. Descendants of Muhammad al-Khwarizmi Scientific-Practical and Information-Analytical Journal, No. 4 (26), December 2023,

Raximov B.N., Khudayberganov J.D. Comparative analysis of modern methods of eliminating acoustic echo interference in telecommunication systems. “World of science” republican scientific journal, May 2023, Volume-6, Issue-5, 74-79 p.

Ivan Malashin, Vadim Tynchenko, Andrei Gantimurov, Vladimir Nelyub and Aleksei Borodulin. Applications of Long Short-Term Memory (LSTM) Networks in Polymeric Sciences: A Review. MDPI Polymers 2024, 16, 2607. https://doi.org/10.3390/polym16182607.

Praveen Damacharla, Hamid Rajabalipanah and Mohammad Hosein Fakheri. LSTM-CNN Network for Audio Signature Analysis in Noisy Environments. 2023 International Conference on Computational Science and Computational Intelligence (CSCI). https://doi.org/10.31224/3312.

Ivan Pisa, Antoni Morell, Jose Lopez Vicario and Ramon Vilanova. Denoising Autoencoders and LSTM-Based Artificial Neural Networks Data Processing for Its Application to Internal Model Control in Industrial Environments—The Wastewater Treatment Plant Control Case. MDPI Sensors 2020, 20, 3743; doi:10.3390/s20133743.

Jiakang Li, Xiongwei Zhang, Meng Sun, Xia Zou and Changyan Zheng. Attention-Based LSTM Algorithm for Audio Replay Detection in Noisy Environments. MDPI Appl. Sci. 2019, 9, 1539; doi:10.3390/app9081539.

Marvin Coto-Jiménez. Improving Post-Filtering of Artificial Speech Using Pre-Trained LSTM Neural Networks. MDPI Biomimetics 2019, 4, 39; doi:10.3390/biomimetics4020039.

Haohan Shi, Xiyu Shi and Safak Dogan. Speech Inpainting Based on Multi-Layer Long Short-Term Memory Networks. MDPI Future Internet 2024, 16, 63. https://doi.org/10.3390/fi16020063.

Santiago Pascual, Antonio Bonafonte, Joan Serra. SEGAN: Speech Enhancement Generative Adversarial Network. Interspeech 2017. 10.21437/Interspeech.2017-1428.

Raximov B.N., Berdiyev A.A., Khudayberganov J.D. Modeling an Acoustic Noise Cancellation Filter Based on the Raspberry Pi 4B Model. 2024 International Conference on Information Science and Communications Technologies (ICISCT), Kookmin University, Republic of Korea, November 7–8, 2024.

Michelle Gutiérrez-Muñoz, Astryd González-Salazar and Marvin Coto-Jiménez. Evaluation of Mixed Deep Neural Networks for Reverberant Speech Enhancement. MDPI Biomimetics 2020, 5, 1; doi:10.3390/biomimetics5010001.

Рахимов Б.Н., Худайберганов Ж.Д. Акустическое эхоподавление для развития телекоммуникаций. Multidisciplinary Scientific Journal “Innovative development in educational activities”, volume 2, issue 9, may, 2023, 182-190 p.

Mustaqeem and Soonil Kwon. A CNN-Assisted Enhanced Audio Signal Processing for Speech Emotion Recognition. MDPI Sensors 2020, 20, 183; doi:10.3390/s20010183.

TensorFlow Lite Documentation. (2024). https://www.tensorflow.org/lite

Downloads

Published

2024-12-28

How to Cite

Xudayberganov , J., & Berdiyev , A. (2024). NOISE SUPPRESSION USING DEEP LEARNING (DL) FOR REAL-TIME SPEECH ENHANCEMENT. DIGITAL TRANSFORMATION AND ARTIFICIAL INTELLIGENCE, 2(6), 262–269. Retrieved from https://dtai.tsue.uz/index.php/dtai/article/view/v2i636