AI-Driven Near-Lossless Audio Compression Modeling via Autoencoders

Hind khalid

doi:10.61268/c23c6z11

PDF

Published: 26-09-2025

DOI: https://doi.org/10.61268/c23c6z11

Keywords:

Voice data compression , neural network-based compression , sound wave compression , data compression techniques

Hind khalid

University of Nahrain, College of Political Science, Baghdad, Iraq

https://orcid.org/0000-0002-8318-097X

Abstract

Near-Lossless audio compression is an important aspect of efficient data storage and transmission in various audio-related applications. Traditional compression algorithms often rely on mathematical techniques and signal processing methods to reduce file size while maintaining the original audio quality. However, deep learning-based methods have shown promising results in achieving better compression performance. This study explores the application of deep learning techniques for Near-Lossless audio compression. Deep neural networks (DNNs) and recurrent neural networks (RNNs) are used to learn compressed representations of audio data that can be efficiently reconstructed without any information loss. Models have been trained on a large dataset of unannotated audio samples to capture complex patterns and dependencies in the data. Experimental results demonstrated a compression ratio of 0.0333 (30:1) with a mean squared error (MSE) of 4.3957e-06, outperforming traditional compression algorithms such as FLAC (compression ratio: 0.1879) in both compression efficiency and reconstruction quality. In addition, the trained models showed robust generalization to unseen audio samples. Overall, this study contributes to the advancement of Near-Lossless audio compression techniques using deep learning methodologies.

Issue

Al-Rafidain Journal of Engineering Sciences Vol.3 Issue 2 (2025)

Section

Computer Engineering

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.

Licensed under a CC-BY license: https://creativecommons.org/licenses/by-nc-sa/4.0/

How to Cite

[1]

H. khalid, “AI-Driven Near-Lossless Audio Compression Modeling via Autoencoders”, Rafidain J. Eng. Sci., vol. 3, no. 2, pp. 406–422, Sep. 2025, doi: 10.61268/c23c6z11.

References

C. H. Chi, C. K. Kan, K. S. Cheng, and L. Wong, “Extending Huffman coding for multilingual text compression,” in Data Compression Conference Proceedings, 1995, p. 437. doi: 10.1109/dcc.1995.515547.

Fowler, J. E., & Yagel, R. (1995). Optimal linear prediction for the lossless compression of volume data. Data Compression Conference Proceedings, 458. doi:10.1109/dcc.1995.515568

Franceschini, R., & Mukherjee, A. (1996). Data compression using encrypted text. Proceedings of the Forum on Research and Technology Advances in Digital Libraries (ADL), 130–138. doi:10.1109/dcc.1996.488369

Bhattacharjee, A. K. B. A. K. (2013). Comparison study of lossless data compression algorithms for text data. IOSR Journal of Computer Engineering, *11*(6), 15–19. doi:10.9790/0661-1161519

Jain, A., & Patel, R. (2009). An efficient compression algorithm (ECA) for text data. 2009 International Conference on Signal Processing Systems (ICSPS), 762–765. doi:10.1109/ICSPS.2009.96

Shukla, S., Gupta, R., Rajput, D. S., Goswami, Y., & Sharma, V. (2022). A comparative analysis of lossless compression algorithms on uniformly quantized audio signals. International Journal of Image, Graphics and Signal Processing, *14*(6), 59–69. doi:10.5815/ijigsp.2022.06.05

Dubois, Y., Bloem-Reddy, B., Ullrich, K., & Maddison, C. J. (2021). Lossy compression for lossless prediction. Advances in Neural Information Processing Systems, *34*, 14014–14028.

Barman, R., Badade, S., Deshpande, S., Agarwal, S., & Kulkarni, N. (2022). Lossless data compression method using deep learning. In Machine Intelligence and Smart Systems (pp. 145–151). Springer. doi:10.1007/978-981-16-9650-3_11

Shukla, S., Ahirwar, M., Gupta, R., Jain, S., & Rajput, D. S. (2019). Audio compression algorithm using discrete cosine transform (DCT) and Lempel-Ziv-Welch (LZW) encoding method. Proceedings of the International Conference on Machine Learning, Big Data, Cloud and Parallel Computing (COMITCon), 476–480. doi:10.1109/COMITCon.2019.8862228

Hennequin, R., Royo-Letelier, J., & Moussallam, M. (2017). Codec independent lossy audio compression detection. IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 726–730. doi:10.1109/ICASSP.2017.7952251

Schuller, G. D. T., Yu, B., Huang, D., & Edler, B. (2002). Perceptual audio coding using adaptive pre- and post-filters and lossless compression. IEEE Transactions on Speech and Audio Processing, *10*(6), 379–390. doi:10.1109/TSA.2002.803444

Ramesh, V., & Wang, M. (2021). ClefNet: Recurrent autoencoders with dynamic time warping for near-lossless music compression and minimal-latency transmission. Preprints. doi:10.20944/preprints202103.0360.v1

Friedland, G., Jia, R., Wang, J., Li, B., & Mundhenk, N. (2020). On the impact of perceptual compression on deep learning. 3rd International Conference on Multimedia Information Processing and Retrieval (MIPR), 219–224. doi:10.1109/MIPR49039.2020.00052

Mineo, T., & Shouno, H. (2022). Improving sign-algorithm convergence rate using natural gradient for lossless audio compression. EURASIP Journal on Audio, Speech, and Music Processing, *2022*(1), 12. doi:10.1186/s13636-022-00243-w

Liu, Y. (2021). Recovery of lossy compressed music based on CNN super-resolution and GAN. IEEE 3rd International Conference on Frontiers Technology of Information and Computer (ICFTIC), 623–629. doi:10.1109/ICFTIC54370.2021.9647041

Huang, Q., Liu, T., Wu, X., & Qu, T. (2019). A generative adversarial net-based bandwidth extension method for audio compression. Journal of the Audio Engineering Society, *67*(12), 986–993. doi:10.17743/jaes.2019.0047

Passricha, V., & Aggarwal, R. K. (2020). A hybrid of deep CNN and bidirectional LSTM for automatic speech recognition. Journal of Intelligent Systems, *29*(1), 1261–1274. doi:10.1515/jisys-2018-0372

Yoshimura, T., Hashimoto, K., Oura, K., Nankaku, Y., & Tokuda, K. (2018). WaveNet-based zero-delay lossless speech coding. IEEE Spoken Language Technology Workshop (SLT), 153–158. doi:10.1109/SLT.2018.8639598

Zeghidour, N., Luebs, A., Omran, A., Skoglund, J., & Tagliasacchi, M. (2022). SoundStream: An end-to-end neural audio codec. IEEE/ACM Transactions on Audio, Speech, and Language Processing, *30*, 495–507. doi:10.1109/TASLP.2021.3129994

Nogales, A., Donaher, S., & García-Tejedor, Á. (2023). A deep learning framework for audio restoration using convolutional/deconvolutional deep autoencoders. Expert Systems with Applications, *230*, 120586. doi:10.1016/j.eswa.2023.120586

Nagaraj, P., Rao, J. S., Muneeswaran, V., Kumar, A. S., & Sudar, K. M. (2020). Competent ultra data compression by enhanced features excerption using deep learning techniques. International Conference on Intelligent Computing and Control Systems (ICICCS), 1061–1066. doi:10.1109/ICICCS48265.2020.9121126

Q Chen, Q., Wu, W., & Luo, W. (2021). Lossless compression of sensor signals using an untrained multi-channel recurrent neural predictor. Applied Sciences, *11*(21), 10240. doi:10.3390/app112110240

Wang, J., Xie, X., & Kuang, J. (2014). A novel multichannel audio signal compression method based on tensor representation and decomposition. China Communications, *11*(3), 80–90. doi:10.1109/CC.2014.6825261

Shin, S., Byun, J., Park, Y., Sung, J., & Beack, S. (2022). Deep neural network (DNN) audio coder using a perceptually improved training method. IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 871–875. doi:10.1109/ICASSP43922.2022.9747575

Article Sidebar

Main Article Content

Abstract

Article Details

Issue

Section

How to Cite

References

Similar Articles