
Dang Thoai Phan
Software engineer, acoustic recognition researcher
[Google scholar] - [WoS] - [Github]
Personal email: thoai.phandang@gmail.com
Behind every success is a series of failures!
Publications
2025
3. Phan, D. T., Huynh, T. A., Pham, V. T., Tran, C. M., Mai, V. T., & Tran, N. Q. (2025). Optimal Scalogram for Computational Complexity Reduction in Acoustic Recognition Using Deep Learning. IEEE International Conference on Consumer Electronics (ICCE), 2025. [preprint]
Abstract—The Continuous Wavelet Transform (CWT) is an effective tool for feature extraction in acoustic recognition using Convolutional Neural Networks (CNNs), particularly when applied to non-stationary audio. However, its high computational cost poses a significant challenge, often leading researchers to prefer alternative methods such as the Short-Time Fourier Transform (STFT). To address this issue, this paper proposes a method to reduce the computational complexity of CWT by optimizing the length of the wavelet kernel and the hop size of the output scalogram. Experimental results demonstrate that the proposed approach significantly reduces computational cost while maintaining the robust performance of the trained model in acoustic recognition tasks.
Index Terms—Continuous Wavelet Transform, Wavelet Kernel Length, Scalogram, Hop Size, Acoustic Recognition.
2024
2. Phan, D.T. (2025). Comparison Performance of Spectrogram and Scalogram as Input of Acoustic Recognition Task. In: Arai, K. (eds) Advances in Information and Communication. FICC 2025. Lecture Notes in Networks and Systems, vol 1283. Springer, Cham. https://doi.org/10.1007/978-3-031-84457-7_41 [preprint PDF] [published version]
Abstract—Acoustic recognition has emerged as a prominenttask in deep learning research, frequently utilizing spectralfeature extraction techniques such as the spectrogram fromthe Short-Time Fourier Transform and the scalogram from theWavelet Transform. However, there is a notable deficiency instudies that comprehensively discuss the advantages, drawbacks,and performance comparisons of these methods. This paperaims to evaluate the characteristics of these two transforms asinput data for acoustic recognition using Convolutional NeuralNetworks. The performance of the trained models employingboth transforms is documented for comparison. Through thisanalysis, the paper elucidates the advantages and limitations ofeach method, provides insights into their respective applicationscenarios, and identifies potential directions for further research.
Index Terms—spectrogram and scalogram, acoustic recognition, Convolutional Neural Networks
1. D. Thoai Phan, "Reduce Computational Complexity For Continuous Wavelet Transform in Acoustic Recognition Using Hop Size", 2024 International Symposium on Electronics and Telecommunications (ISETC), Timisoara, Romania, 2024, pp. 1-4, doi: 10.1109/ISETC63109.2024.10797444. [preprint PDF] [published version]
Abstract—In recent years, the continuous wavelet transform (CWT) has been employed as a spectral feature extractor for acoustic recognition tasks in conjunction with machine learning and deep learning models. However, applying the CWT to each individual audio sample is computationally intensive. This paper proposes an approach that applies the CWT to a subset of samples, spaced according to a specified hop size. Experimental results demonstrate that this method significantly reduces computational costs while maintaining the robust performance of thetrained models.
Index Terms—Continuous Wavelet Transform, Hop Size, Acoustic Recognition.


