Hostname: page-component-89b8bd64d-shngb Total loading time: 0 Render date: 2026-05-07T14:51:15.240Z Has data issue: false hasContentIssue false

Robust deep convolutional neural network against image distortions

Published online by Cambridge University Press:  11 October 2021

Liang-Yao Wang
Affiliation:
Institute of Electronics, National Yang Ming Chiao Tung University, Hsinchu 300, Taiwan
Sau-Gee Chen
Affiliation:
Institute of Electronics, National Yang Ming Chiao Tung University, Hsinchu 300, Taiwan
Feng-Tsun Chien*
Affiliation:
Institute of Electronics, National Yang Ming Chiao Tung University, Hsinchu 300, Taiwan
*
Corresponding author: Feng-Tsun Chien Email: ftchien@mail.nctu.edu.tw

Abstract

Many approaches have been proposed in the literature to enhance the robustness of Convolutional Neural Network (CNN)-based architectures against image distortions. Attempts to combat various types of distortions can be made by combining multiple expert networks, each trained by a certain type of distorted images, which however lead to a large model with high complexity. In this paper, we propose a CNN-based architecture with a pre-processing unit in which only undistorted data are used for training. The pre-processing unit employs discrete cosine transform (DCT) and discrete wavelets transform (DWT) to remove high-frequency components while capturing prominent high-frequency features in the undistorted data by means of random selection. We further utilize the singular value decomposition (SVD) to extract features before feeding the preprocessed data into the CNN for training. During testing, distorted images directly enter the CNN for classification without having to go through the hybrid module. Five different types of distortions are produced in the SVHN dataset and the CIFAR-10/100 datasets. Experimental results show that the proposed DCT-DWT-SVD module built upon the CNN architecture provides a classifier robust to input image distortions, outperforming the state-of-the-art approaches in terms of accuracy under different types of distortions.

Information

Type
Original Paper
Creative Commons
Creative Common License - CCCreative Common License - BY
This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted re-use, distribution, and reproduction in any medium, provided the original work is properly cited.
Copyright
Copyright © The Author(s), 2021. Published by Cambridge University Press in association with Asia Pacific Signal and Information Processing Association
Figure 0

Fig. 1. (a) FDCT and IDCT. (b) The zig-zag scanning.

Figure 1

Fig. 2. Multi-resolution decomposition of the 2D-DWT.

Figure 2

Fig. 3. (a) Single-level decomposition (level-1 DWT). (b) Two-level decomposition (level-2 DWT).

Figure 3

Fig. 4. The “smooth” signal extension mode.

Figure 4

Fig. 5. Images after being processed by the hybrid DCT-DWT-SVD module with different amounts of low frequency components in the DCT-DWT stage and with six principle components in the SVD stage.

Figure 5

Fig. 6. The proposed DCNN architecture based on the VGG-16 [3]. Clean data first go through the DCT-DWT-SVD module before entering the VGG-16 for training. However, the DCT-DWT-SVD module will not be active when testing, with the distorted picture directly entering the model for classification.

Figure 6

Fig. 7. Five different distortion levels for five different types of distortions in CIFAR-10.

Figure 7

Fig. 8. The classification accuracy of different DWT-module levels with Gaussian Noise and Speckle in CIFAR-10.

Figure 8

Fig. 9. The classification accuracy of various zig-zag selection regions in various types and degrees of noise for images in CIFAR-10.

Figure 9

Table 1. The accuracy ($\%$) of different random selection regions in the DCT-stage of DCT-DWT-SVD module for image data in CIFAR-10.

Figure 10

Fig. 10. The classification accuracy of six different models in various types and degrees of noise in SVHN.

Figure 11

Fig. 11. The classification accuracy of six different models in various types and degrees of noise in CIFAR-10.

Figure 12

Fig. 12. The classification accuracy of six different models in various types and degrees of noise in CIFAR-100.

Figure 13

Table 2. The accuracy ($\%$) of different random selection stages of DCT-DWT-SVD module in SVHN.

Figure 14

Table 3. The accuracy ($\%$) of different random selection stages of DCT-DWT-SVD module in CIFAR-10.

Figure 15

Table 4. The accuracy ($\%$) of different random selection stages of DCT-DWT-SVD module in CIFAR-100.

Figure 16

Table 5. Comparison of training parameters and accuracy between DCT-DWT-SVD module and MixQualNet in CIFAR-10 [6].