Data augmentation technique for degraded images without losing the classification ability of clean images

Abstract. Classification networks of degraded images need to deal with various strengths of degradation, referred to as degradation levels, in practical applications. However, there has been limited exploration of data augmentation techniques for degraded images with various degradation levels. We propose a data augmentation technique to apply distinct data augmentations to both clean and degraded image domains. Specifically, the proposed method uses random erasing and CutBlur data augmentations for a clean and degraded image, respectively. Experimental results show that the proposed method can effectively train a classification network of degraded images without losing the classification ability of clean images. Furthermore, the results also confirm the proposed method’s efficacy across various degradations, multiple network architectures, and several datasets.


Introduction
2][3][4][5][6][7] Typically, these CNNs are trained with only clean images and are designed to input clean images.However, in real-world applications, such as autonomous driving, the input images of the networks often contain various degradations, such as noise, blur, and compression.Prior studies 8,9 have pointed out that CNNs trained with only clean images cannot recognize degraded images well due to degradations.Therefore, recognizing degraded images becomes a more critical and realistic challenge compared with recognizing only clean images.This paper focuses on the CNN-based classification of degraded images because classification is the typical task of image recognition.
0][11][12][13][14][15][16][17][18][19] A straightforward approach to address the classification of degraded images is to train a classification network using degraded images.Notably, even if these images include a single degradation, the degradation usually has various strengths of degradation.Therefore, the classification network should be trained over the various strengths of degradation.This paper uses a degradation level as a parameter representing the strength of degradation.For instance, the degradation levels are noise levels for additive white Gaussian noise (AWGN).This paper assumes that the original clean image of a degraded image can be acquired.Consequently, degraded images are assumed to be synthesized from clean images without any degradations using a degradation operator while changing degradation levels.
Note that synthesizing degraded images can be regarded as a kind of data augmentation.To the best of the author's knowledge, there is limited literature on data augmentation methods of degraded images over various levels of degradation.One such method is mixed training, 10 which is a straightforward approach for augmenting degraded images with various levels of degradation.Mixed training involves the following four steps.(1) A clean image is randomly sampled from a training dataset.(2) A degradation level, followed by a uniform distribution, is randomly sampled.(3) A degraded image is acquired from the clean image by applying a degradation operator with the sampled degradation level.(4) A classification network is trained using the degraded image.However, a classification network trained by mixed training loses the classification ability of clean images compared with a network trained with only clean images 8 because mixed training trains a network averagely over various levels of degradation, as illustrated in Fig. 1.To overcome this drawback, Endo et al. 18,20 introduced a network structure termed the feature adjustor.This paper proposes a data augmentation technique to overcome this drawback without relying on special network structures, in which degraded images are assumed to have a single known degradation with unknown levels of degradation.Therefore, our goal is to construct a data augmentation technique for degraded images that can train a classification network of degraded images without losing the classification ability of clean images, as depicted in Fig. 1.
Figure 2 illustrates several data augmentations of degraded images.Mixed training, as shown in Fig. 2(a), has already been mentioned.Figure 2(b) shows mixed training with random erasing, 21 which first generates degraded images by mixed training and then applies random erasing to them.CutBlur 22 is presented in Fig. 2(c).CutBlur generates a degraded image with a clean region or a clean image with a degraded region.These regions are highlighted as rectangles with white edges in Fig. 2(c).Generally, the same data augmentation methods are applied to both clean and degraded images during the training of classification networks.However, this might not be the best strategy because clean images belong to a distinct domain from degraded images.A more intuitive approach would be to apply appropriate augmentation methods to images based on their respective domains.Based on this idea, this paper proposes a data augmentation technique that applies different operations to clean and degraded images.Specifically, as illustrated in Fig. 2(d), the proposed method is a combination of random erasing for clean images and CutBlur for degraded images.Applying different data augmentations for each clean and degraded image enhances a classification network of degraded images without losing the classification ability of clean images.This paper's contributions are as follows.
(1) Recognizing that clean and degraded images belong to distinct domains, this paper proposes a data augmentation technique that applies separate data augmentations to each domain.Specifically, the proposed method utilizes random erasing for clean images and CutBlur for degraded images.Endo: Data augmentation technique for degraded images without losing. . .
(2) Unlike mixed training, a classification network of degraded images trained using the proposed method does not lose the classification ability of clean images.Furthermore, the proposed method shows a more stable performance in classifying high-quality images than mixed training with random erasing and CutBlur.(3) To validate the effectiveness of the proposed method against various degradations, the proposed method was tested on four types: JPEG distortion, Gaussian blur, AWGN, and salt-and-pepper noise.(4) To evaluate the robustness of the proposed method against different network structures and datasets, the proposed method was confirmed using four classification CNNs, i.e., VGG16, 1 ResNet50, 4,5 ResNet56, 4,5 and PyramidNet110-270 6 with ShakeDrop regularization, 7 and on three datasets: CIFAR-10, 23 CIFAR-100, 23 and Tiny ImageNet. 24e remainder of this paper is organized as follows.Section 2 describes the related works of this paper.Section 3 explains the proposed method.Then experiments are described in Sec. 4. Finally, conclusions are described in Sec. 5.

Data Augmentations of Degraded Images
There are few papers related to data augmentations of degraded images with various degradation levels.Peng et al. 10 investigated the fine-grained classification of low-resolution images.They proposed staged training in which a classifier is trained with high-resolution images before training the classifier with low-resolution images.Their aim was to transfer the knowledge of highresolution images to the classifier of low-resolution images rather than to provide data augmentation.In their experiments, they used mixed training, which involves randomly sampling both low-and high-resolution images to train a network.Their results showed that mixed training is superior to staged training for the classification of high-resolution images.In this paper, mixed training is used as the baseline of data augmentation of degraded images with various levels of degradation.Meanwhile, Yoo et al. 22 introduced the data augmentation technique CutBlur for single-image super-resolution.CutBlur replaces a region of either a high-resolution or a low-resolution image with the corresponding region of its paired image, assuming a pair of highresolution and low-resolution images exists.This paper applies CutBlur to the classification of degraded images with various levels of degradation.Specifically, in the proposed method, CutBlur is applied to only degraded images.

Data Augmentations of Image Mixing and Deleting
There are many data augmentations of image mixing and deleting, as surveyed by Naveed et al. 25 DeVries et al. 26 introduced Cutout, which is a data augmentation technique that deletes a fixedsize square region from images.Cutout always deletes a square region from images but randomly selects the position of the region.Zhong et al. 21proposed a data augmentation technique called random erasing, which randomly deletes a rectangle region inside images.Notably, the size of the rectangle region is randomly determined.Moreover, random erasing allows for the replacement of the rectangle region with various colors or random noise.Yun et al. 27 proposed CutMix data augmentation, which replaces a rectangle region of an image with a region of another image.Cutout, random erasing, and CutMix do not consider the presence of image degradation.By contrast, the proposed method takes into account degradation and changes data augmentation methods based on the presence or absence of degradation in a training image.Specifically, the proposed method applies random erasing to only clean images without any degradation.
Hendrycks et al. 28 introduced AUGMIX to boost robustness against a domain gap between training and testing images.First, AUGMIX performs multiple operations on an image independently and acquires corresponding images.Then those images are merged into a single image.Notably, AUGMIX does not incorporate any degradations, which testing images include, in data augmentations.This paper uses the same degradation operators between the training and testing image domains.Thus the problem setting of this paper differs from that of AUGMIX.

Classification of Degraded Images
Three primary approaches exist for the classification of degraded images: a straightforward approach, a restoration approach, and a knowledge distillation approach.The straightforward approach 10,11 trains an image classification network with degraded images directly.By contrast, the restoration approach 9,12-15 is a sequential network composed of a restoration network and a classification network, in which the classification network is trained with clean images without any degradation.First, degraded images are restored by the restoration network.Next, restored images are input into the classification network trained with clean images.The classification network may be fine-tuned with restored images.The knowledge distillation approach for degraded images [16][17][18][19][20]29 transfers the knowledge of a teacher network into a student network. Typcally, the teacher network is a classification network trained with only clean images.The student network is trained with degraded images to coincide with image features or the predicted distribution of the teacher network, which has clean images as input.This paper focuses on the straightforward approach, which offers a clearer impact of data augmentations than other approaches.

Proposed Method
Although clean and degraded images belong to different domains, the same data augmentation methods are usually performed on them.However, it would be more reasonable to apply distinct data augmentations to each domain.Based on this concept, this paper proposes a data augmentation technique that applies distinct transformations to clean and degraded images, as illustrated in Fig. 3. Initially, mixed training 10 is employed to sample degradation levels using a discrete uniform distribution, in which a clean image is not sampled.With the determined degradation level, a degraded image is synthesized by applying a degradation operator to a clean image.Subsequently, either a clean or degraded image is randomly sampled, as demonstrated in Fig. 3.If a clean image is selected, random erasing 21 is applied to the clean image with a decision probability, p d , in which random erasing is used to erase a region inside the clean image.Though original random erasing can replace the erased region with random noise or various colors, the proposed method always sets the region colored black.Conversely, when a degraded image is selected, CutBlur 22 is applied to the degraded image with decision probability p d , in which CutBlur replaces a region inside the degraded image with its corresponding clean image.The detailed algorithm of the proposed method is shown in Algorithm 1.
Endo: Data augmentation technique for degraded images without losing. . .

Experiments
This section validates the efficacy of the proposed method.Table 1 shows four existing methods compared with the proposed method in terms of domain sampling probabilities and data augmentations, in which the domain sampling probability means a probability of sampling each domain for clean and degraded images."Clean" signifies a naive training method using only clean images without any degradations.On the other hand, "Mixed" denotes mixed training, which randomly samples degradation levels, including a clean image and applies degradation.
For "Mixed" the sampling probability of each degradation level is 1 Nþ1 , where N denotes the number of degradation levels, excluding a clean image."Clean" and "Mixed" are baselines for clean images and degraded images, respectively.Then "Mixed R.E." denotes mixed training combined with random erasing data augmentation.After generating degraded images using mixed training, "Mixed R.E." applies random erasing.As based on mixed training, the domain sampling probability of "Mixed R.E." is the same as that of "Mixed."For a fair comparison with the proposed method, the region of an image was always erased and filled with black.Although the standard random erasing selects a height and an aspect ratio randomly, "Mixed R.E." randomly samples a height and a width in the same way as Algorithm 1. Furthermore, "CutBlur" denotes the data augmentation method by the same name."CutBlur" has a domain sampling probability of 1 2 for each clean and degraded image domain.For a fair comparison with the proposed method, the decision probability of "CutBlur" was always set to 1.0.This implies that clean and degraded images were always mixed.The implementation of "CutBlur" mirrored the CutBlur part of Algorithm 1. Finally, "Proposed" stands for the proposed method.The decision probability p d of "Proposed" was set to 1.0, as discussed in Appendix 6.2.The parameters R min and R max for "Mixed R.E.," "CutBlur," and "Proposed" were fixed at 0.125 and 0.75, respectively.All experiments were executed using Python 3.7.7,Pytorch 1.9.0,CUDA 11.7, and PIL 9.3 on an NVIDIA RTX-A6000 GPU and an Intel Core i9-10940X clocked at CPU 3.3 GHz.Other detailed experiments are described in Appendix A.

Training and Evaluation Procedures
The training procedure for experiments is as follows.First, the horizontal flip is randomly applied to a clean image sampled from a dataset.Then the clean image is transformed by each training condition, as seen in Table 1.Subsequently, random cropping yields an augmented image.Finally, a classification network is trained with the augmented image while minimizing the expectation of cross-entropy loss between estimated and true labels.The expectation is replaced by the sample mean on a minibatch in training.The details of the optimizer settings are explained in subsequent analyses because they depend on the structure of classification networks.
The evaluation procedure is as follows.First, degraded testing images are synthesized from clean testing images by applying a degradation operator across every degradation level.Then a

Domain sampling probability
Clean image

Data augmentation
Clean trained classification network infers class labels for both clean and degraded images.Finally, two evaluation metrics are calculated: accuracy and interval mean accuracy. 8The accuracy, denoted by Acc, gives the number of correct predictions divided by the total number of testing images.The interval mean accuracy is defined as ; t e m p : i n t r a l i n k -; e 0 0 1 ; 1 1 7 ; 6 8 8 Accðf θ ðDðI c ; q i ÞÞ; YÞ; (1) where f θ denotes a classification network with parameter θ.D denotes a degradation operator such as JPEG distortion, Gaussian blur, etc.I c , Y, and q i represent a clean image, the associated true label of the clean image, and the i'th degradation level, respectively.α and β are some integers satisfying 0 ≤ α ≤ β.The rationale behind using the interval mean accuracy is as follows.A calculated Acc for each degradation level may exhibit fluctuations due to the variance in a classification network's predictions.This implies that the Acc does not necessarily show a smooth change for degradation levels, even if degradation levels are consecutive.Therefore, the interval mean accuracy, which averages accuracies over a range of degradation levels, provides a more coherent and effective metric than analyzing individual Acc.The interval mean accuracy simplifies the process of understanding the classification performance over various levels of degradation.

Analysis for JPEG CIFAR-10
Now, experimental comparisons are performed to confirm the proposed method's effectiveness.Furthermore, we also demonstrate that the proposed method does not depend on a specific network architecture by evaluating three classification CNNs: VGG16, 1 ResNet50, 4,5 and PyramidNet110-270 6 with ShakeDrop regularization. 7These CNNs differ in architectures and numbers of parameters.In these experiments, the following dataset, degradation, and optimizers were used.
Dataset.The CIFAR-10 dataset, 23 comprising 60,000 RGB images with a resolution of 32 × 32 pixels, was used.This dataset is split into 50,000 training images and 10,000 testing images across 10 classes.
Degradation.JPEG distortion was focused on because JPEG compression is the de facto standard of image compression.In all experiments for JPEG distortion, JPEG quality factors, ranging from 1 to 100, were used instead of degradation levels.For clarity, we call the CIFAR-10 dataset that applies JPEG distortion as "JPEG CIFAR-10." Optimizers.For VGG16 and ResNet50, RAdam 30 optimizer was used with an initial learning rate of 0.001 and a weight decay of 0.0001.On the other hand, PyramidNet110-270 with ShakeDrop regularization was trained using stochastic gradient decent (SGD), consistent with the approach reported by Yamada et al. 7 The learning rate was set to 0.1 initially and scaled down by multiplying 0.1 at the 75th and 150th epochs.Additionally, a momentum of 0.9 and a weight decay of 0.0001 were applied.
Table 2 shows the interval mean accuracy of JPEG CIFAR-10 for VGG16, ResNet50, and PyramidNet110-270 with ShakeDrop regularization.Regarding the accuracy of JPEG CIFAR-10, the results are shown in Appendix B. First, focusing on the results of VGG16, the proposed method demonstrates classifying degraded images without losing the classification ability of clean images.Comparing "Clean" and "Mixed," "Clean" shows a higher accuracy in Accð81;100Þ and clean images than "Mixed," but it is worse for other interval mean accuracies.In particular, the performance of "Mixed" drops significantly by 0.052 in classifying clean images.In the case of "Mixed R.E.," "Mixed R.E." significantly underperforms "Clean" by 0.040 in the classification of clean images, and "Mixed R.E." outperforms "Mixed."Regarding "CutBlur," "CutBlur" almost outperforms "Mixed" but still underperforms "Clean" by 0.026 in the classification of clean images.When comparing "Proposed" with "Clean," "Proposed" exhibits only a difference of 0.006, indicating good classification performance of clean images.Moreover, "Proposed" almost outperforms three existing methods, i.e., "Mixed," "Mixed R.E.," and "CutBlur," except for Accð1;20Þ.The results show that a classification CNN trained by the proposed method can classify degraded images without losing the classification ability of clean images.
Subsequently, we confirm that the proposed method is effective not only for VGG16 but also for other CNNs.Regarding ResNet50, the interval mean accuracy of JPEG CIFAR-10 is shown in the middle row of Table 2.Only "Proposed" achieves an almost equivalent accuracy to "Clean" in classifying clean images.However, "Proposed" underperforms "Mixed" in Accð1;20Þ.Furthermore, the interval mean accuracy for PyramidNet110-270 with ShakeDrop regularization is shown in the bottom of Table 2. Comparing "Proposed" and "Mixed," "Proposed" outperforms "Mixed" except for Accð1;20Þ.This tendency is similar to the tendency of ResNet50.However, the relationship between "Proposed" and "Mixed R.E." is slightly different from VGG16 and ResNet50.Specifically, "Proposed" performs better than "Mixed R.E." for Accð61;80Þ, Accð81;100Þ, and clean images but worse for other interval mean accuracies, as seen in Table 2.In other words, "Proposed" shows a good performance for high-quality images, and "Mixed R.E." shows a good performance for low-quality images.Notably, only "Proposed" outperforms "Clean" in classifying clean images.As a result, the proposed method is also effective for ResNet50 and PyramidNet110-270 with ShakeDrop regularization.This indicates that the proposed method does not depend on a specific network architecture.Endo: Data augmentation technique for degraded images without losing. . .

Application to Other Degradations
The proposed method is evaluated for other degradations with the CIFAR-10 dataset: Gaussian blur, AWGN, and salt-and-pepper noise.In these evaluations, PyramidNet110-270 with ShakeDrop regularization was used because it showed the best performance in classifying JPEG CIFAR-10.
The second row of Table 3 shows the interval mean accuracy of Gaussian blurring CIFAR-10.The degradation level denotes the standard deviation of a Gaussian blur kernel, varying from 0 to 5 in increments of 0.1.In classifying clean images, "Proposed" outperforms "Clean" and "CutBlur."Moreover, "Proposed" is superior for high-quality images, whereas "Mixed R.E." is superior for low-quality images.This tendency is almost similar to one of JPEG CIFAR-10.
Subsequently, the evaluation shifts to CIFAR-10 with added white Gaussian noise, termed AWGN CIFAR-10.The interval mean accuracy of AWGN CIFAR-10 is presented in the third row of Table 3.The degradation level denotes the standard deviation of Gaussian noise, varying from 0 to 50 in increments of 1.0, in which the intensity is the 8-bit base."Proposed" outperforms "Clean" in classifying clean images.In addition, "Proposed" is superior for high-quality images, Table 3 Interval mean accuracy of CIFAR-10 under several degradations using PyramidNet110-270 with Shakedrop regularization.The degradation level denotes a standard deviation of a kernel for Gaussian blur, a standard deviation for AWGN, and a density for salt-and-pepper noise."All" means the interval mean accuracy for all degradation levels, including a clean image.All results are averaged over three runs.Italic numbers represent standard deviations.In each interval, the bold value indicates the highest interval mean accuracy among all five methods.whereas "Mixed R.E." is superior for low-quality images.This tendency is similar to one of Gaussian blurring CIFAR-10.Finally, the last row of Table 3 shows the interval mean accuracy of CIFAR-10 with added salt-and-pepper noise.The degradation level signifies the density of salt-and-pepper noise, varying from 0 to 0.25 in increments of 0.01."Proposed" outperforms other existing methods for all interval mean accuracies.
These results show that the proposed method can train a classification network without losing the classification ability of high-quality images, including clean images, for the above three degradations, that is, the proposed method is effective for not only JPEG distortion but also other degradations.

Analysis for CIFAR-100
To evaluate the efficacy of the proposed method on other datasets, the proposed method is applied to CIFAR-100. 23CIFAR-100 has 100 classes and contains 50,000 training images and 10,000 testing images.Regarding a classification network, PyramidNet110-270 with ShakeDrop regularization was used.The training strategy was almost the same as that of CIFAR-10 except for the detailed settings of SGD.The learning rate was set to 0.5 initially and scaled down by multiplying 0.1 at the 150th and 225th epochs.Additionally, a momentum of 0.9 and a weight decay of 0.0001 were applied.Regarding degradations, four types were analyzed: JPEG distortion, Gaussian blur, AWGN, and salt-and-pepper noise.The range of JPEG quality factors and degradation levels was consistent with that of CIFAR-10 for each type.
Table 4 shows the interval mean accuracy of CIFAR-100 degraded by four types of degradation.Only "Proposed" attains the same level of performance as "Clean" in classifying clean images for all degradations.However, "Proposed" underperforms "Mixed" or "Mixed R.E." for the classification of low-quality images except for salt-and-pepper noise.Regarding salt-andpepper noise, "Mixed R.E." and "Proposed" show almost similar performance for every interval mean accuracy.
The results show that the proposed method can classify high-quality images well, including clean images, whereas "Mixed R.E." is superior for the classification of low-quality images.This tendency is the same as one observed in CIFAR-10 evaluations, that is, the proposed method is also effective for the CIFAR-100 dataset.

Analysis for TINY ImageNet
In this section, the efficacy of the proposed method is evaluated using TINY ImageNet. 24This dataset was chosen due to its higher resolution compared with the CIFAR datasets.TINY ImageNet contains 100,000 training images and 10,000 testing images for 200 classes.Each class has 500 images for training and 50 images for testing.TINY ImageNet are RGB images with a resolution of 64 × 64 pixels.For the network architecture, ResNet56 4,5 was utilized instead of PyramidNet110-270 with ShakeDrop regularization to reduce the training time.The optimization method was SGD.The learning rate was set to 0.1 initially and scaled down by multiplying 0.1 at epochs 60, 120, 160, 200, 240, and 280.Additionally, a momentum of 0.9 and a weight decay of 0.0001 were applied.
Table 5 shows the interval mean accuracy of TINY ImageNet degraded by four types of degradation: JPEG distortion, Gaussian blur, AWGN, and salt-and-pepper noise.Regarding AWGN and salt-and-pepper noise, "Mixed" outperforms "Clean" in the classification of clean images and already attains this paper's goal.Consequently, "Mixed R.E." shows almost the best performance for AWGN and salt-and-pepper noise.This superior performance appears to result from the higher resolution of TINY ImageNet in addition to naive pixel-wise degradations.However, "Proposed" outperforms "Mixed R.E." in classifying clean images and almost outperforms "Mixed." Next, regarding JPEG distortion, the accuracy difference between "Clean" and "Mixed" is relatively minor by 0.007 in classifying clean images.Thus "Mixed R.E." can outperform "Clean" in classifying clean images and is the best in the interval "All."This results from the image quality being not so degraded as JPEG CIFAR-10 due to TINY ImageNet being higher resolution even if JPEG distortion is applied.However, "Proposed" outperforms "Mixed R.E." for classifying high-quality images including clean images and almost outperforms "Mixed." Endo: Data augmentation technique for degraded images without losing. . .Finally, focusing on Gaussian blur, only "Proposed" outperforms "Clean" in classifying clean images as well as high-quality images.In addition, "Proposed" is the best in the interval "All." As a result, only "Proposed" stably outperforms "Clean" in classifying clean images for all types of degradation.The proposed method is effective for not only CIFAR-100 but also for TINY ImageNet.
Table 4 Interval mean accuracy of CIFAR-100 under several degradations using PyramidNet110-270 with Shakedrop regularization.For JPEG distortion, the JPEG quality factor is used instead of a degradation level.Regarding the other degradations, the degradation level denotes a standard deviation of a kernel for Gaussian blur, a standard deviation for AWGN, and a density for salt-andpepper noise."All" means the interval mean accuracy for all quality factors or degradation levels, including a clean image.All results are based on a single run.In each interval, the bold value indicates the highest interval mean accuracy among all five methods.

Conclusions
This paper proposed a data augmentation technique for degraded images with various levels of degradation by applying distinct data augmentations for each clean and degraded image.This paper also showed that the proposed method can effectively train a classification network Table 5 Interval mean accuracy of TINY ImageNet under several degradations with ResNet56.For JPEG distortion, the JPEG quality factor is used instead of a degradation level.Regarding the other degradations, the degradation level denotes a standard deviation of a kernel for Gaussian blur, a standard deviation for AWGN, and a density for salt-and-pepper noise."All" means the interval mean accuracy for all quality factors or degradation levels, including a clean image.All results are based on a single run.In each interval, the bold value indicates the highest interval mean accuracy among all five methods. of degraded images without losing the classification ability of clean images.In addition, experimental results showed that the proposed method creates a more stable performance in classifying high-quality images than mixed training with random erasing and CutBlur.Furthermore, the proposed method's effectiveness was confirmed for four types of degradation: JPEG distortion, Gaussian blur, AWGN, and salt-and-pepper noise.Finally, the robustness of the proposed method was demonstrated for three datasets, i.e., CIFAR-10, CIFAR-100, and TINY ImageNet, and for four classification CNNs, i.e., VGG16, ResNet50, ResNet56, and PyramidNet110-270 with ShakeDrop regularization.
Although this paper proved the effectiveness of the proposed method, we found an improvement opportunity.The proposed method enhances the classification of high-quality images; however, it reduces the classification ability of low-quality images.A straightforward extension could consider the domain sampling probability as an adjustable parameter.Optimizing the domain sampling probability might reduce the observed trade-off.This trade-off should be further investigated, especially by analyzing the feature discrepancy between clean and degraded images, in the near future.
of the proposed method, CutBlur boosts classification across all degradation levels, excluding extremely low-quality images.Moreover, CutBlur might enhance the classification of clean images because degraded images processed with CutBlur retain a region of clean images.

Decision Probability
In Sec. 4, a decision probability of 1.0 was used.This section confirms how the proposed method performs by changing a decision probability.Table 7 shows the interval mean accuracy when the decision probability is varied from 0 to 1.0 in increments of 0.2.The classification performance for high-quality images improves as the decision probability increases.Regarding the classification of clean images, a decision probability of 1.0 outperforms all other probabilities.Therefore, a decision probability of 1.0 seems to be the most reasonable choice.Endo: Data augmentation technique for degraded images without losing. . .

Domain Sampling Probability
The proposed method always uses a domain sampling probability of 1 2 ð¼ 0.5Þ.Here we numerically validate the impact of a domain sampling probability.Table 8 shows the interval mean accuracy when the domain sampling probability of degraded images is varied from 0.1 to 0.9 in increments of 0.1.From the point of this paper's goal, the proposed method needs to be as close to both "Clean" in the classification of clean images and "Mixed" in the classification of low-quality images as possible.As seen in Table 2, "Clean" and "Mixed" show an accuracy of 0.928 for the classification of a clean image and Accð1;20Þ of 0.752, respectively.Compared with these two values, probabilities around 0.5 and 0.6 are good choices, as seen in Table 8, that is, using a domain sampling probability of 0.5 seems plausible.

Appendix B: Accuracy of JPEG CIFAR-10
The accuracy of JPEG CIFAR-10 is presented for each JPEG quality factor using three networks: VGG16, ResNet50, and PyramidNet with ShakeDrop regularization.Figure 4 shows the classification accuracy of JPEG CIFAR-10 for each degradation level, where the accuracy of clean images is plotted next to the JPEG quality factor of 100.For all networks, "Proposed" outperforms "Mixed" over the JPEG quality factor of 20.Moreover, "Proposed" approaches to "Clean" as the JPEG quality factors increase.These observations are consistent with the analysis using the interval mean accuracy.

Fig. 1
Fig. 1 Classification accuracy over various degradation levels with different training methods.

Fig. 2
Fig. 2 Data augmentations of degraded images: (a) mixed training, (b) mixed training with random erasing, (c) CutBlur, and (d) proposed method.Degradation is JPEG distortion, in which the degradation level denotes a 101 − JPEG quality factor.Black rectangles denote erased regions.Rectangles with white edges denote regions replaced by other image qualities."D.L." stands for degradation level.

Fig. 3
Fig.3Proposed method.p d denotes a decision probability to apply either random erasing or CutBlur.p is a random number that is randomly sampled from ½0;1.

Fig. 4
Fig. 4 Accuracy of JPEG CIFAR-10 for (a) VGG16, (b) ResNet50, and (c) PyramidNet110-270 with ShakeDrop regularization.The accuracy of clean images is plotted next to the JPEG quality factor of 100.All values are averaged over three runs.The accuracy for a JPEG quality factor of over 90 is zoomed in the right graphs.

Table 1
21mparison of existing and proposed methods in terms of domain sampling probabilities and data augmentations."Mixed"and"MixedR.E." denote mixed training10and mixed training with random erasing (R.E.),21respectively.N represents the number of degradation levels used in the experiments, excluding a clean image.

Table 2
Interval mean accuracy of JPEG CIFAR-10 with three networks: VGG16, ResNet50, and PyramidNet110-270 with ShakeDrop regularization.The JPEG quality factor is used instead of a degradation level."All" means the interval mean accuracy for all quality factors, including a clean image.All results are averaged over three runs.Italic numbers represent standard deviations.In each interval, the bold value indicates the highest interval mean accuracy among all five methods.