Introduction
We know that there are mainly two problems in the case of building or training a neural network: over-fitting and under-fitting. Well, we know that the concept of overfitting is that the model is overtrained on the given input samples. Thus this leads to around 100% accuracy of the model. But this results in showing less accuracy in the test data. In order to reduce this problem, we have some methodologies like generalization techniques. Along with them, the new concept of noise injection is added to reduce the problem of overfitting.
We will go through this concept of noise injection in this article.
Also Read, Resnet 50 Architecture
Noise Injection
The concept of noise injection is simple. We know that the main root cause for the problem of overfitting is the size of the dataset, and if the dataset we are dealing with is too small, then our model will get complete accuracy on the train data, but this model won’t show much accuracy on the holdout dataset. So, we need to increase the size of the dataset by upsampling the whole dataset, either collecting the new data or adding some noise or unwanted data. The concept of collecting a new data sample and adding it to the dataset is a routine and effort-needed task. Thus the concept of noise injection to the dataset is required and developed. The type of noise you are going to add to the dataset is purely based on the actual dataset.
Why add Noise?
This small dataset problem challenges machine learning to develop this procedure. The main problems that we are facing with the small dataset are that we have very few samples. Thus our model will effectively learn all of those and work well for training data. Similarly, since the model learned fewer samples, this model can’t make a better mapping between the input and output data, thus resulting in a poor relation to the output of a particular input.
Wait! Don’t you think why the addition of noise will improve the model? Doesn’t the addition of it degrade the model performance?
Well!, the answer is NO. This is because we know what regularization is? And how does it work? Similarly, the addition of noise to the dataset that is causing the overfitting problem, this addition leads to a regularization effect while training the model and thus improves the model performance.
Thus, adding noise expands the training dataset size. Each time when a training sample is exposed to the model, some random noise is added to the input variables making them different every time it is exposed to the model. In this way, adding noise to input samples is a simple form of “data augmentation”. Thus the noise addition makes the model not memorize the samples much efficiently, resulting in a smooth mapping function.
Key Points on adding Noise
Well, the most common noise added during the training of the model is Gaussian noise or white noise. We all know the Gaussian noise has a mean of zero and a standard deviation of one. The addition of this Gaussian noise to the inputs of a neural network is called “Jitter”.
The next and most important point to be noted is how much noise you are going to add? If you add less noise, this is of no use. Similarly, if you add more noise, the model will lose its hands.
Adding Gaussian noise to hidden layers Example
The main advantage of Gaussian noise is that we can have a look at the standard deviation of the random Noise, and thus can control it by the amount of spread it.
The major point is that we need to add noise only during the training stage, like adding noise to activations, weights, gradients, and outputs.
Check out this article - Padding In Convolutional Neural Network