MasterAI Agents

Batch normalization (BatchNorm) addresses the internal covariate shift problem, where the distribution of network activations changes during training due to changing parameters in preceding layers. This shift can slow down training as subsequent layers need to adapt to the new distribution. BatchNorm mitigates this by normalizing each layer's input. Specifically, for each mini-batch, BatchNorm calculates the mean and variance of the activations. It then normalizes the activations by subtracting the mean and dividing by the standard deviation. To maintain representational capacity, BatchNorm introduces two learnable parameters, gamma (scale) and beta (shift), allowing the network to learn the optimal scaling and shifting of the normalized activations. BatchNorm is typically applied after a linear transformation (e.g., a fully connected or convolutional layer) and before the activation function. It allows higher learning rates, makes the network less sensitive to initialization, and can act as a regularizer.

Batch normalisation

Explanation

Related Terms