Neural Networks
Weight initialisation
Weight initialization is the process of setting the initial values of the weights in a neural network before training begins. Proper initialization is crucial for efficient training and can significantly impact the model's ability to learn and converge to a good solution.
Explanation
In neural networks, weights are parameters that determine the strength of connections between neurons. The initial values assigned to these weights play a critical role in the training process. Poor initialization can lead to vanishing or exploding gradients, hindering learning. Common initialization strategies include random initialization (e.g., using Gaussian or uniform distributions), Xavier/Glorot initialization (designed to maintain similar variance between layers), and He initialization (specifically for ReLU activations). Xavier/Glorot initialization aims to keep the variance of the activations and the gradients roughly the same across all layers. He initialization is tailored for ReLU activations by scaling the variance based on the number of input neurons. Pre-trained weights from other tasks can also be used for initialization, known as transfer learning. The choice of initialization method often depends on the network architecture and activation functions used.