Security
Adversarial example
An adversarial example is an input designed to mislead a machine learning model, causing it to make incorrect predictions. These examples are often crafted by adding small, carefully calculated perturbations to legitimate inputs, imperceptible to humans but causing the model to misclassify them.
Explanation
Adversarial examples exploit vulnerabilities in machine learning models, particularly deep neural networks. They work by leveraging the high dimensionality and non-linear nature of these models. Attackers can use techniques like the Fast Gradient Sign Method (FGSM) or Projected Gradient Descent (PGD) to iteratively adjust the input features, pushing the model's internal activations towards decision boundaries that lead to incorrect classifications. The existence of adversarial examples highlights the fragility of these models and their susceptibility to even minor input variations. This poses significant security risks, especially in safety-critical applications like autonomous driving and medical diagnosis, where misclassifications can have severe consequences. Defenses against adversarial attacks are an active area of research, focusing on techniques such as adversarial training (training the model on adversarial examples), defensive distillation, and input sanitization.