Back to Glossary
Foundations

Ground truth

Ground truth refers to the actual, real-world data or information that is known to be accurate and true. It serves as the benchmark against which the performance of machine learning models is evaluated.

Explanation

In supervised machine learning, ground truth is essential for training and evaluating models. It's the 'correct answer' that the model learns to predict. For example, in image classification, the ground truth would be the accurate label of what's depicted in an image (e.g., 'cat', 'dog'). During training, the model's predictions are compared to the ground truth, and the model's parameters are adjusted to minimize the difference. During evaluation, metrics like accuracy, precision, and recall are calculated by comparing the model's predictions to the ground truth data, providing insights into the model's effectiveness. The quality of ground truth data directly impacts the performance of the model; noisy, incomplete, or biased ground truth can lead to inaccurate or unreliable models. Obtaining high-quality ground truth can be a significant challenge, often requiring manual annotation or expert knowledge.

Related Terms