Back to Glossary
AI Ethics and Safety

AI Alignment

The process of ensuring that artificial intelligence systems' goals and behaviors are consistent with human values, intentions, and ethical principles.

Explanation

AI alignment is a subfield of AI safety that focuses on the challenge of creating AI systems that act in accordance with the designer's intended goals. It is often divided into 'outer alignment' (specifying the right objective function) and 'inner alignment' (ensuring the AI actually pursues that objective without developing unintended sub-goals). As AI systems become more autonomous and capable, the risk of 'misalignment'—where an AI pursues a goal that is technically correct according to its programming but harmful in practice—becomes a critical concern for researchers and ethicists.

Related Terms