Machine LearningAlignment

Weak-to-Strong Generalization

Investigating whether weak supervision can elicit strong capabilities in models.

The Problem

The classic teacher-student model in machine learning posits that a strong teacher supervises a weak student. But what happens when the teacher is weaker than the student? This is the core of the Weak-to-Strong Generalization problem.

Related Papers
2025International Conference on Learning Representations (ICLR), Apr, 2025

Theoretical Findings

We theoretically investigate this for binary and multilabel classification. We identify two asymptotic phases: successful generalization and random guessing.

Key Insight
"Key Insight: The student can generalize even with random-guessing supervision under certain conditions."