Machine LearningAlignment

Weak-to-Strong Generalization

Investigating whether weak supervision can elicit strong capabilities in models.

The Problem

The classic teacher-student model in machine learning posits that a strong teacher supervises a weak student. But what happens when the teacher is weaker than the student? This is the core of the Weak-to-Strong Generalization problem.

Provable weak-to-strong generalization via benign overfitting

David Wu, Anant Sahai

2025International Conference on Learning Representations (ICLR), Apr, 2025

Explore these papers

Theoretical Findings

We theoretically investigate this for binary and multilabel classification. We identify two asymptotic phases: successful generalization and random guessing.

Key Insight

"Key Insight: The student can generalize even with random-guessing supervision under certain conditions."

The Problem

Provable weak-to-strong generalization via benign overfitting

Theoretical Findings

Read Next