Noise vs. Clean Data: What Should a Diffusion Model Learn? (Part 2)
An information-theoretic perspective on why x₀-prediction outperforms ε-prediction in high dimensions.
I'm a freshman at Stanford interested in machine learning and math. I'm broadly drawn to discovering ideas bridging math and physics with neural networks, as well as developing automated formal reasoning systems for higher math.
In the past, I competed in the USA Math Olympiad and International Physics Olympiad. In my free time, I enjoy reading history, solving puzzles, photographing stars, and watching Liverpool FC.
I share my thoughts, works, and favourite reads on this website. Feel free to contact me at alexh06 [at] stanford [dot] edu.
An information-theoretic perspective on why x₀-prediction outperforms ε-prediction in high dimensions.
Exploring the trade-off between ε-prediction and x₀-prediction in diffusion models, and why different objectives lead to different training behavior.