🤗 Upvotes: 41 | cs.CV
Authors:
Minh-Quan Le, Yuanzhi Zhu, Vicky Kalogeiton, Dimitris Samaras
Title:
What about gravity in video generation? Post-Training Newton's Laws with Verifiable Rewards
Arxiv:
http://arxiv.org/abs/2512.00425v1
Abstract:
Recent video diffusion models can synthesize visually compelling clips, yet often violate basic physical laws-objects float, accelerations drift, and collisions behave inconsistently-revealing a persistent gap between visual realism and physical realism. We propose $\texttt{NewtonRewards}$, the first physics-grounded post-training framework for video generation based on $\textit{verifiable rewards}$. Instead of relying on human or VLM feedback, $\texttt{NewtonRewards}$ extracts $\textit{measurable proxies}$ from generated videos using frozen utility models: optical flow serves as a proxy for velocity, while high-level appearance features serve as a proxy for mass. These proxies enable explicit enforcement of Newtonian structure through two complementary rewards: a Newtonian kinematic constraint enforcing constant-acceleration dynamics, and a mass conservation reward preventing trivial, degenerate solutions. We evaluate $\texttt{NewtonRewards}$ on five Newtonian Motion Primitives (free fall, horizontal/parabolic throw, and ramp sliding down/up) using our newly constructed large-scale benchmark, $\texttt{NewtonBench-60K}$. Across all primitives in visual and physics metrics, $\texttt{NewtonRewards}$ consistently improves physical plausibility, motion smoothness, and temporal coherence over prior post-training methods. It further maintains strong performance under out-of-distribution shifts in height, speed, and friction. Our results show that physics-grounded verifiable rewards offer a scalable path toward physics-aware video generation.
Fler avsnitt av Daily Paper Cast
Visa alla avsnitt av Daily Paper CastDaily Paper Cast med Jingwen Liang, Gengyu Wang finns tillgänglig på flera plattformar. Informationen på denna sida kommer från offentliga podd-flöden.
