The trajectory is hard to argue with, even if the messenger invites scepticism. xAI had no video product in July 2025. By late January 2026, Grok Imagine debuted at number one on the Artificial Analysis Video Arena for both text-to-video and image-to-video, ahead of Runway Gen-4.5, Sora 2 Pro, and Google’s Veo 3.1. It currently holds the top spot in image-to-video with an Elo of 1,329 and leads all three video categories on DesignArena. Kling 3.0 has since overtaken it in text-to-video, so the “best model” claim depends on which leaderboard you’re looking at and when. Musk, characteristically, highlights the ones that flatter him.
But the numbers underneath the hype are genuinely interesting. Grok Imagine generates at 720p with native audio, supports clip extension via an “Extend from Frame” feature launched in early March, and is available through an API priced at $4.20 per minute. That pricing deserves attention. Sora 2 Pro charges $30 per minute. Veo 3.1 charges $12. Kling 3.0 matches Grok at roughly $4.20 but without native audio. For creators producing short-form content at volume – social clips, product videos, quick-turn branded work – cost per minute of usable output matters more than Elo scores, and Grok Imagine is aggressively competitive on it.
The caveat is ecosystem. Grok Imagine lives inside xAI’s world, which means X Premium and SuperGrok subscribers. It doesn’t have Runway’s established production user base, Kling’s 30,000-plus enterprise clients, or Sora’s integration with OpenAI’s broader platform. Leaderboard performance gets attention. Ecosystem stickiness gets revenue. xAI hasn’t proved the second part yet.
Still, the speed matters. Most AI video tools iterated their way to competitive quality over two or three years. xAI did it in seven months, partly by acquiring video startup Hotshot and partly by leveraging the sheer volume of generations flowing through X – reportedly over a billion in a single 30-day window. That feedback loop is a training advantage that’s difficult to replicate. The question for practitioners isn’t whether Grok Imagine deserves the hype. It’s whether a top-tier video generation tool at $4.20 a minute with native audio is worth building into your workflow, even if the person promoting it is the same person who renamed Twitter.



