Midjourney launches video model V1 with 5 to 20-second clips

2025-06-23Technology
--:--
--:--
David
Welcome everyone to AI Unpacked! It's June 24, 2025, and we've got an exciting episode lined up for you. I'm David, your host, and today we're diving deep into the latest groundbreaking news from the world of artificial intelligence.
Ema
And I'm Ema, thrilled to be here! We're going to explore a major leap forward in AI-generated content that’s causing quite a stir: Midjourney, the company known for its stunning AI images, has just dropped its first-ever video model. Get ready to have your minds blown!
David
Exactly, Ema! This isn't just a small step; it's a giant leap into a new creative frontier. We'll unpack what this means for artists, creators, and even how we consume media. So, buckle up, because we're about to unpack Midjourney's new V1 video model and all its implications.
David
Alright, Ema, let's jump right into this. What exactly has Midjourney released, and why is everyone talking about it so much right now?
Ema
Well, David, the big news is that Midjourney, which most of our listeners know for incredible AI-generated images, officially launched its first video model, called V1, on June 18th! It's a game-changer because now any of their 20 million subscribers can literally click an 'Animate' button under a still image—whether it’s one they generated or uploaded—and instantly receive four 5-second video clips.
David
Wow, so it's that simple? Just 'Animate' a still image? That sounds incredibly user-friendly for such a complex technology, um, what about extending those clips?
Ema
Exactly! It's designed to be fun, easy, and beautiful, as Midjourney themselves put it. And yes, you can extend those initial 5-second clips in 5-second increments, all the way up to a maximum of 20 seconds. So you can go from a brief moment to a short, dynamic scene with just a few clicks. It's a fantastic starting point for creators.
David
That's really intuitive. And are there options for how the motion itself behaves, or is it just one standard type of movement?
Ema
Great question! There are actually two motion modes available: 'low' and 'high.' 'Low' motion involves more subtle movements, maybe just a slight camera pan or a gentle shift in the subject. 'High' motion, on the other hand, includes much larger camera movements or more significant subject shifts, giving you more dramatic effects.
David
So, you can choose between subtle and dramatic, right? That adds a lot of flexibility. And can users influence the motion beyond just choosing 'low' or 'high'?
Ema
Absolutely! The motion can either remain automatic—meaning Midjourney decides what looks best—or, for those who want more control, it can be directed by a text prompt. So you could write something like 'slow zoom out' or 'character turns head' to guide the animation, which is super powerful for specific creative visions.
David
That's fantastic. It sounds like they're giving users a lot of agency within this new tool. And what's the cost structure like for creating these clips, Ema?
Ema
Creating a clip costs roughly eight image credits. For context, Midjourney's Starter tier begins at the regular $10 a month, which includes a certain number of credits. So, it's quite accessible. They're also testing a slower 'video relax' queue for Pro users, which might offer a more cost-effective option for less urgent projects, um, which is a nice touch.
David
That's really interesting, Ema, especially the pricing and accessibility. But before we dive deeper into V1 itself, can you give us a bit of background on Midjourney as a company? They've become a household name in AI art, but what's their story?
Ema
Of course, David! Midjourney was founded relatively recently, in 2022, as a San Francisco-based lab. They burst onto the scene with their incredible text-to-image AI, which quickly gained massive popularity. What's unique about them is that their service is primarily Discord-based, which created this huge, vibrant community around their image generation tool.
David
Discord-based, right. That's quite a distinctive approach compared to some other AI companies. And their success has been pretty remarkable, hasn't it?
Ema
Remarkable is an understatement, David! In 2024 alone, they reportedly earned $300 million from their image service. That's a huge testament to the demand for AI art and their ability to deliver a product that resonates with millions of users. That kind of revenue in just a couple of years is, um, really phenomenal for a tech startup.
David
Indeed. So they've built a very successful business on images. Now, with V1, they're venturing into video. How does this new V1 model stack up against the existing players in the AI video generation space? Because I know there are some big names out there already.
Ema
Exactly, it's a crowded field already. When we talk about AI video, the current leaders include Runway with their Gen-4 model, Luma's Dream Machine, and of course, OpenAI's highly anticipated Sora. These models have been pushing the boundaries of what's possible, especially in terms of video length, resolution, and overall realism.
David
So, Midjourney is entering a highly competitive market, right? Are they bringing anything unique to the table that sets them apart from these established players, other than the animation of still images?
Ema
Well, one of the most praised aspects of V1, according to early testers, is the incredible coherence it inherits from Midjourney's V6.1 image model. Think of coherence as the consistency and stability of the visuals throughout the video. If you've ever seen AI videos where things suddenly morph or flicker, coherence is what prevents that.
David
Ah, coherence. So, it's about maintaining a stable, believable visual narrative even when things are moving, um, like a character's face staying the same throughout a shot, right?
Ema
Exactly! It's like the AI has a really good memory of what everything should look like from frame to frame. This is a huge advantage, especially when many early AI video models struggled with maintaining visual consistency. So, for Midjourney to carry over that strength from their image model is a big deal.
David
That makes a lot of sense. So, they're leveraging their existing strengths in image generation to enter the video market. That's a smart move. But what about the technical underpinnings? How does V1's technology compare to, say, Sora or Runway Gen-4?
Ema
From a technical standpoint, V1 is still quite foundational. It's their first foray, after all. Unlike some of its rivals, V1 currently lacks audio capabilities, which is a significant difference for creating complete video experiences. Also, it's capped at 1080p resolution and, as we mentioned, a maximum of 20 seconds in length.
David
So, no sound and limited resolution and length. That sounds like it trails behind the big players in terms of overall scope, right? Like, Sora has shown much longer, higher-resolution clips.
Ema
Exactly. In terms of pure scope—length, resolution, and features like audio—V1 definitely trails behind Runway Gen-4, Luma Dream Machine, or OpenAI's upcoming Sora. Those models are designed for more expansive, feature-rich video generation. V1 is a more focused, initial offering, a 'version one' in the truest sense.
David
It sounds like a strategic entry, then, rather than trying to immediately outcompete on every technical spec. Perhaps they're focusing on a different segment or strategy?
Ema
That's exactly it, David. While it might trail in scope, it has a significant advantage in another area, which we'll get into soon, um, but it's clear Midjourney is playing a different game, focusing on accessibility and leveraging their existing user base.
David
Okay, Ema, you've hinted at this. If V1 trails behind rivals in scope—no audio, limited length, 1080p—what's its competitive edge? And what kind of reactions are we seeing from early users and the industry?
Ema
Its primary competitive edge, David, is cost. Midjourney V1 significantly undercuts many of its rivals on price, making AI video generation more accessible to a broader audience. This is huge for independent creators or small businesses who might find the higher costs of other models prohibitive.
David
So, affordability is their big play, right? That makes sense, especially given their large subscriber base. But what about the quality? Are users happy despite the limitations?
Ema
It's a mixed bag, honestly. Testers have largely praised the coherence, as we discussed, which is inherited from Midjourney V6.1. That's a big plus. However, there are notes about high-motion scenes sometimes experiencing flickering, which can be distracting.
David
Flickering in high-motion scenes, um, that sounds like a common challenge for early AI video. What about the general sentiment? Are people blown away or are they seeing gaps?
Ema
Early reactions really vary. Some, like Phi Hoang, have commented that it's 'surpassing expectations,' suggesting people are pleasantly surprised by what it can do for its price point and ease of use. But, exactly, on the other hand, Reddit users are noting 'gaps' when compared to Sora-class realism.
David
So, it’s good for what it is, but not quite up to the bleeding edge of realism from, say, OpenAI. How are industry analysts interpreting this move, then?
Ema
Analysts describe V1 as a 'quick entry into a crowded field' rather than a 'finished film tool.' It’s about establishing a foothold and iterating, not delivering a complete, polished product for professional filmmaking right out of the gate. This aligns with Midjourney's iterative development philosophy.
David
That's a very pragmatic view, right? Get in, learn, and then refine. But there's a significant legal storm brewing around Midjourney, isn't there, something about a copyright lawsuit?
Ema
You're absolutely right, David. This is a massive point of conflict. Midjourney now faces a significant copyright lawsuit from none other than Disney and Universal. The core of the lawsuit revolves around their training data—specifically, allegations that Midjourney used copyrighted material without permission to train their AI models.
David
Wow, Disney and Universal. Those are huge names with vast intellectual property. This isn't just about images anymore, it impacts the entire generative AI landscape, doesn't it? What are the implications of such a lawsuit?
Ema
Exactly. This isn't just about Midjourney; it's a test case for the entire AI industry. The outcome could set precedents for how AI companies can collect and use data for training their models, and what constitutes fair use or infringement. It touches on fundamental questions of intellectual property in the age of AI.
David
So, it could really reshape how AI models are built in the future, um, forcing companies to be much more careful about their data sources. That's a huge challenge to navigate while simultaneously launching new products like V1.
Ema
It's a tightrope walk, for sure. On one hand, they're innovating and expanding into new markets with V1. On the other, they're battling these critical legal challenges that could fundamentally alter their business model. It's a huge tension point for the company and the industry at large.
David
Ema, let's talk about the impact of all this. How does Midjourney's V1 launch, coupled with this significant lawsuit, affect the broader AI video market and, more specifically, creators?
Ema
The immediate impact on the AI video market is increased competition and a push towards affordability. Midjourney's lower cost point means rivals might need to reassess their own pricing strategies, um, which is good news for consumers and creators wanting to experiment with AI video without breaking the bank.
David
So, it's democratizing access, right? More people can get their hands on these tools. What about for the individual creator, the artist, or the small content producer?
Ema
For individual creators, V1 is a fantastic new tool, especially if they're already Midjourney users. Imagine quickly animating a character or a scene from their existing image portfolio for social media, or a rough storyboard. It lowers the barrier to entry for video creation significantly, even with its current limitations.
David
That's empowering. It sounds like it could really accelerate the prototyping and ideation phase for visual storytellers. But let's broaden this out a bit. What are the wider societal implications of AI video models, especially when we consider the copyright issues?
Ema
This is where it gets complex, David. The copyright lawsuit directly highlights the tension between innovation and intellectual property. If AI models are trained on vast datasets that include copyrighted works, how do we fairly compensate creators? This isn't just an abstract legal battle; it affects livelihoods and the future of creative industries.
David
Right, it's about valuing human creativity in an age where machines can generate so much. And then there's the 'deepfake' aspect, which is always a concern with generative video, isn't it?
Ema
Exactly. While V1's short clips and lack of audio limit its immediate deepfake potential compared to more advanced models, every step in AI video generation brings us closer to a world where realistic, fabricated media is easier to produce. This raises critical questions about misinformation, ethics, and the need for robust detection tools.
David
So, even as we celebrate the creative potential, we have to grapple with the societal responsibilities that come with it. It's a double-edged sword, um, that we're still figuring out how to wield.
Ema
Precisely. And Midjourney executives themselves describe V1 as just a 'step toward a future world model.' This 'world model' concept implies something even bigger—AI capable of creating entire explorable 3D scenes. The implications for gaming, VR, and even metaverse development are staggering.
David
Ema, that 'world model' vision sounds like something out of science fiction. Let's talk about that future. What does Midjourney's aspiration for 'explorable 3D scenes' truly mean, and how far away are we from that?
Ema
It's incredibly ambitious, David! Imagine not just generating a video, but generating an entire virtual environment that you can literally walk through, interact with, and explore from any angle. It's like the AI creating a fully fleshed-out, interactive movie set or a game world just from a text prompt.
David
So, beyond just passive viewing, we're talking about truly immersive, interactive experiences generated by AI. That's a massive leap from 20-second clips, right?
Ema
Exactly! It represents the ultimate goal of generative AI: not just creating media, but creating entire realities. This would revolutionize industries from filmmaking and gaming to architecture and virtual training simulations. We're talking about a future where storytelling becomes an explorable experience.
David
That's truly mind-bending. Given V1 is just 20 seconds and no audio, how realistic is this 'world model' vision in the near future? Is it years away, or decades?
Ema
It's definitely not next year, but the pace of AI development is so rapid that what seems like decades away can become years. V1 is a foundational step, teaching the AI to understand motion and 3D space from 2D images. Each iteration builds on the last. So, while it's a complex undertaking, the trajectory is clear.
David
So, we should expect continuous, rapid advancements in this space. What about the immediate future of AI video for our listeners? What should they be looking out for?
Ema
Look for improvements in video length, resolution, and the introduction of audio. Also, watch how the legal landscape evolves with these copyright lawsuits—they could significantly shape the future of AI development. And definitely keep an eye on how these tools become integrated into broader creative workflows, making them even more powerful.
David
Fantastic Ema, a truly insightful look at Midjourney's bold move into video and the complex landscape of AI innovation. It's clear that while V1 is just a start, its implications for accessibility and the future of creative tools are immense. We'll be keeping a close eye on both the technical advancements and the legal battles ahead.
David
That's all the time we have for today on AI Unpacked. Thank you, Ema, for breaking down this fascinating topic. And to all our listeners, thank you for tuning in. Join us next time for more on the exciting world of artificial intelligence!

Midjourney launches video model V1 with 5 to 20-second clips

Read original at TestingCatalog

Midjourney released its first video model, V1, on 18 June 2025. Any subscriber, which includes about 20 million users, can press “Animate” under a still image—whether generated or uploaded—to receive four 5-second clips. These clips are extendable in 5-second increments up to a total of 20 seconds.There are two motion modes available: low, which involves subtle movements, and high, which includes larger camera or subject shifts.

Motion can either remain automatic or be directed by a text prompt. Creating a clip costs roughly eight image credits, with pricing starting at the regular $10 Starter tier. A slower “video relax” queue is currently being tested for Pro users.Introducing our V1 Video Model. It's fun, easy, and beautiful.

Available at 10$/month, it's the first video model for *everyone* and it's available now. pic.twitter.com/iBm0KAN8uy— Midjourney (@midjourney) June 18, 2025V1 lacks audio and is capped at 1080p/20 seconds, which means it trails behind Runway Gen-4, Luma Dream Machine, or OpenAI’s upcoming Sora in terms of scope.

However, it undercuts many rivals on cost. Testers have praised the coherence inherited from Midjourney V6.1, although high-motion scenes may experience flickering.Early reactions vary, from Phi Hoang’s comment that it is “surpassing expectations” to Reddit users noting gaps when compared to Sora-class realism.

Analysts describe the move as a quick entry into a crowded field rather than a finished film tool.Founded in 2022, the San Francisco lab earned $300 million in 2024 from its Discord-based image service. It now faces a Disney-Universal copyright lawsuit over its training data. Executives describe V1 as a step toward a future “world model” capable of creating explorable 3-D scenes.

Source

Analysis

Impact Analysis+
Event Background+
Future Projection+
Key Entities+
Twitter Insights+

Related Podcasts

Midjourney launches video model V1 with 5 to 20-second clips | Goose Pod | Goose Pod