PixVerse, a startup backed by Alibaba, has launched a real-time AI video tool designed to let users steer scenes as the video is being generated, including changing what characters do in the moment. The launch comes as competition in AI video heats up, with Google also updating its Veo 3.1 model to support native vertical video creation and higher-resolution upscaling.
PixVerse says its new approach is built around real-time interaction, aiming to remove the “waiting” that many AI video generators still require before a clip is ready to watch. In an interview cited in the report, PixVerse co-founder Jaden Xie described the tool as a way for users to guide how a video unfolds while it is being produced, such as making characters dance, cry, or pose instantly.
Real-time video, as PixVerse describes it
PixVerse’s launch highlights a push toward more interactive AI-generated video, where users can adjust direction during generation rather than only before a render starts. Xie said this kind of capability could open new business models, including user-shaped narratives in micro-dramas or endlessly evolving video games not limited by traditional production constraints.
The company was founded in 2023 and has raised more than $60 million, with funding led primarily by Alibaba and supported by Antler, according to the report. Xie said PixVerse is nearing completion of another funding round, but did not share details, and also said more than half of investors are international.
User growth and platform reach
PixVerse’s AI tools are integrated into what the report describes as a social media-like platform, which had more than 16 million monthly active users as of October. Xie said the company aims to raise registered users to 200 million in the first half of the year, up from 100 million last August, and plans to expand the team to nearly 200 employees by year-end.
The report also says PixVerse primarily serves users outside China through web and mobile interfaces. PixVerse reported an estimated annual recurring revenue of $40 million as of October, according to the same report.
On its website, PixVerse describes itself as an AI video generator that can create videos from text prompts or by uploading images such as selfies, portraits, or group photos. The site also says it recently launched a v4.5 model and lists earlier versions (v1 through v4), describing v4.5 as delivering higher quality, smoother animation, and more realistic transformations.
Competition around speed and cost
The PixVerse launch is framed within a wider contest among AI video tools, especially among Chinese companies. The report cites AI benchmarking firm Artificial Analysis as saying that most leading AI video generation models are built by Chinese companies, and that many offer faster processing and lower fees than OpenAI’s Sora 2 Pro.
OpenAI’s Sora drew international attention around two years ago for text-to-video capabilities, but only became publicly available in December 2024, according to the report. The report says that by the time Sora became broadly accessible, several Chinese startups had already launched rival tools globally.
Wei Sun, a principal analyst at Counterpoint, said Sora is strong on quality but is limited by speed and cost, which creates an opening for Chinese tools positioned as scalable and cost-effective. The report also points to Beijing-based Shengshu, saying the company demonstrated a TurboDiffusion video framework that can generate videos 100 to 200 times faster with little to no quality loss.
Google Veo 3.1 adds vertical video
While PixVerse is pitching real-time control, Google is also expanding what creators can do with its own video model. Google updated Veo 3.1 with a feature that lets users create native vertical videos using reference images, and said it also improved its “Ingredients to Video” feature and added upscaling to 1080p and 4K.
Google said the updated Ingredients to Video feature can generate “dynamic and engaging videos” from reference images, including with short prompts. The company also said Veo 3.1 improves dialogue and storytelling, and increases visual consistency so characters can remain consistent even when settings change, with the ability to reuse objects, backgrounds, and textures.
Google also added a 9:16 vertical option inside Ingredients to Video, positioning it for formats such as YouTube Shorts without forcing creators to crop or lose quality. Google said 1080p upscaling delivers a sharper, cleaner result, while 4K upscaling adds richer textures and clarity for large screens.
Where Veo 3.1 is rolling out
Google said Veo 3.1’s improved Ingredients to Video feature will be available in the YouTube Shorts app, the YouTube Create app, and the Gemini app. Google also said the enhanced Ingredients to Video and native vertical format are rolling out to Flow, Vertex AI, Google Vids, and the Gemini API.
Google said all videos generated with Veo 3.1 will include a SynthID digital watermark, and that users can verify whether content is AI-generated by uploading an image or video to Gemini. This watermarking and verification approach is presented as a way to help viewers identify AI-generated media.
A crowded market of AI video tools
A separate “best AI video generator” roundup published on Manus’ blog lists multiple AI video generators and compares their “starting price” points, including entries for OpenAI Sora and Google Veo 3.
In that roundup’s table, OpenAI Sora is shown with a starting price of $20 per month via ChatGPT Plus, while Google Veo 3 is shown at $28.99 per month via Google AI Pro.
The same roundup table lists several other tools and their starting monthly prices, including Runway (Gen 4.5) at $15, Kling AI at $10, Luma Dream Machine at $9.99, and Adobe Firefly at $9.99.
The roundup also lists “Manus” at $40 per month and describes it as focused on AI-powered workflow automation.
