Alibaba-backed startup PixVerse has launched a real-time AI video tool designed for interactive video creation, putting it in competition with OpenAI’s Sora. The tool is built around the idea that users can steer what happens while a video is being generated, instead of waiting for a finished output.
PixVerse’s co-founder Jaden Xie told CNBC that “Real-time AI video generation can create ‘new business models,’” pointing to ideas like interactive micro-dramas and “infinite” video games that aren’t locked into a fixed storyline. CNBC’s Evelyn Cheng also discussed PixVerse’s funding approach and what the launch could mean for competition in AI between the U.S. and China.
How the tool works
PixVerse says the new product lets users control how a video unfolds during generation, similar to giving directions on a set. Examples described include telling characters to cry, dance, freeze, or pose, with changes happening immediately as the video continues.
PixVerse has tied the real-time tool into its social-style sharing platform, which it said reached more than 16 million monthly active users in October. In PixVerse’s framing, real-time control reduces the gap between creating and distributing AI-generated video.
Funding and growth plans
PixVerse launched in 2023 and raised more than $60 million last fall in a round led by Alibaba, with Antler also joining, according to reporting that cited Xie. Xie said another funding round is nearly closed, but he declined to provide details.
Xie also said more than half of the incoming investors are based outside China. PixVerse’s user base is largely outside China as well, with people accessing the product through the company’s web platform and mobile app.
PixVerse set aggressive growth goals, with Xie aiming for 200 million registered users by mid-year, up from 100 million last August. He also said headcount could double to nearly 200 employees by the end of the year.
China’s push in AI video
The PixVerse launch arrives as several China-based teams have been advancing AI video generation, according to data cited from benchmarking firm Artificial Analysis. That data showed seven of the top eight video models coming from Chinese companies, with Israeli startup Lightricks as the exception.
Counterpoint principal analyst Wei Sun said, “Sora still defines the quality ceiling in video generation, but it is constrained by generation time and API cost,” while adding that Chinese players are pursuing a different approach focused on scalable, low-cost, high-throughput production. Another example cited was Beijing-based startup Shengshu, which said its TurboDiffusion framework—developed with researchers from Tsinghua University—can generate videos 100 to 200 times faster with minimal quality loss.
Revenue and product priorities
PixVerse reported $40 million in annual recurring revenue in October, according to the reporting. The same reporting also said Kling—an AI video product built by Kuaishou—generated close to $100 million in revenue during the first three quarters of 2025, based on CNBC calculations from public filings.
Xie said PixVerse is prioritizing product over profit and claimed the company has enough capital to operate for a decade. Addressing criticism that AI video can become low-quality “slop,” he argued that early stages will include both good and bad output, and said: “At the beginning, there will be good and bad [content], but gradually the fittest will surely survive … and then some people will improve the technology, and truly meet human needs for emotional and spiritual value.”
What users see in the app
PixVerse’s Google Play listing describes it as an “AI video generator” that can turn text to video and image to video, and it also lists tools such as “AI Video Modify,” “Video Extension,” “Key Frame Control,” and a “Fusion Mode” that combines up to three images. PixVerse’s web app describes itself as an AI video generator that helps users create video content from text and photos.
