SenseTime Unveils Upgraded SenseNova 5.5 Large Model, China's Answer to GPT-4o

TMTPOST--Chinese AI company SenseTime has unveiled the upgraded SenseNova 5.5 Large Model series, the first real-time multimodal model in China, which the company claimed to be a new AI interaction model on par with GPT-4o’s streaming interaction capabilities.

The SenseNova 5.5 series, which includes SenseNova 5o, was released at the 2024 World Artificial Intelligence Conference (WAIC 2024) last Friday.

The series also includes an upgraded and cost-effective edge-side large model, reducing the cost per device to as low as RMB 9.90 per year, allowing for widespread deployment. Through continuous updates to its Cloud-to-Edge full-stack large model product matrix, SenseTime provides innovative solutions for generative applications across various scenarios and industries.

This new interaction model matches the experience of GPT-4o, bringing a revolutionary AI interaction mode.

Additionally, the new SenseNova 5.5 boasts a massive 600 billion parameters, with overall performance improved by an average of 30% compared to SenseNova 5.0. Notably, its abilities in mathematical reasoning, English proficiency, and instruction following have significantly enhanced. Its interactive performance and various core metrics align with GPT-4o, positioning it as one of the leading domestic models comparable to GPT-4 Turbo.

Currently, the SenseNova Large Model has been deployed at more than 3,000 government and corporate customers, including industries such as technology, healthcare, finance and programming.

Xu Li, the CEO of SenseTime, said: "This is a critical year for large models as they evolve from unimodal to multimodal. In line with users’ needs, SenseTime is also focused on boosting interactivity. With applications driving the development of models and their capabilities, coupled with technological advancements in multimodal streaming interactions, we will witness unprecedented transformations in human-AI interactions. "

Large models fundamentally aim to memorize world knowledge, and their intelligence stems from understanding the higher-order thinking logic and memory behind this knowledge, said Xu, adding that “The 5.5 Lite version, designed for flagship mobile platforms, sees a 10% improvement in performance accuracy, a 40% reduction in initial package delay, and a 15% increase in inference efficiency, processing 90.2 Chinese characters per second. SenseTime’s model capabilities continually evolve with SenseNova’s constant iteration.”

To lower the barriers to entry for enterprise users in leveraging the robust capabilities of the SenseNova Large Model, SenseTime has recently launched the "Project $0 Go" scheme. This is a free and comprehensive onboarding bundle for all new enterprise users who are migrating from the OpenAI platform, including a 50 million tokens package and API migration consulting services.

Over the past year, SenseTime has significantly expanded its investment in AI large models, establishing a “model-as-a-service” business model. This approach uses the combination of large models and large devices to achieve the goal of Artificial General Intelligence (AGI).

In March, Xu noted that guided by the Scaling Law, large models are in a golden period of technological revolution and performance improvement.

Since the release of the SenseNova model in 2023, its capabilities have significantly improved every three months. The model has reached domestic leading levels in fundamental models, multimodal capabilities, programming and tool calling, lossless context for millions of words, and small terminal models.

According to the company’s financial report in 2023, the SenseNova model and large devices drove a 200% explosive growth in SenseTime’s generative AI business, generating 1.2 billion yuan (US$ 165 million) in revenue and contributing 35% to the overall revenue. This is the fastest-growing business within SenseTime, surpassing one billion yuan in revenue in just over ten years since its founding.

In the first half of 2024, SenseTime has fostered close collaborations with several industry-leading companies in adopting its large model technology. For instance, Kingsoft Office's WPS AI has integrated SenseTime’s large model technology for intelligent upgrades in office software. Xiaomi’s Xiaoai Assistant has significantly improved user experience with the help of SenseTime’s large model technology. Haitong Securities has partnered with SenseTime to build financial AI applications, driving digital transformation. China Literature Group has collaborated with SenseTime to create an AI-native virtual social platform called Dream Island.

Xu emphasized that SenseTime’s large models have expanded from text to code, office automation, humanoid dialogue, finance, agriculture, and other vertical industries, launching specialized models and integrated devices. This allows customers to use AI large models efficiently and economically. Currently, SenseTime's large models serve over 3,000 clients.

“We are at a crucial turning point where AI’s super moment is dependent on our collective efforts to create super applications,” Xu remarked.

Earlier, SenseTime also launched Vimi, China’s first “controllable” character video generation large model, capable of generating human-like videos from a single photo and driven by various elements such as existing videos, animations, sounds, and texts.

“Connecting all smart speakers, smart car systems, smart glasses to our terminal large models will make AI large models truly accessible to everyone, ushering in AI’s super moment,” Xu added.