SendTech Times
Analysis
AI SHIFT:

Alibaba AI voice model beats OpenAI, xAI to bridge Chinese dialect gap

Article summary

Alibaba’s Fun-Realtime-TTS-Preview ranked fifth on Artificial Analysis’ Speech Arena, ahead of rivals including OpenAI and xAI and as the only Chinese-engineered system in the global top five. A separate Artificial Analysis index placed Alibaba’s Fun-Realtime-ASR first on word error rate at 1.8 per cent. Alibaba says the model supports more than 30 languages, seven major Chinese dialects and over 20 regional accents, targeting a persistent weakness in speech systems trained on standard Mandarin.

Alibaba AI voice model beats OpenAI, xAI to bridge Chinese dialect gap
Image source: South China Morning Post

Alibaba Group Holding’s new artificial intelligence voice model has beaten Western rivals OpenAI and xAI on a major global benchmark, with the result highlighting its strength in handling complex Chinese dialects and accents.

Fun-Realtime-TTS-Preview, developed by Alibaba’s Tongyi Lab, took fifth place on the Artificial Analysis Speech Arena leaderboard with a score of 1,190.

It was the only Chinese-engineered voice system in the global top five.

The benchmark is run by Artificial Analysis, a San Francisco-based AI evaluation organisation backed by investors including former GitHub chief executive Nat Friedman and Google Brain founder Andrew Ng.

The platform ranks models through blind user evaluations of generated speech clips using an Elo-based system.

Benchmark rankings and speech tasks

Speech Arena users test models across three core capabilities: converting speech into text, enabling end-to-end voice understanding and conversational interaction, and transforming text into natural-sounding speech.

In a separate Artificial Analysis Word Error Rate index, Alibaba’s Fun-Realtime-ASR ranked first with a word error rate of 1.8 per cent.

That means fewer than two words out of every 100 were transcribed incorrectly.

Bridging dialect and accent gaps

The result speaks to a long-running bottleneck for voice technology in Asia.

A May report by the Baidu Developer Centre said traditional speech systems trained on standard Mandarin see accuracy fall below 60 per cent for accented speakers and under 30 per cent for regional Chinese dialects.

Alibaba has been trying to bridge that gap.

According to its cloud unit, Fun-Realtime-TTS-Preview supports more than 30 languages, seven major Chinese dialects and over 20 regional accents.

The model also provides enterprise-level customisation interfaces for finance and healthcare use cases.

In medical settings, for example, Alibaba said the system can convert doctors’ spoken notes into structured clinical records in real time.

Wider push into speech AI

Alibaba’s expansion in speech AI comes as Chinese tech companies shift from general-purpose chatbots toward more specialised real-world applications.

Developers are increasingly embedding voice AI assistants into daily applications in search of broader commercial uses for generative AI.

That focus reflects expectations that voice interfaces could become a key gateway for deploying AI across industries.

Voice is widely seen as one of the most intuitive forms of human-computer interaction, requiring little user training and working naturally across smartphones, smart speakers and in-car assistants.

Even so, US companies including Google and ElevenLabs continue to dominate many global commercial voice applications and developer ecosystems.

Share this article
inXf

Related articles

More
UAE Banks Lead Regional Responsible AI Push as Adoption Gap Narrows
AI

UAE Banks Lead Regional Responsible AI Push as Adoption Gap Narrows

Emirates NBD ranked first and First Abu Dhabi Bank ranked third in a responsible AI index for Middle East and Africa banks. The Evident AI Index surveyed more than 100 companies and weighted talent highest at 45 per cent across four assessment metrics. The practical test is whether UAE banks can turn responsible AI rankings into measurable deployment across customer engagement, risk analytics and core banking workflows.

Stratos Data Center Cuts Utah Plan as Water Backlash Tests AI Infrastructure Growth
AI

Stratos Data Center Cuts Utah Plan as Water Backlash Tests AI Infrastructure Growth

A Kevin O'Leary-backed Utah data center plan has been cut back after water and transparency objections, showing how local resistance can reshape AI infrastructure projects.

MiniMax M3 turns long-context AI into an agent platform test
AI

MiniMax M3 turns long-context AI into an agent platform test

MiniMax launched M3 on June 1, 2026, combining long-context, agentic, coding and native multimodal capabilities in one model line. The API supports up to 1 million tokens of context, with a guaranteed minimum of 512K tokens, and includes M3 and M3-highspeed versions. MiniMax plans to open-source M3 on HuggingFace and GitHub, while early pricing offers a 50% discount for the first seven days.

Smart TV Proxy SDKs Turn Free Apps Into a Hidden AI Scraping Supply Chain
Cybersecurity

Smart TV Proxy SDKs Turn Free Apps Into a Hidden AI Scraping Supply Chain

Bright Data's SDK has been reverse-engineered in research showing how free apps can turn consumer devices, including smart TVs, into residential proxy nodes for web-scraping traffic. The issue matters because AI data harvesting is increasing demand for residential IPs, while consent screens and background network behavior may not be clear to users or IT teams.

Keep Reading

More Stories

Latest
Apple AI Architecture Puts Google And Nvidia Inside Its Privacy TestAIJun 9, 2026Apple AI Architecture Puts Google And Nvidia Inside Its Privacy TestApple is using Google and Nvidia to support its most advanced cloud AI model while trying to keep Apple Intelligence centered on private orchestration, proprietary models and on-device context.Amazon-Corning Fiber Deal Puts Optics Inside The AI Data Center BottleneckCloud & Data CentersJun 9, 2026Amazon-Corning Fiber Deal Puts Optics Inside The AI Data Center BottleneckAmazon has reached a multi-year optical fiber and networking agreement with Corning, adding North Carolina manufacturing jobs and highlighting fiber capacity as a practical constraint in AI data center expansion.Check Point VPN Exploitation Puts Legacy IKEv1 Access In The Ransomware SpotlightCybersecurityJun 8, 2026Check Point VPN Exploitation Puts Legacy IKEv1 Access In The Ransomware SpotlightA critical Check Point VPN flaw, CVE-2026-50751, is being exploited against legacy IKEv1 remote-access configurations, with activity tied in one case to a Qilin ransomware affiliate and a second related VPN issue also disclosed.Silent Ransom Group Uses Fake IT Support Calls to Pressure Law FirmsCybersecurityJun 8, 2026Silent Ransom Group Uses Fake IT Support Calls to Pressure Law FirmsSilent Ransom Group is targeting U.S. law firms and professional services organizations with fake IT support calls, remote access tools and rapid data-theft extortion. Mandiant links the activity to UNC3753, Luna Moth and Chatty Spider, while the FBI has warned of related social engineering and in-person theft attempts.Alphabet’s $85 Billion AI Financing Push Tests Data Center Investor AppetiteCloud & Data CentersJun 8, 2026Alphabet’s $85 Billion AI Financing Push Tests Data Center Investor AppetiteAlphabet is seeking $85 billion in equity financing after raising its capex outlook to as high as $190 billion. The company is presenting Google Cloud growth, AI adoption and lower Gemini serving costs as evidence that its data center spending can support long-term AI demand.Apple WWDC 2026 Turns Siri Into the Test of Its AI CredibilityAIJun 8, 2026Apple WWDC 2026 Turns Siri Into the Test of Its AI CredibilityApple is expected to put Siri back at the center of WWDC 2026 after delays to its promised Apple Intelligence assistant. The event is likely to test whether Apple can turn contextual awareness, chatbot-style interaction and agentic voice tasks into reliable platform features.ChatGPT Lockdown Mode Narrows AI Data Exfiltration PathsCybersecurityJun 8, 2026ChatGPT Lockdown Mode Narrows AI Data Exfiltration PathsOpenAI is rolling out Lockdown Mode for eligible ChatGPT users to reduce data exfiltration risk from prompt injection. The optional setting limits outbound web and tool capabilities, trading some product flexibility for stronger containment around sensitive workflows.Dubai Hotels Turn to Residents as Tourism Shock Tests Luxury DemandEconomyJun 7, 2026Dubai Hotels Turn to Residents as Tourism Shock Tests Luxury DemandDubai luxury hotels are using resident staycation discounts to offset weaker international tourism, but the source shows weekend demand cannot fully replace longer foreign stays.Ciena's $50 Billion AI Network Target Puts Optical Capacity on the Hyperscaler ClockChips & SemiconductorsJun 7, 2026Ciena's $50 Billion AI Network Target Puts Optical Capacity on the Hyperscaler ClockCiena says AI demand could roughly double its addressable market to about $50 billion by 2029 as hyperscalers and service providers invest in optical networking. It cited RLS Hyper Rail, DCOM, coherent modules and 400G/800G pluggable optics as demand areas while planning $250 million to $275 million in capex this year. The practical test is whether AI compute buildouts convert into durable network orders.liko.ai Funding Turns Edge AI Into a Smart-Home Hardware TestAIJun 7, 2026liko.ai Funding Turns Edge AI Into a Smart-Home Hardware Testliko.ai completed its first-round financing to fund edge-side vision-language models, AI-native hardware and multi-modal home terminals. The investor group includes Shangtang Guoxiang Capital, Orient Fortune Capital, iFlytek Venture Capital, Hongtai Fund, Zhengxuan Investment and Mianbi Intelligence. The practical test is whether the startup can turn camera-based edge AI into a consumer smart-home hub without relying on cloud processing.Impact Circle Turns Impact Finance Into a Japan Fintech Measurement TestFintech & Digital PaymentsJun 7, 2026Impact Circle Turns Impact Finance Into a Japan Fintech Measurement TestTokyo-based Impact Circle is building a fintech model that measures social impact through its own lending and visualization businesses. The company won the Tokyo Financial Award 2025 financial innovation category and raised 335 million yen in a November 2024 Series A round. The next signal is whether Impact Cloud IC can turn impact measurement into a repeatable workflow for investors and Japanese corporations.ByteDance Raises Volcano Engine AI Revenue Target on Seedance 2.0 DemandAIJun 7, 2026ByteDance Raises Volcano Engine AI Revenue Target on Seedance 2.0 DemandByteDance’s Volcano Engine raised its full-year MaaS revenue target to RMB 15 billion after Seedance 2.0 became a larger AI revenue contributor. Seedance 2.0 is described as generating more than RMB 1 billion in monthly revenue, while average daily token consumption has grown by nearly 40% month-on-month. The practical test is whether Volcano Engine can keep video-generation usage converting into paid token consumption beyond high-usage content segments.