Chinese language AI startup Z.ai, recognized for its highly effective, open supply GLM household of enormous language fashions (LLMs), has launched GLM-5-Turbo, a brand new, proprietary variant of its open supply GLM-5 mannequin geared toward agent-driven workflows, with the corporate positioning it as a sooner mannequin tuned for OpenClaw-style duties equivalent to software use, long-chain execution and protracted automation.
It's out there now by Z.ai's utility programming interface (API) on third-party supplier OpenRouter with roughly a 202.8K-token context window, 131.1K max output, and listed pricing of $0.96 per million enter tokens and $3.20 per million output tokens. That makes it about $0.04 cheaper per whole enter and output price (at 1 million tokens) than its predecessor, in accordance with our calculations.
Mannequin | Enter | Output | Complete Price | Supply |
Grok 4.1 Quick | $0.20 | $0.50 | $0.70 | |
Gemini 3 Flash | $0.50 | $3.00 | $3.50 | |
Kimi-K2.5 | $0.60 | $3.00 | $3.60 | |
GLM-5-Turbo | $0.96 | $3.20 | $4.16 | |
GLM-5 | $1.00 | $3.20 | $4.20 | |
Claude Haiku 4.5 | $1.00 | $5.00 | $6.00 | |
Qwen3-Max | $1.20 | $6.00 | $7.20 | |
Gemini 3 Professional | $2.00 | $12.00 | $14.00 | |
GPT-5.2 | $1.75 | $14.00 | $15.75 | |
GPT-5.4 | $2.50 | $15.00 | $17.50 | |
Claude Sonnet 4.5 | $3.00 | $15.00 | $18.00 | |
Claude Opus 4.6 | $5.00 | $25.00 | $30.00 | |
GPT-5.4 Professional | $30.00 | $180.00 | $210.00 |
Second, Z.ai can also be including the mannequin to its GLM Coding subscription product, which is its packaged coding assistant service. That service has three tiers: Lite at $27 per quarter, Professional at $81 per quarter, and Max at $216 per quarter.
Z.ai’s March 15 rollout observe says Professional subscribers get GLM-5-Turbo in March, whereas Lite subscribers get the bottom GLM-5 in March and should wait till April for GLM-5-Turbo. The corporate can also be taking early-access functions for enterprises through a Google Kind, which suggests some customers could get entry forward of that schedule relying on capability.
z.ai describes GLM-5-Turbo as designed for “quick inference” and “deeply optimized for real-world agent workflows involving lengthy execution chains,” with enhancements in advanced instruction decomposition, software use, scheduled and protracted execution, and stability throughout prolonged duties.
The discharge presents builders a brand new possibility for constructing OpenClaw-style autonomous AI brokers, and serves as a sign about the place mannequin distributors assume enterprise demand is heading: away from chat interfaces and towards programs that may reliably execute multi-step work.
That’s now the place a lot of the competitors is transferring, as properly, particularly amongst distributors making an attempt to win builders and enterprise groups constructing inner assistants, workflow orchestrators and coding brokers.
Constructed for execution, not simply dialog
Z.ai’s supplies body GLM-5-Turbo as a mannequin for production-like agent habits slightly than static prompt-response use.
The pitch facilities on reliability in sensible process flows: higher command following, stronger software invocation, improved dealing with of scheduled and protracted duties, and sooner execution throughout longer logical chains. That positioning places the mannequin squarely out there for brokers that do greater than reply questions.
It’s geared toward programs that may collect info, name instruments, break down directions and hold working by advanced process sequences with much less supervision.
Somewhat than an easy successor to GLM-5, GLM-5-Turbo seems to be a extra execution-focused variant: tuned for velocity, software use and long-chain agent stability, whereas the bottom GLM-5 stays Z.ai’s broader open-source flagship.
GLM-5-Turbo seems particularly aggressive in OpenClaw situations equivalent to info search and gathering, workplace and each day duties, information evaluation, improvement and operations, and automation. These are company-supplied supplies, not impartial validation, however they make the meant product positioning clear.
Background: z.ai and GLM-5 set the stage for Turbo
Based in 2019 as a Tsinghua College spinoff in Beijing, Z.ai — previously Zhipu AI — is now one in every of China’s best-known basis mannequin firms. The corporate stays headquartered in Beijing and is led by CEO Zhang Peng
Z.ai listed on the Hong Kong Inventory Alternate on January 8, 2026, with shares priced at HK$116.20 and opening at HK$120, for a acknowledged market capitalization of HK$52.83 billion, making it China’s largest impartial massive language mannequin developer.
As of September 30, 2025 its fashions had reportedly been utilized by greater than 12,000 enterprise clients, greater than 80 million end-user units and greater than 45 million builders worldwide.
Z.ai’s final main launch, GLM-5, which debuted in February 2026, offers helpful context for what the corporate is now making an attempt to do with GLM-5-Turbo.
GLM-5 is an open-source flagship mannequin carrying an MIT license, posting a record-low hallucination rating on the AA-Omniscience Index, and debuted a local “Agent Mode” that would flip prompts or supply supplies into ready-to-use .docx, .pdf and .xlsx recordsdata.
That earlier launch was additionally framed as a serious technical step up for the corporate. GLM-5 scaled to 744 billion parameters with 40 billion energetic per token in a mixture-of-experts structure, used 28.5 trillion pretraining tokens, and relied on a brand new asynchronous reinforcement-learning infrastructure known as “slime” to scale back coaching bottlenecks and assist extra advanced agentic habits.
In that gentle, GLM-5-Turbo appears to be like much less like a alternative for GLM-5 than a narrower industrial offshoot: a variant that retains the long-context, agentic orientation of the flagship line however emphasizes velocity, stability and execution in real-world agent chains.
Developer options and mannequin packaging
On the technical aspect, Z.ai has been packaging the GLM-5 household with the sorts of capabilities builders now count on from critical agent-facing fashions, together with lengthy context dealing with, instruments, reasoning assist and structured integrations.
OpenRouter’s GLM-5-Turbo web page lists assist for instruments, software selection and response formatting, whereas additionally surfacing stay efficiency information together with common throughput and latency.
OpenRouter’s supplier telemetry provides a helpful deployment-level comparability between GLM-5 and GLM-5-Turbo, although the information shouldn’t be completely apples-to-apples as a result of GLM-5 seems throughout a number of suppliers whereas GLM-5-Turbo is proven solely by Z.ai.
On throughput, GLM-5-Turbo averages 48 tokens per second on OpenRouter, which places it beneath the quickest GLM-5 endpoints proven within the screenshots, together with Fireworks at 70 tok/s and Friendli at 58 tok/s, however above Collectively’s 40 tok/s.
On uncooked first-token latency, GLM-5-Turbo is slower within the out there information, posting 2.92 seconds versus 0.41 seconds for Friendli’s GLM-5 endpoint, 1.00 second for Parasail and 1.08 seconds for DeepInfra.
However the image improves on end-to-end completion time: GLM-5-Turbo is proven at 8.16 seconds, sooner than the GLM-5 endpoints, which vary from 9.34 seconds on Fireworks to 11.23 seconds on DeepInfra.
Essentially the most notable operational benefit is in software reliability. GLM-5-Turbo reveals a 0.67% software name error charge, materially decrease than the GLM-5 suppliers proven, the place error charges vary from 2.33% to six.41%.
For enterprise groups, that implies a mannequin that will not win on preliminary responsiveness in its present OpenRouter routing, however may nonetheless be higher suited to longer agent runs the place completion stability and decrease software failure matter greater than the quickest first token.
Benchmarking and pricing
A ZClawBench radar chart launched by z.ai reveals GLM-5-Turbo as particularly aggressive in OpenClaw situations equivalent to info search and gathering, workplace and each day duties, information evaluation, improvement and operations, and automation.
These are company-supplied benchmark visuals, not impartial validation, however they do assist clarify how Z.ai needs the 2 fashions understood: GLM-5 because the broader coding and open flagship, and Turbo because the extra focused agent-execution variant.
A extra nuanced licensing sign
One notable caveat is licensing. Z.ai says GLM-5-Turbo is at the moment closed-source, however it additionally says the mannequin’s capabilities and findings might be folded into its subsequent open-source mannequin launch. That is a crucial distinction. The corporate shouldn’t be clearly promising to open-source GLM-5-Turbo itself.
As a substitute, it’s saying that classes, strategies and enhancements from this launch will inform a future open mannequin. That makes the launch extra nuanced than a clear break from openness.
Z.ai’s earlier GLM technique leaned closely on open releases and open-weight distribution, which helped it construct visibility amongst builders.
China’s AI market could also be rebalancing away from open supply
GLM-5-Turbo’s licensing posture additionally lands in a wider Chinese language market context that makes the launch extra notable than a easy product replace.
In latest weeks, reporting round Alibaba’s Qwen unit has raised contemporary questions on how China’s main AI labs will steadiness open releases with industrial stress.
Earlier this month, Qwen division head Lin Junyang stepped down, changing into the third senior Qwen government to go away in 2026, although Alibaba’s Qwen household stays one of the vital prolific open-model efforts anyplace, with greater than 400 open-source fashions launched since 2023 and greater than 1 billion downloads.
Reuters then reported on March 16 that Alibaba CEO Eddie Wu would take direct management of a newly fashioned AI-focused enterprise group consolidating Qwen and different models, amid scrutiny over technique, profitability and the brutal value competitors surrounding open-model choices in China.
Even with out overstating these developments, they assist body the broader query hanging over the sector: whether or not the economics of frontier AI are beginning to push even traditionally open-leaning Chinese language labs towards a extra segmented technique.
That doesn’t imply Chinese language labs are abandoning open supply. However the sample is changing into tougher to disregard: open fashions assist drive adoption, developer goodwill and ecosystem attain, whereas sure high-value variants geared toward enterprise brokers, coding workflows and different commercially enticing use instances could more and more arrive first as proprietary merchandise.
In that sense, GLM-5-Turbo suits a bigger attainable shift in China’s AI market, one that appears more and more just like the playbook utilized by OpenAI, Anthropic and Google within the U.S.: openness as distribution, proprietary programs as enterprise.
Seen in that gentle, GLM-5-Turbo appears to be like like greater than a speed-focused product replace. It could be one other signal that components of China’s AI sector are transferring towards the identical hybrid mannequin already frequent within the U.S.: openness as distribution, proprietary programs as enterprise.
That might not mark the tip of open-source AI from Chinese language labs, however it may imply their most strategically necessary agent-focused choices seem first behind closed entry, even when a few of their underlying advances later make their means into open releases.
For builders evaluating agent platforms, that makes GLM-5-Turbo each a product launch and a helpful sign. Z.ai continues to be talking the language of open fashions. However with this launch, it’s also displaying that a few of its most commercially related work could arrive first as proprietary infrastructure for enterprise-grade agent programs.

