By using this site, you agree to the Privacy Policy and Terms of Use.
Accept
MadisonyMadisony
Notification Show More
Font ResizerAa
  • Home
  • National & World
  • Politics
  • Investigative Reports
  • Education
  • Health
  • Entertainment
  • Technology
  • Sports
  • Money
  • Pets & Animals
Reading: Alibaba's Qwen 3.5 397B-A17 beats its bigger trillion-parameter mannequin — at a fraction of the associated fee
Share
Font ResizerAa
MadisonyMadisony
Search
  • Home
  • National & World
  • Politics
  • Investigative Reports
  • Education
  • Health
  • Entertainment
  • Technology
  • Sports
  • Money
  • Pets & Animals
Have an existing account? Sign In
Follow US
2025 © Madisony.com. All Rights Reserved.
Technology

Alibaba's Qwen 3.5 397B-A17 beats its bigger trillion-parameter mannequin — at a fraction of the associated fee

Madisony
Last updated: February 18, 2026 9:27 pm
Madisony
Share
Alibaba's Qwen 3.5 397B-A17 beats its bigger trillion-parameter mannequin — at a fraction of the associated fee
SHARE



Contents
A New Structure Constructed for Pace at ScaleNative Multimodal, Not Bolted OnLanguage Protection and Tokenizer EffectivityAgentic Capabilities and the OpenClaw IntegrationDeployment Realities: What IT Groups Truly Must KnowWhat Comes Subsequent

Alibaba dropped Qwen3.5 earlier this week, timed to coincide with the Lunar New Yr, and the headline numbers alone are sufficient to make enterprise AI patrons cease and listen.

The brand new flagship open-weight mannequin — Qwen3.5-397B-A17B — packs 397 billion whole parameters however prompts solely 17 billion per token. It’s claiming benchmark wins in opposition to Alibaba's personal earlier flagship, Qwen3-Max, a mannequin the corporate itself has acknowledged exceeded one trillion parameters. 

The discharge marks a significant second in enterprise AI procurement. For IT leaders evaluating AI infrastructure for 2026, Qwen 3.5 presents a special sort of argument: that the mannequin you may truly run, personal, and management can now commerce blows with the fashions it’s a must to lease.

A New Structure Constructed for Pace at Scale

The engineering story beneath Qwen3.5 begins with its ancestry. The mannequin is a direct successor to final September's experimental Qwen3-Subsequent, an ultra-sparse MoE mannequin that was previewed however broadly considered half-trained. Qwen3.5 takes that architectural route and scales it aggressively, leaping from 128 consultants within the earlier Qwen3 MoE fashions to 512 consultants within the new launch.

The sensible implication of this and a greater consideration mechanism is dramatically decrease inference latency. As a result of solely 17 billion of these 397 billion parameters are lively for any given ahead go, the compute footprint is way nearer to a 17B dense mannequin than a 400B one — whereas the mannequin can draw on the total depth of its knowledgeable pool for specialised reasoning.

These pace features are substantial. At 256K context lengths, Qwen 3.5 decodes 19 occasions sooner than Qwen3-Max and seven.2 occasions sooner than Qwen 3's 235B-A22B mannequin.

Alibaba can also be claiming the mannequin is 60% cheaper to run than its predecessor and eight occasions extra able to dealing with massive concurrent workloads, figures that matter enormously to any group being attentive to inference payments. It's additionally about 1/18th the price of Google's Gemini 3 Professional.

Two different architectural selections compound these features:

  1. Qwen3.5 adopts multi-token prediction — an method pioneered in a number of proprietary fashions — which accelerates pre-training convergence and will increase throughput.

  2. It additionally inherits the eye system from Qwen3-Subsequent launched final 12 months, designed particularly to cut back reminiscence strain at very lengthy context lengths.

The result’s a mannequin that may comfortably function inside a 256K context window within the open-weight model, and as much as 1 million tokens within the hosted Qwen3.5-Plus variant on Alibaba Cloud Mannequin Studio.

Native Multimodal, Not Bolted On

For years, Alibaba took the usual business method: construct a language mannequin, then connect a imaginative and prescient encoder to create a separate VL variant. Qwen3.5 abandons that sample fully. The mannequin is educated from scratch on textual content, photographs, and video concurrently, that means visible reasoning is woven into the mannequin's core representations moderately than grafted on.

This issues in follow. Natively multimodal fashions are likely to outperform their adapter-based counterparts on duties that require tight text-image reasoning — suppose analyzing a technical diagram alongside its documentation, processing UI screenshots for agentic duties, or extracting structured information from advanced visible layouts. On MathVista, the mannequin scores 90.3; on MMMU, 85.0. It trails Gemini 3 on a number of vision-specific benchmarks however surpasses Claude Opus 4.5 on multimodal duties and posts aggressive numbers in opposition to GPT-5.2, all whereas carrying a fraction of the parameter depend.

Qwen3.5's benchmark efficiency in opposition to bigger proprietary fashions is the quantity that may drive enterprise conversations.

On the evaluations Alibaba has printed, the 397B-A17B mannequin outperforms Qwen3-Max — a mannequin with over a trillion parameters — throughout a number of reasoning and coding duties.

It additionally claims aggressive outcomes in opposition to GPT-5.2, Claude Opus 4.5, and Gemini 3 Professional on basic reasoning and coding benchmarks.

Language Protection and Tokenizer Effectivity

One underappreciated element within the Qwen3.5 launch is its expanded multilingual attain. The mannequin's vocabulary has grown to 250k tokens, up from 150k in prior Qwen generations and now akin to Google's ~256K tokenizer. Language assist expands from 119 languages in Qwen 3 to 201 languages and dialects.

The tokenizer improve has direct value implications for international deployments. Bigger vocabularies encode non-Latin scripts — Arabic, Thai, Korean, Japanese, Hindi, and others — extra effectively, decreasing token counts by 15–40% relying on the language. For IT organizations working AI at scale throughout multilingual consumer bases, this isn’t an instructional element. It interprets on to decrease inference prices and sooner response occasions.

Agentic Capabilities and the OpenClaw Integration

Alibaba is positioning Qwen3.5 explicitly as an agentic mannequin — one designed not simply to reply to queries however to take multi-step autonomous motion on behalf of customers and programs. The corporate has open-sourced Qwen Code, a command-line interface that lets builders delegate advanced coding duties to the mannequin in pure language, roughly analogous to Anthropic's Claude Code.

The discharge additionally highlights compatibility with OpenClaw, the open-source agentic framework that has surged in developer adoption this 12 months. With 15,000 distinct reinforcement studying coaching environments used to sharpen the mannequin's reasoning and job execution, the Qwen group has made a deliberate wager on RL-based coaching to enhance sensible agentic efficiency — a development in line with what MiniMax demonstrated with M2.5.

The Qwen3.5-Plus hosted variant additionally allows adaptive inference modes: a quick mode for latency-sensitive functions, a pondering mode that allows prolonged chain-of-thought reasoning for advanced duties, and an auto (adaptive) mode that selects dynamically. That flexibility issues for enterprise deployments the place the identical mannequin could must serve each real-time buyer interactions and deep analytical workflows.

Deployment Realities: What IT Groups Truly Must Know

Operating Qwen3.5’s open-weights in-house requires critical {hardware}. Whereas a quantized model calls for roughly 256GB of RAM, and realistically 512GB for comfy headroom. This isn’t a mannequin for a workstation or a modest on-prem server. What it’s appropriate for is a GPU node — a configuration that many enterprises already function for inference workloads, and one which now gives a compelling various to API-dependent deployments.

All open-weight Qwen 3.5 fashions are launched below the Apache 2.0 license. It is a significant distinction from fashions with customized or restricted licenses: Apache 2.0 permits industrial use, modification, and redistribution with out royalties, with no significant strings hooked up. For authorized and procurement groups evaluating open fashions, that clear licensing posture simplifies the dialog significantly.

What Comes Subsequent

Alibaba has confirmed that is the primary launch within the Qwen3.5 household, not the whole rollout. Based mostly on the sample from Qwen3 — which featured fashions all the way down to 600 million parameters — the business expects smaller dense distilled fashions and extra MoE configurations to comply with over the following a number of weeks and months. The Qwen3-Subsequent 80B mannequin from final September was broadly thought of undertrained, suggesting a 3.5 variant at that scale is a probable near-term launch.

For IT decision-makers, the trajectory is obvious. Alibaba has demonstrated that open-weight fashions on the frontier are not a compromise. Qwen3.5 is a real procurement choice for groups that need frontier-class reasoning, native multimodal capabilities, and a 1M token context window — with out locking right into a proprietary API. The following query shouldn’t be whether or not this household of fashions is succesful sufficient. It’s whether or not your infrastructure and group are able to benefit from it.


Qwen 3.5 is accessible now on Hugging Face below the mannequin ID Qwen/Qwen3.5-397B-A17B. The hosted Qwen3.5-Plus variant is on the market by way of Alibaba Cloud Mannequin Studio. Qwen Chat at chat.qwen.ai gives free public entry for analysis.

Subscribe to Our Newsletter
Subscribe to our newsletter to get our newest articles instantly!
[mc4wp_form]
Share This Article
Email Copy Link Print
Previous Article ‘Advance mag-isip?’ Teams, politicians react to Sara Duterte’s 2028 presidential bid ‘Advance mag-isip?’ Teams, politicians react to Sara Duterte’s 2028 presidential bid
Next Article NIH’s Bhattacharya can even run CDC amid director search NIH’s Bhattacharya can even run CDC amid director search

POPULAR

Ukraine Animal Shelters Underwater As a result of Extreme Flooding – A whole lot of Animals At Danger
Pets & Animals

Ukraine Animal Shelters Underwater As a result of Extreme Flooding – A whole lot of Animals At Danger

Rashee Rice accused of home violence by ex-girlfriend in civil lawsuit
Sports

Rashee Rice accused of home violence by ex-girlfriend in civil lawsuit

Bunnie Xo makes TMI confession about Jelly Roll’s manhood
National & World

Bunnie Xo makes TMI confession about Jelly Roll’s manhood

Trump has mentioned timeline for Iran strikes — together with as quickly as this weekend — however no choice but
Politics

Trump has mentioned timeline for Iran strikes — together with as quickly as this weekend — however no choice but

11 Arrested in Probe of Far-Right Student’s Fatal Lyon Beating
world

11 Arrested in Probe of Far-Right Student’s Fatal Lyon Beating

The Bose QuietComfort Extremely Gen 2 Headphones Are at Their Lowest Value in Months
Technology

The Bose QuietComfort Extremely Gen 2 Headphones Are at Their Lowest Value in Months

FDA chief warns U.S. is dropping floor to China in early drug trials
Money

FDA chief warns U.S. is dropping floor to China in early drug trials

You Might Also Like

Hess Midstream Forecasts 10% Annual Cash Flow Growth Until 2028
businessEducationEntertainmentHealthPoliticsSportsTechnologytopworld

Hess Midstream Forecasts 10% Annual Cash Flow Growth Until 2028

Midstream Operator Announces Long-Term Financial Strategy Hess Midstream LP (HESM) has outlined ambitious financial targets through 2028 during its latest…

2 Min Read
Anthropic's Claude Code can now learn your Slack messages and write code for you
Technology

Anthropic's Claude Code can now learn your Slack messages and write code for you

Anthropic on Monday launched a beta integration that connects its fast-growing Claude Code programming agent on to Slack, permitting software…

16 Min Read
Kohler Black Friday Promo Code (2025): 10 % Off Rest room and Kitchen
Technology

Kohler Black Friday Promo Code (2025): 10 % Off Rest room and Kitchen

There are Black Friday offers, after which there are Black Friday offers. Most offers are the sort that present up…

3 Min Read
ScaleOps' new AI Infra Product slashes GPU prices for self-hosted enterprise LLMs by 50% for early adopters
Technology

ScaleOps' new AI Infra Product slashes GPU prices for self-hosted enterprise LLMs by 50% for early adopters

ScaleOps has expanded its cloud useful resource administration platform with a brand new product aimed toward enterprises working self-hosted massive…

8 Min Read
Madisony

We cover the stories that shape the world, from breaking global headlines to the insights behind them. Our mission is simple: deliver news you can rely on, fast and fact-checked.

Recent News

Ukraine Animal Shelters Underwater As a result of Extreme Flooding – A whole lot of Animals At Danger
Ukraine Animal Shelters Underwater As a result of Extreme Flooding – A whole lot of Animals At Danger
February 18, 2026
Rashee Rice accused of home violence by ex-girlfriend in civil lawsuit
Rashee Rice accused of home violence by ex-girlfriend in civil lawsuit
February 18, 2026
Bunnie Xo makes TMI confession about Jelly Roll’s manhood
Bunnie Xo makes TMI confession about Jelly Roll’s manhood
February 18, 2026

Trending News

Ukraine Animal Shelters Underwater As a result of Extreme Flooding – A whole lot of Animals At Danger
Rashee Rice accused of home violence by ex-girlfriend in civil lawsuit
Bunnie Xo makes TMI confession about Jelly Roll’s manhood
Trump has mentioned timeline for Iran strikes — together with as quickly as this weekend — however no choice but
11 Arrested in Probe of Far-Right Student’s Fatal Lyon Beating
  • About Us
  • Privacy Policy
  • Terms Of Service
Reading: Alibaba's Qwen 3.5 397B-A17 beats its bigger trillion-parameter mannequin — at a fraction of the associated fee
Share

2025 © Madisony.com. All Rights Reserved.

Welcome Back!

Sign in to your account

Username or Email Address
Password

Lost your password?