By using this site, you agree to the Privacy Policy and Terms of Use.
Accept
MadisonyMadisony
Notification Show More
Font ResizerAa
  • Home
  • National & World
  • Politics
  • Investigative Reports
  • Education
  • Health
  • Entertainment
  • Technology
  • Sports
  • Money
  • Pets & Animals
Reading: Alibaba's Qwen 3.5 397B-A17 beats its bigger trillion-parameter mannequin — at a fraction of the associated fee
Share
Font ResizerAa
MadisonyMadisony
Search
  • Home
  • National & World
  • Politics
  • Investigative Reports
  • Education
  • Health
  • Entertainment
  • Technology
  • Sports
  • Money
  • Pets & Animals
Have an existing account? Sign In
Follow US
2025 © Madisony.com. All Rights Reserved.
Technology

Alibaba's Qwen 3.5 397B-A17 beats its bigger trillion-parameter mannequin — at a fraction of the associated fee

Madisony
Last updated: February 18, 2026 9:27 pm
Madisony
Share
Alibaba's Qwen 3.5 397B-A17 beats its bigger trillion-parameter mannequin — at a fraction of the associated fee
SHARE

[ad_1]

Alibaba's Qwen 3.5 397B-A17 beats its bigger trillion-parameter mannequin — at a fraction of the associated fee

Contents
A New Structure Constructed for Pace at ScaleNative Multimodal, Not Bolted OnLanguage Protection and Tokenizer EffectivityAgentic Capabilities and the OpenClaw IntegrationDeployment Realities: What IT Groups Truly Must KnowWhat Comes Subsequent

Alibaba dropped Qwen3.5 earlier this week, timed to coincide with the Lunar New Yr, and the headline numbers alone are sufficient to make enterprise AI patrons cease and listen.

The brand new flagship open-weight mannequin — Qwen3.5-397B-A17B — packs 397 billion whole parameters however prompts solely 17 billion per token. It’s claiming benchmark wins in opposition to Alibaba's personal earlier flagship, Qwen3-Max, a mannequin the corporate itself has acknowledged exceeded one trillion parameters. 

The discharge marks a significant second in enterprise AI procurement. For IT leaders evaluating AI infrastructure for 2026, Qwen 3.5 presents a special sort of argument: that the mannequin you may truly run, personal, and management can now commerce blows with the fashions it’s a must to lease.

A New Structure Constructed for Pace at Scale

The engineering story beneath Qwen3.5 begins with its ancestry. The mannequin is a direct successor to final September's experimental Qwen3-Subsequent, an ultra-sparse MoE mannequin that was previewed however broadly considered half-trained. Qwen3.5 takes that architectural route and scales it aggressively, leaping from 128 consultants within the earlier Qwen3 MoE fashions to 512 consultants within the new launch.

The sensible implication of this and a greater consideration mechanism is dramatically decrease inference latency. As a result of solely 17 billion of these 397 billion parameters are lively for any given ahead go, the compute footprint is way nearer to a 17B dense mannequin than a 400B one — whereas the mannequin can draw on the total depth of its knowledgeable pool for specialised reasoning.

These pace features are substantial. At 256K context lengths, Qwen 3.5 decodes 19 occasions sooner than Qwen3-Max and seven.2 occasions sooner than Qwen 3's 235B-A22B mannequin.

Alibaba can also be claiming the mannequin is 60% cheaper to run than its predecessor and eight occasions extra able to dealing with massive concurrent workloads, figures that matter enormously to any group being attentive to inference payments. It's additionally about 1/18th the price of Google's Gemini 3 Professional.

Two different architectural selections compound these features:

  1. Qwen3.5 adopts multi-token prediction — an method pioneered in a number of proprietary fashions — which accelerates pre-training convergence and will increase throughput.

  2. It additionally inherits the eye system from Qwen3-Subsequent launched final 12 months, designed particularly to cut back reminiscence strain at very lengthy context lengths.

The result’s a mannequin that may comfortably function inside a 256K context window within the open-weight model, and as much as 1 million tokens within the hosted Qwen3.5-Plus variant on Alibaba Cloud Mannequin Studio.

Native Multimodal, Not Bolted On

For years, Alibaba took the usual business method: construct a language mannequin, then connect a imaginative and prescient encoder to create a separate VL variant. Qwen3.5 abandons that sample fully. The mannequin is educated from scratch on textual content, photographs, and video concurrently, that means visible reasoning is woven into the mannequin's core representations moderately than grafted on.

This issues in follow. Natively multimodal fashions are likely to outperform their adapter-based counterparts on duties that require tight text-image reasoning — suppose analyzing a technical diagram alongside its documentation, processing UI screenshots for agentic duties, or extracting structured information from advanced visible layouts. On MathVista, the mannequin scores 90.3; on MMMU, 85.0. It trails Gemini 3 on a number of vision-specific benchmarks however surpasses Claude Opus 4.5 on multimodal duties and posts aggressive numbers in opposition to GPT-5.2, all whereas carrying a fraction of the parameter depend.

Qwen3.5's benchmark efficiency in opposition to bigger proprietary fashions is the quantity that may drive enterprise conversations.

On the evaluations Alibaba has printed, the 397B-A17B mannequin outperforms Qwen3-Max — a mannequin with over a trillion parameters — throughout a number of reasoning and coding duties.

It additionally claims aggressive outcomes in opposition to GPT-5.2, Claude Opus 4.5, and Gemini 3 Professional on basic reasoning and coding benchmarks.

Language Protection and Tokenizer Effectivity

One underappreciated element within the Qwen3.5 launch is its expanded multilingual attain. The mannequin's vocabulary has grown to 250k tokens, up from 150k in prior Qwen generations and now akin to Google's ~256K tokenizer. Language assist expands from 119 languages in Qwen 3 to 201 languages and dialects.

The tokenizer improve has direct value implications for international deployments. Bigger vocabularies encode non-Latin scripts — Arabic, Thai, Korean, Japanese, Hindi, and others — extra effectively, decreasing token counts by 15–40% relying on the language. For IT organizations working AI at scale throughout multilingual consumer bases, this isn’t an instructional element. It interprets on to decrease inference prices and sooner response occasions.

Agentic Capabilities and the OpenClaw Integration

Alibaba is positioning Qwen3.5 explicitly as an agentic mannequin — one designed not simply to reply to queries however to take multi-step autonomous motion on behalf of customers and programs. The corporate has open-sourced Qwen Code, a command-line interface that lets builders delegate advanced coding duties to the mannequin in pure language, roughly analogous to Anthropic's Claude Code.

The discharge additionally highlights compatibility with OpenClaw, the open-source agentic framework that has surged in developer adoption this 12 months. With 15,000 distinct reinforcement studying coaching environments used to sharpen the mannequin's reasoning and job execution, the Qwen group has made a deliberate wager on RL-based coaching to enhance sensible agentic efficiency — a development in line with what MiniMax demonstrated with M2.5.

The Qwen3.5-Plus hosted variant additionally allows adaptive inference modes: a quick mode for latency-sensitive functions, a pondering mode that allows prolonged chain-of-thought reasoning for advanced duties, and an auto (adaptive) mode that selects dynamically. That flexibility issues for enterprise deployments the place the identical mannequin could must serve each real-time buyer interactions and deep analytical workflows.

Deployment Realities: What IT Groups Truly Must Know

Operating Qwen3.5’s open-weights in-house requires critical {hardware}. Whereas a quantized model calls for roughly 256GB of RAM, and realistically 512GB for comfy headroom. This isn’t a mannequin for a workstation or a modest on-prem server. What it’s appropriate for is a GPU node — a configuration that many enterprises already function for inference workloads, and one which now gives a compelling various to API-dependent deployments.

All open-weight Qwen 3.5 fashions are launched below the Apache 2.0 license. It is a significant distinction from fashions with customized or restricted licenses: Apache 2.0 permits industrial use, modification, and redistribution with out royalties, with no significant strings hooked up. For authorized and procurement groups evaluating open fashions, that clear licensing posture simplifies the dialog significantly.

What Comes Subsequent

Alibaba has confirmed that is the primary launch within the Qwen3.5 household, not the whole rollout. Based mostly on the sample from Qwen3 — which featured fashions all the way down to 600 million parameters — the business expects smaller dense distilled fashions and extra MoE configurations to comply with over the following a number of weeks and months. The Qwen3-Subsequent 80B mannequin from final September was broadly thought of undertrained, suggesting a 3.5 variant at that scale is a probable near-term launch.

For IT decision-makers, the trajectory is obvious. Alibaba has demonstrated that open-weight fashions on the frontier are not a compromise. Qwen3.5 is a real procurement choice for groups that need frontier-class reasoning, native multimodal capabilities, and a 1M token context window — with out locking right into a proprietary API. The following query shouldn’t be whether or not this household of fashions is succesful sufficient. It’s whether or not your infrastructure and group are able to benefit from it.


Qwen 3.5 is accessible now on Hugging Face below the mannequin ID Qwen/Qwen3.5-397B-A17B. The hosted Qwen3.5-Plus variant is on the market by way of Alibaba Cloud Mannequin Studio. Qwen Chat at chat.qwen.ai gives free public entry for analysis.

[ad_2]

Subscribe to Our Newsletter
Subscribe to our newsletter to get our newest articles instantly!
[mc4wp_form]
Share This Article
Email Copy Link Print
Previous Article ‘Advance mag-isip?’ Teams, politicians react to Sara Duterte’s 2028 presidential bid ‘Advance mag-isip?’ Teams, politicians react to Sara Duterte’s 2028 presidential bid
Next Article NIH’s Bhattacharya can even run CDC amid director search NIH’s Bhattacharya can even run CDC amid director search

POPULAR

Estée Lauder Seeks Buyers for Beauty Lines Amid Puig Merger
business

Estée Lauder Seeks Buyers for Beauty Lines Amid Puig Merger

Trump: Iran’s Uranium Removal Mostly PR in Nuclear Talks
top

Trump: Iran’s Uranium Removal Mostly PR in Nuclear Talks

Labour Faces Leadership Shake-Up After Election Losses
top

Labour Faces Leadership Shake-Up After Election Losses

Claude Mythos AI Discovers Critical Vulnerabilities in Hours
Technology

Claude Mythos AI Discovers Critical Vulnerabilities in Hours

Data Shows Burnham’s Chances Against Reform in Makerfield Vote
Politics

Data Shows Burnham’s Chances Against Reform in Makerfield Vote

MLPI ETF Delivers 14% Yield in Tax-Efficient Energy Infrastructure
business

MLPI ETF Delivers 14% Yield in Tax-Efficient Energy Infrastructure

Trump and Xi’s Matching Suits Test Chameleon Effect in Beijing
world

Trump and Xi’s Matching Suits Test Chameleon Effect in Beijing

You Might Also Like

FIFA Sparks Outrage With 0 World Cup Parking Passes
businessEducationEntertainmentHealthPoliticsSportsTechnologytopworld

FIFA Sparks Outrage With $300 World Cup Parking Passes

World Cup Parking Costs Draw Fan Criticism Football supporters worldwide have expressed frustration over FIFA's pricing strategy for the 2026…

2 Min Read
Did Alibaba simply kneecap its highly effective Qwen AI group? Key figures depart in wake of newest open supply launch
Technology

Did Alibaba simply kneecap its highly effective Qwen AI group? Key figures depart in wake of newest open supply launch

Alibaba's Qwen group of AI researchers have been among the many most prolific and well-regarded by worldwide machine studying group…

8 Min Read
US Decide Guidelines ICE Raids Require Judicial Warrants, Contradicting Secret ICE Memo
Technology

US Decide Guidelines ICE Raids Require Judicial Warrants, Contradicting Secret ICE Memo

A federal choose in Minnesota dominated final Saturday that Immigration and Customs Enforcement (ICE) brokers violated the Fourth Modification after…

5 Min Read
Cloudflare Unveils EmDash: Secure AI-Native CMS Rivaling WordPress
Technology

Cloudflare Unveils EmDash: Secure AI-Native CMS Rivaling WordPress

Cloudflare recently introduced EmDash, a bold new open-source, serverless content management system (CMS) designed to modernize web publishing. Built in…

3 Min Read
Madisony

We cover the stories that shape the world, from breaking global headlines to the insights behind them. Our mission is simple: deliver news you can rely on, fast and fact-checked.

Recent News

Estée Lauder Seeks Buyers for Beauty Lines Amid Puig Merger
Estée Lauder Seeks Buyers for Beauty Lines Amid Puig Merger
May 15, 2026
Trump: Iran’s Uranium Removal Mostly PR in Nuclear Talks
Trump: Iran’s Uranium Removal Mostly PR in Nuclear Talks
May 15, 2026
Labour Faces Leadership Shake-Up After Election Losses
Labour Faces Leadership Shake-Up After Election Losses
May 15, 2026

Trending News

Estée Lauder Seeks Buyers for Beauty Lines Amid Puig Merger
Trump: Iran’s Uranium Removal Mostly PR in Nuclear Talks
Labour Faces Leadership Shake-Up After Election Losses
Claude Mythos AI Discovers Critical Vulnerabilities in Hours
Data Shows Burnham’s Chances Against Reform in Makerfield Vote
  • About Us
  • Privacy Policy
  • Terms Of Service
Reading: Alibaba's Qwen 3.5 397B-A17 beats its bigger trillion-parameter mannequin — at a fraction of the associated fee
Share

2025 © Madisony.com. All Rights Reserved.

Welcome Back!

Sign in to your account

Username or Email Address
Password

Lost your password?