By using this site, you agree to the Privacy Policy and Terms of Use.
Accept
MadisonyMadisony
Notification Show More
Font ResizerAa
  • Home
  • National & World
  • Politics
  • Investigative Reports
  • Education
  • Health
  • Entertainment
  • Technology
  • Sports
  • Money
  • Pets & Animals
Reading: Weibo's new open supply AI mannequin VibeThinker-1.5B outperforms DeepSeek-R1 on $7,800 post-training funds
Share
Font ResizerAa
MadisonyMadisony
Search
  • Home
  • National & World
  • Politics
  • Investigative Reports
  • Education
  • Health
  • Entertainment
  • Technology
  • Sports
  • Money
  • Pets & Animals
Have an existing account? Sign In
Follow US
2025 © Madisony.com. All Rights Reserved.
Technology

Weibo's new open supply AI mannequin VibeThinker-1.5B outperforms DeepSeek-R1 on $7,800 post-training funds

Madisony
Last updated: November 12, 2025 8:08 pm
Madisony
Share
Weibo's new open supply AI mannequin VibeThinker-1.5B outperforms DeepSeek-R1 on ,800 post-training funds
SHARE



Contents
A Completely different Coaching Strategy: Spectrum-to-SignEfficiency Throughout DomainsSteering for Enterprise AdoptionWeibo’s Technique and Market PlaceWhat It Means for Enterprise Technical Choice Makers

One other day in late 2025, one other spectacular outcome from a Chinese language firm in open supply synthetic intelligence.

Chinese language social networking firm Weibo's AI division just lately launched its open supply VibeThinker-1.5B—a 1.5 billion parameter massive language mannequin (LLM) that could be a fine-tuned variant of rival Chinese language tech agency Alibaba's Qwen2.5-Math-1.5B.

It's out there now at no cost obtain and utilization by researchers and enterprise builders—even for business functions—below a permissive MIT License on Hugging Face, GitHub and ModelScope, with a technical report on open entry science publishing website arxiv.org.

And but, regardless of its compact measurement, VibeThinker-1.5B achieves benchmark-topping reasoning efficiency on math and code duties, rivaling or surpassing fashions a whole bunch of occasions its measurement, even outperforming Chinese language rival DeepSeek's famed R1 that went viral in the beginning of this yr—a 671-billion parameter mannequin—on formal reasoning benchmark.

It additional eclipses Mistral AI's Magistral Medium and holds its personal in opposition to Anthropic's Claude Opus 4 and OpenAI's gpt-oss-20B Medium, all whereas requiring a fraction of the infrastructure and funding.

It additionally does so having been post-trained on a funds of merely $7800 USD for compute assets (3900 GPU hours on Nvidia H800s) — far lower than the tens, and even a whole bunch, of hundreds of {dollars} sometimes required to fine-tune fashions of comparable or bigger scale.

Recall this isn’t the entire value of the mannequin's growth, nevertheless: LLMs are educated in levels. First comes pre-training, when the mannequin learns fundamental language construction and basic data by predicting the subsequent phrase throughout huge quantities of textual content from the web, books, and articles. This provides it fluency however not a lot sense of comply with directions or maintain a dialog

Publish-training comes subsequent, utilizing a lot smaller, higher-quality datasets—sometimes collections of instance questions, prompts, and expert-written solutions—to show the mannequin reply helpfully, motive by way of issues, and align with human expectations. Nonetheless, Weibo's post-training value effectiveness on VibeThinker-1.5B is noteworthy and ought to be counseled.

The open-source launch upends assumptions about parameter scale, compute depth, and the minimal viable measurement for high-performance LLMs.

A Completely different Coaching Strategy: Spectrum-to-Sign

VibeThinker-1.5B owes its efficiency to not scale, however to the coaching framework behind it: the Spectrum-to-Sign Precept (SSP).

As an alternative of optimizing a mannequin purely for single-answer correctness (Move@1), the SSP framework decouples supervised fine-tuning (SFT) and reinforcement studying (RL) into two distinct phases with totally different targets:

  • SFT (“Spectrum Part”): The mannequin is educated to maximise variety throughout potential appropriate solutions, bettering its Move@Ok rating. This builds a variety of believable answer paths.

  • RL (“Sign Part”): A second-stage reinforcement studying system (referred to as MaxEnt-Guided Coverage Optimization, or MGPO) is used to determine and amplify essentially the most appropriate paths from this various answer pool. MGPO prioritizes issues the place the mannequin is most unsure, utilizing entropy-based weighting to focus studying.

The authors argue this separation permits small fashions to discover reasoning area extra successfully—reaching sign amplification with out counting on large parameter counts.

VibeThinker-1.5B makes a compelling case that the business’s reliance on parameter scaling as the one route to raised reasoning efficiency could also be outdated.

By adopting a diversity-first coaching pipeline, WeiboAI has proven that smaller, extra accessible fashions can match and even outperform billion-dollar programs in logic-heavy duties.

The low useful resource footprint is among the many most vital facets of VibeThinker-1.5B. At below $8,000, the post-training value is 30–60x decrease than fashions like DeepSeek R1 and MiniMax-M1, which value between $294K and $535K to coach.

Efficiency Throughout Domains

Regardless of its small measurement, VibeThinker-1.5B delivers cross-domain reasoning that outpaces many bigger open-source and business fashions:

Mannequin

AIME25

LiveCodeBench v6

GPQA-Diamond

VibeThinker-1.5B

74.4

51.1

46.7

GPT-OSS-20B-Medium

72.1

54.9

66.0

Claude Opus 4

69.2

56.6

79.6

MiniMax M1 (456B)

74.6

62.3

69.2

DeepSeek R1 (671B)

70.0

65.9

71.5

Kimi K2 (1.09T)

49.5

53.7

75.1

VibeThinker was benchmarked in opposition to each reasoning-centric fashions (Magistral, Claude, OpenAI o3-mini) and non-reasoning LLMs (GPT-4.1, Kimi K2, DeepSeek V3). Throughout structured reasoning benchmarks, the mannequin persistently outperformed non-reasoning fashions, no matter measurement:

  • On AIME24 (math), it beat Kimi K2 (1.09T) by over 10 factors (80.3 vs. 69.6).

  • On LiveCodeBench v6, it surpassed Claude Opus 4 (51.1 vs. 47.4).

  • On GPQA, it scored under GPT-4.1 and Claude, however nonetheless doubled its base mannequin (from 16.4 to 46.7).

This helps the authors’ declare that measurement isn’t the one path to reasoning functionality—with correct coaching design, smaller fashions can attain and even exceed the efficiency of far bigger programs in focused duties.

Notably, it achieves parity with fashions a whole bunch of occasions bigger on math and code, although it lags behind normally data reasoning (GPQA), the place bigger fashions keep an edge.

This means a possible specialization trade-off: whereas VibeThinker excels at structured logical duties, it has much less capability for wide-ranging encyclopedic recall, a identified limitation of smaller architectures.

Steering for Enterprise Adoption

The discharge consists of really useful inference settings (temperature = 0.6, top_p = 0.95, max tokens = 40960).

The mannequin is sufficiently small to be deployed on edge gadgets, together with cellphones and vehicle-embedded programs, whereas inference prices are estimated to be 20–70x cheaper than with massive fashions.

This positions VibeThinker-1.5B not simply as a analysis achievement, however as a possible basis for cost-efficient, domestically deployable reasoning programs.

Weibo’s Technique and Market Place

Weibo, launched by Sina Company in 2009, stays a cornerstone of China’s social media ecosystem. Typically described as China’s model of X (previously Twitter), the platform blends microblogging, multimedia content material, and trending-topic options with a regulatory atmosphere formed by tight authorities oversight.

Regardless of counting 600 million month-to-month energetic customers (greater than twice that of X), buyers are usually not optimistic about its promoting income progress potential within the close to time period, and Weibo is navigating intensifying competitors from video-first platforms like Douyin, that are drawing youthful customers and rising time-spent elsewhere.

In response, Weibo has leaned into creator-economy monetization, live-streaming, and vertical video—including instruments for influencer engagement, e-commerce integration, and richer analytics for manufacturers.

The platform’s function as a digital public sq. additionally makes it a spotlight of regulatory scrutiny. Chinese language authorities proceed to use stress on points starting from content material governance to information safety. In September 2025, Weibo was among the many platforms cited in official warnings, highlighting its ongoing publicity to coverage dangers.

Weibo’s push into AI R&D—exemplified by the discharge of VibeThinker-1.5B—indicators a shift in ambition. Past being a media platform, Weibo is positioning itself as a participant within the subsequent part of Chinese language AI growth, utilizing its capital reserves, consumer conduct information, and in-house analysis capability to pursue adjoining technical domains.

What It Means for Enterprise Technical Choice Makers

For engineering leaders and enterprise AI groups, VibeThinker’s launch has sensible implications for the whole lot from orchestration pipelines to value modeling.

A 1.5B-parameter mannequin that outperforms 100x bigger fashions on math and programming duties doesn’t simply save compute—it shifts the architectural stability. It allows LLM inference on constrained infrastructure, reduces latency on the edge, and lowers the barrier to entry for purposes that in any other case would have required API entry to closed, frontier-scale fashions.

That issues for enterprise ML leads attempting to deploy reasoning-capable brokers inside present programs, or for platform homeowners tasked with integrating LLMs into automated workflows.

It additionally speaks to these working reinforcement studying from human suggestions (RLHF) pipelines or managing inference optimization throughout hybrid cloud environments.

The mannequin’s post-training methodology—significantly its entropy-targeted reinforcement studying strategy—provides a roadmap for groups seeking to refine smaller checkpoints as an alternative of counting on large-scale pretraining.

VibeThinker’s benchmark transparency and information decontamination steps additionally handle one other rising precedence in enterprise AI: auditability. Whereas its efficiency on general-knowledge exams nonetheless trails massive frontier fashions, its task-specific reliability makes it a sexy candidate for managed environments the place correctness issues greater than protection.

Briefly, VibeThinker-1.5B isn’t only a analysis milestone—it’s a robust candidate for sensible enterprise use, deployment and learnings. It suggests {that a} new class of compact, reasoning-optimized fashions is viable for enterprise use circumstances that had been beforehand the area of far bigger programs. For organizations attempting to stability value, latency, interpretability, and management, it’s an excellent new choice to the lengthy, rising listing of Chinese language open supply choices.

Subscribe to Our Newsletter
Subscribe to our newsletter to get our newest articles instantly!
[mc4wp_form]
Share This Article
Email Copy Link Print
Previous Article 76 Climate Jokes That Will Blow You Away 76 Climate Jokes That Will Blow You Away
Next Article Prosecutors flip over 130,000 pages of proof in killing of lawmaker Prosecutors flip over 130,000 pages of proof in killing of lawmaker

POPULAR

Jeffries says the “struggle just isn’t over” on well being care subsidies as shutdown nears attainable finish
Politics

Jeffries says the “struggle just isn’t over” on well being care subsidies as shutdown nears attainable finish

DHS Stored Chicago Police Data for Months in Violation of Home Espionage Guidelines
Technology

DHS Stored Chicago Police Data for Months in Violation of Home Espionage Guidelines

Catanduanes city reels as Uwan ravages over 3,500 houses
Investigative Reports

Catanduanes city reels as Uwan ravages over 3,500 houses

Flight cancellations ease with finish to the shutdown in sight
Money

Flight cancellations ease with finish to the shutdown in sight

Sports activities Followers Are Naming Their Pets After Their Heroes
Pets & Animals

Sports activities Followers Are Naming Their Pets After Their Heroes

Drew Brees shares who has an out of doors shot at NFL MVP, and he is not a quarterback
Sports

Drew Brees shares who has an out of doors shot at NFL MVP, and he is not a quarterback

Mega Thousands and thousands jackpot jumps to 5 million for Friday’s drawing
National & World

Mega Thousands and thousands jackpot jumps to $965 million for Friday’s drawing

You Might Also Like

8 Greatest Indoor Safety Cameras (2025): For Properties and Residences
Technology

8 Greatest Indoor Safety Cameras (2025): For Properties and Residences

Evaluate Indoor CamerasGreatest MicroSD Playing cards{Photograph}: AmazonMany safety cameras help native storage, enabling you to file movies on the digital…

22 Min Read
6 Finest {Couples} Intercourse Toys (2025), Examined and Reviewed
Technology

6 Finest {Couples} Intercourse Toys (2025), Examined and Reviewed

Different Intercourse Toys to Think aboutListed below are just a few different toys that are not as nice because the…

4 Min Read
6 Finest Dyson Vacuums (2025): V15 Detect, Gen5 Detect, Digital Slim
Technology

6 Finest Dyson Vacuums (2025): V15 Detect, Gen5 Detect, Digital Slim

Evaluating Our Favourite Dyson VacuumsMeet Dyson’s 2026 LineupDyson introduced in September that its lineup of vacuums has a serious refresh…

29 Min Read
The NYSE sped up its realtime streaming information 5X with Redpanda
Technology

The NYSE sped up its realtime streaming information 5X with Redpanda

Need smarter insights in your inbox? Join our weekly newsletters to get solely what issues to enterprise AI, information, and…

11 Min Read
Madisony

We cover the stories that shape the world, from breaking global headlines to the insights behind them. Our mission is simple: deliver news you can rely on, fast and fact-checked.

Recent News

Jeffries says the “struggle just isn’t over” on well being care subsidies as shutdown nears attainable finish
Jeffries says the “struggle just isn’t over” on well being care subsidies as shutdown nears attainable finish
November 12, 2025
DHS Stored Chicago Police Data for Months in Violation of Home Espionage Guidelines
DHS Stored Chicago Police Data for Months in Violation of Home Espionage Guidelines
November 12, 2025
Catanduanes city reels as Uwan ravages over 3,500 houses
Catanduanes city reels as Uwan ravages over 3,500 houses
November 12, 2025

Trending News

Jeffries says the “struggle just isn’t over” on well being care subsidies as shutdown nears attainable finish
DHS Stored Chicago Police Data for Months in Violation of Home Espionage Guidelines
Catanduanes city reels as Uwan ravages over 3,500 houses
Flight cancellations ease with finish to the shutdown in sight
Sports activities Followers Are Naming Their Pets After Their Heroes
  • About Us
  • Privacy Policy
  • Terms Of Service
Reading: Weibo's new open supply AI mannequin VibeThinker-1.5B outperforms DeepSeek-R1 on $7,800 post-training funds
Share

2025 © Madisony.com. All Rights Reserved.

Welcome Back!

Sign in to your account

Username or Email Address
Password

Lost your password?