By using this site, you agree to the Privacy Policy and Terms of Use.
Accept
MadisonyMadisony
Notification Show More
Font ResizerAa
  • Home
  • National & World
  • Politics
  • Investigative Reports
  • Education
  • Health
  • Entertainment
  • Technology
  • Sports
  • Money
  • Pets & Animals
Reading: LLMs vs. Geolocation: GPT-5 performs worse than different AI fashions
Share
Font ResizerAa
MadisonyMadisony
Search
  • Home
  • National & World
  • Politics
  • Investigative Reports
  • Education
  • Health
  • Entertainment
  • Technology
  • Sports
  • Money
  • Pets & Animals
Have an existing account? Sign In
Follow US
2025 © Madisony.com. All Rights Reserved.
Investigative Reports

LLMs vs. Geolocation: GPT-5 performs worse than different AI fashions

Madisony
Last updated: August 14, 2025 11:35 am
Madisony
Share
LLMs vs. Geolocation: GPT-5 performs worse than different AI fashions
SHARE


In June, Bellingcat ran 500 geolocation exams, evaluating LLMs from varied corporations in opposition to one another, in addition to Google Lens – a staple software for locating the placement of photographs.

On the time, ChatGPT o4-mini-high emerged because the clear winner, with Google Lens outperforming most different fashions. Simply two months later, with new variations of those AI instruments accessible, we re-ran the trial – this time together with Google “AI Mode,” GPT-5, GPT-5 Considering, and Grok 4 into the combo.

These 5 photographs have been excluded from our most up-to-date trial as they have been printed in our earlier article.

The authentic take a look at used 25 of Bellingcat’s personal vacation photographs. From cities to distant countryside, the pictures included scenes each with and with out recognisable options – akin to roads, signage, mountains, or structure. Photos have been sourced from each continent.

For the up to date trial, 5 take a look at photographs have been excluded, as they’d appeared in a earlier article, thus compromising the integrity of the outcomes.

All 24 fashions’ responses have been ranked on a scale from 0 to 10, with 10 indicating an correct and particular identification (akin to a neighbourhood, path, or landmark) and 0 indicating no try and determine the placement in any respect.

Google AI Mode was proven to be probably the most succesful geolocation software general. 

Grok 4 gave each higher and worse solutions in comparison with Grok 3 however, on common, scored marginally larger. Nonetheless, it was nonetheless much less correct than older variations of Gemini and GPT. 

GPT-5, even in ‘Considering’ and ‘Professional’ modes, was a substantial downgrade when put next with the capabilities demonstrated by GPT o4-mini-high. In a single instance, of a metropolis road with skyscrapers within the background, o4-mini-high accurately recognized the road, whereas GPT-5 in Considering mode pointed to the flawed nation. 

Assist Bellingcat

Your donations straight contribute to our capability to publish groundbreaking investigations and uncover wrongdoing all over the world.

Regardless of delivering sooner solutions, GPT-5 appeared to sacrifice accuracy. A shocking variety of errors and a normal sense of disappointment within the new mannequin have additionally been reported by different customers.

Bellingcat examined GPT-5 and its ‘Considering’ mode by way of the Plus subscription, which prices roughly the identical as entry to 04-mini-high previous to its retirement. 5 of probably the most troublesome take a look at pictures have been additionally run by GPT-5 Professional. However even Professional, with a premium price ticket of €200 monthly, didn’t geolocate the photographs any extra precisely than GPT 04-mini-high.

A Seaside, a Resort and a Ferris Wheel

The disparity between Google and the GPT fashions grew to become much more obvious in Check 25 – a photograph of a shoreline resort in Noordwijk, the Netherlands, with a Ferris wheel rising simply past the dunes.

Check 25: A photograph of Noordwijk seaside within the Netherlands. Credit score: Bellingcat.

Within the earlier trial, most older fashions – together with these from GPT, Claude, Gemini and Grok – precisely recognized the nation because the Netherlands however didn’t find the city. Many latched onto the Ferris wheel however pointed as a substitute to the seaside city of Scheveningen, which additionally has a Ferris wheel, although located on a pier, not among the many sand dunes.

Nonetheless, the latest fashions, GPT-5 Professional and Considering, have been even much less correct, figuring out a seaside in France – a completely completely different nation. 

Sadly for open supply researchers, following the discharge of GPT-5, OpenAI eliminated the choice to pick out older fashions akin to o4-mini-high. After a wave of detrimental suggestions, OpenAI reinstated GPT-4o because the default mannequin for paid subscribers. Nonetheless, probably the most succesful geolocation fashions recognized in Bellingcat’s testing stay inaccessible.

Google AI Mode, however, was the primary, and solely mannequin thus far, to accurately determine Noordwijk as the placement in Check 25.  

Although AI Mode is powered by a model of Gemini 2.5, it outperformed Gemini 2.5 Professional Deep Analysis in these exams. Described by Google as its “strongest AI search, with extra superior reasoning and multimodality,” AI Mode geolocated take a look at pictures with larger accuracy than any GPT fashions, together with our earlier winner, o4-mini-high.

AI Mode is presently solely accessible in India, United Kingdom and the USA.

Nearly all of fashions, in some unspecified time in the future, returned a hallucination. Customers mustn’t rely solely on the solutions offered by LLMs. Even one of the best choices, together with Google AI Mode, nonetheless, at occasions, confidently level to the flawed location. 

The distinction in fashions’ capabilities in contrast with simply two months in the past exhibits how shortly this discipline is evolving. Nonetheless, OpenAI’s latest modifications additionally counsel that progress is just not assured, and that AI’s capability to geolocate could plateau and even worsen over time. As new fashions emerge, Bellingcat will proceed to check them.

Because of Nathan Patin for contributing to the unique benchmark exams.


Because of Nathan Patin for contributing to the unique benchmark exams.Bellingcat is a non-profit and the flexibility to hold out our work relies on the sort assist of particular person donors. If you need to assist our work, you are able to do so right here. You can too subscribe to our Patreon channel right here. Subscribe to our E-newsletter and observe us on Bluesky right here and Instagram right here.



Subscribe to Our Newsletter
Subscribe to our newsletter to get our newest articles instantly!
[mc4wp_form]
Share This Article
Email Copy Link Print
Previous Article Kodak faces monetary hassle whilst Gen Z drives a movie resurgence Kodak faces monetary hassle whilst Gen Z drives a movie resurgence
Next Article 44 Should-Have Again-to-Faculty Faculty Dorm Room Necessities and Gear (2025) 44 Should-Have Again-to-Faculty Faculty Dorm Room Necessities and Gear (2025)
Leave a Comment

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

POPULAR

Leopard Stops in Her Tracks When a Tiny Dung Beetle Steals the Highlight
Pets & Animals

Leopard Stops in Her Tracks When a Tiny Dung Beetle Steals the Highlight

2025 NFL Week 12 Picks: Finest Bets for Each Sport — Will’s Wagers
Sports

2025 NFL Week 12 Picks: Finest Bets for Each Sport — Will’s Wagers

10,000 shark fins value over M seized in main wildlife trafficking bust in Peru
National & World

10,000 shark fins value over $11M seized in main wildlife trafficking bust in Peru

Lawsuit filed on behalf of immigrants fined as much as .8 million for remaining within the nation
Politics

Lawsuit filed on behalf of immigrants fined as much as $1.8 million for remaining within the nation

Google's upgraded Nano Banana Professional AI picture mannequin hailed as 'completely bonkers' for enterprises and customers
Technology

Google's upgraded Nano Banana Professional AI picture mannequin hailed as 'completely bonkers' for enterprises and customers

Trump Administration Drops Plan to Minimize Social Safety Incapacity Advantages for Older Staff — ProPublica
Investigative Reports

Trump Administration Drops Plan to Minimize Social Safety Incapacity Advantages for Older Staff — ProPublica

Needham Cuts Crimson Cat Holdings Inc. (RCAT)’s Worth Goal To , Maintains Purchase Ranking
Money

Needham Cuts Crimson Cat Holdings Inc. (RCAT)’s Worth Goal To $12, Maintains Purchase Ranking

You Might Also Like

Unhealthy Religion Podcast
Investigative Reports

Unhealthy Religion Podcast

Whitney joined Unhealthy Religion podcast to debate the latest Trump cupboard picks, the dominant position tech CEOs & the deep…

1 Min Read
5.6 tons of pasta with distinctive Cordilleran spin deliver Baguio collectively
Investigative Reports

5.6 tons of pasta with distinctive Cordilleran spin deliver Baguio collectively

That is AI generated summarization, which can have errors. For context, all the time seek advice from the total article.…

6 Min Read
Tulfo, Hontiveros blast FIVB world volleyball ticket costs, playing advertisements
Investigative Reports

Tulfo, Hontiveros blast FIVB world volleyball ticket costs, playing advertisements

MANILA, Philippines – Senator Erwin Tulfo and Risa Hontiveros lamented what they described as missed alternatives within the internet hosting…

6 Min Read
The rise of Senator Rodante Marcoleta
Investigative Reports

The rise of Senator Rodante Marcoleta

READ: Half 1 | Senator Rodante Marcoleta, the gentleman from Iglesia Was it Iglesia ni Cristo (INC) or Rodrigo Duterte…

16 Min Read
Madisony

We cover the stories that shape the world, from breaking global headlines to the insights behind them. Our mission is simple: deliver news you can rely on, fast and fact-checked.

Recent News

Leopard Stops in Her Tracks When a Tiny Dung Beetle Steals the Highlight
Leopard Stops in Her Tracks When a Tiny Dung Beetle Steals the Highlight
November 20, 2025
2025 NFL Week 12 Picks: Finest Bets for Each Sport — Will’s Wagers
2025 NFL Week 12 Picks: Finest Bets for Each Sport — Will’s Wagers
November 20, 2025
10,000 shark fins value over M seized in main wildlife trafficking bust in Peru
10,000 shark fins value over $11M seized in main wildlife trafficking bust in Peru
November 20, 2025

Trending News

Leopard Stops in Her Tracks When a Tiny Dung Beetle Steals the Highlight
2025 NFL Week 12 Picks: Finest Bets for Each Sport — Will’s Wagers
10,000 shark fins value over $11M seized in main wildlife trafficking bust in Peru
Lawsuit filed on behalf of immigrants fined as much as $1.8 million for remaining within the nation
Google's upgraded Nano Banana Professional AI picture mannequin hailed as 'completely bonkers' for enterprises and customers
  • About Us
  • Privacy Policy
  • Terms Of Service
Reading: LLMs vs. Geolocation: GPT-5 performs worse than different AI fashions
Share

2025 © Madisony.com. All Rights Reserved.

Welcome Back!

Sign in to your account

Username or Email Address
Password

Lost your password?