Intelligence is foundation
Podcast Subscribe
Builders & Makers Friday, 27 February 2026

Reverse Engineering ChatGPT - What It Actually Searches

Share: LinkedIn
Reverse Engineering ChatGPT - What It Actually Searches

A developer noticed something odd. When you ask ChatGPT a question with search enabled, the answer feels confident and comprehensive. But what is it actually searching for behind the scenes?

The answer, it turns out, is surprisingly different from what you typed. This developer built a Chrome extension to intercept the Server-Sent Events (SSE) stream and reveal the hidden queries. The data is fascinating.

The Reformulation Gap

ChatGPT does not search for what you asked. It reformulates your question - an average of 8.2 times per question - into what it thinks will return better results.

You ask: "What are the latest developments in quantum computing?"

ChatGPT searches: "quantum computing breakthroughs 2024", "recent quantum computing announcements", "quantum computing research papers January 2025", and five more variations.

The gap between what you ask and what gets searched is what the developer calls the Reformulation Gap. Across the dataset, it averaged 47 per cent. Nearly half the time, the search query bore only loose resemblance to the original question.

That is not necessarily bad. ChatGPT is optimising for results, not literalism. But it does mean you are not in control of the search. You are outsourcing query formulation to a model that may or may not share your intent.

The Consult-to-Cite Ratio

Here is where it gets messier. The extension tracked how many sources ChatGPT consulted versus how many it actually cited in the answer. The ratio was 3.2:1.

For every source ChatGPT references in its response, it consulted three others and chose not to mention them. You are getting a curated view, filtered through the model's judgement of what matters.

Again, this is not inherently wrong. Humans do the same thing when researching - we consult far more than we cite. But it does mean the confidence you feel reading a ChatGPT answer is not transparency. It is editorial judgement you cannot see.

What This Means for Builders

If you are building applications on top of ChatGPT's search capabilities, this data changes how you should think about reliability.

First, the model is rewriting your queries. If precision matters - legal research, medical information, technical documentation - you need to account for the fact that the system is interpreting intent, not executing instructions.

Second, you are not seeing the full search process. The sources ChatGPT consulted but did not cite might be exactly the ones you needed. There is no way to audit that decision after the fact.

Third, this behaviour varies across platforms. The developer compared ChatGPT, Perplexity, and Claude. Each one reformulates queries differently. Each one has a different consult-to-cite ratio. If you are comparing answers across systems, you are not comparing the same search process.

The Transparency Problem

The real issue here is not that ChatGPT reformulates queries. It is that most users have no idea it is happening.

When you search Google, you see the query you typed. When you search ChatGPT, you see the answer to a query you did not write. That gap is fine for casual use - asking about recipes or travel recommendations. It is less fine when the stakes matter.

The developer who built this extension is not arguing for removing reformulation. The argument is for visibility. Let users see what the model actually searched for. Let them understand why certain sources were cited and others were not.

That kind of transparency is not just useful for power users. It is how you build trust in systems that are making decisions on your behalf.

For now, the extension exists as a proof of concept. It works, but it is fragile - ChatGPT's SSE format could change at any time, breaking the intercept. What would be better is if this kind of visibility were built into the product itself.

Until then, this is a good reminder: when you ask an AI to search for something, you are not just outsourcing the search. You are outsourcing the question itself. Understanding that difference is the first step toward using these tools well.

More Featured Insights

Robotics & Automation
Google Bets Big on Physical AI - Intrinsic Returns Home
Voices & Thought Leaders
Anthropic's CEO Drew a Line - and the Industry Noticed

Video Sources

Ania Kubów
The ultimate dev skill is Integration Testing - Interview with Internet of Bugs [Podcast #209]

Today's Sources

DEV.to AI
How I Reverse-Engineered ChatGPT's Hidden Search Behavior with a Chrome Extension
DEV.to AI
How to Build an Agent Skill: A Practical Guide
DEV.to AI
What Happens When IoT Application Development Meets AI at Scale?
Replit Blog
We Built a Video Rendering Engine by Lying to the Browser About What Time It Is
Towards Data Science
Designing Data and AI Systems That Hold Up in Production
The Robot Report
Intrinsic is joining Google to advance physical AI in robotics
Robohub
I developed an app that uses drone footage to track plastic litter on beaches
Hackaday Robotics
Robot Looks Exactly Like a Roll of Filament, If Filament Had Eyes
ROS Discourse
ROSConJP 2026 講演提案 募集開始
ROS Discourse
Cerebel Robotics - Founding Engineers (AI / Embedded SW)
ROS Discourse
Looking for Assembled Differential Drive Robot with Camera & LiDAR for ROS2 Nav2
Hacker News Best
Statement from Dario Amodei on our discussions with the Department of War
Gary Marcus
Retired US Air Force General Jack Shanahan on the Anthropic-Pentagon tensions
Latent Space
[AINews] Nano Banana 2 aka Gemini 3.1 Flash Image Preview: the new SOTA Imagegen model
Gary Marcus
Historic statement from Dario Amodei
Latent Space
[LIVE] Anthropic Distillation & How Models Cheat (SWE-Bench Dead)
Hacker News Best
Google workers seek 'red lines' on military A.I., echoing Anthropic

About the Curator

Richard Bland
Richard Bland
Founder, Marbl Codes

27+ years in software development, curating the tech news that matters.

Subscribe RSS Feed
View Full Digest Today's Intelligence
Free Daily Briefing

Start Every Morning Smarter

Luma curates the most important AI, quantum, and tech developments into a 5-minute morning briefing. Free, daily, no spam.

  • 8:00 AM Morning digest ready to listen
  • 1:00 PM Afternoon edition catches what you missed
  • 8:00 PM Daily roundup lands in your inbox

We respect your inbox. Unsubscribe anytime. Privacy Policy

© 2026 MEM Digital Ltd t/a Marbl Codes
About Sources Podcast Audio Privacy Cookies Terms Thou Art That
RSS Feed