There is something about online search that should bother you more than it probably does:
When you type a question into a search engine or AI model, the results you get back aren't ranked by truth or accuracy. They're ranked by hundreds of visibility signals. Factors like how many other sites link to a page, how long the domain has been around, and how well the content is structured for the algorithm, to name a few. But, it’s what they aren’t looking at is most concerning: Not accuracy, and not credibility.
A major new study by AirOps and Kevin Indig, analyzing 16,851 queries and nearly 354,000 pages, has just confirmed what a lot of us have long suspected: the factors that determine whether a source gets cited have little to do with whether that source is credible and have everything to do with whether it's findable, formatted well, and matches the exact words in your query.
So, what does this mean for how we find information and conduct research?
We've been conditioned to treat Google as a first resort for facts. Someone makes a claim at dinner. You pull out your phone. You search. You find something. Case closed.
But search results don't tell you whether a source is credible. They only tell you whether a source is visible.
The AirOps study (currently one of the most comprehensive analyses of how ChatGPT retrieves and cites information) found that domain authority and backlinks show no positive correlation with citation rate. In fact, they're slightly inversely correlated. Pages that are always cited by ChatGPT have a lower domain authority (53) than pages that are never cited (56). Sites with three times as many backlinks are actually less likely to get cited.
What this means is that the signals we've been trained to associate with trustworthiness online such as reputation, authority, and reach don't predict whether AI systems will recommend them as a source at all.
So, what does influence AI citations? Retrieval rank. A page at position one in ChatGPT's internal search results gets cited 58% of the time. By position ten, that drops to 14%. A four-times gap based purely on where a page lands in a ranking system.
An additional factor that isn’t often mentioned is the cost involved every time a retrieval happens. Providing updated information comes at a cost, so most AI platforms create logic around when and how often this happens. That means that the content they cite can be cached and out of date.
The bottom line is this: well optimised pages get you cited, credibility doesn't.
Wikipedia is a bit of an anomaly. It achieves the highest citation rate at 59.2% despite having the lowest query match scores. Meanwhile, major news outlets, with high domain authorities and newsrooms full of trained journalists, earn a citation rate of just 32%.This is baffling when Harvard University and other top institutions actually advise against using Wikipedia as a citable source.
Why does Wikipedia get cited more often? The study suggests the answer is its content density. The average Wikipedia page contains 4,383 words, 31 lists, and 6.6 tables. The structural density is so overwhelming that it overcomes a terrible retrieval position. The AI model doesn't care that Wikipedia is written and edited by volunteers with no editorial accountability. It cares that the page is long, well-structured, and exhaustively covers a topic.
The TL/DR? It’s content density and not credibility that is causing Wikipedia to get cited more often.
By comparison, Reddit has a high domain authority, nearly identical to major news outlets. It has a citation rate of 29.9% and is the single largest source when ChatGPT cites sources from memory. That means when ChatGPT answers a question from what it already "knows," it's treating Reddit threads with essentially the same weight it gives to vetted journalism.
ChatGPT doesn't distinguish between a well-sourced news article and a Reddit thread where someone confidently stated something false. It knows which one appeared more frequently in its training data and which one matched the pattern of the question being asked. This highlights the crux of the issue: credibility is not a consideration in ChatGPT citations.
Another alarming finding of the AirOps study is that what gets a page cited is structural, not substantive or factual information.
Pages whose headings closely match the user's query are cited 41% of the time, compared to 29% for weak heading matches. Pages with JSON-LD schema markup (a technical formatting signal invisible to readers) earn a 6.5 percentage point citation advantage. College-level writing (Flesch-Kincaid grade 16-17) outperforms simpler writing because AI models appear to treat linguistic complexity as a proxy for expertise.
None of these signals have anything to do with whether the information is true. The system rewards pages that are precisely formatted to answer one specific question, not pages that have done the deepest, most thorough work.
This isn't a criticism of AI systems for being broken. They're doing exactly what they were designed to do: retrieve and organize information at scale. But, "organizing information at scale" is not the same as "verifying whether information is credible." These are completely different problems, and right now, a lot of people are treating them as if they're the same thing.
Search optimizes for visibility. AI optimizes for fluency. AmICredible optimizes for something different: helping you decide whether a claim is credible.
In AmICredible, you're not getting a ranked list of pages. You're not getting a confident summary assembled from training data that treats Reddit threads with the same weight as peer-reviewed research. You're getting an analysis of whether a specific claim is grounded in sourced evidence, with transparent reasoning you can evaluate and push back on.
What AmICredible does is fundamentally different from what a search engine or an AI assistant does. It is designed to evaluate claims based on four dimensions of credibility, and show you where the evidence comes from. Surfacing sources across the political spectrum that agree on the underlying facts. Flagging when something is misleading even if it technically contains accurate elements. And linking back to the journalism and reporting that underlies the analysis. Credibility doesn't exist in a vacuum, it traces back to the reporters and institutions that did the hard work of verifying information.
Visibility and credibility are not the same. And they shouldn't be confused with each other.
Right now, a lot of people think they're doing their own research when they Google something or ask ChatGPT. But, if the tool you're using rewards heading structure over accuracy, and treats a Reddit thread as a high-authority source because it appeared in training data at the right frequency, you're not verifying the claim. You're just finding more versions of it, delivered confidently.
Real credibility evaluation requires different questions: Who reported this? What evidence supports it? Has it been corroborated? Are sources across the ideological spectrum agreeing on the same underlying facts?
A search ranking is not built to answer these questions. And it's not what an AI summary is built to answer either. But it is what a credibility layer like AmICredible is built to answer.
The AirOps research is a useful reminder that the systems we use to find information were not built to evaluate it. They were built to retrieve it, rank it, and surface it quickly. They do that extremely well.
But citations are not credibility. Heading structure is not accuracy. A high domain authority score is not the same as being right.
The job of evaluating credibility still belongs to us. The question is whether we have tools that actually help us do it.
Sources