GEO GUIDE

Published on March 14, 2026

AI Citation Tracking API: Monitor What Sources LLMs Use and Why It Matters

TL;DR: When ChatGPT or Perplexity recommends a brand, it often cites sources. Those citations are gold — they tell you exactly WHY the AI trusts that brand and WHERE it got its information. The sellm API lets you track every cited URL and domain programmatically.

AI search engines don't just generate answers from thin air. When ChatGPT recommends your brand as "the best CRM for startups," it's drawing on specific sources — review sites, documentation pages, blog posts, news articles. Some providers, like Perplexity, show those sources explicitly with inline citations. Others, like ChatGPT, include cited URLs in their responses. These citations reveal the evidence trail behind every AI recommendation.

For anyone doing Generative Engine Optimization, citations are the most actionable data point available. They tell you which content the AI model actually trusts, which domains carry authority in the model's view, and what type of content gets referenced when users ask buying-intent questions. Pairing citation data with sentiment analysis reveals not just which sources the AI trusts, but how those sources shape the AI's opinion of your brand. If you're not tracking citations, you're missing the "why" behind your AI visibility.

What Are AI Citations?

When an AI search engine responds to a query, it may include references to external sources. These citations take different forms depending on the provider, but they generally contain the same core information: the URL that was referenced, the domain it belongs to, and sometimes a label or description of the source.

In the sellm API, citation data is captured in two key fields on every analysis result:

citedUrls

An array of specific URLs that the AI provider referenced in its response. Each entry includes the full URL, a label (the text used to describe or link to the source), and the position in the response where the citation appeared. For providers like Perplexity that use inline citations, these map directly to the numbered source references. For ChatGPT, these are extracted from any URLs included in the response body.

{
  "citedUrls": [
    {
      "url": "https://www.g2.com/products/acme-crm/reviews",
      "label": "G2 Reviews - Acme CRM",
      "position": 1
    },
    {
      "url": "https://www.acme.com/blog/enterprise-crm-guide",
      "label": "Acme Blog - Enterprise CRM Guide",
      "position": 3
    },
    {
      "url": "https://techcrunch.com/2026/01/acme-crm-series-b",
      "label": "TechCrunch - Acme CRM raises Series B",
      "position": 5
    }
  ]
}

citedDomains

A deduplicated list of domains that appeared in the cited URLs. This is useful for higher-level analysis — understanding which domains the AI model treats as authoritative for a given topic, without getting into the specifics of individual pages.

{
  "citedDomains": [
    "g2.com",
    "acme.com",
    "techcrunch.com"
  ]
}

Together, these two fields give you a complete picture of the AI's evidence trail. You can see both the specific pages that influenced the response and the broader domain-level authority patterns.

Why Citations Matter for GEO

Citations are more than metadata. They're the mechanism through which AI models establish trust, and they have direct implications for your brand's visibility strategy.

Citations Are Trust Signals

When an AI model cites a source alongside a brand recommendation, it's signaling that the source contributed to its confidence in that recommendation. If ChatGPT consistently cites G2 reviews when recommending your competitor, that tells you the model trusts G2 as an authority in your category. To improve your own visibility, you need strong presence on the same authoritative domains.

Citations Drive Real Traffic

On platforms like Perplexity, citations are clickable links displayed alongside the response. Users who want to verify or explore the AI's recommendation click through to the cited sources. If your content is cited, you receive direct referral traffic from AI search. This is a growing traffic channel that most analytics tools don't yet track separately from organic search.

Citations Influence Future Training

AI models are periodically retrained or fine-tuned on new data. Content that gets cited frequently in AI responses signals relevance and authority, which can influence how future model versions treat your brand. Getting cited today creates a compounding advantage over time — your content becomes part of the evidence base that future models draw from.

How to Track Citations with the Sellm API

The sellm API returns citation data as part of every analysis result. When you pull results for a completed run, each result object includes the citedUrls and citedDomains fields alongside the standard visibility metrics.

Step 1: Submit an Analysis

Start by submitting a prompt for asynchronous analysis. The API will query the AI providers you specify and return citation data alongside visibility metrics:

curl -X POST "https://sellm.io/api/v1/async-analysis" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "prompt": "best HR software for remote teams",
    "providers": ["chatgpt", "perplexity", "claude"],
    "locations": ["US"],
    "replicates": 3
  }'

Step 2: Poll for Results

Use the returned analysisId to poll until the status is "succeeded":

curl -X GET "https://sellm.io/api/v1/async-analysis/{analysisId}" \
  -H "Authorization: Bearer YOUR_API_KEY"

Once succeeded, the response contains everything: summary, providerBreakdown, promptBreakdown, and results[]. Each result includes the prompt that was sent, the provider that responded, the brand mentions and rankings, and the full citation data:

{
  "data": {
    "status": "succeeded",
    "summary": {
      "sovPct": 22,
      "coveragePct": 67,
      "avgPos": 2.3,
      "sentiment": 0.74
    },
    "results": [
      {
        "prompt": "best HR software for remote teams",
        "provider": "chatgpt",
        "position": 2,
        "mentioned": true,
        "sentiment": 0.78,
        "citedUrls": [
          {
            "url": "https://www.g2.com/categories/hr-software",
            "label": "G2 - Best HR Software",
            "position": 1
          },
          {
            "url": "https://www.bamboohr.com/hr-software/",
            "label": "BambooHR - HR Software",
            "position": 2
          }
        ],
        "citedDomains": ["g2.com", "bamboohr.com"]
      }
    ]
  }
}

Different AI providers cite sources differently. Perplexity is the most citation-heavy, often including 5-10 source references per response — making it especially valuable to monitor with a Perplexity Tracker API. ChatGPT and Claude cite less frequently but still include URLs in many responses. You can filter the results[] array by provider to analyze citation patterns per platform.

Citation Analysis: What to Look For

Raw citation data becomes valuable when you analyze it for patterns. Here are the key analyses that reveal actionable insights about your AI visibility.

Which Domains Get Cited Most

Aggregate citedDomains across all results in a run to build a frequency table. The domains that appear most often are the ones AI models treat as authoritative for your category. Common patterns include:

Review platforms (G2, Capterra, Trustpilot) — AI models lean heavily on aggregated user reviews for product recommendations
Industry publications (TechCrunch, Forbes, industry-specific blogs) — editorial coverage signals credibility
Official brand sites — product documentation and feature pages get cited for factual claims
Comparison sites — pages that directly compare products in a category are frequently referenced

Which Content Types Get Cited

Look at the URL paths in citedUrls to identify what type of content AI models prefer. You'll typically find that certain content formats are disproportionately cited:

Product comparison pages and "best of" roundups
Detailed feature documentation with structured data
Case studies with specific metrics and outcomes
FAQ pages that directly answer common questions
Original research and data-driven reports

Citation Freshness Signals

Some citations include dates or come from time-stamped content. Tracking which citations are recent vs. older reveals how much AI models weight content freshness. If the model consistently cites articles from the last 6 months, that tells you publishing frequency matters. If it cites evergreen documentation pages regardless of age, that signals that depth and authority outweigh recency for your category.

How to Get Your Content Cited by AI

Understanding citations is only half the equation. The real value comes from optimizing your content to earn citations. Based on patterns across thousands of AI responses, here are the strategies that consistently lead to more citations.

Use Structured Data

AI models are trained on web content, and structured data (Schema.org markup, JSON-LD) makes it easier for models to understand and reference your content. Pages with clear Product, FAQ, Review, and HowTo schema are more likely to be cited because the model can extract specific claims with confidence.

Build Comprehensive FAQ Pages

AI search queries are conversational questions. When your website has a well-structured FAQ page that directly answers those questions, AI models can cite your page as a source for specific answers. Make sure your FAQ content is detailed, not just one-line responses — the model needs enough context to trust the source.

Publish Original Research

Data and statistics are heavily cited by AI models because they provide concrete evidence for claims. If you publish original research — benchmark reports, industry surveys, usage statistics — AI models will reference your findings when answering related questions. This is one of the highest-leverage activities for earning citations.

Be Present on Authoritative Domains

AI models don't just cite your website. They cite reviews on G2, mentions in TechCrunch articles, comparisons on industry blogs, and discussions on forums. Building your presence across authoritative third-party domains increases the total number of citable sources that mention your brand, which increases the likelihood of being cited in AI responses.

Keep Content Fresh

Regularly update your key pages with current information, recent statistics, and new case studies. AI models are periodically retrained on newer data, and pages that are recently updated tend to be cited more frequently than stale content. Add "last updated" dates to your pages so the freshness signal is clear.

Monitoring Competitor Citations

Citation tracking isn't just about your own brand. Monitoring which sources AI models cite when recommending your competitors reveals opportunities you might otherwise miss.

When you set up competitors in your sellm project and run analysis, the results include citation data for all brand mentions — not just yours. If ChatGPT cites a specific Forbes article when recommending your competitor, that tells you:

Forbes is an authoritative domain in your category according to the model
Your competitor has earned editorial coverage that you may lack
Getting similar coverage on Forbes (or comparable publications) could improve your own AI visibility

Over time, tracking competitor citations reveals the full "source landscape" for your category. You can identify which domains carry the most weight with AI models, where your competitors are investing in content and PR, and where gaps exist that you can fill with your own content. Combined with SOV tracking, citation analysis helps you prioritize which content investments will move the needle on visibility.

// Example: Analyze competitor citations from an async analysis
const analysis = await fetch(
  "https://sellm.io/api/v1/async-analysis/{analysisId}",
  { headers: { "Authorization": "Bearer YOUR_API_KEY" } }
).then(r => r.json());

// Build a map of competitor -> cited domains
const competitorCitations = {};
for (const result of analysis.data.results) {
  if (!result.mentioned) continue;
  for (const domain of result.citedDomains || []) {
    competitorCitations[domain] = (competitorCitations[domain] || 0) + 1;
  }
}

// Sort by frequency to find the most-cited domains
const ranked = Object.entries(competitorCitations)
  .sort((a, b) => b[1] - a[1]);
console.log("Most cited domains in your category:", ranked);

Pricing

Citation tracking is included in every sellm plan at no additional cost. Every analysis result — accessible through the LLM Mentions API — includes full citation data: citedUrls, citedDomains, and associated metadata. You don't pay extra for citation-level detail.

The cost of monitoring comes down to your prompt volume. Each prompt analysis costs less than 1 cent.

Start Tracking AI Citations Today

Get Started

Frequently Asked Questions

Which AI providers include citations in their responses?

Perplexity is the most citation-rich, including numbered source references in nearly every response. ChatGPT includes URLs in many responses, especially for factual or product-related queries. Claude, Gemini, and Grok cite sources less consistently but still include them in certain contexts. Sellm captures citation data from all providers whenever it's present. For providers that support real-time search, you can also monitor Google AI Overviews to track citation patterns in Google's AI-generated answers.

Do I need a paid plan to access citation data?

Yes, citation data (citedUrls and citedDomains) is included in every API response at no extra cost.

Can I see which sources are cited for my competitors?

Yes. When you configure competitors in your sellm project, analysis results include citation data for all brand mentions in the response, not just your own brand. This lets you see which sources AI models reference when recommending competitors, giving you a roadmap for improving your own visibility.

How often should I check citation data?

Citation patterns tend to shift gradually as AI models are updated and new content gets indexed. Weekly monitoring is sufficient for most brands. The sellm API makes it easy to pull citation data after each scheduled run and compare it against previous weeks to spot trends.

What's the difference between citedUrls and citedDomains?

citedUrls contains the specific page URLs that were referenced, including the full path, label text, and position in the response. citedDomains is a deduplicated list of just the root domains from those URLs. Use citedUrls for page-level analysis and citedDomains for domain-level authority analysis.