GUIDE

Published on March 14, 2026

Extract Brand Sentiment from LLM Responses via API

Key insight: AI engines don't just mention your brand - they express opinions about it. Sellm's 4-dimensional sentiment analysis quantifies how ChatGPT, Claude, and Perplexity perceive your brand across trustworthiness, authority, recommendation strength, and fit for query intent.

When someone asks an AI assistant "What's the best project management tool for remote teams?", the response does more than list brands. It frames each brand with implicit sentiment: one tool might be described as "widely trusted," another as "a strong contender," and a third as "worth considering but limited in scope." These subtle differences in how AI engines talk about your brand directly affect whether users click through, sign up, or move on to a competitor.

Sellm's sentiment analysis breaks down these qualitative signals into four measurable dimensions, each scored from 0 to 1. You can retrieve these scores through the LLM Mentions API for every prompt, provider, and run — giving you a structured, trackable view of how AI perceives your brand over time.

What Is 4-Dimensional Sentiment?

Traditional sentiment analysis gives you a single positive/negative/neutral label. That's not enough for AI search optimization. When ChatGPT says your product is "reliable but expensive," a single sentiment score hides the nuance. Sellm decomposes brand sentiment into four independent dimensions that capture what AI engines actually communicate about your brand:

Trustworthiness (0-1)

Measures how much the AI response conveys that your brand can be relied upon. High trustworthiness scores correlate with language like "well-established," "proven track record," "trusted by thousands of teams," and "reliable choice." Low scores appear when the AI hedges with phrases like "relatively new," "limited reviews available," or "some users have reported issues."

A score of 0.0 means the response expresses no trust signals about your brand. A score of 1.0 means the AI presents your brand as highly dependable and credible.

Authority (0-1)

Captures whether the AI positions your brand as a leader or expert in its space. High authority scores appear when responses reference your brand as "industry-leading," "the standard," "used by top companies like X and Y," or "pioneered the approach." Low scores indicate the AI treats your brand as a follower or niche player - "one of several options," "a smaller alternative," or simply listing it without distinction.

A score of 0.0 means no authority signals. A score of 1.0 means the AI frames your brand as the definitive authority in the category.

Recommendation Strength (0-1)

Quantifies how strongly the AI actually recommends your brand to the user. This is the most direct measure of commercial intent in the response. High scores correspond to explicit recommendations: "I'd recommend," "your best bet," "the top choice for this use case." Low scores appear with passive mentions, caveats, or when the AI recommends competitors more enthusiastically.

A score of 0.0 means the brand is mentioned without any recommendation. A score of 1.0 means the AI gives an explicit, strong endorsement.

Fit for Query Intent (0-1)

Evaluates how well the AI believes your brand matches what the user is actually asking for. This dimension is query-specific: the same brand can score 0.9 for "best enterprise CRM" and 0.3 for "simple CRM for freelancers." High fit scores mean the AI explicitly connects your brand's capabilities to the user's stated needs. Low scores indicate the AI sees a mismatch between what the user wants and what your brand offers.

A score of 0.0 means the AI considers your brand irrelevant to the query. A score of 1.0 means perfect alignment between brand capabilities and query intent.

Interpreting Sentiment Scores

Each dimension is scored independently, so you'll often see asymmetric profiles. A brand might score high on authority (0.85) but low on recommendation strength (0.4) - meaning the AI acknowledges the brand as a leader but doesn't actively push users toward it. Understanding these patterns is where the real strategic value lies.

Here's a practical scoring guide:

0.0 - 0.2: The AI expresses no signal or negative signal on this dimension. Needs attention.
0.2 - 0.4: Weak signal. The AI acknowledges your brand but without conviction on this dimension.
0.4 - 0.6: Moderate signal. Your brand is reasonably well-positioned but not standing out.
0.6 - 0.8: Strong signal. The AI consistently conveys this quality about your brand.
0.8 - 1.0: Exceptional. The AI treats your brand as a clear leader on this dimension.

The most actionable insight comes from comparing your scores against competitors for the same prompt and provider. If a competitor scores 0.8 on recommendation strength where you score 0.5, that gap represents real lost conversions from AI search. Combining sentiment data with share of voice tracking and competitor tracking gives you the full picture of where and why you're winning or losing.

Using Sentiment to Guide Content Strategy

Each sentiment dimension maps to specific content strategies you can execute to improve how AI engines perceive your brand. Here's the playbook:

Low Trustworthiness: Add Citations and Social Proof

AI models build trust assessments from the training data and web content they've ingested. If your trustworthiness score is low, the AI hasn't encountered enough credibility signals about your brand. To improve this:

Publish case studies with specific, verifiable metrics ("reduced churn by 34% for Company X")
Get featured in third-party review sites (G2, Capterra, TrustRadius) with detailed reviews
Add customer testimonials with named individuals and companies to your website
Ensure your product pages cite data sources, methodology, and security certifications
Build a track record of accurate, well-sourced content on your blog

Low Authority: Get Industry Publications and Thought Leadership

Authority comes from being recognized by credible third parties as a leader. If your authority score lags:

Publish original research and industry reports that others cite
Contribute expert commentary to industry publications (TechCrunch, Forbes, industry-specific outlets)
Speak at conferences and ensure talks are documented online
Build partnerships with recognized industry leaders and have them reference your brand
Create definitive guides and resources that become reference material in your category

Low Recommendation Strength: Address Objections Directly

When AI models acknowledge your brand but don't recommend it, there's usually a specific blocker in the training data - price concerns, missing features, usability issues, or unresolved complaints. To strengthen recommendations:

Identify and publicly address the most common objections to your product
Create comparison pages that honestly differentiate you from competitors
Publish content that explicitly frames your product as the solution for specific use cases
Ensure positive user experiences are well-documented across the web (forums, Reddit, Quora)
Build free tools and resources that demonstrate your product's value without requiring a purchase

Low Fit for Query Intent: Align Content with Target Queries

Fit scores are query-specific, so low fit means the AI doesn't connect your brand to the specific need expressed in the prompt. This is the most tactical dimension:

Review the exact prompts where fit is low and create landing pages that directly address those queries
Use the same language and terminology your target users use in their AI queries
Build content that explicitly connects your product features to specific use cases and personas
Ensure your product positioning clearly states who your product is for and what problems it solves
Create "best X for Y" content that matches the query patterns where you want to appear

Tracking Sentiment Trends via API

Sentiment scores are most valuable when tracked over time. A single snapshot tells you where you stand; a trend tells you whether your content strategy is working. The Sellm API makes it straightforward to build sentiment tracking into your existing workflows.

Start by submitting a prompt for asynchronous analysis:

curl -X POST "https://sellm.io/api/v1/async-analysis" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "prompt": "best design tools for agencies",
    "providers": ["chatgpt", "claude", "perplexity"],
    "locations": ["US"],
    "replicates": 3
  }'

Then poll the returned analysisId until the status is "succeeded":

curl -X GET "https://sellm.io/api/v1/async-analysis/{analysisId}" \
  -H "Authorization: Bearer YOUR_API_KEY"

The response includes everything in one call: a summary with aggregate KPIs, providerBreakdown, promptBreakdown with per-prompt sentiment dimensions, and results[] with individual result details. Here's what the sentiment-relevant portions look like:

{
  "data": {
    "status": "succeeded",
    "summary": {
      "sovPct": 18,
      "coveragePct": 55,
      "avgPos": 3.1,
      "sentiment": 0.72
    },
    "promptBreakdown": [
      {
        "prompt": "best design tools for agencies",
        "sovPct": 18,
        "avgPos": 3.1,
        "sentiment": 0.72,
        "sentimentDimensions": {
          "trustworthiness": 0.72,
          "authority": 0.65,
          "recommendation_strength": 0.58,
          "fit_for_query_intent": 0.81
        }
      }
    ],
    "providerBreakdown": {
      "sentimentByProvider": [
        { "provider": "ChatGPT", "sentiment": 0.78 },
        { "provider": "Claude", "sentiment": 0.68 }
      ]
    },
    "results": [...]
  }
}

To build a trend over time, submit the same prompt periodically and compare the sentimentDimensions in the promptBreakdown across analyses. Here's a Python example:

import requests
import time

API_KEY = "your_api_key"
BASE_URL = "https://sellm.io/api/v1"
headers = {"Authorization": f"Bearer {API_KEY}"}

def run_analysis(prompt):
    """Submit and poll an async analysis."""
    resp = requests.post(
        f"{BASE_URL}/async-analysis",
        headers={**headers, "Content-Type": "application/json"},
        json={
            "prompt": prompt,
            "providers": ["chatgpt", "claude", "perplexity"],
            "locations": ["US"],
            "replicates": 3,
        },
    )
    resp.raise_for_status()
    analysis_id = resp.json()["data"]["analysisId"]

    for _ in range(40):
        time.sleep(15)
        result = requests.get(
            f"{BASE_URL}/async-analysis/{analysis_id}",
            headers=headers,
        ).json()
        if result["data"]["status"] == "succeeded":
            return result["data"]
        if result["data"]["status"] == "failed":
            raise RuntimeError("Analysis failed")
    raise TimeoutError("Analysis did not complete in time")

# Run analysis and extract sentiment
data = run_analysis("best design tools for agencies")
for pb in data["promptBreakdown"]:
    dims = pb["sentimentDimensions"]
    print(f"Prompt: {pb['prompt']}")
    print(f"  trust={dims['trustworthiness']:.2f} "
          f"auth={dims['authority']:.2f} "
          f"rec={dims['recommendation_strength']:.2f} "
          f"fit={dims['fit_for_query_intent']:.2f}")

Per-Prompt Sentiment Analysis

Aggregate scores are useful for high-level tracking, but the real optimization happens at the prompt level. Different queries reveal different facets of how AI perceives your brand. For example:

"Best CRM for enterprise" might show high authority but low fit if the AI sees you as mid-market
"Most affordable CRM" might show high fit but low trustworthiness if price is your strength but credibility is lacking
"CRM with best integrations" might show low recommendation strength if the AI doesn't associate your brand with an integration ecosystem

Use the per-prompt breakdown in the run summary to identify which queries need the most attention, then apply the dimension-specific content strategies outlined above.

Comparing Across Providers

Different AI engines often have different opinions about the same brand. ChatGPT might score your trustworthiness at 0.8 while Claude scores it at 0.5. These discrepancies aren't bugs - they reflect differences in training data, model architecture, and how each provider weighs different types of evidence.

Provider-level sentiment breakdowns help you understand:

Which AI platforms are most favorable to your brand (and where to double down)
Which platforms have gaps (and what content might close them)
Whether a content change improved perception across all providers or just one

Pricing

Sentiment analysis is included in every Sellm plan. Sentiment scores are computed automatically for every analysis run and are available through both the dashboard and the API. Each prompt analysis costs less than 1 cent.

All plans include the same 4-dimensional sentiment analysis across all supported providers. There are no additional charges for sentiment data - it's a core part of every analysis result.

Getting Started

If you already have a Sellm account with completed analysis runs, you can start pulling sentiment data from the API immediately:

Generate an API key in your project settings
Submit a prompt with POST /v1/async-analysis
Poll with GET /v1/async-analysis/{'{'}analysisId{'}'} until "succeeded"
Extract sentiment from the summary and sentimentDimensions from the promptBreakdown
Use providerBreakdown for per-provider sentiment comparison

If you're new to Sellm, sign up to get started.

For the full API reference, visit the API documentation. And for a broader look at how sentiment fits into a complete optimization strategy, read our developer's guide to GEO.

Frequently Asked Questions

How are sentiment scores calculated?

Sellm sends your prompts to AI providers (ChatGPT, Claude, Perplexity, etc.) and captures the raw responses. Those responses are then analyzed using structured extraction to score each mentioned brand across the four sentiment dimensions. Scores range from 0 to 1, where 0 means no signal and 1 means the strongest possible signal.

Can sentiment scores differ between providers for the same query?

Yes, and they often do. Each AI provider has different training data and response patterns. ChatGPT might emphasize your brand's reliability while Claude focuses on your technical capabilities. The per-provider breakdown in the API helps you see exactly where these differences lie.

How quickly do sentiment scores change after updating content?

AI models update their knowledge at different rates. Perplexity, which searches the web in real-time, may reflect content changes within days. ChatGPT and Claude update their training data less frequently, so changes may take weeks or months to appear in sentiment scores. Consistent, sustained content improvements yield the best long-term results.

Is sentiment analysis included in every plan?

Yes, sentiment analysis is included in every API response at no extra cost. All plans include full 4-dimensional sentiment scoring across all providers.