
The LLM capability and pricing mess stops here
It’s 2025, and you still can’t get basic information about LLM models through their APIs. Want to know the context window? Pricing per token? Whether a model supports function calling? Good luck hunting through documentation that changes constantly and is formatted differently for each provider.
This isn’t just annoying. It’s actively harmful to the ecosystem. Developers waste countless hours maintaining this information manually or hacking together scrapers to pull it from docs. Library maintainers are duplicating effort across the ecosystem. And ultimately, users suffer from brittle applications when this information becomes outdated.
We’ve been maintaining this in RubyLLM
We’ve included model capabilities and pricing in RubyLLM since the beginning. It’s been essential for our users to programmatically select the right model for their needs and estimate costs.
But as the ecosystem has exploded with new models and providers, this has become increasingly unwieldy to maintain. Every time OpenAI, Anthropic, or Google changes their pricing or releases a new model, we’re back to updating tables of data.
Introducing the LLM Capabilities API
So I’m partnering with Parsera to build something better: a standardized API that provides capabilities and pricing information for all major LLM providers.
The schema looks like this:
# This is a YAML file so I can have comments but the API should obviously return an array of models in JSON.
# Legend:
# Required: this is important to have in v1.
# Optional: this is still important but can wait for v2.
id: gpt-4.5-preview # Required, will match it with the OpenAI API
display_name: GPT-4.5 Preview # Required
provider: openai # Required
family: gpt45 # Optional, each model page is a family for OpenAI models
context_window: 128000 # Required
max_output_tokens: 16384 # Required
knowledge_cutoff: 20231001 # Optional
modalities:
text:
input: true # Required
output: true # Required
image:
input: true # Required
output: false # Required
audio:
input: false # Required
output: false # Required
pdf_input: false # Optional - from Anthropic and Google
embeddings_output: false # Required
moderation_output: false # Optional
capabilities:
streaming: true # Optional
function_calling: true # Required
structured_output: true # Required
predicted_outputs: false # Optional
distillation: false # Optional
fine_tuning: false # Optional
batch: true # Required
realtime: false # Optional
citations: false # Optional - from Anthropic
reasoning: false # Optional - called Extended Thinking in Anthropic's lingo
pricing:
text_tokens:
standard:
input_per_million: 75.0 # Required
cached_input_per_million: 37.5 # Required
output_per_million: 150.0 # Required
reasoning_output_per_million: 0 # Optional
batch:
input_per_million: 37.5 # Required
output_per_million: 75.0 # Required
images:
standard:
input: 0.0 # Optional
output: 0.0 # Optional
batch:
input: 0.0 # Optional
output: 0.0 # Optional
audio_tokens:
standard:
input_per_million: 0.0 # Optional
output_per_million: 0.0 # Optional
batch:
input_per_million: 0.0 # Optional
output_per_million: 0.0 # Optional
embeddings:
standard:
input_per_million: 0.0 # Required
batch:
input_per_million: 0.0 # Required
This API will track:
- Context windows and token limits
- Knowledge cutoff dates
- Supported modalities (text, image, audio)
- Available capabilities (function calling, streaming, etc.)
- Detailed pricing for all operations
Parsera will handle keeping this data fresh through their specialized scraping infrastructure, and they’ll expose a public API endpoint that anyone can access via a simple GET request. This endpoint will return the complete model registry in the standardized format. RubyLLM will integrate with this API immediately upon release.
This is for everyone
This isn’t just for RubyLLM. We want this to become a standard that benefits the entire LLM ecosystem. The API will be accessible to developers using any language or framework.
No more duplicated effort across libraries. No more scrambling when pricing changes. Just a single source of truth that everyone can rely on.
LLM library maintainers can simply query this API to get up-to-date information about all models across providers, rather than each implementing their own scraping and maintenance solutions.
What’s next
We’re finalizing the schema now and would love your feedback: check out the draft here.
Expect the first version of the API to launch in the next few weeks. We’ll start with the major providers (OpenAI, Anthropic, Gemini, DeepSeek) and expand from there.
In the longer term, we hope to work directly with providers to ensure this data is always accurate and up-to-date. This shouldn’t be something the community needs to scrape - it should be a standard part of how LLM providers communicate with developers.
What do you think? Would this solve a pain point for you? Anything missing from the schema that would be essential for your use cases? Let me know in the Gist’ comments or on GitHub Discussions.