v1.80.7-stable - RAG API, Skills API, and Organization Usage
Deploy this version​
- Docker
- Pip
docker run litellm
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.80.7
pip install litellm
pip install litellm==1.80.7
Key Highlights​
- New RAG API - Unified RAG API with support for Vertex AI RAG engine and OpenAI Vector Stores
- Claude Skills API - Support for Anthropic's new Skills API with extended context and tool calling
- Organization Usage - Filter and track usage analytics at the organization level
- Claude Opus 4.5 - Support for Anthropic's Claude Opus 4.5 via Anthropic, Bedrock, VertexAI
- Guardrails for Passthrough - Guardrails support for pass-through endpoints
- Public AI Provider - Support for publicai.co provider
Organization Usage​
Users can now filter usage statistics by organization, providing the same granular filtering capabilities available for teams.
Details:
- Filter usage analytics, spend logs, and activity metrics by organization ID
- View organization-level breakdowns alongside existing team and user-level filters
- Consistent filtering experience across all usage and analytics views
New Providers and Endpoints​
New Providers​
| Provider | Supported Endpoints | Description |
|---|---|---|
| Public AI | Chat completions | Support for publicai.co provider |
New LLM API Endpoints​
| Endpoint | Method | Description | Documentation |
|---|---|---|---|
/v1/skills | POST | Anthropic Skills API for extended context tool calling | Skills API |
/rag/ingest | POST | Unified RAG API with Vertex AI RAG and Vector Stores | RAG API |
New Models / Updated Models​
New Model Support​
| Provider | Model | Context Window | Input ($/1M tokens) | Output ($/1M tokens) | Features |
|---|---|---|---|---|---|
| Anthropic | claude-opus-4-5-20251101 | 200K | $5.00 | $25.00 | Chat, reasoning, vision, function calling, prompt caching |
| Bedrock | anthropic.claude-opus-4-5-20251101-v1:0 | 200K | $5.00 | $25.00 | Chat, reasoning, vision, function calling, prompt caching |
| Bedrock | us.anthropic.claude-opus-4-5-20251101-v1:0 | 200K | $5.00 | $25.00 | Chat, reasoning, vision, function calling, prompt caching |
| Bedrock | amazon.nova-canvas-v1:0 | - | - | $0.06/image | Image generation |
| OpenRouter | openrouter/anthropic/claude-opus-4.5 | 200K | $5.00 | $25.00 | Chat, reasoning, vision, function calling, prompt caching |
| Vertex AI | vertex_ai/claude-opus-4-5 | 200K | $5.00 | $25.00 | Chat, reasoning, vision, function calling, prompt caching |
| Vertex AI | vertex_ai/claude-opus-4-5@20251101 | 200K | $5.00 | $25.00 | Chat, reasoning, vision, function calling, prompt caching |
| Azure | azure_ai/claude-opus-4-1 | 200K | $15.00 | $75.00 | Chat, reasoning, vision, function calling, prompt caching |
| Azure | azure_ai/claude-sonnet-4-5 | 200K | $3.00 | $15.00 | Chat, reasoning, vision, function calling, prompt caching |
| Azure | azure_ai/claude-haiku-4-5 | 200K | $1.00 | $5.00 | Chat, reasoning, vision, function calling, prompt caching |
| Fireworks AI | fireworks_ai/accounts/fireworks/models/glm-4p6 | 202K | $0.55 | $2.19 | Chat, function calling |
| Public AI | publicai/swiss-ai/apertus-8b-instruct | 8K | Free | Free | Chat, function calling |
| Public AI | publicai/swiss-ai/apertus-70b-instruct | 8K | Free | Free | Chat, function calling |
| Public AI | publicai/aisingapore/Gemma-SEA-LION-v4-27B-IT | 8K | Free | Free | Chat, function calling |
| Public AI | publicai/BSC-LT/salamandra-7b-instruct-tools-16k | 16K | Free | Free | Chat, function calling |
| Public AI | publicai/BSC-LT/ALIA-40b-instruct_Q8_0 | 8K | Free | Free | Chat, function calling |
| Public AI | publicai/allenai/Olmo-3-7B-Instruct | 32K | Free | Free | Chat, function calling |
| Public AI | publicai/aisingapore/Qwen-SEA-LION-v4-32B-IT | 32K | Free | Free | Chat, function calling |
| Public AI | publicai/allenai/Olmo-3-7B-Think | 32K | Free | Free | Chat, function calling, reasoning |
| Public AI | publicai/allenai/Olmo-3-32B-Think | 32K | Free | Free | Chat, function calling, reasoning |
| Cohere | embed-multilingual-light-v3.0 | 1K | $0.10 | - | Embeddings, supports images |
| WatsonX | watsonx/whisper-large-v3-turbo | - | $0.0001/sec | - | Audio transcription |
Features​
-
- Add OpenRouter Opus 4.5 - PR #17144
-
- Add fireworks_ai/accounts/fireworks/models/glm-4p6 - PR #17154
-
- Add vertex ai image gen support for both gemini and imagen models - PR #17070
- Handle global location in context caching - PR #16997
- Fix CreateCachedContentRequest enum error - PR #16965
- Use the correct domain for the global location when counting tokens - PR #17116
- Support Vertex AI batch listing in LiteLLM proxy - PR #17079
- Fix default sample count for image generation - PR #16403
-
- Add audio transcriptions for WatsonX - PR #17160
-
- Fix gpt-5.1 temperature support when reasoning_effort is "none" or not specified - PR #17011
-
- Add Provider publicai.co - PR #17230
-
- Add cost tracking for cohere embed passthrough endpoint - PR #17029
Bug Fixes​
-
- Fix pydantic validation errors during tool call with streaming - PR #16899
-
- Integrate eleven labs text-to-speech - PR #16573
LLM API Endpoints​
Features​
-
- New API - Claude Skills API with extended context and tool calling - PR #17042
-
- Add search API logging and cost tracking in LiteLLM Proxy - PR #17078
-
- Fix prevent duplicate spend logs in Responses API for non-OpenAI providers - PR #16992
- Support response_format parameter in completion -> responses bridge - PR #16844
- Fix mcp tool call response logging + remove unmapped param error mid-stream - allows gpt-5 web search to work via responses api - PR #16946
- Add header passing support for MCP tools in Responses API - PR #16877
-
- Fix image edit endpoint - PR #17046
-
- Add header forwarding in embeddings - PR #16869
-
- Add method for extracting vector store ids from path params - PR #16566
-
General
Bugs​
- General
Management Endpoints / UI​
Features​
-
Proxy CLI Auth
- Add enforce user param functionality - PR #17088
-
Virtual Keys
- Fix Create Key Duration - PR #17170
-
Models + Endpoints
- Allow adding Bedrock API Key when adding models - PR #17153
- Add aws_bedrock_runtime_endpoint into Credential Types - PR #17053
- Change provider create fields to JSON - PR #16985
- Change model_hub_table to call getUiConfig before Fetching Public Data - PR #17166
- Improve Wording for Config Models in Model Table - PR #17100
-
Teams & Users
- Deleting a User From Team Deletes key User Created for Team - PR #17057
- Hide Default Team Settings From Proxy Admin Viewers - PR #16900
- Add No Default Models for Team and User Settings - PR #17037
- User Table Sort by All - PR #17108
- Org Admin Team Permissions Fix - PR #17110
- Better Loading State for Internal User Page - PR #17168
-
Permission Management
-
MCP Gateway
-
General UI Improvements
- Ensure Unique Keys in Navbar Menu Items - PR #16987
- Minor Cosmetic Changes for Buttons, Add Notification for Delete Team - PR #16984
- Change Delete Modals to Common Component - PR #17068
- Disable edit, delete, info for dynamically generated spend tags - PR #17098
- Migrate modelInfoCall to ReactQuery - PR #17123
- Migrate Provider Fields to React Query - PR #17177
- Fix Flaky Test - PR #17161
- Change Add Fallback Modal to use Antd Select - PR #17223
-
Infrastructure
Bugs​
-
Database
- Distinguish permission errors from idempotent errors in Prisma migrations - PR #17064
-
MCP Gateway
- Fix missing await - PR #17103
-
Infrastructure
- Enhancement(helm): ServiceMonitor template rendering - PR #17038
AI Integrations​
Logging​
- General
Guardrails​
-
- Add presidio pii masking tutorial with litellm - PR #16969
-
General
Prompt Management​
- General
- AI gateway prompt management documentation - PR #16990
MCP Gateway​
-
OAuth 2.0
-
Tool Permissions
-
Configuration
Performance / Loadbalancing / Reliability improvements​
-
Memory Optimization
- Lazy-load cost_calculator & logging to reduce memory + import time - PR #17089
-
Dependency Management
- Downgrade grpcio to < 1.68.0 - PR #17090
-
Database Performance
- Optimize date filtering for spend logs queries - PR #17073
-
Request Handling
- Add automatic LiteLLM context headers (Pillar integration) - PR #17076
-
Generic API Support
- Make generic api OSS + support multiple generic API's - PR #17152
Documentation Updates​
-
Provider Documentation
-
General Documentation
- AI gateway prompt management - PR #16990
- Cleanup README and improve agent guides - PR #17003
- Update broken documentation links in README - PR #17002
- Update version and add preview tag - PR #17032
- Document model pricing contribution process - PR #17031
- Document event hook usage - PR #17035
- Link to logging spec in callback docs - PR #17049
- Add OpenAI Agents SDK to projects - PR #17203
- Fix unspecified issue - PR #17034
New Contributors​
- @prawaan made their first contribution in PR #16997
- @lior-ps made their first contribution in PR #16365
- @HaiyiMei made their first contribution in PR #17020
- @yuya2017 made their first contribution in PR #17064
- @saar-win made their first contribution in PR #17038
- @sdip15fa made their first contribution in PR #16965
- @KeremTurgutlu made their first contribution in PR #16826
- @choigawoon made their first contribution in PR #17019
- @SamAcctX made their first contribution in PR #17144
- @naaa760 made their first contribution in PR #17079
- @abi-jey made their first contribution in PR #17096
- @hxyannay made their first contribution in PR #16734

