Models
Language Model Providers
Ollama
Model Name | Model ID | Description | Max Tokens | Supports Images | Supports JSON Schema | Supports Function Calls |
---|---|---|---|---|---|---|
DeepSeek V3 | deepseek-v3 | DeepSeek V3 model | 163840 | ❌ | ❌ | ❌ |
DeepSeek R1 1.5B | deepseek-r1:1.5b | DeepSeek R1 1.5B Qwen model | 131072 | ❌ | ❌ | ❌ |
DeepSeek R1 7B | deepseek-r1:7b | DeepSeek R1 7B Qwen model | 131072 | ❌ | ❌ | ❌ |
DeepSeek R1 8B | deepseek-r1:8b | DeepSeek R1 8B Llama model | 131072 | ❌ | ❌ | ❌ |
DeepSeek R1 14B | deepseek-r1:14b | DeepSeek R1 14B Qwen model | 131072 | ❌ | ❌ | ❌ |
DeepSeek R1 32B | deepseek-r1:32b | DeepSeek R1 32B Qwen model | 131072 | ❌ | ❌ | ❌ |
DeepSeek R1 70B | deepseek-r1:70b | DeepSeek R1 70B Llama model | 131072 | ❌ | ❌ | ❌ |
DeepSeek R1 671B | deepseek-r1:671b | DeepSeek R1 671B model | 131072 | ❌ | ❌ | ❌ |
Llama3 7b | llama3:latest | Llama 3 | 8192 | ❌ | ❌ | ❌ |
Llama 2-7b | llama2:latest | Llama 2 | 8192 | ❌ | ❌ | ❌ |
Mistral | mistral:latest | Mistral | 8192 | ❌ | ❌ | ❌ |
Code Llama | codellama:7b-code | Code Llama | 8192 | ❌ | ❌ | ❌ |
Replicate
Model Name | Model ID | Description | Max Tokens | Supports Images | Supports JSON Schema | Supports Function Calls |
---|---|---|---|---|---|---|
Mixtral 8x7b instruct | mistralai/mixtral-8x7b-instruct-v0.1 | Mixtral 8x7b instruct | 128000 | ❌ | ❌ | ❌ |
Mistral 7b instruct v0.2 | mistralai/mistral-7b-instruct-v0.2 | The Mistral-7B-Instruct-v0.2 Large Language Model (LLM) is an improved instruct fine-tuned version of Mistral-7B-Instruct-v0.1. | 128000 | ❌ | ❌ | ❌ |
Mistral 7b instruct v0.1 | mistral-7b-instruct-v0.1 | An instruction-tuned 7 billion parameter language model from Mistral | 128000 | ❌ | ❌ | ❌ |
Mixtral 8x7b instruct v0.1 | mistralai/mixtral-8x7b-instruct-v0.1 | The Mixtral-8x7B-instruct-v0.1 Large Language Model (LLM) is a pretrained generative Sparse Mixture of Experts tuned to be a helpful assistant. | 128000 | ❌ | ❌ | ❌ |
Llama 2 13b chat | meta/llama-2-13b-chat | A 13 billion parameter language model from Meta, fine tuned for chat completions | 128000 | ❌ | ❌ | ❌ |
Llama 2 70b chat | meta/llama-2-70b-chat | A 70 billion parameter language model from Meta, fine tuned for chat completions | 128000 | ❌ | ❌ | ❌ |
OpenAI
Model Name | Model ID | Description | Max Tokens | Supports Images | Supports JSON Schema | Supports Function Calls |
---|---|---|---|---|---|---|
4.1 | gpt-4.1 | OpenAI's flagship model for complex tasks. It is well suited for problem solving across domains. | 1047576 | ✅ | ✅ | ✅ |
4.1 mini | gpt-4.1-mini | GPT 4.1 mini provides a balance between intelligence, speed, and cost that makes it an attractive model for many use cases. | 1047576 | ✅ | ✅ | ✅ |
4.1 nano | gpt-4.1-nano | GPT-4.1 nano is the fastest, most cost-effective GPT 4.1 model. | 1047576 | ✅ | ✅ | ✅ |
4o | gpt-4o | Advanced, multimodal flagship model that's cheaper and faster than GPT-4 Turbo | 128000 | ✅ | ✅ | ✅ |
4o-mini | gpt-4o-mini | Affordable and intelligent small model for fast, lightweight tasks. GPT-4o mini is cheaper and more capable than GPT-3.5 Turbo. Currently points to gpt-4o-mini-2024-07-18. | 128000 | ✅ | ✅ | ✅ |
o3-mini (Low Reasoning) | o3-mini-low | Fast and efficient o3-mini model with low reasoning effort. Optimized for quick responses with basic reasoning. | 200000 | ✅ | ✅ | ✅ |
o3-mini (Medium Reasoning) | o3-mini-medium | Balanced o3-mini model with medium reasoning effort. Good for general-purpose tasks requiring moderate analysis. | 200000 | ✅ | ✅ | ✅ |
o3-mini (High Reasoning) | o3-mini-high | Thorough o3-mini model with high reasoning effort. Best for complex tasks requiring deep analysis. | 200000 | ✅ | ✅ | ✅ |
o3 | o3 | o3 is a powerful reasoning model designed for complex problem-solving across domains. It combines advanced reasoning capabilities with high performance for demanding tasks. | 200000 | ✅ | ✅ | ✅ |
o4-mini (Low Reasoning) | o4-mini-low | Fast and efficient o4-mini model with low reasoning effort. Optimized for quick responses with basic reasoning. | 200000 | ✅ | ✅ | ✅ |
o4-mini (Medium Reasoning) | o4-mini-medium | Balanced o4-mini model with medium reasoning effort. Good for general-purpose tasks requiring moderate analysis. | 200000 | ✅ | ✅ | ✅ |
o4-mini (High Reasoning) | o4-mini-high | Thorough o4-mini model with high reasoning effort. Best for complex tasks requiring deep analysis. | 200000 | ✅ | ✅ | ✅ |
o4-mini | o4-mini | o4-mini is a compact and efficient model that delivers strong performance for a wide range of tasks. It offers a good balance of capabilities and resource efficiency. | 200000 | ✅ | ✅ | ✅ |
o1 | o1 | o1 is a reasoning model designed to solve hard problems across domains. The o1 series of models are trained with reinforcement learning to perform complex reasoning. o1 models think before they answer, producing a long internal chain of thought before responding to the user. | 200000 | ✅ | ✅ | ✅ |
o1-mini | o1-mini | o1-mini is a fast and affordable reasoning model for specialized tasks. The o1-mini series of models are trained with reinforcement learning to perform complex reasoning. o1-mini models think before they answer, producing a long internal chain of thought before responding to the user. | 128000 | ❌ | ✅ | ✅ |
4.5 | gpt-4.5-preview | This is a research preview of GPT-4.5, OpenAI's largest and most capable GPT model yet. Its deep world knowledge and better understanding of user intent makes it good at creative tasks and agentic planning. | 128000 | ✅ | ✅ | ✅ |
4.1 2025-04-14 | gpt-4.1-2025-04-14 | OpenAI's flagship model for complex tasks. It is well suited for problem solving across domains. | 1047576 | ✅ | ✅ | ✅ |
4.1 mini 2025-04-14 | gpt-4.1-mini-2025-04-14 | GPT 4.1 mini provides a balance between intelligence, speed, and cost that makes it an attractive model for many use cases. | 1047576 | ✅ | ✅ | ✅ |
4.1 nano 2025-04-14 | gpt-4.1-nano-2025-04-14 | GPT-4.1 nano is the fastest, most cost-effective GPT 4.1 model. | 1047576 | ✅ | ✅ | ✅ |
4o 2024-08-06 | gpt-4o-2024-08-06 | 2024-08-06 version of gpt-4o | 128000 | ✅ | ✅ | ✅ |
4o-mini 2024-07-18 | gpt-4o-mini-2024-07-18 | 2024-07-18 version of gpt-4o-mini | 128000 | ✅ | ✅ | ✅ |
o1 2024-12-17 | o1-2024-12-17 | 2024-12-17 version of o1 | 200000 | ✅ | ✅ | ✅ |
o1-mini 2024-09-12 | o1-mini-2024-09-12 | 2024-09-12 version of o1-mini | 128000 | ❌ | ✅ | ✅ |
4 Turbo | gpt-4-turbo | The latest GPT-4 model with improved instruction following, JSON mode, reproducible outputs, parallel function calling, and more. | 128000 | ✅ | ✅ | ✅ |
4 Turbo Preview | gpt-4-turbo-preview | The latest GPT-4 model with improved instruction following, JSON mode, reproducible outputs, parallel function calling, and more. Returns a maximum of 4,096 output tokens. This preview model is not yet suited for production traffic. | 128000 | ✅ | ✅ | ✅ |
4 Vision | gpt-4-vision-preview | GPT-4 with the ability to understand images, in addition to all other GPT-4 Turbo capabilities. | 128000 | ✅ | ✅ | ✅ |
4 | gpt-4 | More capable than any GPT-3.5 model, able to do more complex tasks, and optimized for chat. Will be updated with our latest model iteration. | 8192 | ❌ | ✅ | ✅ |
4 32K | gpt-4-32k | Same capabilities as the base gpt-4 mode but with 4x the context length. Will be updated with our latest model iteration. | 32768 | ❌ | ✅ | ✅ |
4 Turbo 2024-04-09 | gpt-4-turbo-2024-04-09 | Advanced, multimodal flagship model that's cheaper and faster than GPT-4 Turbo | 128000 | ✅ | ✅ | ✅ |
3.5 Turbo | gpt-3.5-turbo | Most capable GPT-3.5 model and optimized for chat at 1/10th the cost of text-davinci-003. Will be updated with our latest model iteration. | 4096 | ❌ | ✅ | ✅ |
3.5 Turbo 16K | gpt-3.5-turbo-16k | Same capabilities as the base gpt-3.5-turbo model but with 4x the context length. Will be updated with our latest model iteration. | 16384 | ❌ | ✅ | ✅ |
4 0613 | gpt-4-0613 | More capable than any GPT-3.5 model, able to do more complex tasks, and optimized for chat. Will be updated with our latest model iteration. | 8192 | ❌ | ✅ | ✅ |
3.5 Turbo 0613 | gpt-3.5-turbo-0613 | Most capable GPT-3.5 model and optimized for chat at 1/10th the cost of text-davinci-003. Will be updated with our latest model iteration. | 4096 | ❌ | ✅ | ✅ |
Groq
Model Name | Model ID | Description | Max Tokens | Supports Images | Supports JSON Schema | Supports Function Calls |
---|---|---|---|---|---|---|
Llama 4 Maverick | meta-llama/llama-4-maverick-17b-128e-instruct | Llama 4 Maverick | 131072 | ❌ | ✅ | ✅ |
Llama 4 Scout | meta-llama/llama-4-scout-17b-16e-instruct | Llama 4 Scout | 131072 | ❌ | ✅ | ✅ |
DeepSeek R1 Distilled Llama 70B | deepseek-r1-distill-llama-70b | DeepSeek R1 Distilled Llama 70B | 128000 | ❌ | ✅ | ✅ |
DeepSeek R1 Distilled Llama 70B SpecDec | deepseek-r1-distill-llama-70b-specdec | DeepSeek R1 Distilled Llama 70B SpecDec | 128000 | ❌ | ✅ | ✅ |
Llama 3.1 405B Reasoning | llama-3.1-405b-reasoning | Llama 3.1 405B Reasoning | 131072 | ❌ | ✅ | ❌ |
Llama 3.3 70B Versatile | llama-3.3-70b-versatile | Llama 3.3 70B Versatile | 32768 | ❌ | ✅ | ✅ |
Llama 3.3 70B SpecDec | llama-3.3-70b-specdec | Llama 3.3 70B SpecDec | 8192 | ❌ | ✅ | ✅ |
Llama 3.1 70B Versatile (Tool Use Preview) | llama3-groq-70b-8192-tool-use-preview | Llama 3.1 70B Versatile (Tool Use Preview) | 8192 | ❌ | ✅ | ✅ |
Llama 3.1 70B Versatile | llama-3.1-70b-versatile | Llama 3.1 70B Versatile | 131072 | ❌ | ✅ | ❌ |
Llama 3.1 8B Instant (Tool Use Preview) | llama3-groq-8b-8192-tool-use-preview | Llama 3.1 8B Instant (Tool Use Preview) | 8192 | ❌ | ✅ | ✅ |
Llama 3.1 8B Instant | llama-3.1-8b-instant | Llama 3.1 8B Instant | 131072 | ❌ | ✅ | ✅ |
LLaMA3-70b | llama3-70b-8192 | LLaMA3-70b | 8192 | ❌ | ✅ | ✅ |
LLaMA3-8b | llama3-8b-8192 | LLaMA3-8b | 8192 | ❌ | ✅ | ✅ |
LLaMA2-70b | llama2-70b-4096 | LLaMA2-70b | 4096 | ❌ | ❌ | ❌ |
Mixtral-8x7b | mixtral-8x7b-32768 | Mixtral-8x7b | 32768 | ❌ | ✅ | ✅ |
Gemma-7b-it | gemma-7b-it | Gemma-7b-it | 8192 | ❌ | ✅ | ✅ |
Google Generative AI
Model Name | Model ID | Description | Max Tokens | Supports Images | Supports JSON Schema | Supports Function Calls |
---|---|---|---|---|---|---|
Gemini 2.5 Pro Preview 03-25 | gemini-2.5-pro-preview-03-25 | Gemini 2.5 Pro Preview 03-25 | 1000000 | ✅ | ✅ | ✅ |
Gemini 2.5 Flash Preview 04-17 | gemini-2.5-flash-preview-04-17 | Gemini 2.5 Flash Preview 04-17 | 1000000 | ✅ | ✅ | ✅ |
Gemini 2.0 Flash | gemini-2.0-flash-001 | Gemini 2.0 Flash | 1000000 | ✅ | ✅ | ✅ |
Gemini 2.0 Flash Experimental | gemini-2.0-flash-exp | Gemini 2.0 Flash Experimental | 1000000 | ✅ | ✅ | ✅ |
Gemini 1.0 Pro | gemini-pro | Gemini 1.0 Pro | 32000 | ❌ | ❌ | ❌ |
Anthropic Claude
Model Name | Model ID | Description | Max Tokens | Supports Images | Supports JSON Schema | Supports Function Calls |
---|---|---|---|---|---|---|
Claude 3.7 Sonnet | claude-3-7-sonnet-20250219 | Anthropic's most intelligent model. Highest level of intelligence and capability with toggleable extended thinking. This is the latest version of the model. | 200000 | ✅ | ✅ | ✅ |
Claude 3.5 Sonnet (V2) | claude-3-5-sonnet-20241022 | Anthropic's previous most intelligent model. High level of intelligence and capability. | 200000 | ✅ | ✅ | ✅ |
Claude 3.5 Sonnet (V1) | claude-3-5-sonnet-20240620 | Anthropic's previous most intelligent model. High level of intelligence and capability. | 200000 | ✅ | ✅ | ✅ |
Claude 3.5 Haiku | claude-3-5-haiku-20241022 | Anthropic's fastest model that can execute lightweight actions, with industry-leading speed. | 200000 | ✅ | ✅ | ✅ |
Claude 3 Opus | claude-3-opus-20240229 | Most powerful model for highly complex tasks, offering top-level performance with multilingual and vision capabilities. | 200000 | ✅ | ✅ | ✅ |
Claude 3 Sonnet | claude-3-sonnet-20240229 | Ideal balance of intelligence and speed for enterprise workloads, with multilingual and vision support. | 200000 | ✅ | ✅ | ✅ |
Claude 3 Haiku | claude-3-haiku-20240307 | Fastest and most compact model for near-instant responsiveness, includes multilingual and vision capabilities. | 200000 | ✅ | ✅ | ✅ |
OctoAI
Model Name | Model ID | Description | Max Tokens | Supports Images | Supports JSON Schema | Supports Function Calls |
---|---|---|---|---|---|---|
Llama-3.1-Instruct (8B) | meta-llama-3.1-8b-instruct | Meta's Llama-3.1-Instruct model with 8 billion parameters for chat use cases. | 131072 | ❌ | ✅ | ✅ |
Llama-3.1-Instruct (70B) | meta-llama-3.1-70b-instruct | Meta's Llama-3.1-Instruct model with 70 billion parameters for chat use cases. | 131072 | ❌ | ✅ | ✅ |
Llama3-Instruct (8B) | meta-llama-3-8b-instruct | Meta's Llama3-Instruct model with 8 billion parameters for chat use cases. | 8192 | ❌ | ✅ | ✅ |
Llama3-Instruct (70B) | meta-llama-3-70b-instruct | Meta's Llama3-Instruct model with 70 billion parameters for chat use cases. | 8192 | ❌ | ✅ | ✅ |
Mistral Instruct v0.3 (7B) | mistral-7b-instruct | Mistral's Instruct v0.3 model with 7 billion parameters for chat and coding use cases. | 32768 | ❌ | ❌ | ❌ |
Mixtral Instruct (8x7B) | mixtral-8x7b-instruct | Mistral's Mixtral Instruct model with 8x7 billion parameters for chat and coding use cases. | 32768 | ❌ | ❌ | ❌ |
Nous Hermes 2 Mixtral DPO (8x7B) | nous-hermes-2-mixtral-8x7b-dpo | Nous Research's Hermes 2 Mixtral DPO model with 8x7 billion parameters for content moderation. | 32768 | ❌ | ❌ | ❌ |
Mixtral Instruct (8x22B) | mixtral-8x22b-instruct | Mistral's Mixtral Instruct model with 8x22 billion parameters for chat and coding use cases. | 65536 | ❌ | ❌ | ❌ |
WizardLM-2 (8x22B) | wizardlm-2-8x22b | Microsoft's WizardLM-2 model with 8x22 billion parameters for chat and coding use cases. | 65536 | ❌ | ❌ | ❌ |
Llama Guard 2 | llamaguard-2-7b | Meta's Llama Guard 2 model with 7 billion parameters for content moderation. | 4096 | ❌ | ❌ | ❌ |
Perplexity AI
Model Name | Model ID | Description | Max Tokens | Supports Images | Supports JSON Schema | Supports Function Calls |
---|---|---|---|---|---|---|
Llama-3.1-Sonar-Small (8B) | llama-3.1-sonar-small-128k-online | Meta's Llama-3.1-Sonar-Small model with 8 billion parameters for chat use cases. | 127072 | ❌ | ❌ | ❌ |
Llama-3.1-Sonar-Large (70B) | llama-3.1-sonar-large-128k-online | Meta's Llama-3.1-Sonar-Large model with 70 billion parameters for chat use cases. | 127072 | ❌ | ❌ | ❌ |
Llama-3.1-Sonar-Huge (405B) | llama-3.1-sonar-huge-128k-online | Meta's Llama-3.1-Sonar-Huge model with 405 billion parameters for chat use cases. | 127072 | ❌ | ❌ | ❌ |
Amazon Bedrock
Model Name | Model ID | Description | Max Tokens | Supports Images | Supports JSON Schema | Supports Function Calls |
---|---|---|---|---|---|---|
Claude 3.7 Sonnet | anthropic.claude-3-7-sonnet-20250219-v1:0 | Anthropic's Claude 3.7 Sonnet model on Amazon Bedrock | 200000 | ✅ | ✅ | ✅ |
Claude 3.5 Sonnet (V2) | anthropic.claude-3-5-sonnet-20241022-v2:0 | Anthropic's Claude 3.5 Sonnet model on Amazon Bedrock | 200000 | ✅ | ✅ | ✅ |
Claude 3.5 Sonnet | anthropic.claude-3-5-sonnet-20240620-v1:0 | Anthropic's Claude 3.5 Sonnet model on Amazon Bedrock | 200000 | ✅ | ✅ | ✅ |
Claude 3 Sonnet | anthropic.claude-3-sonnet-20240229-v1:0 | Anthropic's Claude 3 Sonnet model on Amazon Bedrock | 200000 | ✅ | ✅ | ❌ |
Claude 3.5 Haiku | anthropic.claude-3-5-haiku-20241022-v1:0 | Anthropic's Claude 3.5 Haiku model on Amazon Bedrock | 200000 | ✅ | ✅ | ✅ |
Claude 3 Haiku | anthropic.claude-3-haiku-20240307-v1:0 | Anthropic's Claude 3 Haiku model on Amazon Bedrock | 200000 | ✅ | ✅ | ✅ |
Claude 3 Opus | anthropic.claude-3-opus-20240229-v1:0 | Anthropic's Claude 3 Opus model on Amazon Bedrock | 200000 | ✅ | ✅ | ❌ |
Llama 3 8B Instruct | meta.llama3-8b-instruct-v1:0 | Meta's Llama 3 8B Instruct model on Amazon Bedrock | 4096 | ❌ | ❌ | ❌ |
Llama 3 70B Instruct | meta.llama3-70b-instruct-v1:0 | Meta's Llama 3 70B Instruct model on Amazon Bedrock | 4096 | ❌ | ❌ | ❌ |
Llama 3.1 8B Instruct | meta.llama3-1-8b-instruct-v1:0 | Meta's Llama 3.1 8B Instruct model on Amazon Bedrock | 128000 | ❌ | ❌ | ❌ |
Llama 3.1 70B Instruct | meta.llama3-1-70b-instruct-v1:0 | Meta's Llama 3.1 70B Instruct model on Amazon Bedrock | 128000 | ❌ | ❌ | ❌ |
Llama 3.1 405B Instruct | meta.llama3-1-405b-instruct-v1:0 | Meta's Llama 3.1 405B Instruct model on Amazon Bedrock | 128000 | ❌ | ❌ | ❌ |
Llama 3.2 1B Instruct | us.meta.llama3-2-1b-instruct-v1:0 | Meta's Llama 3.2 1B Instruct model on Amazon Bedrock | 128000 | ❌ | ❌ | ❌ |
Llama 3.2 3B Instruct | us.meta.llama3-2-3b-instruct-v1:0 | Meta's Llama 3.2 3B Instruct model on Amazon Bedrock | 128000 | ❌ | ❌ | ❌ |
Llama 3.2 11B Instruct | us.meta.llama3-2-11b-instruct-v1:0 | Meta's Llama 3.2 11B Instruct model on Amazon Bedrock | 128000 | ❌ | ❌ | ❌ |
Llama 3.2 90B Instruct | us.meta.llama3-2-90b-instruct-v1:0 | Meta's Llama 3.2 90B Instruct model on Amazon Bedrock | 128000 | ❌ | ✅ | ✅ |
Azure OpenAI
Model Name | Model ID | Description | Max Tokens | Supports Images | Supports JSON Schema | Supports Function Calls |
---|---|---|---|---|---|---|
GPT-4.1 | gpt-4.1 | Most capable GPT-4.1 model for tasks requiring deep understanding and advanced reasoning. | 1047576 | ✅ | ✅ | ✅ |
GPT-4.1 Mini | gpt-4.1-mini | Smaller, faster version of GPT-4.1 optimized for efficiency. | 1047576 | ✅ | ✅ | ✅ |
GPT-4.1 Nano | gpt-4.1-nano | Smallest version of GPT-4.1 optimized for speed and cost efficiency. | 1047576 | ✅ | ✅ | ✅ |
GPT-4o | gpt-4o | Latest large GA model with structured outputs, text/image processing, enhanced accuracy and superior performance in non-English languages and vision tasks. | 128000 | ✅ | ✅ | ✅ |
GPT-4o mini | gpt-4o-mini | Latest small GA model optimized for fast, inexpensive tasks. Supports text and image processing, JSON Mode, and parallel function calling. | 128000 | ✅ | ✅ | ✅ |
o1 | o1 | o1 is a reasoning model designed to solve hard problems across domains. The o1 series of models are trained with reinforcement learning to perform complex reasoning. o1 models think before they answer, producing a long internal chain of thought before responding to the user. | 200000 | ✅ | ✅ | ✅ |
o1-mini | o1-mini | o1-mini is a fast and affordable reasoning model for specialized tasks. The o1-mini series of models are trained with reinforcement learning to perform complex reasoning. o1-mini models think before they answer, producing a long internal chain of thought before responding to the user. | 128000 | ❌ | ✅ | ✅ |
GPT-4 | gpt-4 | Most capable GPT-4 model for tasks requiring deep understanding and advanced reasoning. | 8192 | ❌ | ✅ | ✅ |
GPT-3.5 Turbo | gpt-35-turbo | Most capable GPT-3.5 model, optimized for chat at 1/10th the cost of GPT-4. | 16385 | ❌ | ✅ | ✅ |
xAI
Model Name | Model ID | Description | Max Tokens | Supports Images | Supports JSON Schema | Supports Function Calls |
---|---|---|---|---|---|---|
Grok 3 | grok-3 | Grok 3 model with high performance capabilities. Choose this for reduced cost compared to grok-3-fast. | 131072 | ❌ | ✅ | ✅ |
Grok 3 Latest | grok-3-latest | Latest version of Grok 3 model with high performance capabilities. | 131072 | ❌ | ✅ | ✅ |
Grok 3 Fast | grok-3-fast | Same as Grok 3 model but optimized for latency-sensitive applications. Choose this for better response time at higher cost. | 131072 | ❌ | ✅ | ✅ |
Grok 3 Fast Latest | grok-3-fast-latest | Latest faster version of Grok 3 model with optimized response time. | 131072 | ❌ | ✅ | ✅ |
Grok 3 Mini | grok-3-mini | Lightweight version of Grok 3 model with lower cost and good performance. | 131072 | ❌ | ✅ | ✅ |
Grok 3 Mini Latest | grok-3-mini-latest | Latest lightweight version of Grok 3 model with lower cost and good performance. | 131072 | ❌ | ✅ | ✅ |
Grok 3 Mini Fast | grok-3-mini-fast | Faster lightweight version of Grok 3 model with balanced cost and performance. | 131072 | ❌ | ✅ | ✅ |
Grok 3 Mini Fast Latest | grok-3-mini-fast-latest | Latest faster lightweight version of Grok 3 model with balanced cost and performance. | 131072 | ❌ | ✅ | ✅ |
Grok Beta | grok-beta | Comparable performance to Grok 2 but with improved efficiency, speed and capabilities. | 131072 | ❌ | ✅ | ✅ |
Grok Vision Beta | grok-vision-beta | Comparable performance to Grok 2 but with improved efficiency, speed and capabilities and with ability to process images. | 8192 | ✅ | ✅ | ❌ |
Fireworks
Model Name | Model ID | Description | Max Tokens | Supports Images | Supports JSON Schema | Supports Function Calls |
---|---|---|---|---|---|---|
Llama 4 Maverick Instruct (Basic) | accounts/fireworks/models/llama4-maverick-instruct-basic | The Llama 4 collection of models are natively multimodal AI models that enable text and multimodal experiences. These models leverage a mixture-of-experts architecture to offer industry-leading performance in text and image understanding. | 1000000 | ✅ | ✅ | ✅ |
Llama 4 Scout Instruct (Basic) | accounts/fireworks/models/llama4-scout-instruct-basic | The Llama 4 collection of models are natively multimodal AI models that enable text and multimodal experiences. These models leverage a mixture-of-experts architecture to offer industry-leading performance in text and image understanding. | 128000 | ✅ | ✅ | ✅ |
Qwen3 235B A22B | accounts/fireworks/models/qwen3-235b-a22b | Qwen3 is the latest generation of large language models in Qwen series, offering a comprehensive suite of dense and mixture-of-experts (MoE) models | 32768 | ❌ | ✅ | ✅ |
DeepSeek R1 | accounts/fireworks/models/deepseek-r1 | DeepSeek R1 is a large language model optimized for instruction following and coding tasks. | 160000 | ❌ | ❌ | ❌ |
DeepSeek V3 03-24 | accounts/fireworks/models/deepseek-v3-0324 | DeepSeek V3 is a large language model optimized for instruction following. This model is the version of the DeepSeek V3 model as of 3/24/2025. | 128000 | ❌ | ❌ | ❌ |
DeepSeek V3 | accounts/fireworks/models/deepseek-v3 | DeepSeek V3 is a large language model optimized for instruction following. | 128000 | ❌ | ❌ | ❌ |
Llama 3.3 70B Instruct | accounts/fireworks/models/llama-v3p3-70b-instruct | Llama 3.3 70B Instruct is a large language model that is optimized for instruction following. | 128000 | ❌ | ✅ | ✅ |
Llama 3.1 405B Instruct | accounts/fireworks/models/llama-v3p1-405b-instruct | Llama 3.1 405B Instruct is a large language model that is optimized for instruction following. | 128000 | ❌ | ✅ | ✅ |
Llama 3.1 70B Instruct | accounts/fireworks/models/llama-v3p1-70b-instruct | Llama 3.1 70B Instruct is a large language model that is optimized for instruction following. | 128000 | ❌ | ✅ | ✅ |
Embedding Models
OpenAI
Model Name | Model ID | Description | Max Tokens | Max Output Dimensions | Supports Reduced Dimensions |
---|---|---|---|---|---|
Text Embedding Ada 002 | text-embedding-ada-002 | Text Embedding Ada 002 | 8191 | 1536 | ❌ |
Text Embedding 3 Small | text-embedding-3-small | Increased performance over 2nd generation ada embedding model | 8191 | 1536 | ✅ |
Text Embedding 3 Large | text-embedding-3-large | Most capable embedding model for both english and non-english tasks | 8191 | 3072 | ✅ |
Cohere
Model Name | Model ID | Description | Max Tokens | Max Output Dimensions | Supports Reduced Dimensions |
---|---|---|---|---|---|
Embed English v3.0 | embed-english-v3.0 | A model that allows for text to be classified or turned into embeddings. English only. | 512 | 1024 | ❌ |
Embed English Light v3.0 | embed-english-light-v3.0 | A smaller, faster version of embed-english-v3.0. Almost as capable, but a lot faster. English only. | 512 | 384 | ❌ |
Embed English v2.0 | embed-english-v2.0 | Our older embeddings model that allows for text to be classified or turned into embeddings. English only | 512 | 4096 | ❌ |
Embed English Light v2.0 | embed-english-light-v2.0 | A smaller, faster version of embed-english-v2.0. Almost as capable, but a lot faster. English only. | 512 | 1024 | ❌ |
Embed Multilingual v3.0 | embed-multilingual-v3.0 | Provides multilingual classification and embedding support. See supported languages here. | 512 | 1024 | ❌ |
Embed Multilingual Light v3.0 | embed-multilingual-light-v3.0 | A smaller, faster version of embed-multilingual-v3.0. Almost as capable, but a lot faster. Supports multiple languages. | 512 | 384 | ❌ |
Embed Multilingual v2.0 | embed-multilingual-v2.0 | Provides multilingual classification and embedding support. See supported languages here. | 256 | 768 | ❌ |
Amazon Bedrock
Model Name | Model ID | Description | Max Tokens | Max Output Dimensions | Supports Reduced Dimensions |
---|---|---|---|---|---|
Cohere Embed English | cohere.embed-english-v3 | Cohere English Embedding Model hosted on AWS Bedrock | 512 | 1024 | ❌ |
Cohere Embed Multilingual | cohere.embed-multilingual-v3 | Cohere Multilingual Embedding Model hosted on AWS Bedrock | 512 | 1024 | ❌ |
Amazon Titan Embeddings G1 - Text | amazon.titan-embed-text-v1 | Amazon's G1 Test Embedding Model hosted on AWS Bedrock | 8192 | 1024 | ❌ |
Amazon Titan Embeddings V2 - Text | amazon.titan-embed-text-v2:0 | Amazon's G2 Text Embedding Model hosted on AWS Bedrock | 8192 | 1024 | ❌ |
Azure OpenAI
Model Name | Model ID | Description | Max Tokens | Max Output Dimensions | Supports Reduced Dimensions |
---|---|---|---|---|---|
OpenAI embedding Large | text-embedding-3-large | OpenAI's Large Text Embedding Model hosted on Microsoft Azure | 8192 | 3072 | ✅ |
OpenAI embedding Small | text-embedding-3-small | OpenAI's Small Text Embedding Model hosted on Microsoft Azure | 8192 | 1536 | ✅ |