Skip to main content

Models

Language Model Providers

Ollama

Model NameModel IDDescriptionMax TokensSupports ImagesSupports JSON SchemaSupports Function Calls
Llama3 7bllama3:latestLlama 38192
Llama 2-7bllama2:latestLlama 28192
Mistralmistral:latestMistral8192
Code Llamacodellama:7b-codeCode Llama8192

Replicate

Model NameModel IDDescriptionMax TokensSupports ImagesSupports JSON SchemaSupports Function Calls
Mixtral 8x7b instructmistralai/mixtral-8x7b-instruct-v0.1Mixtral 8x7b instruct128000
Mistral 7b instruct v0.2mistralai/mistral-7b-instruct-v0.2The Mistral-7B-Instruct-v0.2 Large Language Model (LLM) is an improved instruct fine-tuned version of Mistral-7B-Instruct-v0.1.128000
Mistral 7b instruct v0.1mistral-7b-instruct-v0.1An instruction-tuned 7 billion parameter language model from Mistral128000
Mixtral 8x7b instruct v0.1mistralai/mixtral-8x7b-instruct-v0.1The Mixtral-8x7B-instruct-v0.1 Large Language Model (LLM) is a pretrained generative Sparse Mixture of Experts tuned to be a helpful assistant.128000
Llama 2 13b chatmeta/llama-2-13b-chatA 13 billion parameter language model from Meta, fine tuned for chat completions128000
Llama 2 70b chatmeta/llama-2-70b-chatA 70 billion parameter language model from Meta, fine tuned for chat completions128000

OpenAI

Model NameModel IDDescriptionMax TokensSupports ImagesSupports JSON SchemaSupports Function Calls
GPT-4ogpt-4oAdvanced, multimodal flagship model that's cheaper and faster than GPT-4 Turbo128000
GPT-4o 2024-04-09gpt-4-turbo-2024-04-09Advanced, multimodal flagship model that's cheaper and faster than GPT-4 Turbo128000
GPT-4o-minigpt-4o-miniAffordable and intelligent small model for fast, lightweight tasks. GPT-4o mini is cheaper and more capable than GPT-3.5 Turbo. Currently points to gpt-4o-mini-2024-07-18.128000
GPT-4 Turbogpt-4-turboThe latest GPT-4 model with improved instruction following, JSON mode, reproducible outputs, parallel function calling, and more.128000
GPT-4 Turbo Previewgpt-4-turbo-previewThe latest GPT-4 model with improved instruction following, JSON mode, reproducible outputs, parallel function calling, and more. Returns a maximum of 4,096 output tokens. This preview model is not yet suited for production traffic.128000
GPT-4 Visiongpt-4-vision-previewGPT-4 with the ability to understand images, in addition to all other GPT-4 Turbo capabilities.128000
GPT-4gpt-4More capable than any GPT-3.5 model, able to do more complex tasks, and optimized for chat. Will be updated with our latest model iteration.8192
GPT-4 0613gpt-4-0613More capable than any GPT-3.5 model, able to do more complex tasks, and optimized for chat. Will be updated with our latest model iteration.8192
GPT-4 32Kgpt-4-32kSame capabilities as the base gpt-4 mode but with 4x the context length. Will be updated with our latest model iteration.32768
GPT-3.5 Turbo 0613gpt-3.5-turbo-0613Most capable GPT-3.5 model and optimized for chat at 1/10th the cost of text-davinci-003. Will be updated with our latest model iteration.4096
GPT-3.5 Turbogpt-3.5-turboMost capable GPT-3.5 model and optimized for chat at 1/10th the cost of text-davinci-003. Will be updated with our latest model iteration.4096
GPT-3.5 Turbo 16Kgpt-3.5-turbo-16kSame capabilities as the base gpt-3.5-turbo model but with 4x the context length. Will be updated with our latest model iteration.16384

Groq

Model NameModel IDDescriptionMax TokensSupports ImagesSupports JSON SchemaSupports Function Calls
Llama 3.1 405B Reasoningllama-3.1-405b-reasoningLlama 3.1 405B Reasoning131072
Llama 3.1 70B Versatile (Tool Use Preview)llama3-groq-70b-8192-tool-use-previewLlama 3.1 70B Versatile (Tool Use Preview)8192
Llama 3.1 70B Versatilellama-3.1-70b-versatileLlama 3.1 70B Versatile131072
Llama 3.1 8B Instant (Tool Use Preview)llama3-groq-8b-8192-tool-use-previewLlama 3.1 8B Instant (Tool Use Preview)8192
Llama 3.1 8B Instantllama-3.1-8b-instantLlama 3.1 8B Instant131072
LLaMA3-70bllama3-70b-8192LLaMA3-70b8192
LLaMA3-8bllama3-8b-8192LLaMA3-8b8192
LLaMA2-70bllama2-70b-4096LLaMA2-70b4096
Mixtral-8x7bmixtral-8x7b-32768Mixtral-8x7b32768
Gemma-7b-itgemma-7b-itGemma-7b-it8192

Google Generative AI

Model NameModel IDDescriptionMax TokensSupports ImagesSupports JSON SchemaSupports Function Calls
gemini-progemini-progemini-pro32000

Anthropic Claude

Model NameModel IDDescriptionMax TokensSupports ImagesSupports JSON SchemaSupports Function Calls
Claude 3 Opusclaude-3-opus-20240229Most powerful model for highly complex tasks, offering top-level performance with multilingual and vision capabilities.200000
Claude 3 Sonnetclaude-3-sonnet-20240229Ideal balance of intelligence and speed for enterprise workloads, with multilingual and vision support.200000
Claude 3 Haikuclaude-3-haiku-20240307Fastest and most compact model for near-instant responsiveness, includes multilingual and vision capabilities.200000
Claude 3.5 Sonnetclaude-3-5-sonnet-20240620Most intelligent model, includes multilingual and vision capabilities.200000

OctoAI

Model NameModel IDDescriptionMax TokensSupports ImagesSupports JSON SchemaSupports Function Calls
Llama-3.1-Instruct (8B)meta-llama-3.1-8b-instructMeta's Llama-3.1-Instruct model with 8 billion parameters for chat use cases.131072
Llama-3.1-Instruct (70B)meta-llama-3.1-70b-instructMeta's Llama-3.1-Instruct model with 70 billion parameters for chat use cases.131072
Llama3-Instruct (8B)meta-llama-3-8b-instructMeta's Llama3-Instruct model with 8 billion parameters for chat use cases.8192
Llama3-Instruct (70B)meta-llama-3-70b-instructMeta's Llama3-Instruct model with 70 billion parameters for chat use cases.8192
Mistral Instruct v0.3 (7B)mistral-7b-instructMistral's Instruct v0.3 model with 7 billion parameters for chat and coding use cases.32768
Mixtral Instruct (8x7B)mixtral-8x7b-instructMistral's Mixtral Instruct model with 8x7 billion parameters for chat and coding use cases.32768
Nous Hermes 2 Mixtral DPO (8x7B)nous-hermes-2-mixtral-8x7b-dpoNous Research's Hermes 2 Mixtral DPO model with 8x7 billion parameters for content moderation.32768
Mixtral Instruct (8x22B)mixtral-8x22b-instructMistral's Mixtral Instruct model with 8x22 billion parameters for chat and coding use cases.65536
WizardLM-2 (8x22B)wizardlm-2-8x22bMicrosoft's WizardLM-2 model with 8x22 billion parameters for chat and coding use cases.65536
Llama Guard 2llamaguard-2-7bMeta's Llama Guard 2 model with 7 billion parameters for content moderation.4096

Perplexity AI

Model NameModel IDDescriptionMax TokensSupports ImagesSupports JSON SchemaSupports Function Calls
Llama-3.1-Sonar-Small (8B)llama-3.1-sonar-small-128k-onlineMeta's Llama-3.1-Sonar-Small model with 8 billion parameters for chat use cases.127072
Llama-3.1-Sonar-Large (70B)llama-3.1-sonar-large-128k-onlineMeta's Llama-3.1-Sonar-Large model with 70 billion parameters for chat use cases.127072
Llama-3.1-Sonar-Huge (405B)llama-3.1-sonar-huge-128k-onlineMeta's Llama-3.1-Sonar-Huge model with 405 billion parameters for chat use cases.127072

Embedding Models

OpenAI

Model NameModel IDDescriptionMax TokensMax Output DimensionsSupports Reduced Dimensions
Text Embedding Ada 002text-embedding-ada-002Text Embedding Ada 00281911536
Text Embedding 3 Smalltext-embedding-3-smallIncreased performance over 2nd generation ada embedding model81911536
Text Embedding 3 Largetext-embedding-3-largeMost capable embedding model for both english and non-english tasks81913072

Cohere

Model NameModel IDDescriptionMax TokensMax Output DimensionsSupports Reduced Dimensions
Embed English v3.0embed-english-v3.0A model that allows for text to be classified or turned into embeddings. English only.5121024
Embed English Light v3.0embed-english-light-v3.0A smaller, faster version of embed-english-v3.0. Almost as capable, but a lot faster. English only.512384
Embed English v2.0embed-english-v2.0Our older embeddings model that allows for text to be classified or turned into embeddings. English only5124096
Embed English Light v2.0embed-english-light-v2.0A smaller, faster version of embed-english-v2.0. Almost as capable, but a lot faster. English only.5121024
Embed Multilingual v3.0embed-multilingual-v3.0Provides multilingual classification and embedding support. See supported languages here.5121024
Embed Multilingual Light v3.0embed-multilingual-light-v3.0A smaller, faster version of embed-multilingual-v3.0. Almost as capable, but a lot faster. Supports multiple languages.512384
Embed Multilingual v2.0embed-multilingual-v2.0Provides multilingual classification and embedding support. See supported languages here.256768