Model Guide
One of our core-values is to build a tool which is model-agnostic. This means we do not want to restrict users to models from just one provider, but rather allow them to choose which model from which provider to use. Each model has different strengths and we encourage you to test the different models to find the best models for your specific need.
Selecting a model
- Whenever you start a new chat, you can use the model you want to work with at the top left.
- You can still change the model at the top left if you have already started a chat. For example, you can start with GPT-4.1 and, after three messages, switch to Claude Sonnet 4.
- You can also set your personal default model in the account settings here. The default for new users is GPT-4.1.
Selecting the right model
Below you find an overview of which models do exceptionally well at which use cases that you might encounter. Below, we also outline our personal recommendations based on our current user feedback.
Model | Provider | Strengths | Token Limits (Context Window and Output Limit) | Notes |
---|---|---|---|---|
GPT-4.1 | OpenAI | Content Generation & Creative Writing, Strong Analytical Skills, high proficiency in logical reasoning and problem-solving | Input: 1,047,576 Output: 32,768 | OpenAI’s refined flagship, ChatGPT 4.1, features a much larger context window, improved accuracy, and multimodal abilities. It excels in creative, analytical, and reasoning tasks, making it a versatile all-rounder. |
GPT-4.1 mini | OpenAI | Smaller, faster version of GPT-4.1. Great for everyday tasks with significantly faster responses. | Input: 1,047,576 Output: 32,768 | GPT-4.1 mini is a streamlined variant of OpenAI’s GPT-4.1 model, delivering fast, efficient responses while maintaining robust language understanding. It is ideal for users who value the advantages of the latest architecture but prefer lower latency and reduced resource usage. |
GPT-4.1 nano | OpenAI | Fast and efficient content generation, quick interactive responses, and solid performance on everyday tasks | Input: 1,047,576 Output: 32,768 | ChatGPT 4.1 nano is a lightweight and fast variant ideal for real-time assistants and high-volume use. Optimised for speed and lower resource use, it trades some reasoning depth for quick, reliable everyday performance. |
o4 mini | OpenAI | Excels at visual tasks, optimized for fast, effective reasoning. | Input: 200,000 Output: 100,000 | OpenAI’s latest o-series small model, o4-mini, excels at fast, effective reasoning and delivers highly efficient coding and visual performance. It’s optimised for speed and capability in a compact form. |
o3 | OpenAI | Advanced technical writing & instruction-following, outstanding math, science, and coding skills, excels at multi-step reasoning across text, code. | Input: 200,000 Output: 100,000 | o3 is a well-rounded and capable model that performs strongly across a range of domains. It sets a new standard for mathematics, science, coding, and visual reasoning, and also excels in technical writing and instruction-following. |
Claude Sonnet 4 | Anthropic | Excels at complex coding, creative writing, image analysis, and translation, with a dual-mode system for both quick answers and in-depth reasoning | Input: 200,000 Output: 64,000 | Anthropic’s top model, Claude Sonnet 4, improves on Claude 3.7 with a larger context window, smoother mode switching, better coding, and deeper reasoning. Maintains strong safety and alignment. |
Gemini 2.5 Flash | Excels at rapid, real-time content generation and robust image analysis, handling long documents and datasets with ease. | Input: 1,048,576 Output: 65,535 | Google’s fastest Gemini Flash model, with a large context window and double the output length. Excels at long-document tasks and multi-modal analysis, delivering results nearly twice as fast as its predecessor. | |
Gemini 2.5 Pro | Excels at fast, real-time content and image analysis with ultra-long context for handling large documents. | Input: 1,048,576 Output: 65,535 | Google’s Flagship Gemini 2.5 Pro model offers large context support and excels at complex tasks and code. It prioritises nuanced reasoning and depth over speed, outperforming previous Gemini models in accuracy and analysis. | |
Mistral Large 2411 | Mistral | Software Development & Coding, Multilingual Capabilities | Input: 131,000 Output: 4,096 | Mistral’s top model with excellent reasoning capabilities. Strong performance on sophisticated conversations and complex problem-solving. More refined than previous versions with better instruction following. |
Our Recommendations
Our Default for Everyday Tasks: GPT-4.1 (OpenAI)
GPT-4.1 from OpenAI is our top recommendation for a versatile standard model, excelling in a wide range of tasks with its exceptional multimodal capabilities. This flagship model is ideal for users who need a powerful all-rounder, offering excellent performance in content generation, creative writing, image analysis, and multilingual support. GPT-4.1 stands out for its ability to balance performance and speed while maintaining strong creative writing and reasoning skills.
A Hybrid Model for Coding and Writing: Claude Sonnet 4 (Anthropic)
Claude Sonnet 4 from Anthropic is our top recommendation for coding or text generation. Many of our users prefer Claude over GPT-4.1 for writing their software code, emails, texts, translations etc. This hybrid model is ideal if you need both quick answers and in-depth analysis, offering a standard mode for efficient responses and an extended thinking mode for complex tasks. Claude Sonnet 4 offers amazing instruction-following abilities, minimal hallucination, and improved coding capabilities. We recommend it for its consistently high user satisfaction in language-related and software development tasks.
The Specialized Model for Complex Reasoning: o4 mini (OpenAI)
o4 mini from OpenAI is our top recommendation for complex analytical tasks requiring maximum precision. This enhanced variant excels at mathematical, scientific, and programming challenges through its elevated “reasoning effort” architecture, which breaks down intricate problems with exceptional clarity.
The key technical advantage is its ability to process up to 200,000 tokens with substantial output capacity, enabling comprehensive analyses without losing coherence. While this deeper reasoning adds modest latency, the accuracy gains make it ideal for professional and enterprise applications.
o4 mini also incorporates robust safety mechanisms to ensure factual and ethical outputs, all while maintaining competitive costs compared to legacy large language models. For teams needing technical sophistication with operational efficiency, it’s the premier choice in OpenAI’s specialized reasoning portfolio.
Legacy models
The following models are still available and can be used. However, our recommendation is to use the newer versions of them.
For context: the older model versions have limitations in performance and accuracy compared to their newer counterparts. The newer versions include significant improvements in response quality, speed, and safety features that we’ve developed based on user feedback and technical advances.
You can continue using the older versions if needed for compatibility reasons, but we’d recommend migrating to the newer versions when possible for the best experience.
Model | Provider | Strengths | Token Limits (Context Window and Output Limit) | Notes |
---|---|---|---|---|
GPT-4o | OpenAI | Content Generation & Creative Writing, Data Analysis (Excel & CSV files), Image Analysis, Translation and Multilingual Support | Input: 128,000 Output: 16,384 | OpenAI’s previous flagship model with good multimodal capabilities. Balanced performance and speed with creative writing and reasoning skills. |
GPT-4o Mini | OpenAI | Speed and Efficiency, Image Analysis | Input: 128,000 Output: 16,384 | Faster and more cost-effective than GPT-4o but with reduced capabilities for complex tasks. Ideal for quick responses and lightweight applications where speed matters more than depth. |
o1 | OpenAI | Excels at real-time reasoning, planning, content creation, coding, and coherence in extended conversations. | Input: 200,000 Output: 100,000 | ChatGPT o1, an optimised OpenAI model bridging GPT-4 and future GPT versions, delivers good dialogue management. It balances efficiency and advanced reasoning. |
o3 Mini High | OpenAI | Enhanced accuracy and depth of response while still benefiting from an optimized, efficient architecture | Input: 200,000 Output: 100,000 | Upgraded o3 Mini for deeper, more powerful outputs. Handles longer, more detailed tasks while staying efficient. |
LLaMA 3.3 70B | Meta | Speed and Efficiency, Multilingual Capabilities | Input: 128,000 Output: 2,048 | Meta’s open-source model with competitive performance against proprietary alternatives. Significant improvement over previous Llama models with better reasoning and instruction following. |
DeepSeek R1 32B | DeepSeek | Software Development & Coding, Hallucination Resistance | Input: 128,000 Output: 8,000 | As the first competitive model from China, Deep Seek R1 is particularly impressive, considering it was built with significantly fewer resources than competitors like OpenAI. Optimized for quality of reasoning rather than speed, it demonstrates exceptional performance on benchmarks like math and coding challenges. |
Claude 3.7 Sonnet | Anthropic | Software Development & Coding, Content Generation & Creative Writing, Image Analysis, Translation and Multilingual Support | Input: 200,000 Output: 8,192 | Anthropic’s previous flagship model with unique dual-mode capability. Features standard mode for quick answers and extended thinking mode for complex reasoning tasks. Significantly improved coding capabilities over previous Claude models. |
Gemini 2.0 Flash | Long-Context Analysis & Document Processing, Image Analysis, Speed and Efficiency | Input: 1,048,576 Output: 8,192 | Google’s previous flagship high-speed model with a large context window. Outperforms the previous Gemini 1.5 Pro model on key benchmarks while being twice as fast. | |
Gemini 1.5 Pro | Long-Context Analysis & Document Processing, Image Analysis | Input: 2,097,152 Output: 8,192 | Google’s older model with 2M token context window. Strong multimodal capabilities and good for long-context applications. |