Selecting a model

  • Whenever you start a new chat, you can use the model you want to work with at the top left.
  • You can still change the model at the top left if you have already started a chat. For example, you can start with GPT-4.1 and, after three messages, switch to Claude Sonnet 4.
  • You can also set your personal default model in the account settings here. The default for new users is GPT-4.1.

Selecting the right model

Below you find an overview of which models do exceptionally well at which use cases that you might encounter. Below, we also outline our personal recommendations based on our current user feedback.

ModelProviderStrengthsToken Limits (Context Window and Output Limit)Notes
GPT-4.1OpenAIContent Generation & Creative Writing, Strong Analytical Skills, high proficiency in logical reasoning and problem-solvingInput: 1,047,576
Output: 32,768
OpenAI’s refined flagship, ChatGPT 4.1, features a much larger context window, improved accuracy, and multimodal abilities. It excels in creative, analytical, and reasoning tasks, making it a versatile all-rounder.
GPT-4.1 miniOpenAISmaller, faster version of GPT-4.1. Great for everyday tasks with significantly faster responses.Input: 1,047,576
Output: 32,768
GPT-4.1 mini is a streamlined variant of OpenAI’s GPT-4.1 model, delivering fast, efficient responses while maintaining robust language understanding. It is ideal for users who value the advantages of the latest architecture but prefer lower latency and reduced resource usage.
GPT-4.1 nanoOpenAIFast and efficient content generation, quick interactive responses, and solid performance on everyday tasksInput: 1,047,576
Output: 32,768
ChatGPT 4.1 nano is a lightweight and fast variant ideal for real-time assistants and high-volume use. Optimised for speed and lower resource use, it trades some reasoning depth for quick, reliable everyday performance.
o4 miniOpenAIExcels at visual tasks, optimized for fast, effective reasoning.Input: 200,000 Output: 100,000OpenAI’s latest o-series small model, o4-mini, excels at fast, effective reasoning and delivers highly efficient coding and visual performance. It’s optimised for speed and capability in a compact form.
o3OpenAIAdvanced technical writing & instruction-following, outstanding math, science, and coding skills, excels at multi-step reasoning across text, code.Input: 200,000 Output: 100,000o3 is a well-rounded and capable model that performs strongly across a range of domains. It sets a new standard for mathematics, science, coding, and visual reasoning, and also excels in technical writing and instruction-following.
Claude Sonnet 4AnthropicExcels at complex coding, creative writing, image analysis, and translation, with a dual-mode system for both quick answers and in-depth reasoningInput: 200,000 Output: 64,000Anthropic’s top model, Claude Sonnet 4, improves on Claude 3.7 with a larger context window, smoother mode switching, better coding, and deeper reasoning. Maintains strong safety and alignment.
Gemini 2.5 FlashGoogleExcels at rapid, real-time content generation and robust image analysis, handling long documents and datasets with ease.Input: 1,048,576 Output: 65,535Google’s fastest Gemini Flash model, with a large context window and double the output length. Excels at long-document tasks and multi-modal analysis, delivering results nearly twice as fast as its predecessor.
Gemini 2.5 ProGoogleExcels at fast, real-time content and image analysis with ultra-long context for handling large documents.Input: 1,048,576 Output: 65,535Google’s Flagship Gemini 2.5 Pro model offers large context support and excels at complex tasks and code. It prioritises nuanced reasoning and depth over speed, outperforming previous Gemini models in accuracy and analysis.
Mistral Large 2411MistralSoftware Development & Coding, Multilingual CapabilitiesInput: 131,000 Output: 4,096Mistral’s top model with excellent reasoning capabilities. Strong performance on sophisticated conversations and complex problem-solving. More refined than previous versions with better instruction following.

Our Recommendations

Our Default for Everyday Tasks: GPT-4.1 (OpenAI)

GPT-4.1 from OpenAI is our top recommendation for a versatile standard model, excelling in a wide range of tasks with its exceptional multimodal capabilities. This flagship model is ideal for users who need a powerful all-rounder, offering excellent performance in content generation, creative writing, image analysis, and multilingual support. GPT-4.1 stands out for its ability to balance performance and speed while maintaining strong creative writing and reasoning skills.

A Hybrid Model for Coding and Writing: Claude Sonnet 4 (Anthropic)

Claude Sonnet 4 from Anthropic is our top recommendation for coding or text generation. Many of our users prefer Claude over GPT-4.1 for writing their software code, emails, texts, translations etc. This hybrid model is ideal if you need both quick answers and in-depth analysis, offering a standard mode for efficient responses and an extended thinking mode for complex tasks. Claude Sonnet 4 offers amazing instruction-following abilities, minimal hallucination, and improved coding capabilities. We recommend it for its consistently high user satisfaction in language-related and software development tasks.

The Specialized Model for Complex Reasoning: o4 mini (OpenAI)

o4 mini from OpenAI is our top recommendation for complex analytical tasks requiring maximum precision. This enhanced variant excels at mathematical, scientific, and programming challenges through its elevated “reasoning effort” architecture, which breaks down intricate problems with exceptional clarity.

The key technical advantage is its ability to process up to 200,000 tokens with substantial output capacity, enabling comprehensive analyses without losing coherence. While this deeper reasoning adds modest latency, the accuracy gains make it ideal for professional and enterprise applications.

o4 mini also incorporates robust safety mechanisms to ensure factual and ethical outputs, all while maintaining competitive costs compared to legacy large language models. For teams needing technical sophistication with operational efficiency, it’s the premier choice in OpenAI’s specialized reasoning portfolio.

Legacy models

The following models are still available and can be used. However, our recommendation is to use the newer versions of them.

For context: the older model versions have limitations in performance and accuracy compared to their newer counterparts. The newer versions include significant improvements in response quality, speed, and safety features that we’ve developed based on user feedback and technical advances.

You can continue using the older versions if needed for compatibility reasons, but we’d recommend migrating to the newer versions when possible for the best experience.

ModelProviderStrengthsToken Limits (Context Window and Output Limit)Notes
GPT-4oOpenAIContent Generation & Creative Writing, Data Analysis (Excel & CSV files), Image Analysis, Translation and Multilingual SupportInput: 128,000 Output: 16,384OpenAI’s previous flagship model with good multimodal capabilities. Balanced performance and speed with creative writing and reasoning skills.
GPT-4o MiniOpenAISpeed and Efficiency, Image AnalysisInput: 128,000 Output: 16,384Faster and more cost-effective than GPT-4o but with reduced capabilities for complex tasks. Ideal for quick responses and lightweight applications where speed matters more than depth.
o1OpenAIExcels at real-time reasoning, planning, content creation, coding, and coherence in extended conversations.Input: 200,000 Output: 100,000ChatGPT o1, an optimised OpenAI model bridging GPT-4 and future GPT versions, delivers good dialogue management. It balances efficiency and advanced reasoning.
o3 Mini HighOpenAIEnhanced accuracy and depth of response while still benefiting from an optimized, efficient architectureInput: 200,000 Output: 100,000Upgraded o3 Mini for deeper, more powerful outputs. Handles longer, more detailed tasks while staying efficient.
LLaMA 3.3 70BMetaSpeed and Efficiency, Multilingual CapabilitiesInput: 128,000 Output: 2,048Meta’s open-source model with competitive performance against proprietary alternatives. Significant improvement over previous Llama models with better reasoning and instruction following.
DeepSeek R1 32BDeepSeekSoftware Development & Coding, Hallucination ResistanceInput: 128,000 Output: 8,000As the first competitive model from China, Deep Seek R1 is particularly impressive, considering it was built with significantly fewer resources than competitors like OpenAI. Optimized for quality of reasoning rather than speed, it demonstrates exceptional performance on benchmarks like math and coding challenges.
Claude 3.7 SonnetAnthropicSoftware Development & Coding, Content Generation & Creative Writing, Image Analysis, Translation and Multilingual SupportInput: 200,000 Output: 8,192Anthropic’s previous flagship model with unique dual-mode capability. Features standard mode for quick answers and extended thinking mode for complex reasoning tasks. Significantly improved coding capabilities over previous Claude models.
Gemini 2.0 FlashGoogleLong-Context Analysis & Document Processing, Image Analysis, Speed and EfficiencyInput: 1,048,576 Output: 8,192Google’s previous flagship high-speed model with a large context window. Outperforms the previous Gemini 1.5 Pro model on key benchmarks while being twice as fast.
Gemini 1.5 ProGoogleLong-Context Analysis & Document Processing, Image AnalysisInput: 2,097,152 Output: 8,192Google’s older model with 2M token context window. Strong multimodal capabilities and good for long-context applications.