GPT 4.1

You can now access three new models through the API: GPT-4.1, GPT-4.1 mini, and GPT-4.1 nano. These models surpass the

By OpenAI
Knowledge Cutoff : June 1, 2024

GPT 4.1 overview

Context Window: 1,047,576 tokens
Maximum Output Tokens: 32,768 tokens
Batch Responses: Yes
Is Open Sourced: No

GPT 4.1

Main base model. This is the full model from which mini and nano versions are derived.

The most capable version, intended for general use and complex tasks where accuracy and reasoning are critical.
Typically has higher computational requirements and cost compared to Mini and Nano variants

GPT 4.1 Mini

Designed as a high-performance small model, matching or exceeding GPT-4o on many benchmarks.

Offers significant reductions in latency (about half of GPT-4o) and cost (83% cheaper).
Suitable for applications needing strong performance but with tighter resource or budget constraints

GPT 4.1 Nano

The smallest, fastest, and cheapest model in the 4.1 lineup.

Ideal for ultra-low latency applications such as text classification, autocompletion, or edge deployments.
Delivers impressive performance for its size, though not as strong as Mini or the standard model for complex reasoning

What is GPT 4.1?

Text Input and Output
Image Input Only
Streaming
Function Caling
Fine Tuning
Distillation
Predicted Outputs
Context Caching

You can now access three new models through the API: GPT-4.1, GPT-4.1 mini, and GPT-4.1 nano.

These models surpass the performance of GPT-4o and GPT-4o mini in coding, instruction following, and long-context processing. Each model supports a context window of up to 1 million tokens and includes updated knowledge through June 2024.

Performance Across Key Areas

Coding

GPT-4.1 achieves a 54.6% score on the SWE-bench Verified benchmark, improving by 21.4% over GPT-4o and 26.6% over GPT-4.5. You can rely on it for generating accurate code patches and diffs across programming languages. It doubles GPT-4o’s score on Aider’s polyglot diff benchmark, ensuring precise edits.

Windsurf reports a 60% improvement on their internal coding benchmarks, with 30% more efficient tool calling and 50% fewer unnecessary edits. Qodo confirms GPT-4.1 outperforms competitors in 55% of GitHub pull request reviews, excelling in precision and analysis.

For frontend development, GPT-4.1 produces functional and visually appealing web applications. Human graders prefer its output over GPT-4o’s in 80% of comparisons, as demonstrated in a flashcard web app with dynamic search and smooth animations.

Instruction Following

GPT-4.1 scores 87.4% on IFEval, compared to 81.0% for GPT-4o, and 38.3% on the MultiChallenge benchmark, a 10.5% improvement. You can expect reliable adherence to formats, negative instructions, and multi-turn conversation coherence.

Blue J notes a 53% accuracy increase on complex tax scenarios. Hex reports a 2x improvement in challenging SQL evaluations, highlighting GPT-4.1’s ability to handle ambiguous schemas and nuanced instructions.

Long-Context Processing

With a 1 million-token context window, GPT-4.1 processes large codebases or documents efficiently. It scores 72.0% on Video-MME’s long, no-subtitles category, improving by 6.7% over GPT-4o. Its needle-in-a-haystack accuracy remains consistent across all context lengths, and it excels in the OpenAI-MRCR evaluation for multi-round coreference tasks.

Carlyle achieves 50% better retrieval performance from complex, lengthy documents, overcoming limitations like lost-in-the-middle errors.

Vision

GPT-4.1 mini outperforms GPT-4o on several vision benchmarks, enabling you to process images effectively for tasks like chart analysis or visual data interpretation.

Cost and Accessibility

You benefit from lower costs with GPT-4.1, which is 26% less expensive than GPT-4o for median queries. GPT-4.1 nano offers the lowest cost at $0.10 per 1M input tokens. Prompt caching provides a 75% discount, and long-context requests incur no additional fees beyond standard per-token rates.

ModelInput ($/1M tokens)Cached Input ($/1M tokens)Output ($/1M tokens)Blended Pricing ($/1M tokens)
GPT-4.1$2.00$0.50$8.00$1.84
GPT-4.1 mini$0.40$0.10$1.60$0.42
GPT-4.1 nano$0.10$0.025$0.40$0.12

You can access these models through the Batch API at a 50% discount for large-scale applications.

Building AI Agents

GPT-4.1’s enhanced instruction following and long-context processing enable you to develop reliable AI agents. These agents handle complex tasks, such as software engineering, document analysis, and customer support, with minimal oversight when paired with tools like the Responses API.

Transition from GPT-4.5 Preview

GPT-4.1 replaces GPT-4.5 Preview, offering superior performance and cost-efficiency. You have until July 14, 2025, to transition, as GPT-4.5 Preview will be deprecated. GPT-4.1 retains the creativity and nuance of GPT-4.5 while improving scalability.

Conclusion

You can leverage GPT-4.1, GPT-4.1 mini, and GPT-4.1 nano to build advanced applications with improved coding, instruction following, and long-context capabilities. These models provide cost-effective solutions for diverse use cases. Visit the API documentation to start integrating GPT-4.1 into your projects.

Other popular AI Models (LLMs)