OpenAI has just introduced GPT-4.1, a new set of AI models built to be especially good at coding and following instructions. This release includes the usable GPT-4.1, GPT-4.1 mini, and GPT-4.1 nano, all available through OpenAI’s API but not yet part of ChatGPT.
A major part of this upgrade is the models’ ability to handle much longer inputs. They can now handle up to 1 million tokens, which is roughly 750,000 words at once. This is a big jump from earlier models and means they can work with much more complex and lengthy information. That’s not a guarantee that it will remember it all; it can just take all that information.
The main goal of GPT-4.1 is to be better at coding. OpenAI has fine-tuned these models based on what developers asked for, focusing on things like front-end programming, reducing unnecessary changes, sticking to the right formats, and using tools correctly. Tests from OpenAI show that GPT-4.1 does better than older versions like GPT-4.0 and GPT-4.0 mini on coding challenges such as SWE-bench.
In one SWE-bench verified test, where humans double-check the results, GPT-4.1 scored between 52% and 54.6%. That’s an improvement, though still a bit behind Google’s Gemini 2.5 Pro (63.8%) and Anthropic’s Claude 3.7 Sonnet (62.3%). The smaller models, GPT-4.1 mini and nano, are faster and cheaper but slightly less accurate, with GPT-4.1 nano being OpenAI’s quickest and most budget-friendly option yet.
I’ve tried to use Chat GPT for coding, and it never ended up being what I needed. The system forgot things, and it generally wasn’t a fun experience. Based on my experience alone, where I have a small amount of experience with programming, I’d say that an upgrade is really needed for ChatGPT.
Besides coding, GPT-4.1 is also better at understanding videos and images. In OpenAI’s own tests for video comprehension (called Video-MME), GPT-4.1 scored 72% accuracy in the “long, no subtitles” category, showing it can grasp complex visual information well.
Pricing varies based on performance. OpenAI says that GPT-4.1 is more cost-effective than its predecessor, GPT-4.0.
It’s worth noting that OpenAI is phasing out GPT-4.5 from its API starting July 14, 2025. GPT-4.5 was apparently OpenAI’s biggest model and did well in writing and persuasion, but it was very expensive, so OpenAI is dropping it in favor of GPT-4.1 as a more affordable alternative. However, GPT-4.5 will still be available in ChatGPT’s research preview for paying users.
The launch of GPT-4.1 and the removal of GPT-4.5 are great examples of bigger, not always being better. I would think that a good idea is to have separate LLMs that handle different needs of users. Otherwise, there’s a lot of time wasted making sure the correct output is given.
GPT-4.1’s better coding skills, lower price, and huge input capacity make this particular AI more practical for developers. The ability to process much longer texts is especially important, as coding takes up many tokens. This isn’t something that the average user would need.
That said, there are still challenges. OpenAI even admitted that GPT-4.1 will become less reliable as inputs get longer. AIs have difficulty keeping track of a lot of information and knowing which is relevant and which isn’t. While it’s good at generating code, AI-written code can still have security risks, bugs, and more, so careful testing and updates are necessary.
The GPT-4.1 models are a big improvement, especially for coding and handling videos. They may not beat every competitor in every test, but it’s more about specific use than being good in every aspect. You can try GPT 4.1 on the official website if you have an account with OpenAI.
Sources: TechCrunch, OpenAI