Gemini 2.5 excels at refactoring code and higher-level projects
On Tuesday, Google announced Gemini 2.5, a reasoning model that is optimized for coding and has a 1 million token context window.
Cursor made it available today, so we decided to give it a shot.
We saw that for another Cursor user, Gemini 2.5 was able to solve a problem that Sonnet had gotten stuck on:
Gemini 2.5 just solved a problem that Sonnet got stuck on. It's really good at understanding the bigger picture and making architectural decisions. pic.twitter.com/WcbbBnGFxg
— Ian Nuttall (@iannuttall) March 28, 2025
Stringme has had this issue where the local API was essentially recreated to be compatible with vercel functions which have certain idiosyncracies. Sonnet kept struggling with the task of consolidating these files and doing overall codebase cleanup.
So I selected Gemini 2.5 (Pro Max) and gave it a basic prompt to this effect.
Side note, I used stringme.dev to get pastable plain text of the Vercel function docs for the Cursor agent prompt
Gemini 2.5 responded by asking followups, asking me to point it to where to look in the codebase. Once directed, it asked me a few more clarifying and orientational questions before calling any tools.
Once it got going, it reached several "Decision Points," where it presented me with several courses of action.
I suspect Gemini 2.5's emphasis on table-setting and greater deference to the user reflects Google's general ethic of caution.
This specific project involved consolidating a lot of the API functionality to use the more Vercel-specific implementation, which included deleting a lot of files. Based on my experience of Gemini 2.5 being quite cautious and deliberative before taking action, I suspect it's the best model for this kind of refactoring / cleanup project.
Gemini 2.5's text generation felt slower than Sonnet (3.7) has in the past (though Gemini may be getting a lot of traffic right now). When Cursor isn't experiencing peak traffic, Sonnet's responses flow super quickly.
It's like you want Sonnet (3.7) to be your writer, generating files quickly, and then Gemini (2.5) is your editor or technical architect.
Seems like some users of Cline are adopting this kind of functionality:
Introducing model switching in Cline. Use different models for different tasks - Sonnet for quick generation, Gemini for architecture decisions. pic.twitter.com/GgT5AYhSIq
— Cline (@cline) March 28, 2025
Finally, while Cursor always prompts you to start a new chat after a certain number of exchanges, I found that in a very long chat with Gemini 2.5 (Pro Max), it kept going no problem. This may have something to do with its stated 1-million token context window.