TL;DR

Claude Opus 4 leads in raw performance and prompt adherence.
It understands user intentions better, reminiscent of 3.6 Sonnet.
High taste. The generated outputs are tasteful. Retains the Opus 3 personality to an extent.
Though unrelated to code, the model feels nice, and I never enjoyed talking to Gemini and o3.
Gemini 2.5 is more affordable in pricing and takes fewer API credits than Opus.
One million context length is undefeatable for large codebase understanding.
Opus is the slowest in time to first token. You have to be patient with the thinking mode.

14

submitted 2 weeks ago by sunilkumardash9@lemmy.world to c/technology@lemmy.world

0 comments fedilink

-21

submitted 1 month ago by sunilkumardash9@lemmy.world to c/technology@lemmy.world

1 comments fedilink

14

submitted 1 month ago by sunilkumardash9@lemmy.world to c/technology@lemmy.world

0 comments fedilink

12

submitted 1 month ago by sunilkumardash9@lemmy.world to c/technology@lemmy.world

0 comments fedilink

-7

submitted 2 months ago by sunilkumardash9@lemmy.world to c/technology@lemmy.world

3 comments fedilink

[–] sunilkumardash9@lemmy.world 0 points 2 months ago* (last edited 2 months ago) (1 children)

Have you tried doing the same?

-8

submitted 2 months ago by sunilkumardash9@lemmy.world to c/technology@lemmy.world

4 comments fedilink

Google has finally arrived

Some observations on the model

Gemini 2.5 pro is absolutely a beast in coding, perhaps the best model right now
They spent all the computing resources on training it on coding data and forgot to give it a distinct personality
Doesn't do well on reasoning as well as Grok 3 (think) and Claude 3.7 Sonnet (thinking)
On par with 03-mini-high in general mathematics

If you're a coder, you'll absolutely love it, or else you will be fine with other frontier reasoning models (Deepseek r1, if you ask me)

54

submitted 2 months ago* (last edited 2 months ago) by sunilkumardash9@lemmy.world to c/technology@lemmy.world

5 comments fedilink

Understands user intention better than before; I'd say it's better than Claude 3.7 Sonnet base and thinking. 3.5 is still better at this (perhaps the best)
Again, in raw quality code generation, it is better than 3.7, on par with 3.5, and sometimes better.
Great at reasoning, much better than any and all non-reasoning models available right now.
Better at the instruction following than 3,7 Sonnet but below 3.5 Sonnet.