DeepSeek-V3.2 released

cm0002@lemmy.world · 16 days ago

DeepSeek-V3.2 released

brucethemoose@lemmy.world · 16 days ago

With sparse attention, very interesting. It seems GQA is a thing of the past.

I especially love Deepseek’s ‘public research’ aspect: they trained this and Terminus the same way, so the attention schemes are (more-or-less) directly comparable. That’s awesome.

GLM 4.6 is reportedly about to drop too. Which is great, as 4.5 is without a doubt my daily driver now.

BroBot9000@lemmy.world · 16 days ago

New version of the propaganda machine dropped 🤦‍♂️

brucethemoose@lemmy.world · 16 days ago

Deepseek is only bad via the chat app, and whatever prefilter (or finetune?) they censor it with.

The model itself (via API or run locally) isn’t too bad, especially with a system prompt or completion syntax to squash refusals. Obviously there are CCP mandated gaps (which you can just add in via context), but it’s not as tankie as you’d think.

cm0002@lemmy.world · 16 days ago

Just ignore them on anything AI related, they are the polar opposite of the AI Tech Bros. Shitting on anything and everyone using AI in any form for anything

brucethemoose@lemmy.world · 16 days ago

…Or are they an LLM? I mean, the handle is BroBot, and the emojii makes me suspicious, lol.

BroBot9000@lemmy.world · 16 days ago

Yes now worship me and don’t forget to put glue in your pizza sauce.

BroBot9000@lemmy.world · edit-2 16 days ago

That’s the biggest compliment any Ai simp could give me. Thank you 😘

DeepSeek-V3.2 released

DeepSeek-V3.2 released

DeepSeek-V3.2 - a deepseek-ai Collection