DeepSeekMath-V2: Towards Self-Verifiable Mathematical Reasoning

victorbuilds • 2 months ago

Notable: they open-sourced the weights under Apache 2.0, unlike OpenAI and DeepMind whose IMO gold models are still proprietary.

PunchyHamster • 2 months ago

I think we should treat copyright for the weights the same way the AI companies treat source material ;)

littlestymaar • 2 months ago

We don't even have to do that: weights being entirely machine generated without human intervention, they are likely not copyrightable in the first place.

In fact, we should collectively refuse to abide to these fantasy license before weight copyrightability gets created out of thin air because it's been commonplace for long enough.

mitthrowaway2 • 2 months ago

There's an argument by which machine-learned neural network weights are a lossy compression of (as well as a smooth interpolator over) the training set.

An mp3 file is also a machine-generated lossy compression of a cd-quality .wav file, but it's clearly copyrightable.

To that extent, the main difference between a neural network and an .mp3 is that the mp3 compression cannot be used to interpolate between two copyrighted works to output something in the middle. This is, on the other hand, perhaps the most common use case for genAI, and it's actually tricky to get it to not output something "in the middle" (but also not impossible).

I think the copyright argument could really go either way here.

+1

littlestymaar • 2 months ago

larodi • 2 months ago

Of course we should! And everyone who says otherwise must be delusional or sort of a gaslighter, as this whole "innovation" (or remix (or comopression)) is enabled by the creative value of the source product. Given AI companies never ever respected this copyright, we should give them similar treatment.

SilverElfin • 2 months ago

If they open source just weights and not the training code and data, then it’s still proprietary.

ekianjo • 2 months ago

It's just open weights, the source has no place in this expression

mips_avatar • 2 months ago

Yeah but you can distill

littlestymaar • 2 months ago

You can distill closed weights models as well. (Just not logit-distillation)

mips_avatar • 2 months ago

Though it violates their terms of service

amelius • 2 months ago

Is that the equivalent of decompile?

c0balt • 2 months ago

No, that is the equivalent of lossy compression.

falcor84 • 2 months ago

Isn't that a bit like saying that if I open source a tool, but not a full compendium of all the code that I had read, which led me to develop it, then it's not really open source?

KaiserPro • 2 months ago

No its like releasing a binary. I can hook into it and its API and make it do other things. But I can't rebuild it from scratch.

+2

falcor84 • 2 months ago

exe34 • 2 months ago

"open source" as a verb is doing too much work here. are you proposing to release the human readable code or the object/machine code?

if it's the latter, it's not the source. it's free as in beer. not freedom.

falcor84 • 2 months ago

Yes, I 100% agree. Open Source is a lot more about not paying than about liberty.

This is exactly the tradeoff that we had made in the industry a couple of decades ago. We could have pushed all-in on Stallman's vision and the FSF's definition of Free Software, but we (collectively) decided that it's more important to get the practical benefits of having all these repos up there on GitHub and us not suing each other over copyright infringement. It's absolutely legitimate to say that we made the wrong choice, and I might agree, but a choice was made, and Open Source != Free Software.

https://www.gnu.org/philosophy/open-source-misses-the-point....

fragmede • 2 months ago

No. In that case, you're providing two things, a binary version of your tool, and the tool's source. That tool's source is available to inspect and build their own copy. However, given just the weights, we don't have the source, and can't inspect what alignment went into it. In the case of DeepSeek, we know they had to purposefully cause their model to consider Tiananmen Square something it shouldn't discuss. But without the source used to create the model, we don't know what else is lurking around inside the model.

+1

NitpickLawyer • 2 months ago

nextaccountic • 2 months ago

No, it's like saying that if you release under Apache license, it's not open source even though it's under an open source license

For something to be open source it needs to have sources released. Sources are the things in the preferred format to be edited. So the code used for training is obviously source (people can edit the training code to change something about the released weights). Also the training data, under the same rationale: people can select which data is used for training to change the weights

+2

falcor84 • 2 months ago

nurettin • 2 months ago

Is this a troll? They don't want to reproduce your open source code, they want to reproduce the weights.

+1

falcor84 • 2 months ago

amelius • 2 months ago

True. But the headline says open weights.

very_illiterate • 2 months ago

[flagged]

jimmydoe • 2 months ago

you are absolutely right. I'd rather use true closed models, not fake open source ones from China.

yorwba • 2 months ago

Previous discussion: https://news.ycombinator.com/item?id=46072786 218 points 3 days ago, 48 comments

victorbuilds • 2 months ago

Ah, missed that one. Thanks for the link.

ilmj8426 • 2 months ago

It's impressive to see how fast open-weights models are catching up in specialized domains like math and reasoning. I'm curious if anyone has tested this model for complex logic tasks in coding? Sometimes strong math performance correlates well with debugging or algorithm generation.

alansaber • 2 months ago

It makes complete sense to me: highly-specific models don't have much commercial value, and at-scale llm training favours generalism.

stingraycharles • 2 months ago

kimi-k2 is pretty decent at coding but it’s nowhere near the SOTA models of Anthropic/OpenAI/Google.

tripplyons • 2 months ago

Are you referring to the new reasoning version of Kimi K2?

stingraycharles • 2 months ago

This one => https://openrouter.ai/moonshotai/kimi-k2-thinking

WhitneyLand • 2 months ago

Shouldn’t there be a lot of skepticism here?

All the problems they claim to have solved are on are the Internet and they explicitly say they crawled them. They do not mention doing any benchmark decontamination or excluding 2024/2025 competition problems from training.

IIRC correctly OpenAI/Google did not have access to the 2025 problems before testing their experimental math models.

terespuwash • 2 months ago

Why isn’t OpenAI’s gold medal-winning model available to the public yet?

esafak • 2 months ago

'coz it was for advertisement. They'll roll their lessons into the next general purpose model.

letmetweakit • 2 months ago

Does anyone know if this will become available on OpenRouter?

simianwords • 2 months ago

A bit important that this model is not general purpose whereas the ones Google and OpenAI used were general purpose.

yorwba • 2 months ago

Both OpenAI and Google used models made specifically for the task, not their general-purpose products.

OpenAI: https://xcancel.com/alexwei_/status/1946477756738629827#m "we are releasing GPT-5 soon, and we’re excited for you to try it. But just to be clear: the IMO gold LLM is an experimental research model. We don’t plan to release anything with this level of math capability for several months."

DeepMind: https://deepmind.google/blog/advanced-version-of-gemini-with... "we additionally trained this version of Gemini on novel reinforcement learning techniques that can leverage more multi-step reasoning, problem-solving and theorem-proving data. We also provided Gemini with access to a curated corpus of high-quality solutions to mathematics problems, and added some general hints and tips on how to approach IMO problems to its instructions."

simianwords • 2 months ago

https://x.com/sama/status/1946569252296929727

>we achieved gold medal level performance on the 2025 IMO competition with a general-purpose reasoning system! to emphasize, this is an LLM doing math and not a specific formal math system; it is part of our main push towards general intelligence.

asterisks mine

yorwba • 2 months ago

DeepSeekMath-V2 is also an LLM doing math and not a specific formal math system. What interpretation of "general purpose" were you using where one of them is "general purpose" and the other isn't?

+1

simianwords • 2 months ago

Not true

mangolie • 2 months ago

https://x.com/deepseek_ai/status/1995452646459858977

Boom

andy12_ • 2 months ago

Do note that that is a different model. The one we are talking about here, DeepSeekMath-V2, is indeed overcooked with math RL. It's so eager to solve math problems, that it even comes up with random ones if you prompt it with "Hello".

https://x.com/AlpinDale/status/1994324943559852326?s=20

yorwba • 2 months ago

That's a different model: https://huggingface.co/deepseek-ai/DeepSeek-V3.2-Speciale

simianwords • 2 months ago

Oh you may be correct. Are these models general purpose or fine tuned for mathematics?

H8crilA • 2 months ago

How do you run this kind of a model at home? On a CPU on a machine that has about 1TB of RAM?

pixelpoet • 2 months ago

Wow, it's 690GB of downloaded data, so yeah, 1TB sounds about right. Not even my two Strix Halo machines paired can do this, damn.

Gracana • 2 months ago

You can do it slowly with ik_llama.cpp, lots of RAM, and one good GPU. Also regular llama.cpp, but the ik fork has some enhancements that make this sort of thing more tolerable.

bertili • 2 months ago

Two 512GB Mac Studios connected with thunderbolt 5.

sschueller • 2 months ago

How is OpenAI going to be able to serve ads in chatgpt without everyone immediately jumping ship to another model?

Coffeewine • 2 months ago

I suppose the hope is that they don’t, and we wind up with commodity frontier models from multiple providers at market rates.

miroljub • 2 months ago

I don't care about OpenAI even if they don't serve ads.

I can't trust any of their output until they become honest enough to change their name to CloseAI.

astrange • 2 months ago

ChatGPT is a website. There's nothing unusual about ads on a website.

People use Instagram too.

dist-epoch • 2 months ago

The same way people stayed on Google despite DuckDuckGo existing.

PunchyHamster • 2 months ago

by having datacenters with GPUs and API everyone uses.

So they are either earning money directly or on the API calls.

Now, competition can come and compete on that, but they will probably still be the first choice for foreseeable future

KeplerBoy • 2 months ago

Google served ads for decades and no one ever jumped ship to another search engine.

sschueller • 2 months ago

Because Google gave the best results for a long time.

PunchyHamster • 2 months ago

and now, when they are not, everyone else's results are also pretty terrible...

bootsmann • 2 months ago

They pay $30bn (more than OpenAIs lifetime revenue) each year to make sure noone does.

KeplerBoy • 2 months ago

What are you referring to?

+1

rzerowan • 2 months ago

LZ_Khan • 2 months ago

Don't they distill directly off OpenAI/Google outputs?

YouAreWRONGtoo • 2 months ago

[dead]

OBELISK_ASI • 2 months ago

[dead]

Jeff-Collins • 2 months ago

[dead]

Scott-David • 2 months ago

[dead]