Back

How AI assistance impacts the formation of coding skills

481 points8 daysanthropic.com
siliconc0w8 days ago

Good for them to design and publish this - I doubt you'd see anything like this from the other labs.

The loss of competency seems pretty obvious but it's good to have data. What is also interesting to me is that the AI assisted group accomplished the task a bit faster but it wasn't statistically significant. Which seems to align with other findings that AI can make you 'feel' like you're working faster but that perception isn't always matched by the reality. So you're trading learning and eroding competency for a productivity boost which isn't always there.

shimman8 days ago

It's research from a company that gains from selling said tools they researched. Why does it have to be repeated that this is a massive conflict of interests and until this "research" has been verified multiple times by parties with zero conflict of interests it's best to be highly skeptical of anything it claims?

This is up there with believing tobacco companies health "research" from the 30s, 40s, 50s, 60s, 70s, 80s, and 90s.

keeda7 days ago

I mean, they're literally pointing out the negative effects of AI-assisted coding?

> We found that using AI assistance led to a statistically significant decrease in mastery. On a quiz that covered concepts they’d used just a few minutes before, participants in the AI group scored 17% lower than those who coded by hand, or the equivalent of nearly two letter grades. Using AI sped up the task slightly, but this didn’t reach the threshold of statistical significance.

This also echoes other research from a few years ago that had similar findings: https://news.ycombinator.com/item?id=46822158

shimman7 days ago

Dude you falling for so obvious corpo-psyops is so sad. Tobacco companies literally published research that said cigarettes were dangerous too, that didn't stop them from lying to Congress and saying cigarettes weren't totally safe.

Some of you are the reason why there needs to be a new luddite movement (fun fact, the luddites were completely correct in their movements; they fought against oppressive factory owners that treated their fellow humans terrible, smashing the very same machines they used themselves. Entrepreneurs were literally ushering in a new hell on Earth where their factors were killing so many orphans (because many people refused to work in such places originally, until forced by dying in the streets or dying from their labor in such places) they had to ship the bodies of children across towns to not draw suspicion). Until the entrepreneurs started killing them and convincing the king reagent to kill them with the state, they had massive support. Support so high that when suspected luddites were escaping from the "police" you could hear entire towns cheering them on helping them escape).

People rightfully hate this stuff and you refuse to see, the evidence says it's terrible but hey let's still sell it anyway what's the worse that can happen?

keeda6 days ago

Well, this is what Anthropic's CEO told Congress in 2023, the message was not quite "AI is just peachy": https://www.judiciary.senate.gov/imo/media/doc/2023-07-26_-_...

Or here's his more recent statements on the potential disruption from AI: https://www.cnbc.com/2026/01/27/dario-amodei-warns-ai-cause-...

Anthropic is pretty much the only major frontier AI lab that keeps saying "AI is dangerous, we should proceed with caution." It sounds like you're in violent agreement.

If your stance is AI development should not be continued at all, well, the history of Luddites should tell you what happens when an economic force meets labor concerns in a Capitalistic world.

The genie is out of the bottle and there's no putting it back. Our only choices now are to figure out how to tame it, or YOLO it and FAFO.

godelski7 days ago

  > this is a massive conflict of interests
I think everyone is aware of this.

But people like that they aren't shying away from negative results and that builds some trust. Though let's not ignore that they're still suggesting AI + manual coding.

But honestly, this sample size is so small that we need larger studies. The results around what is effective and ineffective AI usage is a complete wash with n<8.

Also anyone else feel the paper is a bit sloppy?

I mean there's a bunch of minor things but Figure 17 (first fig in the appendix) is just kinda wild. I mean there's trivial ways to solve the glaring error. The more carefully you look at even just the figures in the paper the more you say "who the fuck wrote this?" I mean like how the fuck do you even generate Figure 12? The numbers align with the grids but boxes are shifted. And Figure 16 has experience levels shuffled for some reason. And then there are a hell of a lot more confusing stuff you'll see if you do more than a glance...

brookst8 days ago

I wish they had attempted to measure product management skill.

My hypothesis is that the AI users gained less in coding skill, but improved in spec/requirement writing skills.

But there’s no data, so it’s just my speculation. Intuitively, I think AI is shifting entry level programmers to focus on expressing requirements clearly, which may not be all that bad of a thing.

SJMG8 days ago

> I wish they had attempted to measure product management skill.

We're definitely getting better at writing specs. The issue is the labor bottleneck is competent senior engineers, not juniors, not PMs, not box-and-arrow staff engineers.

> I think AI is shifting entry level programmers to focus on expressing requirements clearly

This is what the TDD advocates were saying years ago.

empath758 days ago

What AI development has done for my team is the following:

Dramatically improved Jira usage -- better, more descriptive tickets with actionable user stories and clearly expressed requirements. Dramatically improved github PRs. Dramatically improved test coverage. Dramatically improved documentation, not just in code but in comments.

Basically all _for free_, while at the same time probably doubling or tripling our pace at closing issues, including some issues in our backlog that had lingered for months because they were annoying and nobody felt like working on them, but were easy for claude to knock out.

WD-428 days ago

I'd be willing to bet that your AI written issues, docs, etc look impressive initially but are extremely low signal to noise. You might be checking some boxes (docstrings, etc) but I do not envy anyone on your team that needs to actually read any of that stuff in the future to solve an actual problem.

thunky6 days ago

Right because developers are famous for their 100% perfect hand-crafted docs.

theshrike795 days ago

I keep describing this as the environmental protection meme, "but what if we make the world a better place - for nothing!"

Even if AI goes away tomorrow, we'll still have better tooling, documentation and processes just because we HAD to implement them to use AIs more efficiently.

Jensson8 days ago

> Dramatically improved Jira usage -- better, more descriptive tickets with actionable user stories and clearly expressed requirements. Dramatically improved github PRs. Dramatically improved test coverage. Dramatically improved documentation, not just in code but in comments.

> Basically all _for free_

Not for free, the cost is that all of those are now written by AI so not really vetted any longer. Or do you really think your team is just using AI for code?

AstroBen8 days ago

Interestingly if you look at the breakdown by years of experience, it shows the 1-3 year junior group being faster, 4+ years no difference

I wonder if we're going to have a future where the juniors never gain the skills and experience to work well by themselves, and instead become entirely reliant on AI, assuming that's the only way

pesus8 days ago

I think we're going to see a small minority of juniors who managed to ignore the hype/peer pressure/easy path and actually learned to code have a huge advantage over the others.

DrewADesign7 days ago

Which isn’t saying much if efficiency gains tank the demand for developers, which will then tank everybody’s salary. The actual efficiency gains are debatable, but even if we’re talking about a 20% gain, that could be a few FTEs for a small team.

cal_dent8 days ago

Anthropic's way into regulatory capture seems to be to pretend they're the benevolent adults in the room. It'll probably work too.

austin-cheney8 days ago

I agree with the Ray Dalio perspective on this. AI is not a creative force. It is only a different form of automation. So, the only value to AI is to get to know your habits. As an example have it write test cases in your code style so you don't have to. That is it.

If you sucked before using AI you are going to suck with AI. The compounded problem there is that you won't see just how bad you suck at what you do, because AI will obscure your perspective through its output, like an echo chamber of stupid. You are just going to suck much faster and feel better about it. Think of it as steroids for Dunning-Kruger.

https://www.youtube.com/shorts/0LeJ6xn35gc

https://www.youtube.com/shorts/vXecG_KajLI

psyclobe7 days ago

> If you sucked before using AI > you are going to suck with AI

This.

whattheheckheck7 days ago

Its 2026 and people still post this. Instead of an upvote?

epolanski8 days ago

> The loss of competency seems pretty obvious but it's good to have data

That's not what the study says. It says that most users reflect your statement while there is a smaller % that benefits and learns more and faster.

Generalizations are extremely dangerous.

What the article says simply reflect that most people don't care that much and default to the path of least resistance, which is common every day knowledge, but we very well know this does not apply to everyone.

AstroBen8 days ago

Relevant quote from their conclusion:

> Among participants who use AI, we find a stark divide in skill formation outcomes between high-scoring interaction patterns (65%-86% quiz score) vs low-scoring interaction patterns (24%-39% quiz score). The high scorers only asked AI conceptual questions instead of code generation or asked for explanations to accompany generated code; these usage patterns demonstrate a high level of cognitive engagement.

This is very much my experience. AI is incredibly useful as a personal tutor

rienbdj8 days ago

Yes. I love using AI for the “where do I even start” type questions. The once I’ve had a discussion about various approaches I know what docs to actually look at and I can start thinking about implementation details. I don’t find AI very useful for generating code (weird position I know).

nottorp7 days ago

Why weird? I share this position.

The LLMs have been trained on countless introductory tutorials for most popular topics, so they will provide you with a reasonable one.

Ad and friction free for now.

Enjoy it while it lasts.

pxc7 days ago

This is also how I use LLMs at work. I have some vague worries because I'm told this is outdated, I'm falling behind, etc. I'm doing it this way in part hecause my employer is a big, old, slow company and experienting with other kinds of "AI" tools is virtually impossible. But I think it's really more my style.

ambicapter8 days ago

A personal tutor who you remain skeptical of, and constantly try to disprove in order to perfect your understanding.

+1
marcosdumay8 days ago
epolanski8 days ago

I see it more of a replacement for Google and digging GitHub issues. It can also replace chats for 80% of questions.

Not much as a tutor.

SJMG8 days ago

> there is a smaller % that benefits and learns more and faster

That's not what the study says nor it is capable of credibly making that claim. You are reasoning about individuals in an RCT where subjects did not serve as their own control. The high performers in the treatment group may have done even better had they been in the control and AI is in fact is slowing them down.

You don't know which is true because you can't know because of the study design. This is why we have statistics.

epolanski7 days ago

So you don't doubt their conclusion that most sucked by using AI, but you doubt that they found that some learned more?

+1
SJMG7 days ago
FitchApps8 days ago

This is all wonderful and all but what happens when these tools aren't available - you lose internet connection or the agent is misconfigured or you simply ran out of credits. How would someone support their business / software / livelihood? First, the agents would take our software writing tasks then they encroach on CI/CD and release process and take over from there...

Now, imagine a scenario of a typical SWE in todays or maybe not-so-distant future: the agents build your software, you simply a gate-keeper/prompt engineer, all tests pass, you're now doing a production deployment at 12am and something happens but your agents are down. At that point, what do you do if you haven't build or even deployed the system? You're like a L1 support at this point, pretty useless and clueless when it comes to fully understanding and supporting the application .

esperent8 days ago

I've had a fairly long career as a web dev. When I started, I used to be finicky about configuring my dev environment so that if the internet went down I could still do some kind of work. But over time, partly as I worked on bigger projects and partly as the industry changed, that became infeasible.

So you know what do, what I've been doing for about a decade, if the internet goes down? I stop working. And over that time I've worked in many places around the world, developing countries, tropical islands, small huts on remote mountains. And I've lost maybe a day of work because of connectivity issues. I've been deep in a rainforest during a monsoon and still had 4g connection.

If Anthropic goes down I can switch to Gemini. If I run out of credits (people use credits? I only use a monthly subscription) then I can find enough free credits around to get some basic work done. Increasingly, I could run a local model that would be good enough for some things and that'll become even better in the future. So no, I don't think these are any kind of valid arguments. Everyone relies on online services for their work these days, for banking, messaging, office work, etc. If there's some kind of catastrophe that breaks this, we're all screwed, not just the coders who rely on LLMs.

nzealand8 days ago

> I've worked in many places around the world, developing countries, tropical islands, small huts on remote mountains

I am genuinely curious about your work lifestyle.

The freedom to travel anywhere while working sounds awesome.

The ability to work anywhere while traveling sounds less so.

mikestorrent8 days ago

It does sound like a wonderful life... but if you want to have a family, you'll need to put down roots somewhere. I know a nomad who ended up doing this in Mexico - he'd never have guessed it years prior - and is super happy. So maybe, as a way of finding the country you're "meant" to live in, it's a nice approach. I think it's a younger person's game, though.

esperent7 days ago

Well we did put down roots after a few years, or at least we have for for a while (me and my partner). We'll probably get the travel bug again.

We don't have or want children but I do know people who do this with families. There's an amazing community called world schooling where people travel and arrange a month in some beautiful place around the world with other families. They'll organize teachers and activities for children and make friends with the other parents.

I've met quite a few of them - the immediate assumption people will jump to is that they must be rich. But that's not the case, they're just normal people who love to travel and have jobs that can facilitate that. And the children I've met seem happy and well adjusted.

xeromal8 days ago

There's a whole movement that does this.

https://digitalnomads.world/

LtWorf8 days ago

It means having no friends.

+1
exe348 days ago
trillic7 days ago

People that stay put are no friends of mine. I have a remote job and travelled 20 weeks last year, all to do my sport with friends. Most of us have remote jobs or are FIRE’d already.

Retric8 days ago

Meanwhile I’ve lost roughly a month from internet issues. My guess is you’re experience was unusual enough you felt the need to component where most developers who where less lucky or just remember more issues didn’t.

rglullis8 days ago

> Meanwhile I’ve lost roughly a month from internet issues.

If you tell me "I lost internet at home and couldn't work there", it's one thing. But that you simply went about a month without internet connection, I find it hard to believe.

Retric8 days ago

It’s not a single continuous stretch of one month, I’m probably significantly older than you, and I’ve lost access to critical services because data centers have had issues not just myself.

Hell, on Tuesday I lost ~2 hours because Starlink was having some issue. When it came up I was on a different ground station and getting very low speeds. Not such a big deal except you never get that time back.

esperent7 days ago

How much of that was in the last ten years? And do you make any attempt to have a backup system (phone hotspot, for example)?

Retric7 days ago

Last 10 years has been about average. I’ve used a phone hotspot some but it’s often not an option. My prior company wanted a really locked down setup on their systems. WFH required a fixed IP address for some god forsaken reason.

bheadmaster8 days ago

> people use credits? I only use a monthly subscription

Those still have limits, no? Or if there's a subscription that provides limitless access, please tell me which one it is.

embedding-shape8 days ago

I've been on ChatGPT Pro plan since introduced, and also used codex-rs since it was made public, never hit a limit. Came close last week, not sure if the limits were recently introduced or there always was but they got lowered, but I think that's as close to "unlimited" as you can get without running your own inference.

I've tried Anthropic's Max plan before, but hit limits after just a couple of hours, same with Google's stuff, but wasn't doing anything radically different when I tried those, compared with Codex, so seems other's limits are way lower.

+1
bheadmaster8 days ago
esperent7 days ago

I finally bit the bullet and got a $200 Claude subscription last month. It's been a busy month and I've used it a lot. More than is healthy, more than I sustainably could for more than a few weeks. I've managed to hit a 5 hour limit exactly once (20 minutes before it refreshed) and I've never hit more than 80% of a weekly limit.

But if I did - and I could imagine having some specific highly parallelizable work like writing a bazillion unit tests where I send out 40 subagents at a time - then the solution would be to buy two subscriptions. Not switch to API billing.

+1
bheadmaster7 days ago
Xfx70288 days ago

And here am I thinking that my life depends too much on the internet and the knowledge you can find on it. So if something big/extreme happens like nuclear war, major internet outage etc, I know nothing. No recipes, so basic medical stuff, like how to use antibiotics, electronics knowledge, whatever. I don't have any books with stuff like that like my parents used to. I have seen some examples of backed up Wikipedia for offline usage and local llms etc and am thinking of implementing something as a precaution for these extreme events.

cynicalpeace8 days ago

That's a very different problem than OP

You should keep physical books, food, and medication for a SHTF scenario

"Back to Basics", "Where There Is No Doctor" and the Bible are my SHTF books

You won't be coding in a SHTF scenario.

lmc8 days ago

> And over that time I've worked in many places around the world, developing countries, tropical islands, small huts on remote mountains. And I've lost maybe a day of work because of connectivity issues. I've been deep in a rainforest during a monsoon and still had 4g connection.

cries on a Bavarian train

esperent7 days ago

If it's any consolation, Bavaria is a beautiful part of the world that's up there with any tropical island or rainforest. I hope to visit again sometime.

lmc7 days ago

Ha, true :-)

alt1878 days ago

Now I wonder, how has this become infeasible exactly?

zahlman8 days ago

I consider it more or less immoral to be expected to use the Internet for anything other than retrieving information from others or voluntarily sharing information with others. The idea that a dev environment should even require finicky configuration to allow for productive work sans Internet appalls me. I should only have to connect in order to push to / pull from origin, deploy something or acquire build tools / dependencies, which should be cached locally and rarely require any kind of update.

raw_anon_11118 days ago

Do you know how many times since 1999 I have had my work Internet go down? Definitely not enough to spend time worrying about it. The world didn’t stop.

In 2022, funny enough I was at an AWS office (I worked remotely when I worked there) working in ProServe, us-east-1 was having issues that was affecting everything, guess what we all did? Stopped working, the world didn’t come to an end.

Even now that I work from home, on the rare occasions that Internet goes down, I just use my phone if I need to take a Zoom call.

+1
zahlman8 days ago
b_t_s8 days ago

Same thing you do if AWS goes down. Same thing we used to do back in the desktop days when the power went out. Heck one day before WFH was common we all got the afternoon off 'cause the toilets were busted and they couldn't keep 100 people in an office with no toilets. Stuff happens. And if that's really not acceptable, you invest in solutions with the understanding that you're dumping a lot of cash into inefficient solutions for rare problems.

pixl978 days ago

Ya, I will say the argument isn't much different than "what happens if there is no gas for your tractor".

drunkdora8 days ago

i think its more like what if ur gps isnt working but you're just supposed to drive down the block

jillesvangurp8 days ago

Why wouldn't these tools be available suddenly? Once you answer the question, the challenge then becomes mitigating that situation rather than doing things the old way. Like having backup systems, SLAs from network and other providers, etc.

Actually, the last thing you probably want is somebody reverting back to doing things the way we did them 20 years ago and creating a big mess. Much easier to just declare an outage and deal with it properly according to some emergency plan (you do have one, right?).

CI/CD are relatively new actually. I remember doing that stuff by hand. I.e. I compiled our system on my Desktop system, created a zip file, and then me and our operations department would use an ISDN line to upload the zip file to the server and "deploy" it by unzipping it and restarting the server. That's only 23 years ago. We had a Hudson server somewhere but it had no access to our customer infrastructure. There was no cloud.

I can still do that stuff if I need to (and I sometimes do ;-) ). But I wouldn't dream of messing with a modern production setup like that. We have CI/CD for a reason. What if CI/CD were to break? I'd fix it rather than adding to the problem by panicking and doing things manually.

reycharles8 days ago

> Why wouldn't these tools be available suddenly?

Take a look at how ridiculously much money is invested in these tools and the companies behind them. Those investments expect a return somehow.

vineyardmike8 days ago

The models are already made. They can just run the very useful models they have indefinitely, and they’d be profitable. Or when they go under someone else can buy the rights to the weights.

Anthropic, a common coding model provider, has said that their models generate enough cash to cover their own training costs before the next one is released. If they stopped getting massive investments, they should be able to coast with the models they have.

jillesvangurp8 days ago

I look at this as cost savings waiting to happen. Nvidia extorts companies to the extent of tens of thousands for a GPU. Somebody's going to undercut them. At the same time, people are working on optimizations as well. Using cheap CPUs for inference instead of expensive GPUs. Doesn't work for anything but if your model is small enough you can get away with it. Using lower bit quantization makes the models cheaper to run. Using hacks like prompt caching makes subsequent calls more efficient. Etc.

Your base assumption is that it is expensive and therefore these companies will eventually fail when they keep on making less money than they are spending. The reality is that they are indeed spending enormously now and making a lot of very non linear progress. At the same time a lot of that stuff is being widely published and quite a lot of it is open source. At some point you might get consolidation and maybe some companies indeed don't make it. But their core tech will survive. Investors might be crying in a corner. But that won't stop people from continuing to use the tech in some form or another.

I already have a laptop that can some modestly largish models locally. I'm not going to spend 40K or whatever on something that can run a GPT 5 class model. But it's not going to cost that in a few years either. This tech is here to stay. We might pay more or less for it. The current state is the worst it is ever going to be. It's going to be faster, bigger, better, cheaper, more useful, etc. At some point the curves flatten and people might start paying attention to cost more. Maybe don't burn a lot of gas in expensive and inefficient gas generators (as opposed to more efficient gas power plants) and maybe use cheap wind/solar instead. Maybe get some GPUs from a different vendor at a lower price? Maybe take a look at algorithm efficiencies, etc. There is a lot of room for optimization in this market. IMHO surviving companies will be making billions, will be running stuff at scale, and will be highly profitable.

Maybe some investors won't get their money back. Shit happens. That's why it's called venture capital. The web bubble bursting didn't kill the web either.

i_am_proteus8 days ago

I am not convinced of the wonderfulness, because the study implies that AI does not improve task completion time but does reduce programmer's comprehension when using a new library.

raw_anon_11118 days ago

Yes instead I am suppose to understand the library I use the most boto3?

https://boto3.amazonaws.com/v1/documentation/api/latest/inde...

I don’t need to comprehend “the library”. I need to know what I need to do and then look up the API call.

DesaiAshu8 days ago

On device models (deepseek-coder, etc) are very good // better than the old way of using stack overflow on the internet. I have been quite productive on long haul flights without internet!

You're an engineer, your goal is to figure stuff out using the best tools in front of you

Humans are resilient, they reliably perform (and throw great parties) in all sorts of chaotic conditions. Perhaps the thing that separates us most from AI is our ability to bring out our best selves when baseline conditions worsen

Gallows45748 days ago

I know this gets asked all the time, but what is your preferred workflow when using local models? I was pretty deep into it early on, with Tabby and Continue.dev, but once I started using Claude Code with Opus it was hard to go back. I do the same as you, and still use them on flights and whatnot, but I think my implementation could be improved.

Bnjoroge8 days ago

on-device models are still a tier or two below most frontier models(really opus 4.5).

dham8 days ago

The tools are going to ~zero (~ 5 years). The open source LLM's are here. No one can put them back or take them down. No internet, no problem. I don't see a long term future in frontier llm companies.

Sevii8 days ago

What I don't get is, how are these free LLMs getting funded? Who is paying $20-100 million to create an open weights LLM? Long term why would they keep doing it?

dham8 days ago

I see what you're saying, but it doesn't matter that much in the long run. If everything stopped right now, the state-of-the-art open source models can still solve a lot of problems. They may never solve coding, per se, but they're good enough.

direwolf208 days ago

Billionaires trying to hurt each other. Facebook released LLaMa hoping to hasten OpenAI's bankruptcy.

+1
LtWorf8 days ago
direwolf208 days ago

Do you mean the open binary LLMs, or did you find the secret training data and the random seed for LLaMa?

light_hue_18 days ago

This is the argument that people used to fight against rich customized IDEs like emacs for decades. What if you need to ssh into a machine that only has baseline vi in an emergency?

I'll happily optimize my life for 99.999% of the time.

If the Internet is down for a long time, I've got bigger problems anyway. Like finding food.

17186274408 days ago

> If the Internet is down for a long time, I've got bigger problems anyway.

I don't know about you, but I don't connect to the internet most of the time, and it makes more productive, not less.

t_mahmood8 days ago

Yeah! I use JetBrains AI assistant sometimes, which suddenly showing only blank window, nothing else. So, not getting anything out of it. But I can see my credits are being spent!

IF I was totally dependent on it, I would be in trouble. Fortunately I am not.

raw_anon_11118 days ago

What good would being able to “build my software” without internet access unless I’m building software for a disconnected desktop? Exactly what am I going to do with it? How am I going to get to my servers?

zahlman7 days ago

> unless I’m building software for a disconnected desktop?

... Why wouldn't you build software that works there?

As I understand things, the purpose of computers is to run software.

But more importantly, let's suppose your software does require an Internet connection to function.

Why should that imply a requirement for your development environment to have one?

Why should that imply a requirement for a code generation tool to have one?

raw_anon_11117 days ago

Because to a first approximation, no one wants desktop software, maintenance is a pain, it’s a pain to distribute across a large organization and people want to use the same app across devices and no one will pay me for it.

> But more importantly, let's suppose your software does require an Internet connection to function.

Because I have been able to depend on “fast” internet since 2000 both at home and at work, just like I’ve been able to depend on a compiler since 1992? There is nothing so important that can’t wait in the rare chance that internet goes out.

> Why should that imply a requirement for a code generation tool to have one

Because I don’t want to spend thousands of dollars to run a frontier model locally when I can spend $20/month and codex is included with my ChatGPT subscription?

+1
zahlman7 days ago
akomtu8 days ago

Or your business gets flagged by an automated system for dubious reasons with no way to appeal. It's the old story of big tech: they pretend to be on your side first, but their motives are nefarious.

darkhorse2227 days ago

People used to and still say the same thing about GPS. As these systems mature they stay up and become incorporated into our workflows. The implication in the case of GPS was that navigating on your own is not a very critical task anymore. Correspondingly the implication here is that software design and feature design are more important than coding or technical implementation. Similar to Google, it's more important that you know how and what to ask for rather than be able to generate it yourself.

RA_Fisher8 days ago

That reminds me of when teachers would say: what if you're without a calculator? And yet we all have smartphones in our pockets today with calculators.

palmotea8 days ago

> That reminds me of when teachers would say: what if you're without a calculator? And yet we all have smartphones in our pockets today with calculators.

Your teachers had the right goal, but a bad argument. Learning arithmetic isn't just about being able to do a calculation. It's about getting your brain comfortable with math. If you always have to pull out a goddamn calculator, you'll be extremely limited.

Trust me, elementary-age me was dumb to not listen to those teachers and to become so calculator-dependent.

RA_Fisher5 days ago

I think learning arithmetic is a good idea, but it’s only a part of computation. I don’t think we should get too hung up on a particular method of computation (bc there’s so many ways).

fatherwavelet7 days ago

Certain subjects we treat as if one has to learn woodworking before taking violin lessons.

We just really underestimate sentimentality in our society because it doesn't fit our self conception.

RA_Fisher5 days ago

Very fair. I think even more we underestimate our own sentimentalities. eg- the teacher that believes adding or multiplication has to be done a particular way (like the standard algorithm vs. partial products).

davidmurdoch8 days ago

Having a deep intuition about what the calculator is doing is the skill we were actually being taught. Teachers don't know always understand why things are being taught.

17186274408 days ago

> Teachers don't know always understand why things are being taught.

Yes, but I don't think that is the actual bottleneck, even when they do, most children probably don't care about abstract goals, but rather about immediate skills in their everyday life, or just the statement, that they will need it.

davidmurdoch8 days ago

I guess I'm just trying to suggest that teachers sometimes might think they know why things are being taught, and make claims like "you wont always have a calculator" as the reason for learning mathematics.

One conclusion might be that it'd be better for some students if teachers understood the why, as they might change their approach on some subjects. An example: knowing that certain equations and patterns EXIST, and which kinds of problems they apply to, is generally much more important that knowing the actual equations by heart themselves.

fatherwavelet7 days ago

"You can't begin to paint until you have learned to stretch canvas by hand like the old masters.

What if one day you couldn't just go to the art supply store and buy a pre-stretched canvas?

It is all besides the point anyway. You are going to learn to stretch canvas by hand first because that is what my teacher made me do!"

RA_Fisher2 days ago

I’m a fan of using a variety of methods to teach, no issue with that. My issue is with teachers that don’t admit how the world is changing. Dinosaurs.

17186274408 days ago

And yet calculating your shopping expenses to prevent getting screwed by buggy vending machines, or quickly making rough estimations at your work, is as useful as ever. Tell me how you can learn calculus and group theory, when you skipped primary school math.

Kiboneu8 days ago

It’s like with most programmers today having forgotten assembly. If their compiler breaks, what are they going to do?!

(I jest a bit, actually agree since turning assembly->compiled code is a tighter problem space than requirements in natural language->code)

ambicapter8 days ago

What a grossly disingenuous comparison.

Kiboneu8 days ago

Read the second line. If you can't generalize then I can't help you. Have good faith (and obtain a sense of humor).

+1
ambicapter8 days ago
wodenokoto8 days ago

The stack overflow era wasn’t that long ago and none of us could write a library call without consulting online sources.

You are at least a decade late to post fears about developers reliance on the internet. It was complete well before the LLM era

17186274408 days ago

> none of us could write a library call without consulting online sources.

I use SO quite often, but it is for questions I would otherwise consult other people, because I can't figure it out short of reverse-engineering something. For actual documentation man pages and info documents are pretty awesome. Honestly I dread leaving the world of libraries shipped with my OS vendor, because the quality of documentation drops fast.

wizzwizz48 days ago

I rely on the internet just as much as the rest of you. When that goes down, I crack out man pages, and the local copy of the documentation I can build from source code comments, and (after a 5-minute delay while I figure out how to do that) I'm back to programming. I'm probably half as quick, but I'm also learning more (speeding me up when the internet does come back on), so overall it's not actually time lost.

raw_anon_11118 days ago

Or I can just take a break, go to the gym downstairs, etc …

Before you go on about kids these days, my first time coding was on an Apple //e in assembly.

bigbuppo8 days ago

Well, you're supposed to pay for the Platinum Pro Gold Deluxe package which includes priority support with an SLA so that six months down the road you get a one month credit for the outage that destroyed your business.

seanmcdirmid8 days ago

I invested in a beefy laptop that can run Qwen Coder locally and it works pretty good. I really think local models are the future, you don’t have to worry about credits or internet access so much.

jimmaswell8 days ago

What are the specs, and how does it compare to Copilot or GPT Codex?

seanmcdirmid7 days ago

You can check out https://www.reddit.com/r/LocalLLaMA/comments/1piq11p/mac_wit... for a sentiment of usefulness and the specs of the machines running it. It will be some variation of Max or Ultra level Apple silicon, and around 64GB or more RAM. Oh, and an HN submission from 9 months ago: https://news.ycombinator.com/item?id=43856489

Copilot comparison:

Intelligence: Qwen2.5-Coder-32B is widely considered the first open-source model to reach GPT-4o and Claude 3.5 Sonnet levels of coding proficiency. While Copilot (using GPT-4o) remains highly reliable, Qwen often produces more concise code and can outperform cloud models in specific tasks like code repair.

Latency: Local execution on an M3 Max provides near-zero network latency, resulting in faster "start-to-type" responses than Copilot, which must round-trip to the cloud.

Reliability: Copilot is an all-in-one "vibe" that integrates deeply into VS Code. Qwen requires local tools like Ollama or MLX-LM and a plugin like Continue.dev to achieve the same UX.

GPT-Codex:

Intelligence & Reasoning: In recent 2025–2026 benchmarks, the Qwen3-Coder series has emerged as the strongest open-source performer, matching the "pass@5" resolution rates of flagship models like GPT-5-High. While OpenAI’s latest GPT-5.1-Codex-Max remains the overall leader in complex, project-wide autonomous engineering, Qwen is frequently cited as the better choice for local, file-specific logic.

Architecture & Efficiency: OpenAI models like GPT-OSS-20b (a Mixture-of-Experts model) are optimized for extreme speed and tool-calling. However, the M3 Max with 64GB is powerful enough to run the Qwen3-Coder-30B or 32B models at full fidelity, which provides superior logic to OpenAI's smaller "mini" or "OSS" models.

Context Window: Qwen models offer substantial context (up to 128K–256K tokens), which is comparable to OpenAI’s specialized Codex variants. This allows you to process entire modules locally without the high per-token cost of sending that data to OpenAI's servers.

cyanydeez8 days ago

I think you laid out why so much mobey is being pressed into this: its digital crack and if they can addict enough businesses, they have subscription moats. Oraclification.

psyclobe7 days ago

It’s kinda scary actually. After getting used to ai doing all the work, doing it yourself again is like using a toilet without a bidet.

empath758 days ago

> - you lose internet connection or the agent is misconfigured or you simply ran out of credits.

What happens when github goes down. You shrug and take a long lunch.

newsoftheday8 days ago

When GitHub goes down? I keep working, that's the point of a distributed version control system.

17186274408 days ago

Yes, and when you do want to share with your colleagues `git push /media/user/usb` takes a few seconds and plugging an Ethernet cable into both computers and disabling ufw takes a few minutes (when you need to find a cable first).

blub8 days ago

Losing connectivity is a non-issue because it will come back soon enough absent some global event. The realistic risks are rather:

* all services are run at a loss and they increase price to the point the corp doesn’t want to pay for everyone any more.

* it turns out that our chats are used for corporate espionage and the corps get spooked and cut access

* some dispute between EU and US happens and they cut our access.

The solution’s having EU and local models.

bathwaterpizza6 days ago

Buy a Mac and run a local model that's likely good enough

giancarlostoro8 days ago

> This is all wonderful and all but what happens when these tools aren't available - you lose internet connection or the agent is misconfigured or you simply ran out of credits. How would someone support their business / software / livelihood?

This is why I suggest developers use the free time they gain back writing documentation for their software (preferably in your own words not just AI slop), reading official docs, sharpening your sword, learning design patterns more thoroughly. The more you know about the code / how to code, the more you can guide the model to pick a better route for a solution.

FitchApps8 days ago

I'm seeing things that are seriously alarming though. Claude can now write better documentation and document things 95% there (we're building a set of MCP tools and API end-points for a large enterprise..) - Claude is already either writing code or fixing bugs or suggesting fixes. We have a PM, who has access to both React and API projects, on our team who saw one of the services return 500; they used Claude to pinpoint the bug to exact database call and suggest a fix. So now, it's quite common for PMs to not only post bugs but also "suggested fixes" from the agents. In a not so distant future, developers here will be simply redundant since PM can just use Claude to code and support the entire app. Right now, they still rely on us for support and deployments but that could go away too.

Bnjoroge8 days ago

PMs could have chosen to do this before, though. Sure, LLMs obviously empower them but the main reason you have developers is to have someone to be accountable to, and they thus have to be extra careful and thoughtful about the code they write. The PMs could come up with adhoc fixes but unless they're also willing to be on the hook for the code, then it's not terribly useful organizationally imo

giancarlostoro7 days ago

Sure Claude can write better docs but if you dont write the documentation yourself you wont fully know the codebase. I would argue write docs and then have Claude critique it. Then adjust.

beepbooptheory8 days ago

This doesn't really seem to be the point? Op is being prescriptive, talking about what we should do, not about what could be done.

Apply to anything else: you could eat out at restaurants every night, and it would do a great job in feeding you! Think of all the productivity you would gain relying on agential chefs. With restaurants even I can eat like a French chef, they have truly democratized food. And they do a perfect job these days executing dishes, only some mistakes.

giancarlostoro8 days ago

I do love restaurants you're really reading right through me haha

exe348 days ago

these chefs will only pour bleach in your food once in a while!

33718 days ago

Well, if they make the decision to accept the suggestion and it's wrong, that's on them. But if you do, that's on you. LLM? How can your boss blame the LLM? Like yelling at it?

giancarlostoro8 days ago

This is the key factor. Sure you can ask an LLM to take the place of a professional medical doctor, but that's on you if you wind up making yourself worse because you didn't seek a professional. That PM would be fired if the code did not work out.

luxcem8 days ago

At some point it will get treated like infrastructure, what a typical SWE is doing when cloudfare is broken or AWS is down.

newsoftheday8 days ago

At most places I've worked, we can still get things done when AWS/GCP/Azure/OCI are down. For my own selfhosted work, I'm more self-reliant. But I'm aware there are some companies who do 100% of their work within AWS/GCP/Azure/OCI and are probably 100% down when they go down. That's a consequence of how they decided to architect their apps, services and infrastructure.

direwolf208 days ago

How would you answer the same question about water or electricity?

Your pizza restaurant is all wonderful and all but what happens when the continual supply of power to the freezer breaks? How will you run your restaurant then?

greenie_beans8 days ago

> This is all wonderful and all but what happens when these tools aren't available - you lose internet connection or the agent is misconfigured or you simply ran out of credits.

i would work on the hundreds of non-coding tasks that i need to do. or just not work?

what do you do when github actions goes down?

LtWorf8 days ago

Don't rely solely on github actions?

greenie_beans8 days ago

it's only an example for a rhetorical question

appsoftware8 days ago

I think this is where current senior engineers have an advantage, like I felt when I was a junior that the older guys had an advantage in understanding the low level stuff like assembly and hardware. But software keeps moving forward - my lack of time coding assembly by hand has never hindered my career. People will learn what they need to learn to be productive. When AI stops working in a given situation, people will learn the low level detail as they need to. When I was a junior I learned a couple of languages in depth, but everything since has been top down, learn-as-i-need to. I don't remember everything I've learned over 20 years software engineering, and the forgetting started way before my use of AI. It's true that conceptual understanding is necessary, but everyone's acting like all human coders are better than all AI's, and that is not the case. Poorly architected, spaghetti code existed way before LLM's.

lelanthran8 days ago

> But software keeps moving forward - my lack of time coding assembly by hand has never hindered my career.

Well, yeah. You were still (presumably) debugging the code you did write in the higher level language.

The linked article makes it very clear that the largest decline was in problem solving (debugging). The juniors starting with AI today are most definitely not going to do that problem-solving on their own.

ekidd8 days ago

I want to compliment Anthropic for doing this research and publishing it.

One of my advantages(?) when it comes to using AI is that I've been the "debugger of last resort" for other people's code for over 20 years now. I've found and fixed compiler code generation bugs that were breaking application code. I'm used to working in teams and to delegating lots of code creation to teammates.

And frankly, I've reached a point where I don't want to be an expert in the JavaScript ORM of the month. It will fall out of fashion in 2 years anyway. And if it suddenly breaks in old code, I'll learn what I need to fix it. In the meantime, I need to know enough to code review it, and to thoroughly understand any potential security issues. That's it. Similarly, I just had Claude convert a bunch of Rust projects from anyhow to miette, and I definitely couldn't pass a quiz on miette. I'm OK with this.

I still develop deep expertise in brand new stuff, but I do so strategically. Does it offer a lot of leverage? Will people still be using it on greenfield projects next year? Then I'm going to learn it.

So at the current state of tech, Claude basically allows me to spend my learning strategically. I know the basics cold, and I learn the new stuff that matters.

beej718 days ago

> my lack of time coding assembly by hand has never hindered my career.

I'd kinda like to see this measured. It's obviously not the assembly that matters for nine-9s of jobs. (I used assembly language exactly one time in my career, and that was three lines of inline in 2003.) But you develop a certain set of problem-solving skills when you code assembly. I speculate, like with most problem-solving skills, it has an impact on your overall ability and performance. Put another way, I assert nobody is worse for having learned it, so the only remaining question is, is it neutral?

> everyone's acting like all human coders are better than all AI's

I feel like the sentiment here on HN is that LLMs are better than all novices. But human coders with actual logical and architectural skills are better than LLMs. Even the super-duper AI enthusiasts talk about controlling hoards of LLMs doing their bidding--not the other way around.

direwolf208 days ago

Being able to read assembly has helped me debug. You don't have to write it but you have to be able to write it. The same applies to manual transmissions and pocket calculators.

webdevver8 days ago

thats fair enough but reading assembly is such a pain in the ass... it was exciting for the first 10 minutes of my life, but now, if i ever got to that point, i will 100% copy-paste the listing to chatgpt with "hey, can you see anything sketchy?"

omnicognate8 days ago

An important aspect of this for professional programmers is that learning is not something that happens as a beginner, student or "junior" and then stops. The job is learning, and after 25 years of doing it I learn more per day than ever.

cyclotron3k8 days ago

I've reached a steady state where the rate of learning matches the rate of forgetting

sph8 days ago

How old are you? At 39 (20 years of professional experience) I've forgotten more things in this field than I'm comfortable with today. I find it a bit sad that I've completely lost my Win32 reverse engineering skills I had in my teens, which have been replaced by nonsense like Kubernetes and aligning content with CSS Grid.

And I must admit my appetite in learning new technologies has lessened dramatically in the past decade; to be fair, it gets to a point that most new ideas are just rehashing of older ones. When you know half a dozen programming languages or web frameworks, the next one takes you a couple hours to get comfortable with.

doix8 days ago

> I've forgotten more things in this field than I'm comfortable with today. I find it a bit sad that I've completely lost my Win32 reverse engineering skills I had in my teens

I'm a bit younger (33) but you'd be surprised how fast it comes back. I hadn't touched x86 assembly for probably 10 years at one point. Then someone asked a question in a modding community for an ancient game and after spending a few hours it mostly came back to me.

I'm sure if you had to reverse engineer some win32 applications, it'd come back quickly.

mickeyp8 days ago

SoftICE gang represent :-)

That's a skill onto itself, and I mean the general stuff does not fade or at least come back quickly. But there's a lot of the tail end that's just difficult to recall because it's obscure.

How exactly did I hook Delphi apps' TForm handling system instead of breakpointing GetWindowTextA and friends? I mean... I just cannot remember. It wasn't super easy either.

Agentlien8 days ago

I want to second this. I'm 38 and I used to do some debugging and reverse engineering during my university days (2006-2011). Since then I've mainly avoided looking at assembly since I mostly work in C++ systems or HLSL.

These last few months, however, I've had to spend a lot of time debugging via disassembly for my work. It felt really slow at first, but then it came back to me and now it's really natural again.

nkrisc8 days ago

You can’t keep infinite knowledge in your brain. You forget skills you don’t use. Barring some pathology, if you’re doing something every day you won’t forget it.

If you’ve forgotten your Win32 reverse engineering skills I’m guessing you haven’t done much of that in a long time.

That said, it’s hard to truly forget something once you’ve learned it. If you had to start doing it again today, you’d learn it much faster this time than the first.

+3
Wowfunhappy8 days ago
thesz8 days ago

  > When you know half a dozen programming languages or web frameworks, the next one takes you a couple hours to get comfortable with.
Learn yourself relational algebra. It invariantly will lead you to optimization problems and these will also invariantly lead you to equality saturation that is most effectively implemented with... generalized join from relational algebra!

Also, relational algebra implements content-addressable storage (CAS), which is essential for data flow computing paradigm. Thus, you will have a window into CPU design.

At 54 (36 years of professional experience) I find these rondos fascinating.

steve_adams_868 days ago

> I must admit my appetite in learning new technologies has lessened dramatically in the past decade;

I felt like that for a while, but I seem to be finding new challenges again. Lately I've been deep-diving on data pipelines and embedded systems. Sometimes I find problems that are easy enough to solve by brute force, but elegant solutions are not obvious at all. It's a lot of fun.

It could be that you're way ahead of me and I'll wind up feeling like that again.

TeMPOraL8 days ago

That's one of several possibilities. I've reached a different steady state - one where the velocity of work exceeds the rate at which I can learn enough to fully understand the task at hand.

everdrive8 days ago

But just think, there's a whole new framework that isn't better but is trendy. You can recycle a lot of your knowledge and "learn new things" that won't matter in five years. Isn't that great?

epolanski8 days ago

I use spaced repetition for stuff I care for.

I use remnote for that.

I write cards and quizzes for all kind of stuff, and I tend to retain it for years after having it practiced with the low friction of spaced repetition.

bryanrasmussen8 days ago

to fix that you basically need to switch specialty or focus. A difficult thing to do if you are employed of course.

emil-lp8 days ago

I worked as an "advisor" for programmers in a large company. Our mantra there was that programming and development of software is mainly acquiring knowledge (ie learning?).

One take-away for us from that viewpoint was that knowledge in fact is more important than the lines of code in the repo. We'd rather lose the source code than the knowledge of our workers, so to speak.

Another point is that when you use consultants, you get lines of codes, whereas the consultancy company ends up with the knowledge!

... And so on.

So, I wholeheartedly agree that programming is learning!

mlrtime8 days ago

>One take-away for us from that viewpoint was that knowledge in fact is more important than the lines of code in the repo. We'd rather lose the source code than the knowledge of our workers, so to speak.

Isn't this the opposite of how large tech companies operate? They can churn develops in/out very quickly, hire-to-fire, etc... but the code base lives on. There is little incentive to keep institutional knowledge. The incentives are PRs pushed and value landed.

emil-lp8 days ago

That might be the case for USA, but this was in a country with practically no firing.

teiferer8 days ago

> We'd rather lose the source code than the knowledge of our workers, so to speak.

Isn't large amounts of required institutional knowledge typically a problem?

emil-lp8 days ago

It was a "high tech domain", so institutional knowledge was required, problem or not.

We had domain specialists with decades of experience and knowledge, and we looked at our developers as the "glue" between domain knowledge and computation (modelling, planning and optimization software).

You can try to make this glue have little knowledge, or lots of knowledge. We chose the latter and it worked well for us.

But I was only in that one company, so I can't really tell.

17186274408 days ago
emil-lp7 days ago

Very cool! Thanks

hnthrow02873458 days ago

It can be I guess, but I think it's more about solving problems. You can fix a lot of peoples' problems by shipping different flavors of the same stuff that's been done before. It feels more like a trade.

People naturally try to use what they've learned but sometimes end up making things more complicated than they really needed to be. It's a regular problem even excluding the people intentionally over-complicating things for their resume to get higher paying jobs.

dude2507118 days ago

> The job is learning...

I could have sworn I was meant to be shipping all this time...

rTX5CMRXIfFG8 days ago

Have you been nothing more than a junior contributor all this time? Because as you mature professionally your knowledge of the system should also be growing

MyHonestOpinon8 days ago

It seems to me that now days software engineers move a lot more. Either within a company or to other companies. Furthermore, companies do not seem to care and they are always stuck on a learning loop where engineers are competent enough to make modifications and able to add new code but without deep insights where they can improve the fundamental abstractions of the system. Meanwhile even seniors with 25+ years of experience are noobs when they approaching a new system.

postalcoder8 days ago

One of the nice things about the "dumber" models (like GPT-4) was that it was good enough to get you really far, but never enough to complete the loop. It gave you maybe 90%. 20% of which you had to retrace -- so you had to do 30% of the tough work yourself, which meant manually learning things from scratch.

The models are too good now. One thing I've noticed recently is that I've stopped dreaming about tough problems, be it code or math. The greatest feeling in the world is pounding your head against a problem for a couple of days and waking up the next morning with the solution sketched out in your mind.

I don't think the solution is to be going full natty with things, but to work more alongside the code in an editor, rather than doing things in CLI.

boredemployee8 days ago

The big issue I see coming is that leadership will care less and less about people, and more about shipping features faster and faster. In other words, those that are still learning their craft are fucked up.

The amount of context switching in my day-to-day work has become insane. There's this culture of “everyone should be able to do everything” (within reason, sure), but in practice it means a data scientist is expected to touch infra code if needed.

Underneath it all is an unspoken assumption that people will just lean on LLMs to make this work.

iamflimflam18 days ago

I think this is sadly going to be the case.

I also used to get great pleasure from the banging head and then the sudden revelation.

But that takes time. I was valuable when there was no other option. Now? Why would someone wait when an answer is just a prompt away.

Oras8 days ago

You still have the system design skills, and so far, LLMs are not that good in this field.

They can give plausible architecture but most of the time it’s not usable if you’re starting from scratch.

When you design the system, you’re an architect not a coder, so I see no difference between handing the design to agents or other developers, you’ve done the heavy lifting.

In that perspective, I find LLMs quite useful for learning. But instead of coding, I find myself in long sessions back and forth to ask questions, requesting examples, sequence diagrams .. etc to visualise the final product.

Thanemate8 days ago

I see this argument all the time, and while it sounds great on paper (you're an architect now, not a developer) people forget (or omit?) that a product needs far fewer architects than developers, meaning the workforce gets in fact trimmed down thanks to AI advancements.

iamflimflam18 days ago

I would also point out that a lot of real world problems don’t need a complex architecture. They just need to follow some well established patterns.

It is a pattern matching problem and that seems to me to be something AI is/will be particularly good at.

Maybe it won’t be the perfect architecture, or the most efficient implementation. But that doesn’t seem to have stopped many companies before.

queenkjuul8 days ago

Idk i very much feel like Claude Code only ever gets me really far, but never there. I do use it a fair bit, but i still write a lot myself, and almost never use its output unedited.

For hobby projects though, it's awesome. It just really struggles to do things right in the big codebase at work.

simianwords8 days ago

you can now access similar models for way cheaper prices. grok 4.1 fast is around 10x cheaper but performs slightly better

i_love_retros8 days ago

Grok? You're OK giving money to elon musk?

stray8 days ago

Better than Palantir.

+1
i_love_retros8 days ago
dude2507118 days ago

> The greatest feeling in the world is pounding your head against a problem for a couple of days and waking up the next morning with the solution sketched out in your mind.

And then you find out someone else had already solved it. So might as well use the Google 2.0 aka ChatGPT.

griffzhowl8 days ago

Well, this is exactly the problem. This tactic works until you get to a problem that nobody has solved before, even if it's just a relatively minor one that no one has solved because no one has tried to because it's so specific. If you haven't built up the skills and knowledge to solve problems, then you're stuck.

wesleywt8 days ago

But to understand the solution from someone else, you would have to apply your mind to understand the problem yourself. Transferring the hard work of thinking to GPT will rob you of the attention you will need to understand the subject matter fully. You will be missing insights that would be applicable to your problem. This is the biggest danger of brain rot.

17186274408 days ago

How is that a drawback? You still solved it, you learned a lot, and you can actually discuss approaches with the other one, because you actually understood the problem domain.

dataviz10008 days ago

This is what I am thinking about this morning. I just woke up, made a cup of coffee, read the financial news, and started exploring the code I wrote yesterday.

My first thought was that I can abstract what I wrote yesterday, which was a variation of what I built over the previous week. My second thought was a physiological response of fear that today is going to be a hard hyper focus day full of frustration, and that the coding agents that built this will not be able to build a modular, clean abstraction. That was followed by weighing whether it is better to have multiple one off solutions, or to manually create the abstraction myself.

I agree with you 100 percent that the poor performance of models like GPT 4 introduced some kind of regularization in the human in loop coding process.

Nonetheless, we live in a world of competition, and the people who develop techniques that give them an edge will succeed. There is a video about the evolution of technique in the high jump, the Western Roll, the Straddle Technique, and finally the Fosbury Flop. Using coding agents will be like this too.

I am working with 150 GB of time series data. There are certain pain points that need to be mitigated. For example, a different LLM model has to be coerced into analyzing or working with the data from a completely different approach in order to validate. That means instead of being 4x faster, each iteration is 4x faster, and it needs to be done twice, so it still is only 2x faster. I burned $400 in tokens in January. This cannot be good for the environment.

Timezone handling always has to be validated manually. Every exploration of the data is a train and test split. Here is the thing that hurts the most. The AI coding agents always show the top test results, not the test results of the top train results. Rather than tell me a model has no significant results, it will hide that and only present the winning outliers, which is misleading and, like the OP research suggests, very dangerous.

A lot of people are going to get burned before the techniques to mitigate this are developed.

Overfitting has always been a problem when working with data. Just because the barrier of entry for time series work is much lower does not mean that people developing the skill, whether using old school tools like ARIMA manually or having AI do the work, escape the problem of overfitting. The models will always show the happy, successful looking results.

Just like calculators are used when teaching higher math at the secondary level so basic arithmetic does not slow the process of learning math skills, AI will be used in teaching too. What we are doing is confusing techniques that have not been developed yet with not being able to acquire skills. I wrack and challenge my brain every day solving these problems. As millions of other software engineers do as well, the patterns will emerge and later become the skills taught in schools.

amelius8 days ago

> We find that AI use impairs conceptual understanding, code reading, and debugging abilities, without delivering significant efficiency gains on average.

Ouch.

See also: https://news.ycombinator.com/item?id=46820924

> On average, participants in the AI group finished about two minutes faster, although the difference was not statistically significant. There was, however, a significant difference in test scores: the AI group averaged 50% on the quiz, compared to 67% in the hand-coding group

Ronsenshi8 days ago

It's good that there's some research into this - to confirm what is generally obvious to anyone who studied anything. You have to think about what you are doing, write things by hand, use the skill to improve and retain it.

Common example here is learning a language. Say, you learn French or Spanish throughout your school years or on Duolingo. But unless you're lucky enough to be amazing with language skills, if you don't actually use it, you will hit a wall eventually. And similarly if you stop using language that you already know - it will slowly degrade over time.

dr_dshiv8 days ago

Go Anthropic for transparency and commitment to science.

Personally, I’ve never been learning software development concepts faster—but that’s because I’ve been offloading actual development to other people for years.

jwr8 days ago

The title of this submission is misleading, that's not what they're saying. They said it doesn't show productivity gains for inexperienced developers still gaining knowledge.

visarga8 days ago

The study measures if participants learn the library, but what they should study is if they learn effective coding agent patterns to use the library well. Learning the library is not going to be what we need in the future.

> "We collect self-reported familiarity with AI coding tools, but we do not actually measure differences in prompting techniques."

Many people drive cars without being able to explain how cars work. Or use devices like that. Or interact with people who's thinking they can't explain. Society works like that, it is functional, does not work by full understanding. We need to develop the functional part not the full understanding part. We can write C without knowing the machine code.

You can often recognize a wrong note without being able to play the piece, spot a logical fallacy without being able to construct the valid argument yourself, catch a translation error with much less fluency than producing the translation would require. We need discriminative competence, not generative.

For years I maintained a library for formatting dates and numbers (prices, ints, ids, phones), it was a pile of regex but I maintained hundreds of test cases for each type of parsing. And as new edge cases appeared, I added them to my tests, and iterated to keep the score high. I don't fully understand my own library, it emerged by scar accumulation. I mean, yes I can explain any line, but why these regexes in this order is a data dependent explanation I don't have anymore, all my edits run in loop with tests and my PRs are sent only when the score is good.

Correctness was never grounded in understanding the implementation. Correctness was grounded in the test suite.

2sk218 days ago

You can, most certainly, drive a car without understanding how it works. A pilot of an aircraft on the other hand needs a fairly detailed understanding of the subsystems in order to effectively fly it.

I think being a programmer is closer to being an aircraft pilot than a car driver.

iammjm8 days ago

Sure, if you are a pilot then that makes sense. But what if you are a company that uses planes to deliver goods? Like when the focus shifts from the thing itself to its output

northfield278 days ago

Agreed

discreteevent8 days ago

> Many people drive cars without being able to explain how cars work.

But the fundamentals all cars behave the same way all the time. Imagine running a courier company where sometimes the vehicles take a random left turn.

> Or interact with people who's thinking they can't explain

Sure but they trust those service providers because they are reliable . And the reason that they are reliable is that the service providers can explain their own thinking to themselves. Otherwise their business would be chaos and nobody would trust them.

How you approached your library was practical given the use case. But can you imagine writing a compiler like this? Or writing an industrial automation system? Not only would it be unreliable but it would be extremely slow. It's much faster to deal with something that has a consistent model that attempts to distill the essence of the problem, rather than patching on hack by hack in response to failed test after failed test.

gjadi8 days ago

Interesting argument.

But isn't the corrections of those errors that are valuable to society and get us a job?

People can tell they found a bug or give a description about what they want from a software, yet it requires skills to fix the bugs and to build software. Though LLMs can speedup the process, expert human judgment is still required.

another-dave8 days ago

I think there's different levels to look at it.

If you know that you need O(n) "contains" checks and O(1) retrieval for items, for a given order of magnitude, it feels like you've all the pieces of the puzzle needed to make sure you keep the LLM on the straight and narrow, even if you didn't know off the top of your head that you should choose ArrayList.

Or if you know that string manipulation might be memory intensive so you write automated tests around it for your order of magnitude, it probably doesn't really matter if you didn't know to choose StringBuilder.

That feels different to e.g. not knowing the difference between an array list and linked list (or the concept of time/space complexity) in the first place.

gjadi8 days ago

My gut feeling is that, without wrestling with data structures at least once (e.g. during a course), then that knowledge about complexity will be cargo cult.

When it comes to fundamentals, I think it's still worth the investment.

To paraphrase, "months of prompting can save weeks of learning".

visarga8 days ago

I think the kind of judgement required here is to design ways to test the code without inspecting it manually line by line, that would be walking a motorcycle, and you would be only vibe-testing. That is why we have seen the FastRender browser and JustHTML parser - the testing part was solved upfront, so AI could go nuts implementing.

+1
northfield278 days ago
concats8 days ago

I agree. It's very missleading. Here's what the authors actually say:

> AI assistance produces significant productivity gains across professional domains, particularly for novice workers. Yet how this assistance affects the development of skills required to effectively supervise AI remains unclear. Novice workers who rely heavily on AI to complete unfamiliar tasks may compromise their own skill acquisition in the process. We conduct randomized experiments to study how developers gained mastery of a new asynchronous programming library with and without the assistance of AI. We find that AI use impairs conceptual understanding, code reading, and debugging abilities, without delivering significant efficiency gains on average. Participants who fully delegated coding tasks showed some productivity improvements, but at the cost of learning the library. We identify six distinct AI interaction patterns, three of which involve cognitive engagement and preserve learning outcomes even when participants receive AI assistance. Our findings suggest that AI-enhanced productivity is not a shortcut to competence and AI assistance should be carefully adopted into workflows to preserve skill formation -- particularly in safety-critical domains.

danbruc8 days ago

That itself sounds contradictory to me.

I assistance produces significant productivity gains across professional domains, particularly for novice workers.

We find that AI use impairs conceptual understanding, code reading, and debugging abilities, without delivering significant efficiency gains on average.

Are the two sentences talking about non-overlapping domains? Is there an important distinction between productivity and efficiency gains? Does one focus on novice users and one on experienced ones? Admittedly did not read the paper yet, might be clearer than the abstract.

mold_aid8 days ago

Not seeing the contradiction. The two sentences suggest a distinction between novice task completion and supervisory (ie, mastery) work. "The role of workers often shifts from performing the task to supervising the task" is the second sentence in the report.

The research question is: "Although the use of AI tools may improve productivity for these engineers, would they also inhibit skill formation? More specifically, does an AI-assisted task completion workflow prevent engineers from gaining in-depth knowledge about the tools used to complete these tasks?" This hopefully makes the distinction more clear.

So you can say "this product helps novice workers complete tasks more efficiently, regardless of domain" while also saying "unfortunately, they remain stupid." The introductiory lit review/context setting cites prior studies to establish "ok coders complete tasks efficiently with this product." But then they say, "our study finds that they can't answer questions." They have to say "earlier studies find that there were productivity gains" in order to say "do these gains extend to other skills? Maybe not!"

danbruc6 days ago

The learning aspect is not the relevant part for the [potential] contradiction, let me shorten the two quotes.

AI assistance produces significant productivity gains [...].

We find that AI use [...] [is not] delivering significant efficiency gains on average.*

capnrefsmmat8 days ago

The first sentence is a reference to prior research work that has found those productivity gains, not a summary of the experiment conducted in this paper.

danbruc6 days ago

In that case it should not be stated as a fact, it should then be something like the following.

While prior research found significant productivity gains, we find that AI use is not delivering significant efficiency gains on average while also impairing conceptual understanding, code reading, and debugging abilities.

torginus8 days ago

That doesn't really line up with my experience, I wanted to debug a CMake file recently, having done no such thing before - AI helped me walk through the potential issues, explaining what I got wrong.

I learned a lot more in a short amount of time than I would've stumbling around on my own.

Afaik its been known for a long time that the most effective way of learning a new skill, is to get private tutoring from an expert.

+1
yoz-y8 days ago
hxugufjfjf8 days ago

Has the claim in your third paragraph been backed by research? Not snark, genuinely curious. I have some anecdotal, personal experience backing it up.

omnicognate8 days ago

I agree the title should be changed, but as I commented on the dupe of this submission learning is not something that happens as a beginner, student or "junior" programmer and then stops. The job is learning, and after 25 years of doing it I learn more per day than ever.

mold_aid8 days ago

The study doesn't argue that you stopped learning.

omnicognate8 days ago

I didn't say it did. I just pointed out that learning effectively isn't only a concern for "inexperienced developers still gaining knowledge".

emsign8 days ago

> They said it doesn't show productivity gains for inexperienced developers still gaining knowledge.

But that's what "impairs learning" means.

northfield278 days ago

Edit: Changed title

Previous title: "Anthropic: AI Coding shows no productivity gains; impairs skill development"

The previous title oversimplified the claim to "all" developers. I found the previous title meaningful while submitting this post because most of the false AI claims of "software engineer is finished" has mostly affected junior `inexperienced` engineers. But I think `junior inexperienced` was implicit which many people didn't pick.

The paper makes a more nuanced claim that AI Coding speeds up work for inexperienced developers, leading to some productivity gains at the cost of actual skill development.

suralind8 days ago

No surprise, really. You can use AI to explore new horizons or propose an initial sketch, but for anything larger than small changes - you must do a rewrite. Not just a review. An actual rewrite. AI can do well adding a function, but you can't vibe code an app and get smarter.

I don't necessarily think that writing more code means you get better coder. I automate nearly all my tests with AI and large chunk of bugfixing as well. I will regularly ask AI to propose an architecture or introduce a new pattern if I don't have a goal in my mind. But in these last 2 examples, I will always redesign the entire approach to be what I consider a better, cleaner interface. I don't recall AI ever getting that right, but must admit I asked AI in the first place cos I didn't know where to start.

If I had to summarize, I would say to let AI implement coding, but not API design/architecture. But at the same time, you can only get good at those by knowing what doesn't work and trying to find a better solution.

teiferer8 days ago

> I automate nearly all my tests with AI

How exactly? Do you tell the agent "please write a test for this" or do you also feed it some form of spec to describe what the tested thing is expected to do? And do these tests ever fail?

Asking because the first option essentially just sets the bugs in stone.

Wouldn't it make sense to do it the other way around? You write the test, let the AI generate the code? The test essentially represents the spec and if the AI produces sth which passes all your tests but is still not what you want, then you have a test hole.

suralind8 days ago

I'm not saying my approach is correct, keep that in mind.

I care more about the code than the tests. Tests are verification of my work. And yes, there is a risk of AI "navigating around" bugs, but I found that a lot of the time AI will actually spot a bug and suggest a fix. I also review each line to look for improvements.

Edit: to answer your question, I will typically ask it to test a specific test case or few test cases. Very rarely will I ask it to "add tests everywhere". Yes, these tests frequently fail and the agent will fix on 2nd+ iteration after it runs the tests.

One more thing to add is that a lot of the time agent will add a "dummy" test. I don't really accept those for coverage's sake.

teiferer8 days ago

Thanks for your responses!

A follow-up:

> I care more about the code than the tests.

Why is that? Your (product) code has tests. Your test (code) doesn't. So I often find that I need to pay at least as much attention to my tests to ensure quality.

suralind8 days ago

I think you are correct in your assessment. Both are important. If you're gonna have garbage code tests, you're gonna have garbage quality.

I find tests easier to write. Your function(s) may be hundred lines long, but the test is usually setup, run, assert.

I don't have much experience beyond writing unit/integration tests, but individual test cases seem to be simpler than the code they test (linear, no branches).

james_marks8 days ago

This is why the quality of my code has improved since using AI.

I can iterate on entire approaches in the same amount of time it would have taken to explore a single concept before.

But AI is an amplifier of human intent- I want a code base that’s maintainable, scalable, etc., and that’s a different than YOLO vibe coding. Vibe engineering, maybe.

acedTrex8 days ago

My core uses are 100% racing the model in yolo mode to find a bug. I win most of the time but occasionally it surprises me.

Then also switching arch approaches quickly when i find some code strategies that are not correctly ergonomic. Splitting of behaviors and other refactors are much lower cost now.

mickeyp8 days ago

> No surprise, really. You can use AI to explore new horizons or propose an initial sketch, but for anything larger than small changes - you must do a rewrite. Not just a review. An actual rewrite. AI can do well adding a function, but you can't vibe code an app and get smarter.

Sometimes I wonder if people who make statements like this have ever actually casually browsed Twitter or reddit or even attempted a "large" application themselves with SOTA models.

JustSkyfall8 days ago

You can definitely vibecode an app, but that doesn't mean that you can necessarily "get smarter"!

An example: I vibecoded myself a Toggl Track clone yesterday - it works amazingly but if I had to rewrite e.g. the PDF generation code by myself I wouldn't have a clue!

suralind8 days ago

That's what I meant, it's either, or. Vibe coding definitely has a place for simple utilities or "in-house" tools that solve one problem. You can't vide code and learn (if you do, then it's not vibe coding as I define it).

suralind8 days ago

Did I say that you can't vibe code an app? I browse reddit and have seen the same apps as you did, I also vibe code myself every now and then and know what happens when you let it loose.

simonw8 days ago

Key snippet from the abstract:

> Novice workers who rely heavily on AI to complete unfamiliar tasks may compromise their own skill acquisition in the process. We conduct randomized experiments to study how developers gained mastery of a new asynchronous programming library with and without the assistance of AI. We find that AI use impairs conceptual understanding, code reading, and debugging abilities, without delivering significant efficiency gains on average.

The library in question was Python trio and the model they used was GPT-4o.

gergo_b8 days ago

When I use AI to write code, after a week or 2, if I go back to the written code I have a hard time catching up. When I write code by myself I always just look at it and I understand what I did.

jackdoe8 days ago

a program is function of the programmer, how you code is how you think. that is why it is really difficult, even after 60 years, for multiple people to work on the same codebase, over the years we have made all kinds of rules and processess so that code written by one person can be understood and changed by another.

you can also read human code and empathise what were they thinking while writing it

AI code is not for humans, it is just a stream of tokens that do something, you need to build skills to empirically verify that it does what you think it does, but it is pointless to "reason" about it.

AstroBen8 days ago

Not only do I have a hard time catching up, but it's like I'm looking at a codebase I've never seen before, even though I absolutely reviewed the code before committing

northfield278 days ago

++Hard Agree.

lelanthran8 days ago

I must say I am quite impressed that Anthropic published this, given that they found that:

1. AI help produced a solution only 2m faster, and

2. AI help reduced retention of skill by 17%

visarga8 days ago

Many say generative AI is like a vending machine. But if your vending machine has not 1 button but a keyboard, and you type anything you want in, and it makes it (Star Trek Replicator) and you use it 10,000 times to refine your recipes, did you learn something or not? How about a 3D printer, do you learn something making designs and printing them?

northfield278 days ago

Instead of "vending machine", I see many people calling generative AI "slot machine", which more aptly describes current genAI tools.

Yes, we can use it 10,000 times to refine our recipes, but "did we learn from it"? I am doubtful about that, given that even after running with the same prompt 10 times, it will give different answers in 8/10 responses.

But I am very confident that I can learn by iterating and printing designs on a 3D printer.

latexr8 days ago

Star Trek replicators were deterministic. They had a library of things they could replicate that your programmed in and that’s the extent of what they could do. They replicated to the molecular level, no matter how many times you ask for something, you got the exact same thing. You’d never ask for a raktajino and get a raw steak. In the rare instances where they misbehaved as a plot point, they were treated as being broken and needing fixing, no one ever suggested “try changing your prompt, or ask it seventeen times until you get what you want”.

hahahahhaah8 days ago

3d printer: you learn something of you make CAD designs yourself and print them yes. It is a skill.

comrade12348 days ago

Often when I use it I know that there is a way to do something and I know that I could figure it out by going through some api documents and maybe finding some examples on the web... IOW I already have something in mind.

For example I wanted to add a rate-limiter to an api call with proper http codes, etc. I asked the ai (in IntelliJ it used to be Claude by default but they've since switched to Gemini as default) to generate one for me. The first version was not good so I asked it to do it again but with some changes.

What would take me a couple of hours or more took less than 10 minutes.

drooby8 days ago

Exactly this.

I’m starting to believe that people who think AI-generated code is garbage actually don’t know how to code.

I hit about 10 years of coding experience right before AI hit the scene, which I guess makes me lucky. I know, with high confidence, what I want my code to look like, and I make the AI do it. And it does it damn well and damn fast.

I think I sit at a unique point for leveraging AI best. Too junior and you create “working monsters.” Meanwhile, Engineering Managers and Directors treat it like humans, but it’s not AGI yet.

hollowturtle8 days ago

> Unsurprisingly, participants in the No AI group encountered more errors. These included errors in syntax and in Trio concepts, the latter of which mapped directly to topics tested on the evaluation

I'm wondering if we could have the best of IDE/Editor features like LSP and LLMs working together. With an LSP syntax errors are a solved problem, if the language is statically typed I often find myself just checking out type signatures of library methods, simpler to me than asking an LLM. But I would love to have LLMs fixing your syntax and with types available or not, giving suggestions on how to best use the libraries given current context.

Cursor tab does that to some extent but it's not fool proof and it still feels too "statistical".

I'd love to have something deeply integrated with LSPs and IDE features, for example VSCode alone has the ability of suggesting imports, Cursor tries to complete them statistically but it often suggest the wrong import path. I'd like to have the twos working together.

Another example is renaming identifiers with F2, it is reliable and predictable, can't say the same when asking an agent doing that. On the other hand if the pattern isn't predictable, e.g. a migration where a 1 to 1 rename isn't enough, but needs to find a pattern, LLMs are just great. So I'd love to have an F2 feature augmented with LLMs capabilities

gorbachev8 days ago

I've found the AI assisted auto-completion to be very valuable. It's definitely sped up my coding and reduced the number of errors I make.

It reduces the context switching between coding and referencing docs quite a bit.

hollowturtle8 days ago

Have you read my comment or are you a bot?

keeda8 days ago

Another study from 2024 with similar findings: https://www.mdpi.com/2076-3417/14/10/4115 -- a bit more preliminary, but conducted with undergrad students still learning to program, so I expect the effect would be even more pronounced.

This similarly indicates that reliance on LLM correlates with degraded performance in critical problem-solving, coding and debugging skills. On the bright side, using LLMs as a supplementary learning aid (e.g. clarifying doubts) showed no negative impact on critical skills.

This is why I'm skeptical of people excited about "AI native" junior employees coming in and revamping the workplace. I haven't yet seen any evidence that AI can be effectively harnessed without some domain expertise, and I'm seeing mounting evidence that relying too much on it hinders building that expertise.

I think those who wish to become experts in a domain would willingly eschew using AI in their chosen discipline until they've "built the muscles."

jbellis8 days ago

Good to see that Anthropic is honest and open enough to publish a result with a mostly negative headline.

> Importantly, using AI assistance didn’t guarantee a lower score. How someone used AI influenced how much information they retained. The participants who showed stronger mastery used AI assistance not just to produce code but to build comprehension while doing so—whether by asking follow-up questions, requesting explanations, or posing conceptual questions while coding independently.

This might be cynically taken as cope, but it matches my own experience. A poor analogy until I find a better one: I don't do arithmetic in my head anymore, it's enough for me to know that 12038 x 912 is in the neighborhood of 10M, if the calculator gives me an answer much different from that then I know something went wrong. In the same way, I'm not writing many for loops by hand anymore but I know how the code works at a high level and how I want to change it.

(We're building Brokk to nudge users in this direction and not a magic "Claude take the wheel" button; link in bio.)

cleandreams8 days ago

I'm anxious about code quality in critical infrastructure in 5 years or so.

Also my mastery of code starts with design and implementation that results in deep, intuitive understanding. Then I can do good code reviews and fix bugs fast fast fast.

Now engineers leap from AI assisted or even dominated implementation to code reviews. Lots of reading code without that deep level of mastery. With this approach I have less confidence in the humans who are in the loop.

Kiboneu8 days ago

When coding agents are unavailable I just continue to code myself or focus on architecture specification / feature descriptions. This really helps me retain my skills, though there is some "skew" (I'm not sure how to describe it, it's a feeling). Making instructions to LLMs to me is pretty similar to doing the basic software architecture and specification work that a lot of people tend to skip (now, there's not choice and it's directly useful). When you skip specification for a sufficiently complex project, you likely introduce footguns along the way that slows down development significantly. So what would one expect when they run a bunch of agents based on a single sentence prompt?!

Like the architecture work and making good quality specs, working on code has a guiding effect on the coding agents. So in a way, it also benefits to clarify items that may be more ambiguous in the spec. If I write some of the code myself, it will make fewer assumptions about my intent when it touches it (especially when I didn't specify them in the architecture or if they are difficult to articulate in natural language).

In small iterations, the agent checks back for each task. Because I spend a lot of time on architecture, I already have a model in my mind of how small code snippets and feature will connect.

Maybe my comfort with reviewing AI code comes form spending a large chunk of my life reverse engineering human code, to understand it to the extent that complex bugs and vulnerabilities emerge. I've spent a lot of time with different styles of code writing from awful to "this programmer must have a permanent line to god to do this so elegantly". The models is train on that, so I have a little cluster of neurons in my head that's shaped closely enough to follow the model's shape.

devnonymous8 days ago

From the "Discussion" section:

> This suggests that as companies transition to more AI code writing with human supervision, humans may not possess the necessary skills to validate and debug AI-written code if their skill formation was inhibited by using AI in the first place.

I'm reminded of "Kernighan's lever" :

> Everyone knows that debugging is twice as hard as writing a program in the first place. So if you're as clever as you can be when you write it, how will you ever debug it?

AI is writing code in the cleverest way possible which then introduces cognitive load for anyone who hasn't encountered these patterns previously. Although, one might say that AI would also assist in the debugging, you run the risk of adding further complexity in the process of 'fixing' the bugs and before you know it you have a big stinking ball of mud.

Terretta8 days ago

> AI is writing code in the cleverest way possible …

On the contrary, without mastery guiding, AI writes code in the most boilerplate way possible, even if that means compromising logic or functionality.

> … which then introduces cognitive load for anyone who hasn't encountered these patterns previously

And for those who have. This is the enterprise Java effect. The old trope is Java was designed to make all devs median and all produce the same median code so enterprises don't have to worry about the individual devs, it's all the same bowl of unflavored oatmeal.

When you read code from vibe coding novice, it's difficult to grok the intended logic because that's buried within these chunks of enterprise pattern boilerplate as if the solution was somehow regex'd at random from StackOverflow until some random combination happened to pass a similarly randomized bag of tests.

The cognitive load to reverse this mess into clean clear expression of logic is very high whether a human or machine "coded" this way.

In both cases, the antidote is caring for craft and mastery first, with an almost pseudocode clarity in expressing the desired outcome.

OK, but -- even this doesn't guarantee the result one wants.

Because even if the master writes the code themselves, they may find their intent was flawed. They expressed the intent clearly, but their intention wasn't helpful for the outcome needed.

This is where rapid iteration comes in.

A master of software engineering may be able to iterate on intent faster with the LLM typing the code for them than they can type and iterate on their own. With parallel work sessions, they may be able to explore intention space faster to reach the outcome.

Each seasonal improvement in LLM models' ability to avoid implementation errors while iterating this way makes the software developer with mastery but lack of perfect pre-visualization of intent more productive. Less time cleaning novice coding errors, more cycles per hour iterating the design in their head.

This type of productivity gain has been meaningful for this type of developer.

At the same time, the "chain of thought" or "reasoning" loops being built into the model are reaching into this intention space, covering more of the prompt engineering space for devs with less mastery being unable to express much less iterate intent. This lets vibe "coders" imagine their productivity is improving as well.

If the output of the vibe coder (usually product managers, if you look closely) is considered to be something like a living mockup and not a product, then actual software engineers can take that and add the *-ilities (supportability, maintainability, etc. that the vibe coder has never specified whether vibing or product managing).

Using a vibed prototype can accelerate the transfer of product conception from the PM to the dev team more effectively than PM just yelling at a dev tech lead that the dev hasn't understood what the PM is saying the product should be. Devs can actually help this process by ensuring the product "idea" person is armed with a claude.md to orient the pattern medianizer machine with the below the waterline stuff engineering teams know are 80% of the cost-through-time.

There's not a lot of discussion of prototype vibing being a new way for product owners and engineering teams to gain clarity above the waterline, or whether it's productive. Here's a dirty secret: it's more productive in that it's more protective of the rarer skilset's time. The vibe time wasted is paid by the product owner (hallelujah), the eng team can start with a prototype the product owner iterated with while getting their intent sorted out, so now engineerings iterations shift from intent (PM headspace) to implementation (eng headspace).

Both loops were tightened.

> you run the risk of adding further complexity in the process of 'fixing' the bugs and before you know it you have a big stinking ball of mud.

Iterating where the problem lies, uncoupling these separate intention and iteration loops, addresses this paradox.

vessenes8 days ago

@dang the title here is bait. I’d suggest the paper title: “Anthropic: How AI Impacts Skill Formation”

fragmede8 days ago

This isn't Twitter. email hn@ycombinator.com

vessenes4 days ago

Have you heard of K I B O?

baalimago8 days ago

I've noticed this as well. I delegate to agentic coders on tasks I need to have done efficiently, which I could do myself and lack time to do. Or on tasks which are in areas I simply don't care much for, for languages which I don't like very much etc

Wojtkie8 days ago

This is interesting. I started teaching myself Polars and used Claude to help me muscle through some documentation in order to meet deadlines on a project.

I found that Claude wasn't too great at first at it and returned a lot of hallucinated methods or methods that existed in Pandas but not Polars. I chalk this up to context blurring and that there's probably a lot less Polars code in the training corpus.

I found it most useful for quickly pointing me to the right documentation, where I'd learn the right implementation and then use it. It was terrible for the code, but helpful as a glorified doc search.

i_love_retros8 days ago

I don't understand how so many people can be OK with inflicting brain rot on themselves and basically engineering themselves out of a career.

I use a web ui to chat with ai and do research, and even then I sometimes have to give up and accept that it won't provide the best solution that I know exists and am just to lazy to flesh out on my own. And to the official docs I go.

But the coding tools, I'm sorry but they constantly disappoint me. Especially the agents. In fact the agents fucking scare me. Thank god copilot prompts me before running a terminal command. The other day I asked it about a cypress test function and the agent asked if it could run some completely unrelated gibberish python code in my terminal. That's just one of many weird things it's done.

My colleagues vibe code things because they don't have experience in the tech we use on our project, it gets passed to me to review with "I hope you understand this". Our manager doesn't care because he's all in on AI and just wants the project to meet deadlines because he's scared for his job, and each level up the org chart from him it's the same. If this is what software development is now then I need to find another career because its pathetic, boring, and stressful for anyone with integrity.

shayonj8 days ago

Being able to debug and diagnose difficult problems and distributed systems still remains a key skill, at least until Opus or some other model gets better at it.

I think being intentional about learning while using AI to be productive is where the stitch is, at least for folks earlier in their career. I touch that in my post here as well: https://www.shayon.dev/post/2026/19/software-engineering-whe...

discreteevent8 days ago

The learning loop and LLMs [1] is well worth reading and the anthropic blog post above concurs with it in a number of places. It's fine to use LLMs as an assistant to understanding but your goal as an engineer should always be understanding and the only real way to do that is to have to struggle to make things yourself.

[1] https://martinfowler.com/articles/llm-learning-loop.html

epolanski8 days ago

> Importantly, using AI assistance didn’t guarantee a lower score. How someone used AI influenced how much information they retained. The participants who showed stronger mastery used AI assistance not just to produce code but to build comprehension while doing so

This is my experience exactly. I have never been learning as much as with AI.

It's interesting that numbers show most users degrade but I hate the general assumption that some cannot use it properly to learn faster as well.

grahamlee8 days ago

I’ve been making the case (e.g. https://youtu.be/uL8LiUu9M64?si=-XBHFMrz99VZsaAa [1]) that we have to be intentional about using AI to augment our skills, rather than outsourcing understanding: great to see Anthropic confirming that.

[1] plug: this is a video about the Patreon community I founded to do exactly that. Just want to make sure you’re aware that’s the pitch before you do ahead and watch.

crvdgc8 days ago

Among the six patterns identified, it's interesting that "Iterative AI Debugging" takes more time (and possibly tokens) but results in worse scores than letting AI do everything. So this part really should be handed over to agent loops.

The three high score patterns are interesting as well. "Conceptual Inquiry" actually results in less time and doesn't improve the score than the other two, which is quite surprising to me.

MzxgckZtNqX5i8 days ago

Duplicate?

Submission about the arXiv pre-print: https://news.ycombinator.com/item?id=46821360

siliconc0w8 days ago

It's pretty insidious to think that these AI labs want you become so dependent on them so that once the VC-gravy-train stops they can hike the token price 10x and you'll still pay because you have no other choice.

(thankfully market dynamics and OSS alternatives will probably stop this but it's not a guarantee, you need like at least six viable firms before you usually see competitive behavior)

Zababa8 days ago

>It's pretty insidious to think that these AI labs want you become so dependent on them so that once the VC-gravy-train stops they can hike the token price 10x and you'll still pay because you have no other choice.

I don't think that's true? From what I understand most labs are making money from subscription users (maybe not if you include training costs, but still, they're not selling at a loss).

>(thankfully market dynamics and OSS alternatives will probably stop this but it's not a guarantee, you need like at least six viable firms before you usually see competitive behavior)

OpenAI is very aggressive with the volume of usage you can get from Codex, Google/DeepMind with Gemini. Anthropic reduced the token price with the latest Opus release (4.5).

Bnjoroge8 days ago

gotta say this is some impressive transparency for something that seems to somehwat intersect with their business objective.

qweiopqweiop8 days ago

It makes sense - juniors are coding faster but not understanding anything. Ironically it'll stop them getting more experienced despite feeling good. What I'm interested in is if the same applies for Senior+ developers. The soft signals are that people are feeling the atrophy but who knows...

renegade-otter8 days ago

It requires discipline. I use LLMs for mind-numbing refactoring and things I don't care learning. If you want to learn something, you do it yourself. It's like the gym. No pain, no gain.

I am not saying you should be struggling performatively, like a person still proud in 2026 that they are still using Vim for large projects (good for you, eh), but sometimes you need to embrace the discomfort.

bayindirh8 days ago

> like a person still proud in 2026 that they are still using Vim for large projects.

I remember a small competition where people do a well-defined "share this content to others" routine to showcase how OS A is way more intuitive than OS B. There was also an OS C, which was way slower than A&B. Then, someone came using OS C, topped the chart with a sizeable time difference.

The point is, sometimes mastery pays back so much that, while there's theoretically better ways to do something, the time you save from that mastery is enough of a reason to not to leave the tool you're using.

I also have a couple of "odd" tools that I use and love, which would cause confused looks from many people. Yet, I'm fast and happy with them.

skydhash8 days ago

> like a person still proud in 2026 that they are still using Vim for large projects

These large projects are amlmost always in Java, C#, and co. Where the verbosity of the language make it required to use an IDE. Otherwise, it would be a struggle to identify which module to import or what prefix and suffix (Manager, Service, Abstract, Factory, DTO,…) to add to the concept name.

mkehrt8 days ago

Vim has been having a moment for a while. I have several coworkers who just use it and it seems to work fine for them.

empath758 days ago

I am doing now with AI what I consider more to be engineering management than I am doing software dev, and most technical managers have their coding skills atrophy over time and I don't think that is really a problem.

system28 days ago

I respect Anthoropic for writing an article like this. I can't imagine Sam Altman allowing someone to write something like this that is not a 100% advertisement of their own products or mightiness.

gezman78 days ago

They lost me in the abstract when said “AI increase productivity especially with novice workers” From my experience, it was the most experienced and fluent in the engineering world who gained the most value from AI.

buredoranna8 days ago

Revealing AI is a tool, and like any other tool, its how you use it.

If you use it with the express intent to learn, it is an amazing tool.

If you use it as a crutch, it results in "learning avoidance".

simonw8 days ago

I wonder why these Anthropic researchers chose GPT-4o for their study.

segh8 days ago

Far far more people use ChatGPT than Claude.ai

simianwords8 days ago

This is really strange and warrants some skepticism

fragmede8 days ago

Anthropic paid a team to do a project, and gave them leeway to do it how they wanted. If anything, it's a good signal that Anthropic didn't lean on the scale to have the results go in their favor.

hxugufjfjf8 days ago

Isn’t it technically in their favor if competition is proven bad, even if it would be equally easy to prove their product likely equally bad or even worse?

irrelevant19157 days ago

Interesting read. Makes me wonder how often we mistake convenience for competence when using AI tools.

rkagerer8 days ago

Is anyone else concerned about the huge, centralized dependency AI introduces into your workflow?

This is one reason I've been resistant to using it. I don't want my work to go to the companies providing the models. I don't trust them. Not only with my data in the first place, but also that they'll keep providing the service over the long term without totally enshittifying the experience.

I'll be so much more excited by this when local models catch up to (or even exceed) frontier-level quality. How close are we to this?

(In my case, I don't even care if it costs a boatload in hardware capital to deploy.)

rkagerer8 days ago

Is GLM 4.7 still leading in terms of local models?

direwolf208 days ago

About as much as they're worried about AWS.

AstroBen8 days ago

I actually think this research points out why that isn't an issue: used properly, AI can help you learn and act as support. I'd also be fine if my LSP disappeared overnight. Kind of annoying but meh I'll be fine

You should be concerned if you're outsourcing your work to it, though. There's also no benefit to doing that outside of laziness (the research shows no statistically significant productivity improvement)

yalogin8 days ago

Is this the equivalent of cigarette companies putting “smoking kills” on their packaging?

generalizations8 days ago

From Plato's Phaedrus, on the invention of writing:

Theuth: "This invention, O king, will make the Egyptians wiser and will improve their memories; for it is an elixir of memory and wisdom that I have discovered."

Thamus replied: "Most ingenious Theuth, one man has the ability to beget arts, but the ability to judge of their usefulness or harmfulness to their users belongs to another; and now you, who are the father of letters, have been led by your affection to ascribe to them a power the opposite of that which they really possess. For this invention will produce forgetfulness in the minds of those who learn to use it, because they will not practice their memory. Their trust in writing, produced by external characters which are no part of themselves, will discourage the use of their own memory within them.

You have discovered an elixir not of memory but of reminding; and you offer your pupils the appearance of wisdom, not true wisdom, for they will read many things without instruction and will therefore seem to know many things, when they are for the most part ignorant and hard to get along with, since they are not wise, but only appear wise."

Which is to say: "All this has happened before, and will happen again."

mriet8 days ago

I'm already having flashbacks of how the tobacco industry faired..

journal7 days ago

AI is OIL for your brain. Imagine being able to do 1000% more.

jmatthews8 days ago

I find this so hard to get my head around. I am wildly more prolific with agentic coding. It's at minimum a 10x for the first several iterations and when you get into the heavy detail part I am still the choke point.

luxuryballs8 days ago

I expect, especially in things like transit or healthcare, that people still need to review the code that is written. Even if we write bots that are good at scanning code for issues, we still can’t risk trusting any code blindly for some industries…

I can start to see the dangers of ai now, whereas before it was more imaginary sci-fi stuff I couldn’t pin down. On the other hand a dystopian sci-fi full of smart everything seems more possible now since code can be whipped up so easily, which means perhaps that the ability for your smart-monocle to find and hack things in every day life is also way more likely now if the world around you is saturated by quick and insecure code.

replwoacause8 days ago

I guess its cool they published this paper, but the cynic in me says this is more a PR/optics move to reinforce the narrative that "we're an AI safety-first company" because "look see, we published a study that undermines our own company's benefit", while knowing full well that at the end of the day a majority of people in AI decision making positions are always going to push harder and harder for X thing to be done as fast as possible. So while the warning is "nice" I suppose, it feels sort of like Sam Altman talking about how OpenAI needs to be more regulated by the government meanwhile authors, artists, and publishers are suing because their work was actively being stolen.

Sure, it sounds good to call for more regulation, or admit that there are downsides to your product, but when you know these things are falling largely on deaf ears and you continue operating business as usual, I wonder how much of it is just theater.

roark_howard8 days ago

Guilt driven attempt to save jobs?

kaelandt8 days ago

Nice to see an AI coding company allow such studies to come out, and it looks decently designed

falloutx8 days ago

Dont give them kudos, they are just trying to seem like a "research" company while submitting bogus papers on arXiv (not peer-reviewed)

oxag3n7 days ago

> For novice workers in software engineering or any other industry, our study can be viewed as a small piece of evidence toward the value of intentional skill development with AI tools.

TL;DR it's not AI that makes you dumb, it's the wrong "Output style" - just choose learning style.

HPsquared8 days ago

High-level languages impact assembly coding skills, which are almost extinct.

divbzero8 days ago

In another half century, will this sound like “How compilers impact the formation of assembly coding skills” sounds today?

gordonhart8 days ago

Hinges on whether this new high level -> low level transformation becomes reliable enough to build watertight abstractions on top of it. If AI code becomes good enough that you don't have to worry about the low-level representation 99.9...% of the time, absolutely. But we're pretty far from that at the moment and it's impossible to say where things will be in another 50 years.

reedf18 days ago

This is a fancy way of saying that if you invent the calculator, people get worse at sums. I'm not an AI doomer or a boomer - but it's clear to me that some skills will be permanently relegated to AI.

lionkor8 days ago

Yes, except the calculator is right 100% of the time. LLMs are right ??% of the time, where ?? constantly changes, changes with prompts, etc.

ares6238 days ago

For $100/hour I can fill in those gaps for you!

falloutx8 days ago

Can we ban Anthropic research papers to be submitted on HN?

This study is so bad, the sample size is n = 52 and then in some conclusions it goes down to n = 2.

stuxnet798 days ago

It is sad to see how far Anthropic and OpenAI have strayed from their research roots, that a pitiful manuscript like this can pass muster.

raphman8 days ago

This seems to be a totally normal sample size for such kinds of studies where you look at quantitative and qualitative aspects. Is this the only reason why you find the study to be bad?

jerf8 days ago

If AIs were to plateau where they are for an extended period of time, I definitely worry about their net effect on software quality.

One of the things I worry about is people not even learning what they can ask the computer to do properly because they don't understand the underlying system well enough.

One of my little pet peeves, especially since I do a lot of work in the networking space, is code that works with strings instead of streams. For example, it is not that difficult (with proper languages and libraries) to write an HTTP POST handler that will accept a multi-gigabyte file and upload it to an S3 bucket, perhaps gzip'ing it along the way, such that any size file can be uploaded without reference to the RAM on the machine, by streaming it rather than loading the entire file into a string on upload, then uploading that file to S3, requiring massive amounts of RAM in the middle. There's still a lot of people and code out in the world that works that way. AIs are learning from all that code. The mass of not-very-well-written code can overwhelm the good stuff.

And that's just one example. A whole bunch of stuff that proliferates across a code base like that and you get yet another layer of sloppiness that chews through hardware and negates yet another few generations of hardware advances.

Another thing is that, at the moment, code that is good for an AI is also good for a human. They may not quite be 100% the same but right now they're still largely in sync. (And if we are wise, we will work to keep it that way, which is another conversation, and we probably won't because we aren't going to be this wise at scale, which is yet another conversation.) I do a lot of little things like use little types to maintain invariants in my code [1]. This is good for humans, and good for AIs. The advantages of strong typing still work for AIs as well. Yet none of the AIs I've used seem to use this technique, even with a code base in context that uses this techique extensively, nor are they very good at it, at least in my experience. They almost never spontaneously realize they need a new type, and whenever they go to refactor one of these things they utterly annihilate all the utility of the type in the process, completely blind to the concept of invariants. Not only do they tend to code in typeless goo, they'll even turn well-typed code back into goo if you let them. And the AIs are not so amazing that they overcome the problems even so.

(The way these vibe coded code bases tend to become typeless formless goo as you scale your vibe coding up is one of the reasons why vibe coding doesn't scale up as well as it initially seems to. It's good goo, it's neat goo, it is no sarcasm really amazing that it can spew this goo at several lines per second, but it's still goo and if you need something stronger than goo you have problems. There are times when this is perfect; I'm just about to go spray some goo myself for doing some benchmarking where I just need some data generated. But not everything can be solved that way.)

And who is going to learn to shepherd them through writing better code, if nobody understands these principles anymore?

I started this post with an "if" statement, which wraps the whole rest of the body. Maybe AIs will advance to the point where they're really good at this, maybe better than humans, and it'll be OK that humans lose understanding of this. However, we remain a ways away from this. And even if we get there, it may yet be more years away than we'd like; 10, 15 years of accreting this sort of goo in our code bases and when the AIs that actually can clean this up get here they may have quite a hard time with what their predecessors left behind.

[1]: https://jerf.org/iri/post/2025/fp_lessons_types_as_assertion...

asyncadventure8 days ago

[dead]

fatheranton7 days ago

[dead]

fatheranton8 days ago

[dead]

MarginalGainz8 days ago

[dead]

i_am_proteus8 days ago

TLDR from the paper (https://arxiv.org/pdf/2601.20245)

>We find that AI use impairs conceptual understanding, code reading, and debugging abilities, without delivering significant efficiency gains on average.

lifetimerubyist8 days ago

[flagged]

dang8 days ago

Edit: actually, this account has been breaking the site guidelines so frequently that I've banned it. If you don't want to be banned, you're welcome to email hn@ycombinator.com and give us reason to believe that you'll follow the rules in the future.

--- original comment: ---

Can you please make your substantive points thoughtfully, rather than being snarky? This is in the site guidelines: https://news.ycombinator.com/newsguidelines.html.

Also, please don't use quotation marks to make it look like you're quoting someone when you aren't.

wan238 days ago

Is it good or bad when companies research their own products and release the results honestly?

acedTrex8 days ago

Anthropic is kinda odd in that it seems to be still largely a research company that also has some products they sorta care about.

lifetimerubyist8 days ago

I just think it's hilarious that on one hand Anthropic will do research that basically concludes that using AI assistance makes you worse at your job.

While on the other hand want you to buy their AI assistance products for obscene prices, and hope you get addicted to them so you can never stop giving them money.

They also loudly brag about how none of their engineers actually write code anymore - while the quality of their products is actually dog.

It's worse than snakeoil that does nothing - it's like they are selling you poison while telling you it'll kill you. We're supposed to applaud them for being honest? It's a joke. They are basically drug dealers getting high on their own supply.

cowboylowrez7 days ago

Yeah all this seems to be the wrong tech at the wrong time. The actual technology is amazing, but its getting mixed into all this greed and stupidity which is innate in our humanity, especially with the sort of society we have in the US. Heck I even mooch off of googles freebie gemini so its not like I've got any room to talk but I'm human too lol