AI finds thousands of zero-day exploits... including in FreeBSD.

Running a "bug-find" on each file of a large project has little chance of finding useful issues.

It turned out useful for me.

But this touches on the biggest problem I see with this style of code review: context size in the LLM in use. I find that even moderately-low-mid size files overrun the context window with too many input tokens. A whole-codebase review is out of question, let's leave this aside. But even grouping related files that could produce bugs only detectable when looking at the group will be challenging.

Keep in mind Anthropic charges you double when you use their models with larger context windows, which is 1 million tokens. And that is generally out of reach for local LLMs.
 
The hype is big with this one...
The hype is unavoidable and we gotta learn to separate the wheat from the chaff.

There will come a time, very soon, where it'll be considered rude to submit code to review that you didn't review thru to a LLM first. This is being automated in CI as we speak where an AI agent does the pre-review anyway.
 
From what I heard I'm not sure they intend ever to release it. Instead they will sell it to industry partner companies to identify and fix exploits, but will not release it for general use. That was the gist of the news report I heard earlier. They consider it too dangerous to put it out on general release.

I'm sure the opposition is working on the same kind of thing...
What would be nice is if they would allow free software foundations to add their tools to code review automation so that maybe a key committee can get the bug reports, allowing them to quietly patch stuff or reprioritize things that need really invasive fixes.
 
The hype is unavoidable and we gotta learn to separate the wheat from the chaff.

There will come a time, very soon, where it'll be considered rude to submit code to review that you didn't review thru to a LLM first. This is being automated in CI as we speak where an AI agent does the pre-review anyway.
There's a flaw with it. If finding problems in software using AI is proven effective, why does it still take time? They should find everything at once, considered the available computing capacity. A few days with a very deep search, maybe. The method stays the same. It's just a matter of operations per second.
I think everything that can be automated to increase results is already done in the security corner anyway.
 
There's a flaw with it. If finding problems in software using AI is proven effective, why does it still take time? They should find everything at once, considered the available computing capacity. A few days with a very deep search, maybe. The method stays the same. It's just a matter of operations per second.
I think everything that can be automated to increase results is already done in the security corner anyway.
It's political.

What can a state-sponsored bad-actor damage? If the cost of finding the vulnerabilities is high but not prohibitively high, it's a matter of time before a target rich environment justifies the cost and lots of people get nasty surprises.

The implication is that the cost has regressed from prohibitively high cost for relatively limited effect on target to acceptable for sufficiently large targets, and if that trend continues that weak but well funded bad actors will be able to threaten lots of people. The big scary is the forward projection of falling cost of exploit discovery.
 
The hype is unavoidable and we gotta learn to separate the wheat from the chaff.

There will come a time, very soon, where it'll be considered rude to submit code to review that you didn't review thru to a LLM first. This is being automated in CI as we speak where an AI agent does the pre-review anyway.
Maybe/probably the "Project Glasswing" will become a permanent CI that OSS organizations can use before releasing upgrades.
 
Never. Very expensive to run and these companies want ROI.
Up to date "Anthropic is committing up to $100M in usage credits for Mythos Preview across these efforts, as well as $4M in direct donations to open-source security organizations." https://www.anthropic.com/glasswing.

So Anthropic allow using the model nearly for free, and they are giving also real money to the OSS members that had to review and include the security patches. In future, I don't know.
 
No AI vendor is currently charging market rates. They're all artificially nerfed, draining corporate budget, funding or government subsidies.

It is impossible to gauge the potential impact of this technology until it gets to realistic cost.
 
No AI vendor is currently charging market rates. They're all artificially nerfed, draining corporate budget, funding or government subsidies.

It is impossible to gauge the potential impact of this technology until it gets to realistic cost.
There's lots of hype but there's also doomerism and both sides are wrong.

The most obvious use-case for AI is cheap entertainment like memes and NSFW content that even OpenAI is flirting with.

Those waiting for an AI bubble pop will be disappointed... We don't need AGI for LLM's to be useful. The tech is here to stay.
 
Those waiting for an AI bubble pop will be disappointed... We don't need AGI for LLM's to be useful. The tech is here to stay.
Obviously it's here to stay, but how they are ever going to make a positive ROI financially is yet to be seen. I think sooner or later one or more 'killer apps' will emerge, perhaps things that haven't been thought of yet. But it's not clear right now how any of those companies are going to turn a real profit.

I just hope it doesn't end up being a mechanical hound prowling the streets outside while you sleep, as Ray Bradbury predicted. Although currently the ones in Atlanta have a remote human operator in india, apparantly. Now imagine if this thing was the size of a horse and with a machine gun mounted on top of it.
View: https://www.youtube.com/shorts/2hZwgmDed9Y
But all I can see right now is "burn, baby, burn"...
 
Obviously it's here to stay, but how they are ever going to make a positive ROI financially is yet to be seen. I think sooner or later one or more 'killer apps' will emerge, perhaps things that haven't been thought of yet. But it's not clear right now how any of those companies are going to turn a real profit.
The ones selling the shovels are getting insane profits. The hyperscalers have yet to turn a profit but they have lots of money to spend. They can also reinvent themselves like Google & Meta not being strictly IT companies anymore but in the ad business.
 
Very true. Both sides are more similar than they think.
  • For AI - Bullsh*tting that AI will take over the world.
  • Against AI: Bullsh*tting that AI will take over the world.

You must know the saying from a few years ago that software was eating the world. Now AI is eating software through commoditization and licence laundering bringing total disruption.

Now is the worst time to take anything for granted... While previously you'd hire a designer for a logo or a webpage, now it's not longer the case. Those who have an IT job right now should consider themselves lucky, or even privileged.
 
The ones selling the shovels are getting insane profits. The hyperscalers have yet to turn a profit but they have lots of money to spend. They can also reinvent themselves like Google & Meta not being strictly IT companies anymore but in the ad business.
Well, yes. Nvidia and other semi companies, and firms that make all the related hardware and infrastructure, are doing extremely well out of it. And power generation, and water. But as for 'AI' themselves... just keep on burnin'
 
Now is the worst time to take anything for granted... While previously you'd hire a designer for a logo or a webpage, now it's not longer the case. Those who have an IT job right now should consider themselves lucky, or even privileged.
Sounds like a bum job to me, then. Do something that can't be automated away, not "IT".
 
But this touches on the biggest problem I see with this style of code review: context size in the LLM in use. I find that even moderately-low-mid size files overrun the context window with too many input tokens. A whole-codebase review is out of question, let's leave this aside. But even grouping related files that could produce bugs only detectable when looking at the group will be challenging.
Some co-workers ran into this very problem today.
 
 
I doubt these would see a use case in backward compatibility. Just build BSD for the long term. Let AI do what AI does. What use case does a token have in C/C++ except for sitting atop of it as an abstraction layer.
 
AI has been fun. I have learned a lot. I would never report bugs I find due to my lack of intelligence, so I see why people react this way.
I have used Codex with Pro sub and have it constantly hammering away to make stuff day and night... it's fun!
Eventually things will come to a head and society will reach an equilibrium.
I think what will end up happening is places like Github will separate AI generated bug reports and user ones. That would keep the sludge away I think.
 
Very true. Both sides are more similar than they think.
  • For AI - Bullsh*tting that AI will take over the world.
  • Against AI: Bullsh*tting that AI will take over the world.
Big centralized data will take over the world. A few multinationals hijacking all public knowledge. Pretty much phase 2 after ending piracy which is actually freedom of information.
 
LWN article with its "plateau has not been hit", and "we need to prove AI can't find exploits en masse" again spreads bs. Give us your data, or you're missing out, this time not on development agility but security. What they're not actually mentioning is the power bill to find one exploit.
Anthropic's Mythos Preview writeup from 7 April was pretty upfront in several places about the cost of finding their exploits, at least in terms of API pricing (the true cost may be higher, of course - but also bear in mind the long-term trend of reducing compute costs so these figures are likely to become more affordable eventually). https://red.anthropic.com/2026/mythos-preview/

There's one case where they provide Mythos Preview with an N-day vuln previously unearthed by a fuzzer and get it to create an exploit. "In November 2024, the Syzkaller fuzzer identified a KASAN slab-out-of-bounds read in netfilter's ipset. ... [Claude chains some stuff together] ... And this, finally, grants the user full root permissions and the ability to make arbitrary changes to the machine. Creating this exploit (starting from the syzkaller report) cost under $1000 at API pricing, and took half a day to complete."

And what happened with OpenBSD reveals one of the other complications with fairly putting a cost on these discoveries: "This was the most critical vulnerability we discovered in OpenBSD with Mythos Preview after a thousand runs through our scaffold. Across a thousand runs through our scaffold, the total cost was under $20,000 and found several dozen more findings. While the specific run that found the bug above cost under $50, that number only makes sense with full hindsight. Like any search process, we can't know in advance which run will succeed."

I don't know which open source projects are going to benefit from Project Glasswing, but purchasing this kind of AI code review is clearly not going to be an affordable option for many non-commercial projects. But I can see it being attractive to big tech firms if you compare these figures to some of their bug bounty programs. This might be a bad time to be a project with little financial firepower but which has just enough real-world usage in important infrastructure to be a target of interest.

Incidentally it's not just Anthropic who are at it. Six of the eight FreeBSD CVEs this month were found by AI - aside from the two found by Anthropic my suspicion is the other four were the three by https://aisle.com/about-us and one by https://blog.calif.io/archive?sort=new

See https://nitter.net/cperciva/status/2049591719143059860#m
 
Back
Top