With a concrete bug report like “using codec xyz and input file f3 10 4d 26 f5 0a a1 7e cd 3a 41 6c 36 66 21 d8… ffmpeg crashes with an oob memory error”, it’s pretty simple to confirm that such a crash happens
Google’s big sleep was pretty good, it gave a python program that generated an invalid file. It looked plausible, and it was a real issue. The problem is that literally every other generative AI bug report also looks equally as plausible. As I mentioned before, curl is having a similar issue.
Stenberg said the amount of time it takes project maintainers to triage each AI-assisted vulnerability report made via HackerOne, only for them to be deemed invalid, is tantamount to a DDoS attack on the project.
So you can claim testing may be simple, but it looks like that isn’t the case. I would say one of the problems is that all these people are volunteers, so they probably have a very, very limited set of time to spend on these projects.
This was the first search hit about ffmpeg cve’s, from June 2024 so not about the current incident. It lists four CVE’s, three of them memory errors (buffer overflow, use-after-free), and one off-by-one error. The class of errors in the first three is supposedly completely eliminated by Rust.
FFMpeg is not just C code, but also large portions of handwritten, ultra optimized assembly code (per architecture, too…). You are free to rewrite it in rust if you so desire, but I stated it above and will state it again: ffmpeg made the tradeoff of performance for security. Rust currently isn’t as performant as optimized C code, and I highly doubt that even unsafe rust can beat hand optimized assembly — C can’t, anyways.
(Google and many big tech companies like ultra performant projects because performance equals power savings equals costs savings at scale. But this means weaker security when it comes to projects like ffmpeg…)
Rust currently isn’t as performant as optimized C code, and I highly doubt that even unsafe rust can beat hand optimized assembly — C can’t, anyways.
A bit tangential, but to answer this question, nothing beats the most optimized assembly code. At best, programming languages can only hope to match the most optimized assembly.
Rust does have macros for inlining assembly into your program, but it’s horribly unsafe and not super easy to work with.
Rewriting ffmpeg in Rust is not a solution here (like you’re saying).
Google’s big sleep was pretty good, it gave a python program that generated an invalid file. It looked plausible, and it was a real issue. The problem is that literally every other generative AI bug report also looks equally as plausible. As I mentioned before, curl is having a similar issue.
And here’s what the lead maintainer of curl has to say:
So you can claim testing may be simple, but it looks like that isn’t the case. I would say one of the problems is that all these people are volunteers, so they probably have a very, very limited set of time to spend on these projects.
FFMpeg is not just C code, but also large portions of handwritten, ultra optimized assembly code (per architecture, too…). You are free to rewrite it in rust if you so desire, but I stated it above and will state it again: ffmpeg made the tradeoff of performance for security. Rust currently isn’t as performant as optimized C code, and I highly doubt that even unsafe rust can beat hand optimized assembly — C can’t, anyways.
(Google and many big tech companies like ultra performant projects because performance equals power savings equals costs savings at scale. But this means weaker security when it comes to projects like ffmpeg…)
A bit tangential, but to answer this question, nothing beats the most optimized assembly code. At best, programming languages can only hope to match the most optimized assembly.
Rust does have macros for inlining assembly into your program, but it’s horribly unsafe and not super easy to work with.
Rewriting ffmpeg in Rust is not a solution here (like you’re saying).