The author discovered a critical remote zero-day vulnerability (CVE-2025-37899) in the Linux kernel's SMB implementation, ksmbd, using the o3 fuzzer. This vulnerability allows for remote code execution without authentication, potentially enabling attackers to compromise vulnerable systems. The flaw resides in the handling of extended attributes, specifically when processing EA metadata within SMB2_SET_INFO requests. The fuzzer pinpointed an integer overflow leading to a heap out-of-bounds write, which could then be exploited to gain control. The author developed a proof-of-concept exploit demonstrating arbitrary kernel memory reads and writes, highlighting the severity of the issue. A patch was submitted and accepted upstream, and distributions subsequently released updates addressing this vulnerability.
Simon Willison's blog post showcases the unsettling yet fascinating capabilities of O3, a new location identification tool. By analyzing seemingly insignificant details within photos, like the angle of sunlight, vegetation, and distant landmarks, O3 can pinpoint a picture's location with remarkable accuracy. Willison demonstrates this by feeding O3 his own photos, revealing the tool's ability to deduce locations from obscure clues, sometimes even down to the specific spot on a street. This power evokes a sense of both wonder and unease, highlighting the potential for privacy invasion while showcasing a significant leap in image analysis technology.
Hacker News users discussed the implications of Simon Willison's blog post demonstrating a tool that accurately guesses photo locations based on seemingly insignificant details. Several expressed awe at the technology's power while also feeling uneasy about privacy implications. Some questioned the long-term societal impact of such readily available location identification, predicting increased surveillance and a chilling effect on photography. Others pointed out potential positive applications, such as verifying image provenance or aiding historical research. A few commenters focused on technical aspects, discussing potential countermeasures like blurring details or introducing noise, while others debated the ethical responsibilities of developers creating such tools. The overall sentiment leaned towards cautious fascination, acknowledging the impressive technical achievement while recognizing its potential for misuse.
The post "Jagged AGI: o3, Gemini 2.5, and everything after" argues that focusing on benchmarks and single metrics of AI progress creates a misleading narrative of smooth, continuous improvement. Instead, AI advancement is "jagged," with models displaying surprising strengths in some areas while remaining deficient in others. The author uses Google's Gemini 2.5 and other models as examples, highlighting how they excel at certain tasks while failing dramatically at seemingly simpler ones. This uneven progress makes it difficult to accurately assess overall capability and predict future breakthroughs. The post emphasizes the importance of recognizing these jagged capabilities and focusing on robust evaluations across diverse tasks to obtain a more realistic view of AI development. It cautions against over-interpreting benchmark results and promotes a more nuanced understanding of current AI capabilities and limitations.
Hacker News users discussed the rapid advancements in AI, expressing both excitement and concern. Several commenters debated the definition and implications of "jagged AGI," questioning whether current models truly exhibit generalized intelligence or simply sophisticated mimicry. Some highlighted the uneven capabilities of these models, excelling in some areas while lagging in others, creating a "jagged" profile. The potential societal impact of these advancements was also a key theme, with discussions around job displacement, misinformation, and the need for responsible development and regulation. Some users pushed back against the hype, arguing that the term "AGI" is premature and that current models are far from true general intelligence. Others focused on the practical applications of these models, like improved code generation and scientific research. The overall sentiment reflected a mixture of awe at the progress, tempered by cautious optimism and concern about the future.
The blog post details how to use Google's Gemini Pro and other large language models (LLMs) for creative writing, specifically focusing on generating poetry. The author demonstrates how to "hallucinate" text with these models by providing evocative prompts related to existing literary works like Shakespeare's Sonnet 3.7 and two other poems labeled "o1" and "o3." The process involves using specific prompting techniques, including detailed scene setting and instructing the LLM to adopt the style of a given author or work. The post aims to make these powerful creative tools more accessible by explaining the methods in a straightforward manner and providing code examples for using the Gemini API.
Hacker News commenters discussed the accessibility of the "hallucination" examples provided in the linked article, appreciating the clear demonstrations of large language model limitations. Some pointed out that these examples, while showcasing flaws, also highlight the potential for manipulation and the need for careful prompting. Others discussed the nature of "hallucination" itself, debating whether it's a misnomer and suggesting alternative terms like "confabulation" might be more appropriate. Several users shared their own experiences with similar unexpected LLM outputs, contributing anecdotes that corroborated the author's findings. The difficulty in accurately defining and measuring these issues was also raised, with commenters acknowledging the ongoing challenge of evaluating and improving LLM reliability.
OpenAI's model, O3, achieved a new high score on the ARC-AGI Public benchmark, marking a significant advancement in solving complex reasoning problems. This benchmark tests advanced reasoning capabilities, requiring models to solve novel problems not seen during training. O3 substantially improved upon previous top scores, demonstrating an ability to generalize and adapt to unseen challenges. This accomplishment suggests progress towards more general and robust AI systems.
HN commenters discuss the significance of OpenAI's O3 model achieving a high score on the ARC-AGI-PUB benchmark. Some express skepticism, pointing out that the benchmark might not truly represent AGI and questioning whether the progress is as substantial as claimed. Others are more optimistic, viewing it as a significant step towards more general AI. The model's reliance on retrieval methods is highlighted, with some arguing this is a practical approach while others question if it truly demonstrates understanding. Several comments debate the nature of intelligence and whether these benchmarks are adequate measures. Finally, there's discussion about the closed nature of OpenAI's research and the lack of reproducibility, hindering independent verification of the claimed breakthrough.
Summary of Comments ( 178 )
https://news.ycombinator.com/item?id=44081338
Hacker News users discussed the efficacy of using static analysis tools like O3, with some praising its potential while acknowledging it's not a silver bullet. Several commenters pointed out the vulnerability seemed relatively simple to spot, questioning the need for O3 in this specific case. The conversation also touched on the disclosure process and the discoverer's decision to publish exploit details before a patch was available, sparking debate about responsible disclosure practices. Some users criticized aspects of the write-up itself, such as claims about the novelty of O3's capabilities. Finally, the prevalence of memory safety issues in C code and the role of tools like Rust in mitigating such vulnerabilities were also discussed.
The Hacker News post discussing the blog post about CVE-2025-37899 has generated a substantial number of comments, many of which delve into various technical aspects of the vulnerability and the process used to discover it.
Several commenters commend the author's approach of using compiler optimizations (specifically
-O3
) to uncover the vulnerability. They note the ingenuity of leveraging a tool not typically associated with security research for this purpose. Some discuss how compiler optimizations, while designed to improve performance, can sometimes expose latent bugs by rearranging code in ways that reveal unexpected behavior.A few comments delve into the specific details of the vulnerability, discussing the memory management issues that ultimately lead to the exploit. They analyze how the
-O3
optimization changed the code's execution flow in a way that made the bug manifest.The use of KASAN (Kernel Address Sanitizer) is also highlighted in the comments, with users praising its efficacy in pinpointing the source of the problem. The discussion touches on the importance of robust sanitizers in modern software development, especially for complex systems like the Linux kernel.
Some commenters express concern about the implications of this discovery, pointing out the potential severity of a remote zero-day in such a widely used component. They discuss the potential impact on various systems and the importance of prompt patching.
There's also a discussion around the responsible disclosure process, with commenters expressing appreciation for the author's approach and the timely patching of the vulnerability. The comments highlight the importance of coordinated disclosure to minimize potential harm while ensuring that users have access to necessary updates.
A recurring theme in the comments is the relative simplicity of the vulnerability once it was uncovered. This leads to some speculation about why it wasn't discovered earlier, with suggestions ranging from the complexity of the codebase to the limitations of traditional testing methods.
Finally, some commenters share their own experiences with similar vulnerabilities and discuss the challenges of finding and fixing bugs in complex systems. They offer insights into various debugging techniques and tools, contributing to a broader conversation about software security and best practices.