The blog post describes a method to disable specific kernel functions within a user-space process by intercepting system calls. It leverages the ptrace
system call to attach to a process, modify its system call table entries to point to a custom function, and then detach. The custom function can then choose to emulate the original kernel function, return an error, or perform other actions, effectively blocking or altering the behavior of targeted system calls for the specified process. This technique allows for granular control over kernel interactions within a user-space process, potentially useful for security sandboxing or debugging.
eBPF program portability can be tricky due to differences in kernel versions and configurations. The blog post highlights how seemingly minor variations, such as a missing helper function or a change in struct layout, can cause a program that works perfectly on one kernel to fail on another. It emphasizes the importance of using the bpftool
utility for introspection, allowing developers to compare kernel features and identify discrepancies that might be causing compatibility issues. Additionally, building eBPF programs against the oldest supported kernel and strategically employing the LINUX_VERSION_CODE
macro can enhance portability and minimize unexpected behavior across different kernel versions.
The Hacker News comments discuss potential reasons for eBPF program incompatibility across different kernels, focusing primarily on kernel version discrepancies and configuration variations. Some commenters highlight the rapid evolution of the eBPF ecosystem, leading to frequent breaking changes between kernel releases. Others point to the importance of checking for specific kernel features and configurations (like CONFIG_BPF_JIT
) that might be enabled on one system but not another, especially when using newer eBPF functionalities. The use of CO-RE (Compile Once – Run Everywhere) and its limitations are also brought up, with users encountering problems despite its intent to improve portability. Finally, some suggest practical debugging strategies, such as using bpftool
to inspect program behavior and verify kernel support for required features. A few commenters mention the challenge of staying up-to-date with eBPF's rapid development, emphasizing the need for careful testing across target kernel versions.
The Linux Kernel Defence Map provides a comprehensive overview of security hardening mechanisms available within the Linux kernel. It categorizes these techniques into areas like memory management, access control, and exploit mitigation, visually mapping them to specific kernel subsystems and features. The map serves as a resource for understanding how various kernel configurations and security modules contribute to a robust and secure system, aiding in both defensive hardening and vulnerability research by illustrating the relationships between different protection layers. It aims to offer a practical guide for navigating the complex landscape of Linux kernel security.
Hacker News users generally praised the Linux Kernel Defence Map for its comprehensiveness and visual clarity. Several commenters pointed out its value for both learning and as a quick reference for experienced kernel developers. Some suggested improvements, including adding more details on specific mitigations, expanding coverage to areas like user namespaces and eBPF, and potentially creating an interactive version. A few users discussed the project's scope, questioning the inclusion of certain features and debating the effectiveness of some mitigations. There was also a short discussion comparing the map to other security resources.
Landrun is a tool that utilizes the Landlock Linux Security Module (LSM) to sandbox processes without requiring root privileges or containers. It allows users to define fine-grained access control rules for a target process, restricting its access to the filesystem, network, and other resources. By leveraging Landlock's unprivileged mode and a clever bootstrapping process involving temporary filesystems, Landrun simplifies sandbox setup and makes robust sandboxing accessible to regular users. This enables easier and more secure execution of potentially untrusted code, contributing to a more secure desktop environment.
HN commenters generally praise Landrun for its innovative approach to sandboxing, making it easier than traditional methods like containers or VMs. Several highlight the significance of using Landlock LSM for security, noting its kernel-level enforcement as a robust mechanism. Some discuss potential use cases, including sandboxing web browsers and other potentially risky applications. A few express concerns about complexity and debugging challenges, while others point out the project's early stage and potential for improvement. The user-friendliness compared to other sandboxing techniques is a recurring theme, with commenters appreciating the streamlined process. Some also discuss potential integrations and extensions, such as combining Landrun with Firejail.
This blog post presents a revised and more robust method for invoking raw OpenBSD system calls directly from C code, bypassing the standard C library. It improves upon a previous example by handling variable-length argument lists and demonstrating how to package those arguments correctly for system calls. The core improvement involves using assembly code to dynamically construct the system call arguments on the stack and then execute the syscall
instruction. This allows for a more general and flexible approach compared to hardcoding argument handling for each specific system call. The provided code example demonstrates this technique with the getpid()
system call.
Several Hacker News commenters discuss the impracticality of the raw syscall demo, questioning its real-world usefulness and emphasizing that libraries like libc exist for a reason. Some appreciated the technical depth and the exploration of low-level system interaction, viewing it as an interesting educational exercise. One commenter suggested the demo could be useful for specialized scenarios like writing a dynamic linker or a microkernel. There was also a brief discussion about the performance implications and the idea that bypassing libc wouldn't necessarily result in significant speed improvements, and might even be slower in some cases. Some users also debated the portability of the code and suggested alternative methods for achieving similar results.
Testtrim, a tool designed to reduce the size of test suites while maintaining coverage, ironically struggled to effectively test itself due to its reliance on ptrace for syscall tracing. This limitation prevented Testtrim from analyzing nested calls, leading to incomplete coverage data and hindering its ability to confidently trim its own test suite. A recent update introduces a novel approach using eBPF, enabling Testtrim to accurately trace nested syscalls. This breakthrough allows Testtrim to thoroughly analyze its own behavior and finally optimize its test suite, demonstrating its newfound self-testing capability and reinforcing its effectiveness as a test suite reduction tool.
The Hacker News comments discuss the complexity of testing tools like Testtrim, which aim to provide comprehensive syscall tracing. Several commenters appreciate the author's deep dive into the technical challenges and the clever solution involving a VM and intercepting the vmexit
instruction. Some highlight the inherent difficulties in testing tools that operate at such a low level, where the very act of observation can alter the behavior of the system. One commenter questions the practical applications, suggesting that existing tools like strace
and ptrace
might be sufficient in most scenarios. Others point out that Testtrim's targeted approach, specifically focusing on nested virtualization, addresses a niche but important use case not covered by traditional tools. The discussion also touches on the value of learning obscure assembly instructions and the excitement of low-level debugging.
The article explores a new method for process creation using io_uring, aiming to improve efficiency and reduce overhead compared to traditional fork()
and execve()
. This new approach uses a "registered executable" within io_uring, allowing asynchronous process launching without the performance penalties of copying memory pages between parent and child processes. The proposed solution involves two new system calls: pidfd_spawn()
and pidfd_wait()
. pidfd_spawn()
creates a new process from the registered executable and returns a process file descriptor, while pidfd_wait()
provides an asynchronous wait mechanism using io_uring. This approach offers a streamlined process-creation pathway within the io_uring framework, potentially boosting performance for applications that frequently spawn processes, like containers or web servers.
Hacker News users discuss the implications of io_uring's new process creation capabilities. Several express excitement about the potential performance improvements, particularly for applications that frequently spawn processes, like web servers. Some highlight the security benefits of avoiding execve, while others raise concerns about the complexity introduced by this new feature and the potential for misuse. A few commenters delve into the technical details, comparing the approach to other process creation methods and discussing the trade-offs involved. Several anticipate interesting use cases, including containerization and sandboxing. One user questions if io_uring is becoming overly complex and straying from its original purpose.
Summary of Comments ( 6 )
https://news.ycombinator.com/item?id=44047741
HN commenters discuss the blog post's method of disabling kernel functions by overwriting the system call table entries with
int3
instructions. Several express concerns about the fragility and unsafety of this approach, particularly in multi-threaded environments and due to potential conflicts with security mitigations like SELinux. Some suggest alternatives like usingLD_PRELOAD
to intercept and redirect function calls or employing seccomp-bpf for finer-grained control. Others question the practical use cases for this technique, acknowledging its potential for debugging or specialized security applications but cautioning against its general use. A few commenters share anecdotal experiences or related techniques, like disablingptrace
to hinder debuggers. The overall sentiment is one of cautious curiosity mixed with skepticism regarding the robustness and practicality of the described method.The Hacker News post discussing Chad Austin's article on disabling kernel functions has a moderate number of comments, mostly focusing on the practicality and security implications of the technique described.
Several commenters express skepticism about the usefulness of this approach in real-world scenarios. One commenter highlights the limited scope of the technique, pointing out that it only affects the calling process and not the entire system. They argue that if a serious security vulnerability exists that requires disabling a kernel function, a system-wide solution would be necessary. Another commenter questions the practicality of preemptively disabling functions, suggesting it's difficult to predict which functions might be exploited in the future. They propose that a more reactive approach, focusing on patching vulnerabilities as they are discovered, is likely more effective.
Some comments discuss the potential security risks associated with disabling kernel functions. One commenter notes that disabling certain critical functions could destabilize the system, leading to crashes or unexpected behavior. Another expresses concern that attackers could potentially exploit this mechanism itself, disabling essential security functions to gain further access to the system.
A few commenters delve into the technical details of the implementation. One discusses the challenges of determining which functions are safe to disable without causing system instability. Another mentions the possibility of using this technique for performance optimization, by disabling unused or unnecessary kernel functions. However, they acknowledge that the potential performance gains are likely to be minimal.
One commenter provides an alternative perspective, suggesting that the technique could be valuable in highly specialized environments, such as embedded systems or security-critical applications. They argue that in these contexts, the limited scope and potential risks might be acceptable trade-offs for the added security benefits.
There's a thread discussing the difference between disabling a function and simply not calling it. Commenters clarify that disabling prevents the function from being called by any process, including libraries or other system components, while simply not calling it in your own code only affects your process's behavior.
Finally, some commenters express appreciation for the ingenuity of the approach, even if they acknowledge its limited practical application. They see it as an interesting exploration of the Linux kernel's capabilities and a potential starting point for further research in system security.