This post explores the challenges of generating deterministic random numbers and using cosine within Nix expressions. It highlights that Nix's purity, while beneficial for reproducibility, makes tasks like generating unique identifiers difficult without resorting to external dependencies or impure functions. The author demonstrates various approaches, including using the derivation name as a seed for a pseudo-random number generator (PRNG) and leveraging builtins.currentTime
as a less deterministic but readily available alternative. The post also delves into the lack of a built-in cosine function in Nix and presents workarounds, like writing a custom implementation or relying on a pre-built library, showcasing the trade-offs between self-sufficiency and convenience.
Fedora is implementing a change to enhance package reproducibility, aiming for a 99% success rate. This involves using "source date epochs" (SDE) which fixes build timestamps to a specific point in the past, eliminating variations caused by differing build times. While this approach simplifies reproducibility checks and reduces false positives, it won't address all issues, such as non-deterministic build processes within the software itself. The project is actively seeking community involvement in testing and reporting any remaining non-reproducible packages after the SDE switch.
Hacker News users discuss the implications of Fedora's push for reproducible builds, focusing on the practical challenges. Some express skepticism about achieving true reproducibility given the complexity of build environments and dependencies. Others highlight the security benefits, emphasizing the ability to verify package integrity and prevent malicious tampering. The discussion also touches on the potential trade-offs, like increased build times and the need for stricter control over build processes. A few commenters suggest that while perfect reproducibility might be difficult, even partial reproducibility offers significant value. There's also debate about the scope of the project, with some wondering about the inclusion of non-free firmware and the challenges of reproducing hardware-specific optimizations.
Debian's "bookworm" release now offers officially reproducible live images. This means that rebuilding the images from source code will result in bit-for-bit identical outputs, verifying the integrity and build process. This achievement, a first for official Debian live images, was accomplished by addressing various sources of non-determinism within the build system, including timestamps, random numbers, and build paths. This increased transparency and trustworthiness strengthens Debian's security posture.
Hacker News commenters generally expressed approval of Debian's move toward reproducible builds, viewing it as a significant step for security and trust. Some highlighted the practical benefits, like easier verification of image integrity and detection of malicious tampering. Others discussed the technical challenges involved in achieving reproducibility, particularly with factors like timestamps and build environments. A few commenters also touched upon the broader implications for software supply chain security and the potential influence on other distributions. One compelling comment pointed out the difference between "bit-for-bit" reproducibility and the more nuanced "content-addressed" approach Debian is using, clarifying that some variation in non-functional aspects is still acceptable. Another insightful comment mentioned the value of this for embedded systems, where knowing exactly what's running is crucial.
Researchers reliant on animal models, particularly in neuroscience and physiology, face growing career obstacles. Funding is increasingly directed towards human-focused research like clinical trials and 'omics' approaches, seen as more translatable to human health. This shift, termed "animal methods bias," disadvantages scientists trained in animal research, limiting their funding opportunities, hindering career progression, and potentially slowing crucial basic research. While acknowledging the importance of human-focused studies, the article highlights the ongoing need for animal models in understanding fundamental biological processes and developing new treatments, urging funders and institutions to recognize and address this bias to avoid stifling valuable scientific contributions.
HN commenters discuss the systemic biases against research using animal models. Several express concern that the increasing difficulty and expense of such research, coupled with the perceived lower status compared to other biological research, is driving talent away from crucial areas of study like neuroscience. Some note the irony that these biases are occurring despite significant breakthroughs having come from animal research, and the continued need for it in many fields. Others mention the influence of animal rights activism and public perception on funding decisions. One commenter suggests the bias extends beyond careers, impacting publications and grant applications, ultimately hindering scientific progress. A few discuss the ethical implications and the need for alternatives, acknowledging the complex balancing act between animal welfare and scientific advancement.
This project details modifications to a 7500 Fast Real-Time PCR System to enable independent verification of its operation. By replacing the embedded computer with a Raspberry Pi and custom software, the project aims to achieve full control over the thermocycling process and data acquisition, eliminating reliance on proprietary software and potentially increasing experimental transparency and reproducibility. The modifications include custom firmware, a PCB for interfacing with the thermal block and optical system, and open-source software for experiment design, control, and data analysis. The goal is to create a completely open-source real-time PCR platform.
HN commenters discuss the feasibility and implications of a modified PCR machine capable of verifying scientific papers. Several express skepticism about the practicality of distributing such a device widely, citing cost and maintenance as significant hurdles. Others question the scope of verifiability, arguing that many scientific papers rely on more than just PCR and thus wouldn't be fully validated by this machine. Some commenters suggest alternative approaches to improving scientific reproducibility, such as better data sharing and standardized protocols. A few express interest in the project, seeing it as a potential step towards more transparent and trustworthy science, particularly in fields susceptible to fraud or manipulation. There is also discussion on the difficulty of replicating wet lab experiments in general, highlighting the complex, often undocumented nuances that can influence results. The creator's focus on PCR is questioned, with some suggesting other scientific methods might be more impactful starting points for verification.
The blog post details a performance optimization for Nix's evaluation process. By pre-resolving store paths for built-in functions, specifically fetchers, Nix can avoid redundant computations during evaluation, leading to significant speed improvements. This is achieved by introducing a new builtins
attribute in the Nix expression language containing pre-computed hashes for commonly used fetchers. This change eliminates the need to repeatedly calculate these hashes during each evaluation, resulting in faster build times, particularly noticeable in projects with many dependencies. The post demonstrates benchmark results showing a substantial reduction in evaluation time with this optimization, highlighting its potential to improve the overall Nix user experience.
Hacker News users generally praised the technique described in the article for improving Nix evaluation performance. Several commenters highlighted the cleverness of pre-computing store paths, noting that it bypasses a significant bottleneck in Nix's evaluation process. Some expressed surprise that this optimization wasn't already implemented, while others discussed potential downsides, like the added complexity to the tooling and the risk of invalidating the cache if the store path changes. A few users also shared their own experiences with Nix performance issues and suggested alternative optimization strategies. One commenter questioned the significance of the improvement in practical scenarios, arguing that derivation evaluation is often not the dominant factor in overall build time.
NixOS aims for reproducibility, but subtle discrepancies can arise. While package builds are generally deterministic thanks to Nix's controlled environment, issues like differing system times during builds, non-deterministic build processes within packages themselves, and reliance on external resources like network-fetched timestamps or random numbers can introduce variability. The author highlights these challenges and explores how they impact reproducibility in practice, demonstrating that while NixOS significantly improves build consistency, achieving perfect reproducibility requires careful attention and sometimes impractical restrictions. Flaky tests and varying build outputs are presented as evidence of these limitations, showcasing scenarios where identical Nix expressions produce different results.
Hacker News users discuss reproducibility issues encountered with NixOS, despite its declarative nature. Several commenters point out that while Nix excels at package reproducibility, issues arise from external factors like hardware differences (particularly GPUs and networking) and reliance on non-reproducible external resources like timestamps and random number generation. One compelling comment highlights the distinction between "build reproducibility" and "runtime reproducibility," arguing NixOS effectively achieves the former but struggles with the latter. Others suggest that focusing solely on bit-for-bit reproducibility is misplaced, and that NixOS's value lies in its robust declarative configuration and ease of rollback, even if perfect reproducibility remains a challenge. The importance of properly caching build dependencies for true reproducibility is also emphasized. Several users share anecdotal experiences with inconsistencies and difficulties reproducing specific configurations, especially when dealing with complex setups or proprietary drivers.
Summary of Comments ( 0 )
https://news.ycombinator.com/item?id=43669057
Hacker News users discussed the blog post about reproducible random number generation in Nix. Several commenters appreciated the clear explanation of the problem and the proposed solution using a cosine function to distribute builds across build machines. Some questioned the practicality and efficiency of the cosine approach, suggesting alternatives like hashing or simpler modulo operations, especially given potential performance implications and the inherent limitations of pseudo-random number generators. Others pointed out the complexities of truly distributed builds in Nix and the need to consider factors like caching and rebuild triggers. A few commenters expressed interest in exploring the cosine method further, acknowledging its novelty and potential benefits in certain scenarios. The discussion also touched upon the broader challenges of achieving determinism in build systems and the trade-offs involved.
The Hacker News post titled "RNG and Cosine in Nix" sparked a discussion with several interesting comments.
One commenter questioned the practicality of the approach, pointing out that using a hash function directly would likely be simpler and more efficient than the proposed cosine-based method. They also expressed concern about potential bias introduced by using cosine and suggested that a more rigorous statistical analysis would be necessary to validate the randomness quality.
Another commenter echoed this sentiment, emphasizing the importance of proper statistical testing for random number generation. They recommended using established test suites like TestU01 to thoroughly evaluate the randomness properties of the generated sequence.
One user focused on the security implications, warning that the proposed method might not be suitable for cryptographic purposes due to potential predictability. They advised against using custom RNG solutions in security-sensitive contexts and recommended relying on well-vetted cryptographic libraries instead.
A further commenter offered a different perspective, suggesting that the approach might be useful for generating deterministic random values based on a seed. They envisioned applications in procedural generation, where consistent results are desirable.
Another individual highlighted the importance of understanding the underlying distribution of the generated random numbers. They noted that different applications may require different distributions (uniform, normal, etc.) and that simply generating seemingly random numbers without considering the distribution could lead to incorrect results.
Several commenters discussed the mathematical properties of the cosine function and its suitability for RNG. Some expressed skepticism, while others defended its potential, albeit with the caveat that careful analysis and testing are crucial.
Finally, some comments touched on the specific use case within Nix, the package manager mentioned in the title. They speculated about the potential benefits and drawbacks of using this method for generating unique identifiers or other random values within the Nix ecosystem. However, no definitive conclusions were drawn regarding its practical application in Nix.