Krep is a fast string search utility written in C, designed for performance-sensitive tasks. It utilizes SIMD instructions and optimized algorithms to achieve speeds significantly faster than grep and other similar tools, especially when searching large files or codebases. Krep supports regular expressions via PCRE2, various output formats including JSON and CSV, and features like ignoring binary files and following symbolic links. The project is open-source and aims to provide a robust and efficient alternative for command-line text searching.
Davide Santangelo has introduced Krep, a new command-line utility meticulously crafted in C for executing high-performance string searches within files. Designed as a potential alternative to tools like grep
and ripgrep
, Krep prioritizes speed and efficiency, particularly when dealing with large datasets or frequent search operations.
The project leverages several strategies to achieve its performance goals. A core component is its utilization of SIMD (Single Instruction, Multiple Data) instructions, enabling parallel processing of characters within search strings. This significantly accelerates the matching process compared to traditional sequential approaches. Krep employs a specific SIMD algorithm known as "AVX2," further enhancing its ability to handle multiple characters concurrently.
Furthermore, Krep integrates memory mapping techniques (specifically, mmap
) to streamline file access. By mapping the file contents directly into memory, Krep minimizes the overhead associated with traditional read operations, leading to faster search execution. This is especially beneficial when repeatedly searching within the same file.
Beyond raw speed, Krep aims for practical usability. It features support for regular expressions, allowing users to perform more complex pattern matching beyond simple literal strings. The tool also provides options for case-insensitive searches, recursive directory traversal, and displaying line numbers alongside matching results, mirroring the functionality of established search utilities. While prioritizing performance, Krep still strives to offer a comprehensive set of features for versatile string search tasks.
The project is open-source, available on GitHub, and actively maintained by its creator. Davide Santangelo encourages community involvement and contributions to further refine and extend Krep's capabilities. The project page includes documentation outlining usage instructions, available options, and building procedures, along with benchmark results demonstrating its performance advantages compared to other similar tools. While still under active development, Krep presents a promising alternative for users seeking a high-performance string search solution, especially in scenarios involving large datasets and demanding search requirements.
Summary of Comments ( 44 )
https://news.ycombinator.com/item?id=43333946
HN users generally praised Krep for its speed and clean implementation. Several commenters compared it favorably to other popular search tools like
ripgrep
andgrep
, with some noting its superior performance in specific scenarios. One user suggested incorporating SIMD instructions for potential further speed improvements. Discussion also touched on the nuances of benchmarking and the importance of real-world test cases, with one commenter sharing their own benchmark results wherekrep
excelled. A few users inquired about specific features, like support for PCRE (Perl Compatible Regular Expressions) or Unicode character classes. Overall, the reception was positive, acknowledgingkrep
as a promising tool for efficient string searching.The Hacker News post about Krep, a high-performance string search utility, sparked a discussion with several interesting comments.
One user questioned the performance comparison methodology, pointing out that
ripgrep
defaults to searching hidden files and uses memory mapping, potentially skewing the benchmarks. They suggested that a more accurate comparison would involve disabling these features inripgrep
to matchkrep
's behavior. This comment highlighted the importance of fair and consistent benchmarking practices when comparing tools.Another commenter noted that
krep
lacks support for regular expressions, a significant limitation compared to other search utilities. They acknowledged the potential performance benefits of a simpler string search but questioned its practical usefulness without regex functionality. This comment underscored the trade-off between speed and features.A subsequent reply elaborated on the regex point, stating that the lack of this feature greatly reduces
krep
's versatility, especially in code searching scenarios. The commenter emphasized that regex support is essential for many real-world use cases.One commenter praised
krep
's speed, particularly in simpler search scenarios. They described a situation where they needed to search extensive log files and foundkrep
significantly faster than other tools. This comment highlighted the niche wherekrep
might excel: situations where pure string searching without regex is sufficient.The creator of
krep
also participated in the discussion, acknowledging the feedback regarding regex support and explaining the rationale behind its exclusion. They mentioned plans to potentially implement a separate tool for regex searching built upon some of the underlying techniques used inkrep
. This response demonstrated engagement with the community and a willingness to consider future development based on user feedback.One comment highlighted the value of specialized tools like
krep
, even with their limitations. The commenter argued that having a dedicated tool for fast literal string searches can be beneficial, even if it doesn't replace fully featured tools likeripgrep
in all scenarios.Finally, a commenter raised a point about the documentation, suggesting an improvement to clarify the handling of non-UTF-8 encoded files. This comment emphasized the importance of clear and comprehensive documentation for user experience.
In summary, the comments section primarily revolved around
krep
's performance, its lack of regex support, its potential use cases, and some suggestions for improvements. While some users lauded its speed, others found the absence of regex a significant drawback. The discussion highlighted the importance of benchmarking methodology, the trade-offs between speed and functionality, and the value of specialized tools.