Edsger Dijkstra argues that array indexing should start at zero, not one. He lays out a compelling case based on the elegance and efficiency of expressing slices or subsequences within an array. Using half-open intervals, where the lower bound is inclusive and the upper bound exclusive, simplifies calculations and leads to fewer "off-by-one" errors. Dijkstra demonstrates that representing a subsequence from element 'i' through 'j' becomes significantly more straightforward when using zero-based indexing, as the length of the subsequence is simply j-i. This contrasts with one-based indexing, which necessitates more complex and less intuitive calculations for subsequence lengths and endpoint adjustments. He concludes that zero-based indexing offers a more natural and consistent way to represent array segments, aligning better with mathematical conventions and ultimately leading to cleaner, less error-prone code.
In his 1982 essay, "Why Numbering Should Start at Zero," Edsger W. Dijkstra meticulously argues for the convention of inclusive-exclusive ranges, where a sequence starts at zero and ends at n, excluding the value of n itself. Dijkstra posits that this approach, representing a range as 0 ≤ i < n, is superior to the alternative conventions of inclusive-inclusive (0 ≤ i ≤ n-1) or exclusive-inclusive (1 ≤ i ≤ n) ranges. He bases his argument on the enhanced elegance and efficiency this convention offers in several common programming scenarios.
Dijkstra highlights the awkwardness inherent in describing empty sequences using the other conventions, requiring either asymmetrical bounds or the introduction of an arbitrary negative lower bound. With the 0 ≤ i < n convention, an empty sequence is elegantly represented by n = 0, maintaining the consistent lower bound of zero.
Furthermore, Dijkstra demonstrates the benefits of his preferred convention when describing sub-sequences within a larger sequence. He illustrates how the inclusive-exclusive range simplifies the calculation of sequence lengths and the expression of relationships between adjacent subsequences. For example, concatenating adjacent subsequences becomes straightforward, with the upper bound of the first subsequence directly serving as the lower bound of the second, and the sum of their lengths representing the length of the combined subsequence. This eliminates the need for cumbersome +/- 1 adjustments required by other conventions.
He specifically addresses the case of dividing a sequence into two parts, emphasizing the ease with which the inclusive-exclusive range allows for the expression of the total length (n) as the sum of the lengths of the two sub-sequences (p and q), where n = p + q, with the first subsequence spanning from 0 to p (exclusive) and the second from p to n (exclusive).
Dijkstra concludes by reiterating the overall superiority of the 0 ≤ i < n convention, emphasizing the simplicity, consistency, and elegance it introduces into the handling of sequence boundaries and sub-sequence manipulations, thereby contributing to more comprehensible and less error-prone code. He attributes the prevalence of alternative conventions to historical baggage and habit, encouraging a shift towards the more mathematically sound approach he advocates.
Summary of Comments ( 97 )
https://news.ycombinator.com/item?id=43433599
Hacker News users discuss Dijkstra's famous argument for zero-based indexing. Several commenters agree with Dijkstra's logic, emphasizing the elegance and efficiency of using half-open intervals. Some highlight the benefits in loop constructs and simplifying calculations for array slices. A few point out that one-based indexing can be more intuitive in certain contexts, aligning with how humans naturally count. One commenter notes the historical precedent, mentioning that Fortran used one-based indexing, influencing later languages. The discussion also touches on the trade-offs between conventions and the importance of consistency within a given language or project.
The Hacker News post "Numbering should start at zero (1982)" discussing Edsger Dijkstra's famous note on the topic generated a fair number of comments (48 at the time of this summary). Many commenters expressed agreement with Dijkstra's reasoning, emphasizing the elegance and efficiency of zero-based indexing, particularly in relation to computer science and array manipulation. Several pointed out how zero-based indexing simplifies calculations involving ranges and offsets, aligning with the underlying memory organization in computers.
Some comments delved into specific examples of the practical advantages of zero-based indexing in programming languages and algorithms. The discussion touched upon the trade-offs between human readability and computational efficiency. One commenter highlighted how the inclusive/exclusive range debate is intertwined with the zero/one-based indexing argument.
A few commenters offered contrasting viewpoints, acknowledging the cognitive dissonance zero-based indexing can sometimes introduce, especially for individuals without a programming background. They mentioned how one-based indexing can be more intuitive in everyday scenarios. One comment suggested that Dijkstra's argument leans towards being mathematically rigorous, while human intuition often favors starting counts from one.
A recurring theme was the historical context of Dijkstra's note. Commenters discussed the evolution of programming languages and the prevalence of both zero-based and one-based indexing conventions in different contexts. Some comments linked to relevant resources, including further discussions on the topic and examples of programming languages adopting different approaches.
While a significant portion of the comments focused on the technical aspects of indexing, a few explored broader philosophical implications related to inclusivity and exclusivity in mathematics and computer science. One commenter noted the implications in mathematical notation, comparing intervals like [0, n) to [1, n].
Overall, the comment section reflects a general consensus on the advantages of zero-based indexing from a computer science perspective, while acknowledging the validity of one-based indexing in certain contexts and the potential for cognitive friction when transitioning between these two conventions. The comments offer a mix of technical explanations, historical perspectives, and philosophical reflections on Dijkstra's argument.