hackslash dot org

Putting Andrew Ng's OCR models to the test

Posted: 2025-02-28 02:24:04

The blog post "Putting Andrew Ng's OCR models to the test" evaluates the performance of two optical character recognition (OCR) models presented in Andrew Ng's Deep Learning Specialization course. The author tests the models, a simpler CTC-based model and a more complex attention-based model, on a dataset of synthetically generated license plates. While both models achieve reasonable accuracy, the attention-based model demonstrates superior performance, particularly in handling variations in character spacing and length. The post highlights the practical challenges of deploying these models, including the need for careful data preprocessing and the computational demands of the attention mechanism. It concludes that while Ng's course provides valuable foundational knowledge, real-world OCR applications often require further optimization and adaptation.

This blog post, titled "Putting Andrew Ng's OCR models to the test," details a comprehensive evaluation of the optical character recognition (OCR) models presented in Andrew Ng's deep learning specialization on Coursera. The author meticulously examines the performance of two distinct models: a basic model built using a simple recurrent neural network (RNN) and a more advanced model leveraging connectionist temporal classification (CTC). The primary objective of the evaluation is to assess the real-world applicability and robustness of these models beyond the confines of the structured, idealized dataset used within the course.

The author begins by highlighting the simplified and controlled nature of the training data provided in the course, which consists of synthetically generated, warped images of single words. This characteristic, while beneficial for pedagogical purposes, raises concerns regarding the models' generalization capabilities when confronted with the complexities of real-world images, such as varying fonts, backgrounds, layouts, and noise. To address this, the author curates a diverse set of test images captured from different sources, including books, handwritten notes, and computer screens, thereby introducing a more realistic and challenging evaluation scenario.

The subsequent evaluation process involves rigorously comparing the performance of both the RNN and CTC models on this curated dataset. The author documents the models' outputs for various test images, meticulously analyzing their successes and failures. The analysis reveals that while both models demonstrate reasonable performance on clear, well-formatted text, they struggle considerably when faced with more complex scenarios. Issues encountered include difficulties in recognizing unusual fonts, handling background noise or interference, and accurately interpreting handwritten text.

The author provides a detailed account of the observed limitations, showcasing specific examples where the models misclassify characters or fail to segment words correctly. Furthermore, the post delves into the computational aspects of implementing and running these models, offering insights into the training process and the associated computational demands.

Finally, the blog post concludes with a balanced perspective on the utility of Andrew Ng's OCR models. While acknowledging their educational value in illustrating fundamental deep learning concepts, the author underscores the need for further refinement and adaptation to achieve satisfactory performance in real-world OCR applications. This highlights the inherent gap between academic exercises and the practical challenges of deploying machine learning models in complex, uncontrolled environments. The author implicitly suggests that while the models serve as a valuable starting point, substantial further development and training on more representative datasets are crucial for building robust and reliable OCR systems.

Summary of Comments ( 46 )
https://news.ycombinator.com/item?id=43201001

Several Hacker News commenters questioned the methodology and conclusions of the original blog post. Some pointed out that the author's comparison wasn't fair, as they seemingly didn't fine-tune the models properly, particularly the transformer model, leading to skewed results in favor of the CNN-based approach. Others noted the lack of details on training data and hyperparameters, making it difficult to reproduce the results or draw meaningful conclusions about the models' performance. A few suggested alternative OCR tools and libraries that reportedly offer better accuracy and performance. Finally, some commenters discussed the trade-offs between CNNs and transformers for OCR tasks, acknowledging the potential of transformers but emphasizing the need for careful tuning and sufficient data.

The Hacker News post "Putting Andrew Ng's OCR models to the test" has generated several comments discussing the blog post's findings and the broader context of OCR technology.

Several commenters praise the blog post's author for the thoroughness of their testing and analysis. One commenter appreciates the real-world application focus, contrasted with more theoretical deep learning explorations. They highlight the value of the author's systematic approach to finding the best model for their specific use case.

Another thread discusses the licensing implications of using models trained on specific datasets, and whether those licenses carry over to fine-tuned versions of the model. This discussion touches on the practicalities of using open-source models in commercial settings and the potential complexities involved.

A few comments delve into the technical aspects of the OCR process, including preprocessing steps like image cleaning and binarization. One user mentions their own experiences with these techniques, suggesting that such preprocessing can greatly influence the accuracy of the OCR models.

The choice of the Tesseract OCR engine as a benchmark is also a point of discussion. One commenter notes Tesseract's maturity and wide usage, making it a relevant comparison point, while others mention alternative OCR engines and their potential advantages. Someone also mentions the importance of considering the computational resources required by different models, particularly in production environments.

Finally, some comments touch upon the broader advancements in OCR technology and the ongoing research in the field. One commenter points to the evolution of techniques and the increasing accessibility of powerful models, while another emphasizes the importance of tailoring the chosen OCR solution to the specific task at hand.

In essence, the comments section explores various facets of the blog post's findings, from the technical details of OCR and model selection to the broader implications of licensing and real-world application. The commenters generally appreciate the practical approach taken by the author and offer their own insights and experiences related to OCR technology.

Show HN: Klarity – Open-source tool to analyze uncertainty/entropy in LLM output

permalink

Posted: 2025-02-03 13:53:48

Klarity is an open-source Python library designed to analyze uncertainty and entropy in large language model (LLM) outputs. It provides various metrics and visualization tools to help users understand how confident an LLM is in its generated text. This can be used to identify potential errors, biases, or areas where the model is struggling, ultimately enabling better prompt engineering and more reliable LLM application development. Klarity supports different uncertainty estimation methods and integrates with popular LLM frameworks like Hugging Face Transformers.

A newly developed open-source tool named Klarity aims to address the challenge of assessing the certainty and uncertainty inherent in the output generated by Large Language Models (LLMs). LLMs, while powerful, can sometimes produce outputs that sound confident even when the underlying reasoning is weak or the information is uncertain. This can be problematic, especially in sensitive applications where relying on inaccurate or unreliable information can have significant consequences.

Klarity provides a framework for analyzing and quantifying this uncertainty, offering insights into the reliability of LLM-generated text. It operates by leveraging the concept of entropy, a measure of randomness or disorder in information theory. By examining the probability distribution over possible outputs generated by an LLM, Klarity can calculate the entropy of the distribution. A high entropy suggests greater uncertainty, indicating that the model is less confident in its prediction, as it sees many possibilities as equally likely. Conversely, low entropy implies greater certainty, as the model strongly favors a particular output or a small set of outputs.

The tool is designed to be flexible and adaptable to different LLM architectures and tasks. It is implemented as a Python library, offering a programmatic interface for integrating uncertainty analysis into existing LLM workflows. This allows developers and researchers to easily incorporate Klarity into their projects for real-time uncertainty assessment during LLM inference or for post-hoc analysis of generated text.

Klarity’s open-source nature fosters community involvement and contribution, encouraging further development and refinement of the tool. The project aims to improve transparency and trustworthiness in LLM applications by providing a means to quantify and understand the uncertainty associated with their outputs. This can ultimately lead to more responsible and reliable use of LLMs across various domains, empowering users to make informed decisions based on a more nuanced understanding of the limitations and potential pitfalls of these powerful language models. It helps move beyond simply accepting the output at face value and towards a more critical evaluation of the information provided. By making uncertainty analysis more accessible, Klarity hopes to contribute to the development of more robust and trustworthy AI systems.

Summary of Comments ( 23 )
https://news.ycombinator.com/item?id=42918237

Hacker News users discussed Klarity's potential usefulness, but also expressed skepticism and pointed out limitations. Some questioned the practical applications, wondering if uncertainty analysis is truly valuable for most LLM use cases. Others noted that Klarity focuses primarily on token-level entropy, which may not accurately reflect higher-level semantic uncertainty. The reliance on temperature scaling as the primary uncertainty control mechanism was also criticized. Some commenters suggested alternative approaches to uncertainty quantification, such as Bayesian methods or ensembles, might be more informative. There was interest in seeing Klarity applied to different models and tasks to better understand its capabilities and limitations. Finally, the need for better visualization and integration with existing LLM workflows was highlighted.

The Hacker News post about Klarity, an open-source tool to analyze uncertainty/entropy in LLM output, generated a moderate amount of discussion with several insightful comments.

One commenter expressed skepticism about relying solely on entropy as a measure of uncertainty, pointing out that LLMs can be confidently wrong. They suggested that incorporating calibration into the process would be beneficial, acknowledging that it is a challenging problem. This commenter also highlighted the importance of considering the source of uncertainty, distinguishing between inherent ambiguity in the prompt and the model's own limitations.

Another commenter questioned the practical application of Klarity in scenarios where users are seeking definitive answers rather than probabilities. They posited that in many cases, users simply want the most likely answer, not a breakdown of uncertainties. This raised a discussion about the difference between research and practical application, with some arguing that understanding uncertainty is crucial even when a single answer is desired, especially in critical applications.

Several users expressed interest in how Klarity handles multi-token predictions and whether it considers dependencies between tokens. One commenter specifically inquired about the handling of multi-modal distributions, where multiple distinct answers might be equally likely.

One commenter offered a practical suggestion for incorporating Klarity into a workflow, proposing it as a mechanism to trigger human review when uncertainty is high. This aligns with the idea of using AI as a tool to augment human capabilities rather than replace them entirely.

The discussion also touched upon the limitations of entropy as a sole measure of confidence. One commenter pointed out that a low-entropy prediction can still be completely wrong if the model has a fundamental misunderstanding or bias.

Finally, there were some comments expressing general interest in the project and appreciation for its open-source nature, indicating a desire to explore its capabilities further. A few commenters briefly mentioned alternative approaches to uncertainty estimation, further enriching the discussion.

Evaluating Code Embedding Models

permalink

Posted: 2025-02-01 02:06:08

Voyage's blog post details their evaluation of various code embedding models for code retrieval tasks. They emphasize the importance of using realistic datasets and evaluation metrics like Mean Reciprocal Rank (MRR) tailored for code search scenarios. Their experiments demonstrate that retrieval performance varies significantly across datasets and model architectures, with specialized models like CodeT5 consistently outperforming general-purpose embedding models. They also found that retrieval effectiveness plateaus as embedding dimensionality increases beyond a certain point, suggesting diminishing returns for larger embeddings. Finally, they introduce a novel evaluation dataset derived from Voyage's internal codebase, aimed at providing a more practical benchmark for code retrieval models in real-world settings.

The Voyage AI blog post, "Evaluating Code Embedding Models," delves into the complexities of assessing the effectiveness of code embedding models, particularly for the task of code retrieval. Code embedding models transform code snippets into vector representations, allowing for semantic similarity searches. This is crucial for tasks like finding relevant code examples, identifying duplicated code, or suggesting potential fixes. The post emphasizes the importance of robust evaluation methodologies to accurately gauge the performance of these models.

The authors argue that commonly used metrics like Mean Average Precision (MAP) or Normalized Discounted Cumulative Gain (NDCG), while useful, can be insufficient for capturing the nuances of code retrieval. They highlight the issue of "easy negatives" – code examples that are trivially dissimilar to the query – which can inflate performance metrics. These metrics might indicate high accuracy even if the model isn't truly understanding the semantic meaning of the code.

To address this, Voyage AI introduces a novel evaluation framework centered around two key concepts: "hard negative mining" and "domain adaptation." Hard negative mining involves specifically selecting negative examples that are semantically similar to the query but not the correct answer. This forces the model to distinguish between subtly different code snippets and thus demonstrates a deeper understanding of code semantics. The blog post details how they generate these hard negatives using a combination of techniques, including leveraging abstract syntax trees (ASTs) and identifying code snippets with similar functionalities but different implementations.

Domain adaptation, the second core element of their framework, tackles the challenge of evaluating models on diverse coding styles and conventions found across different codebases or projects. The post explains that a model trained on one type of code might not perform well on another. Therefore, they advocate for evaluating models on multiple datasets representing different domains, providing a more holistic and realistic assessment of performance.

The post further elucidates the practical implications of their evaluation framework by showcasing its application in comparing different code embedding models. They demonstrate how their approach reveals performance disparities that would be obscured by traditional metrics alone. This nuanced evaluation allows for more informed decisions when selecting or developing code embedding models for specific tasks and codebases. Ultimately, the post champions a more rigorous and comprehensive approach to evaluating code embedding models, emphasizing the importance of considering both hard negatives and domain adaptation for a truly insightful understanding of model performance and its real-world applicability.

Summary of Comments ( 0 )
https://news.ycombinator.com/item?id=42894939

Hacker News users discussed the methodology of Voyage's code retrieval evaluation, particularly questioning the reliance on HumanEval and MBPP benchmarks. Some argued these benchmarks don't adequately reflect real-world code retrieval scenarios, suggesting alternatives like retrieving code from a large corpus based on natural language queries. The lack of open-sourcing for Voyage's evaluated models and datasets also drew criticism, hindering reproducibility and broader community engagement. There was a brief discussion on the usefulness of keyword search as a strong baseline and the potential benefits of integrating semantic search techniques. Several commenters expressed interest in seeing evaluations based on more realistic use cases, including bug fixing or adding new features within existing codebases.

Scale AI Unveil Results of Humanity's Last Exam, a Groundbreaking New Benchmark

permalink

Posted: 2025-01-23 17:44:07

Scale AI's "Humanity's Last Exam" benchmark evaluates large language models (LLMs) on complex, multi-step reasoning tasks across various domains like math, coding, and critical thinking, going beyond typical benchmark datasets. The results revealed that while top LLMs like GPT-4 demonstrate impressive abilities, even the best models still struggle with intricate reasoning, logical deduction, and robust coding, highlighting the significant gap between current LLMs and human-level intelligence. The benchmark aims to drive further research and development in more sophisticated and robust AI systems.

In a recent publication entitled "Humanity's Last Exam," Scale AI, a prominent provider of artificial intelligence infrastructure and data services, has divulged the findings of a novel benchmark designed to rigorously assess the evolving capabilities of large language models (LLMs) across a broad spectrum of real-world tasks. This ambitious undertaking, meticulously crafted to transcend the limitations of existing benchmarks often criticized for their narrow focus on academic or synthetic datasets, seeks to provide a more comprehensive and nuanced understanding of how these powerful models perform in scenarios that closely mirror the complexities and ambiguities inherent in human communication and problem-solving.

The methodology employed in "Humanity's Last Exam" distinguishes itself through its emphasis on evaluation across a diverse array of 100 distinct tasks, encompassing areas such as coding, creative writing, mathematics, and sophisticated reasoning. Furthermore, these tasks were explicitly designed to emulate real-world challenges, reflecting the type of problems humans frequently encounter in professional and everyday settings. This stands in contrast to conventional benchmarks that often rely on simplified or artificial datasets, potentially inflating the perceived performance of LLMs and failing to capture their true capabilities when confronted with the multifaceted nature of real-world applications.

The results of this extensive evaluation reveal a complex and nuanced picture of current LLM capabilities. While some models demonstrated impressive proficiency in certain domains, particularly those involving well-defined tasks with clear success criteria, significant performance disparities were observed across the spectrum of evaluated tasks. The findings underscore the ongoing challenges in developing truly general-purpose AI systems capable of consistently matching or exceeding human performance across a broad range of cognitive domains. Specifically, the research highlighted areas where further refinement and development are crucial, such as complex reasoning, nuanced understanding of context, and the ability to adapt to novel or unforeseen scenarios.

Scale AI argues that "Humanity's Last Exam" provides a crucial contribution to the ongoing discourse surrounding the advancement and deployment of artificial intelligence. By offering a more robust and realistic assessment framework, the benchmark aims to facilitate more informed decision-making regarding the appropriate application of LLMs, while simultaneously driving further research and development efforts towards the ultimate goal of creating truly general-purpose AI systems. The implication is that this benchmark not only offers a snapshot of current LLM capabilities but also serves as a roadmap for future advancements in the field, guiding researchers towards areas requiring focused attention and fostering the development of more versatile and robust AI models capable of effectively addressing the multifaceted challenges of the real world. Furthermore, the benchmark's emphasis on real-world tasks suggests a commitment to ensuring that AI development remains grounded in practical applications and contributes meaningfully to solving real-world problems.

Summary of Comments ( 22 )
https://news.ycombinator.com/item?id=42806105

HN commenters largely criticized the "Humanity's Last Exam" framing as hyperbolic and marketing-driven. Several pointed out that the exam's focus on reasoning and logic, while important, doesn't represent the full spectrum of human intelligence and capabilities crucial for navigating complex real-world scenarios. Others questioned the methodology and representativeness of the "exam," expressing skepticism about the chosen tasks and the limited pool of participants. Some commenters also discussed the implications of AI surpassing human performance on such benchmarks, with varying degrees of concern about potential societal impact. A few offered alternative perspectives, suggesting that the exam could be a useful tool for understanding and improving AI systems, even if its framing is overblown.

The Hacker News post about Scale AI's "Humanity's Last Exam" has generated a fair amount of discussion, with several commenters expressing skepticism and raising concerns about the methodology and implications of the benchmark.

One recurring theme is the questioning of whether this benchmark truly represents a final exam for humanity. Commenters argue that framing it as such is hyperbolic and potentially misleading. They point out that the tasks, while complex, don't encompass the full breadth of human intelligence and creativity. The focus on specific problem-solving domains, particularly those relevant to current AI capabilities, is seen as a limitation.

Several commenters critique the methodology used to evaluate human performance. Some question the selection of tasks and the way they were presented to participants. Others express concern about the potential for bias in the human evaluators who judged the responses. The lack of detailed information about the human participants also raises concerns about the representativeness of the sample and the generalizability of the results.

The implications of the benchmark for AI development are also debated. While some acknowledge the value of having a standardized benchmark to measure progress, others worry that focusing solely on these specific tasks could lead to a narrow and potentially misdirected development trajectory for AI. The concern is that optimizing AI for these particular problems might not translate to genuine progress towards more general intelligence or beneficial real-world applications.

Some commenters express skepticism about Scale AI's motivations, suggesting that the framing of the benchmark as "Humanity's Last Exam" is primarily a marketing tactic to generate attention. They point to the lack of open access to the data and the evaluation methodology as potentially reinforcing this suspicion.

A few comments offer alternative perspectives, suggesting that the benchmark, despite its limitations, could still be a valuable tool for understanding the strengths and weaknesses of current AI systems. They emphasize the importance of continued research and development in AI, while cautioning against overinterpreting the results of this particular benchmark.

Overall, the comments on Hacker News reflect a cautious and critical reception of Scale AI's "Humanity's Last Exam." While some acknowledge the potential value of the benchmark, many express reservations about its methodology, framing, and implications. The discussion highlights the ongoing debate surrounding the nature of intelligence, the challenges of evaluating AI systems, and the potential societal impact of advanced AI technologies.

Coping with dumb LLMs using classic ML

permalink

Posted: 2025-01-22 09:25:07

The blog post explores using traditional machine learning (specifically, decision trees) to interpret and refine the output of less capable or "dumb" Large Language Models (LLMs). The author describes a scenario where an LLM is tasked with classifying customer service tickets, but its performance is unreliable. Instead of relying solely on the LLM's classification, a decision tree model is trained on the LLM's output (probabilities for each classification) along with other readily available features of the ticket, like length and sentiment. This hybrid approach leverages the LLM's initial analysis while allowing the decision tree to correct inaccuracies and improve overall classification performance, ultimately demonstrating how simpler models can bolster the effectiveness of flawed LLMs in practical applications.

Doug, the author of the blog post "Coping with dumb LLMs using classic ML," explores the inherent unreliability of Large Language Models (LLMs) and proposes a method to mitigate their shortcomings by leveraging traditional machine learning techniques, specifically decision trees. He illustrates this concept with a practical example: determining whether a piece of text generated by an LLM constitutes a valid legal judgment.

Doug begins by acknowledging the impressive capabilities of LLMs in generating human-like text, yet emphasizes their fundamental flaw: they lack true understanding and reasoning abilities. Consequently, while an LLM might produce text that superficially resembles a legal judgment, it may be nonsensical or contain critical errors upon closer inspection. This unreliability renders LLMs unsuitable for tasks requiring precise and logically sound outputs, such as drafting legal documents.

To address this issue, Doug introduces the idea of employing a "judge" to evaluate the output of the LLM. This judge, rather than being a human expert, is implemented as a decision tree trained on a dataset of genuine and fabricated legal judgments. The decision tree learns to identify patterns and features that distinguish authentic judgments from the LLM-generated imitations. These features could include aspects like the structure of the text, the specific terminology used, the presence of citations, and the overall coherence of the arguments presented.

The blog post details the process of training the decision tree using the scikit-learn library in Python. Doug meticulously explains the steps involved in preparing the dataset, selecting appropriate features, training the model, and evaluating its performance. He highlights the importance of using a balanced dataset containing both real and fake judgments to ensure the model learns to differentiate effectively between them.

Doug further elaborates on the specific features used to train the decision tree. These include metrics like the frequency of certain keywords associated with legal language, the overall length of the document, and the complexity of the sentences used. He demonstrates how these features can be extracted from the text and used as input to the decision tree model.

The results presented in the blog post demonstrate the effectiveness of this approach. The trained decision tree achieves a reasonable level of accuracy in distinguishing between genuine legal judgments and those generated by the LLM. While not perfect, the judge provides a significant improvement over relying solely on the LLM's output.

Doug concludes by suggesting that this method can be generalized to other domains where the output of LLMs needs to be verified for accuracy and reliability. He argues that combining the generative power of LLMs with the discerning capabilities of classical machine learning models like decision trees offers a promising path towards harnessing the potential of LLMs while mitigating their inherent limitations. This hybrid approach allows for a more robust and trustworthy application of LLMs in various fields.

Summary of Comments ( 44 )
https://news.ycombinator.com/item?id=42790820

Hacker News users discuss the practicality and limitations of the proposed decision-tree approach to mitigate LLM "hallucinations." Some express skepticism about its scalability and maintainability, particularly with the rapid advancement of LLMs, suggesting that improving prompt engineering or incorporating retrieval mechanisms might be more effective. Others highlight the potential value of the decision tree for specific, well-defined tasks where accuracy is paramount and the domain is limited. The discussion also touches on the trade-off between complexity and performance, and the importance of understanding the underlying limitations of LLMs rather than relying on patches. A few commenters note the similarity to older expert systems and question if this represents a step back in AI development. Finally, some appreciate the author's honest exploration of alternative solutions, acknowledging that relying solely on improving LLM accuracy might not be the optimal path forward.

The Hacker News post titled "Coping with dumb LLMs using classic ML" (linking to an article about using decision trees to augment LLMs) has generated a modest discussion with several insightful comments.

One commenter points out that the approach described in the article, which involves using a decision tree to guide the LLM's output, isn't fundamentally different from prompt engineering. They argue that crafting a detailed prompt is essentially providing a structured set of rules, much like a decision tree. This comment highlights the blurred lines between different techniques for controlling LLM behavior, suggesting that "prompt engineering" might encompass a wider range of methods than typically assumed.

Another commenter raises the question of maintainability. They acknowledge the potential benefits of using decision trees for specific tasks but express concern about the long-term implications of managing and updating these trees as requirements evolve. They suggest that the complexity of maintaining a decision tree could outweigh its advantages in certain dynamic environments.

A further comment delves into the limitations of relying solely on the LLM's internal representations. The commenter argues that while LLMs can store and access a vast amount of information, they lack a reliable mechanism for consistently applying this knowledge in a structured manner. This comment reinforces the article's premise, suggesting that external structures like decision trees can help bridge this gap and improve the reliability of LLM outputs.

Another commenter draws a parallel with older symbolic AI techniques. They suggest that the approach of using decision trees with LLMs represents a return to these earlier methods, combining the strengths of both symbolic and statistical AI. This comment frames the discussion within a broader historical context of AI research.

Finally, a commenter questions the scalability of the proposed approach. They wonder how well the decision tree method would perform with more complex scenarios and larger datasets, expressing skepticism about its general applicability. This comment introduces an important consideration for practical implementations of the described technique.

Overall, the comments on Hacker News provide a valuable critique and extension of the article's core ideas. They raise important questions about the practicality, maintainability, and broader implications of using decision trees to enhance LLM performance, offering a nuanced perspective on the potential and limitations of this hybrid approach.

Stories with Tag Model Evaluation

Putting Andrew Ng's OCR models to the test

Summary of Comments ( 46 ) https://news.ycombinator.com/item?id=43201001

Show HN: Klarity – Open-source tool to analyze uncertainty/entropy in LLM output

Summary of Comments ( 23 ) https://news.ycombinator.com/item?id=42918237

Evaluating Code Embedding Models

Summary of Comments ( 0 ) https://news.ycombinator.com/item?id=42894939

Scale AI Unveil Results of Humanity's Last Exam, a Groundbreaking New Benchmark

Summary of Comments ( 22 ) https://news.ycombinator.com/item?id=42806105

Coping with dumb LLMs using classic ML

Summary of Comments ( 44 ) https://news.ycombinator.com/item?id=42790820

Summary of Comments ( 46 )
https://news.ycombinator.com/item?id=43201001

Summary of Comments ( 23 )
https://news.ycombinator.com/item?id=42918237

Summary of Comments ( 0 )
https://news.ycombinator.com/item?id=42894939

Summary of Comments ( 22 )
https://news.ycombinator.com/item?id=42806105

Summary of Comments ( 44 )
https://news.ycombinator.com/item?id=42790820