Ghostwriter is a project that transforms the reMarkable 2 tablet into an interface for interacting with large language models (LLMs). It leverages the tablet's natural handwriting capabilities to send handwritten prompts to an LLM and displays the generated text response directly on the e-ink screen. Essentially, it allows users to write naturally and receive LLM-generated text, all within the distraction-free environment of the reMarkable 2. The project is open-source and allows for customization, including choosing the LLM and adjusting various settings.
The GitHub repository titled "Ghostwriter" introduces a novel approach to interacting with large language models (LLMs) like Vision-LLMs, specifically Google's Gemini, by leveraging the reMarkable2 tablet as a primary input and output device. This project aims to create a more natural and intuitive writing experience by combining the tactile feel of handwriting on the reMarkable2 with the generative capabilities of advanced LLMs.
The system functions by capturing handwritten text and simple drawings created on the reMarkable2. This input data is then transmitted to a server, where it is interpreted and subsequently fed as prompts to a Vision-LLM. The LLM processes these prompts, generating responses based on the provided handwritten input, effectively using the visual information directly. These responses, which can include generated text, code, or even images in response to sketched diagrams, are then returned to the reMarkable2 screen for display. This creates a closed loop where the user writes or draws on the tablet, the LLM interprets and responds, and the response is displayed back on the reMarkable2, facilitating a dynamic and interactive exchange with the LLM.
Ghostwriter employs a multi-stage process to achieve this functionality. Initially, it utilizes the rm2fb
utility to establish a framebuffer connection with the reMarkable2, allowing real-time access to the screen content. Changes in the framebuffer are monitored to detect new handwritten input. This new input is then extracted, processed for clarity and legibility, and converted into a format suitable for the Vision-LLM. The processed input is then sent as a prompt to the LLM via an API call. The LLM’s generated output is subsequently received by the server and formatted appropriately for display on the reMarkable2. Finally, the formatted response is transmitted back to the tablet, updating the display and presenting the LLM's output to the user. This entire cycle repeats, allowing for continuous interaction and a seamless back-and-forth between user input and LLM generation, all mediated through the reMarkable2 interface. The aim is to provide a more fluid and engaging experience than traditional keyboard-and-mouse interaction with LLMs, mimicking the intuitive nature of working with pen and paper while harnessing the power of advanced AI models.
Summary of Comments ( 70 )
https://news.ycombinator.com/item?id=42979986
HN commenters generally expressed excitement about Ghostwriter, particularly its potential for integrating handwritten input with LLMs. Several users pointed out the limitations of existing tablet-based coding solutions and saw Ghostwriter as a promising alternative. Some questioned the practicality of handwriting code extensively, while others emphasized its usefulness for diagrams, note-taking, and mathematical formulas, especially when combined with LLM capabilities. The discussion touched upon the desire for similar functionality with other tablets like the iPad and speculated on potential applications in education and creative fields. A few commenters expressed interest in the open-source nature of the project and its potential for customization.
The Hacker News thread linked (https://news.ycombinator.com/item?id=42979986) discusses the "Ghostwriter" project, which allows users to leverage their reMarkable 2 tablet as an input device for vision-language models (VLMs). The discussion is relatively brief, consisting of only a few comments, and doesn't delve deeply into the project's merits or drawbacks. It doesn't present any highly compelling arguments or particularly insightful perspectives.
One user questions the practical application of the project, wondering if there's a genuine use case beyond its novelty. They ponder what real-world problem this solves and suggest alternative, potentially more efficient methods for interacting with VLMs, like using a phone's camera. This comment reflects a common sentiment towards new technologies, questioning its purpose beyond the initial "cool" factor.
Another commenter expresses a desire to see similar functionality for other e-ink devices, specifically mentioning the Onyx Boox. This suggests a potential interest in the broader application of e-ink tablets as interfaces for AI models and highlights a user base looking for expanded compatibility.
A third comment very briefly mentions using the reMarkable tablet for note-taking while coding, indirectly hinting at a possible use case for Ghostwriter. However, the connection isn't explicitly made, and the commenter doesn't elaborate on how Ghostwriter might fit into that workflow.
Overall, the discussion is limited and primarily focuses on initial reactions and potential future applications rather than a detailed analysis of Ghostwriter itself. It doesn't offer a wealth of compelling insights, mainly expressing curiosity, suggestions for broader compatibility, and a questioning of the project's practical utility.