This project reverse-engineered the obfuscated bytecode virtual machine used in the TikTok Android app to understand how it protects intellectual property like algorithms and business logic. By meticulously analyzing the VM's instructions and data structures, the author was able to reconstruct its inner workings, including the opcode format, register usage, and stack manipulation. This allowed them to develop a custom disassembler and deobfuscator, ultimately enabling analysis of the previously hidden bytecode and revealing the underlying application logic executed by the VM. This effort provides insight into TikTok's anti-reversing techniques and sheds light on how the app functions internally.
A JavaScript-based Transputer emulator has been developed and is performant enough for practical use. It emulates a T425 Transputer, including its 32-bit processor, on-chip RAM, and link interfaces for connecting multiple virtual Transputers. The emulator aims for accuracy and speed, leveraging WebAssembly and other optimizations. While still under development, it can already run various programs, offering a readily accessible way to explore and experiment with this parallel computing architecture within a web browser. The project's website provides interactive demos and source code.
Hacker News users discussed the surprising speed and cleverness of a JavaScript-based Transputer emulator. Several praised the author's ingenuity in optimizing the emulator, making it performant enough for practical uses like running old Transputer demos. Some commenters reminisced about their past experiences with Transputers, highlighting their unique architecture and the challenges of parallel programming. Others expressed interest in exploring the emulator further, with suggestions for potential applications like running old games or educational purposes. A few users discussed the technical aspects of the emulator, including the use of Web Workers and the limitations of JavaScript for emulating parallel architectures. The overall sentiment was positive, with many impressed by the project's technical achievement and nostalgic value.
Win98-quickinstall is a project that streamlines the installation of Windows 98SE. It provides a pre-configured virtual machine image and a framework for automating the installation process, significantly reducing the time and effort required for setup. The project includes pre-installed drivers, essential utilities, and tweaks for improved performance and stability in a virtualized environment. This allows users to quickly deploy a functional Windows 98SE instance for testing, development, or nostalgia.
Hacker News users discussed the practicality and nostalgia of the Win98-quickinstall project. Some questioned its usefulness in a modern context, while others praised its potential for retro gaming or specific hardware configurations. Several commenters shared their own experiences and challenges with setting up Windows 98, highlighting driver compatibility issues and the tediousness of the original installation process. The project's use of QEMU for virtualized installs was also a point of interest, with some users suggesting alternative approaches. A few comments focused on the technical aspects of the installer, including its scripting and modular design. Overall, the sentiment leaned towards appreciation for the project's ingenuity and its ability to simplify a complex process, even if its real-world applications are limited.
Blue95 is a passion project aiming to recreate the nostalgic experience of a late 90s/early 2000s home computer setup. It's a curated collection of period-accurate software, themes, and wallpapers, designed to evoke the look and feel of Windows 95/98, packaged as a bootable ISO for virtual machines or physical hardware. The project focuses on free and open-source software alternatives to commercial applications of the era, offering a curated selection of games, utilities, and creative tools, all wrapped in a familiar, retro aesthetic. The goal is to capture the essence of that era's computing experience – a blend of discovery, simplicity, and playful experimentation.
Hacker News users generally expressed nostalgia and appreciation for Blue95's aesthetic, recalling the era of Windows 95 and early internet experiences. Several commenters praised the attention to detail and accuracy in recreating the look and feel of the period. Some discussed the practical limitations of older hardware and software, while others reminisced about specific games and applications. A few users questioned the project's purpose beyond nostalgia, but overall the reception was positive, with many expressing interest in trying it out or contributing to its development. The discussion also touched on the broader trend of retro computing and the desire to revisit simpler technological times.
The blog post details the author's process of switching from Linux (Pop!_OS, specifically) to Windows 11. Driven by the desire for a better gaming experience and smoother integration with their workflow involving tools like Adobe Creative Suite and DaVinci Resolve, they opted for a clean Windows installation. The author outlines the steps they took, including backing up essential Linux files, creating a Windows installer USB drive, and installing Windows. They also touch on post-installation tasks like driver installation and setting up their development environment with WSL (Windows Subsystem for Linux) to retain access to Linux tools. Ultimately, the post documents a pragmatic approach to switching operating systems, prioritizing software compatibility and performance for the author's specific needs.
Several commenters on Hacker News express skepticism about the blog post's claim of seamlessly switching from Linux to Windows. Some point out that the author's use case (primarily gaming and web browsing) doesn't necessitate Linux's advantages, making the switch less surprising. Others question the long-term viability of relying on Windows Subsystem for Linux (WSL) for development, citing potential performance issues and compatibility problems. A few commenters share their own experiences switching between operating systems, with some echoing the author's sentiments and others detailing difficulties they encountered. The overall sentiment leans toward cautious curiosity about WSL's capabilities while remaining unconvinced it's a complete replacement for a native Linux environment for serious development work. Several users suggest the author might switch back to Linux in the future as their needs change.
JEP 483 introduces a new class loading and linking mechanism called "ahead-of-time" (AOT) loading, aimed at improving startup performance. Unlike the existing dynamic class loading, AOT processes class data during build time, generating a dedicated archive. This archive contains pre-linked classes, readily available at startup, reducing the runtime overhead associated with verification and resolution. While AOT can significantly decrease startup time, particularly for applications with large class hierarchies, it comes with trade-offs. AOT-generated archives increase disk space consumption and require dedicated build-time tooling. Additionally, AOT doesn't replace dynamic class loading entirely; it complements it, handling a predefined set of classes while dynamic loading manages the rest. JEP 483 intends to improve startup, not overall performance, and introduces a new tool called jaotc
to facilitate AOT compilation.
HN commenters generally express interest in JEP 483's potential performance benefits, particularly faster startup times. Some highlight the complexity of the proposed changes and the potential for subtle bugs. A few commenters question the necessity of AOT given existing JIT compiler advancements, while others point out that AOT can offer advantages beyond raw startup speed, such as reduced memory footprint and improved warmup times. One commenter notes the limited scope of the initial JEP, applying only to platform classes, and wonders about future expansion to application classes. Another expresses concern about the potential security implications of pre-compiled code. Several users discuss the interplay between AOT compilation and existing JIT compilation, specifically how the two might be used together effectively.
MilliForth-6502 is a minimalist Forth implementation for the 6502 processor, designed to be incredibly small while remaining a practical programming language. It features a 1 KB dictionary, a 256-byte parameter stack, and implements core Forth words including arithmetic, logic, stack manipulation, and I/O. Despite its size, MilliForth allows for defining new words and includes a simple interactive interpreter. Its compactness makes it suitable for resource-constrained 6502 systems, and the project provides source code and documentation for building and using it.
Hacker News users discussed the practicality and minimalism of MilliForth, a Forth implementation for the 6502 processor. Some questioned its usefulness beyond educational purposes, citing limited memory and awkward programming style compared to assembly language. Others appreciated its cleverness and the challenge of creating such a compact system, viewing it as a testament to Forth's flexibility. Several comments highlighted the historical context of Forth on resource-constrained systems and drew parallels to other small language implementations. The maintainability of generated code and the debugging experience were also mentioned as potential drawbacks. A few commenters expressed interest in exploring MilliForth further and potentially using it for small embedded projects.
This 1987 paper by Dybvig explores three distinct implementation models for Scheme: compilation to machine code, abstract machine interpretation, and direct interpretation of source code. It argues that while compilation offers the best performance for finished programs, the flexibility and debugging capabilities of interpreters are crucial for interactive development environments. The paper details the trade-offs between these models, emphasizing the advantages of a mixed approach that leverages both compilation and interpretation techniques. It concludes that an ideal Scheme system would utilize compilation for optimized execution and interpretation for interactive use, debugging, and dynamic code loading, hinting at a system where the boundaries between compiled and interpreted code are blurred.
HN commenters discuss the historical significance of the paper in establishing Scheme's minimalist design and portability. They highlight the cleverness of the three implementations, particularly the threaded code interpreter, and its influence on later languages like Lua. Some note the paper's accessibility and clarity, even for those unfamiliar with Scheme, while others reminisce about using the techniques described. A few comments delve into technical details like register allocation and garbage collection, comparing the approaches to modern techniques. The overall sentiment is one of appreciation for the paper's contribution to computer science and programming language design.
MichiganTypeScript is a proof-of-concept project demonstrating a WebAssembly runtime implemented entirely within TypeScript's type system. It doesn't actually execute WebAssembly code, but instead uses advanced type-level programming techniques to simulate its execution. By representing WebAssembly instructions and memory as types, and leveraging TypeScript's type inference and checking capabilities, the project can statically verify the behavior of a given WebAssembly program. This effectively transforms TypeScript's type checker into an interpreter, showcasing the power and flexibility of its type system, albeit in a non-practical, purely theoretical manner.
Hacker News users discussed the cleverness of using TypeScript's type system for computation, with several expressing fascination and calling it "amazing" or "brilliant." Some debated the practical applications, acknowledging its limitations while appreciating it as a demonstration of the type system's power. Concerns were raised about debugging complexity and the impracticality for larger programs. Others drew parallels to other Turing-complete type systems and pondered the potential for generating optimized WASM code from such TypeScript code. A few commenters pointed out the project's connection to the "ts-sql" project and speculated about leveraging similar techniques for compile-time query validation and optimization. Several users also highlighted the educational value of the project, showcasing the unexpected capabilities of TypeScript's type system.
Neut is a statically-typed, compiled programming language designed for building reliable and maintainable systems software. It emphasizes simplicity and explicitness through its C-like syntax, minimal built-in features, and focus on compile-time evaluation. Key features include a powerful macro system enabling metaprogramming and code generation, algebraic data types for representing data structures, and built-in support for pattern matching. Neut aims to empower developers to write efficient and predictable code by offering fine-grained control over memory management and avoiding hidden runtime behavior. Its explicit design choices and limited standard library encourage developers to build reusable components tailored to their specific needs, promoting code clarity and long-term maintainability.
HN commenters generally express interest in Neut, praising its focus on simplicity, safety, and explicitness. Several highlight the appealing aspects of linear types and the borrow checker, noting similarities to Rust but with a seemingly gentler learning curve. Some question the practical applicability of linear types for larger projects, while others anticipate its usefulness in specific domains like game development or embedded systems. A few commenters express skepticism about the limited standard library and the overall maturity of the project, but the overall tone is positive and curious about the language's potential. Performance, particularly relating to garbage collection or its lack thereof, is a recurring point of discussion, with some wondering about the potential for optimizations given the linear type system.
This blog post chronicles the author's weekend project of building a compiler for a simplified C-like language. It walks through the implementation of a lexical analyzer, parser (using recursive descent), and code generator targeting x86-64 assembly. The compiler handles basic arithmetic operations, variable declarations and assignments, if/else statements, and while loops. The post emphasizes simplicity and educational value over performance or completeness, providing a practical example of compiler construction principles in a digestible format. The code is available on GitHub for readers to explore and experiment with.
HN users largely praised the TinyCompiler project for its educational value, highlighting its clear code and approachable structure as beneficial for learning compiler construction. Several commenters discussed extending the compiler's functionality, such as adding support for different architectures or optimizing the generated code. Some pointed out similar projects or resources, like the "Let's Build a Compiler" tutorial and the Crafting Interpreters book. A few users questioned the "weekend" claim in the title, believing the project would take significantly longer for a novice to complete. The post also sparked discussion about the practical applications of such a compiler, with some suggesting its use for educational purposes or embedding in resource-constrained environments. Finally, there was some debate about the complexity of the compiler compared to more sophisticated tools like LLVM.
The F8 is a new 8-bit computer architecture designed for efficiency in both code size and memory usage, especially when programming in C. It aims to achieve performance comparable to 16-bit systems while maintaining the simplicity and resource efficiency of 8-bit designs. This is accomplished through features like a hybrid stack/register-based architecture, variable-width instructions, and dedicated instructions for common C operations like pointer manipulation and function calls. The F8 also emphasizes practical applications with features like a built-in bootloader and support for direct connection to peripherals.
Hacker News users discussed the F8 architecture's unusual design choices. Several commenters questioned the practical applications given the performance tradeoffs for memory efficiency, particularly with modern memory availability. Some debated the value of 8-bit architectures in niche applications like microcontrollers, while others pointed out existing alternatives like AVR. The unusual register structure and lack of hardware stack were also discussed, with some suggesting it might hinder C compiler optimization. A few expressed interest in the unique approach, though skepticism about real-world viability was prevalent. Overall, the comments reflected a cautious curiosity towards F8 but with reservations about its usefulness compared to established architectures.
Zeroperl leverages WebAssembly (Wasm) to create a secure sandbox for executing Perl code. It compiles a subset of Perl 5 to Wasm, allowing scripts to run in a browser or server environment with restricted capabilities. This approach enhances security by limiting access to the host system's resources, preventing malicious code from wreaking havoc. Zeroperl utilizes a custom runtime environment built on Wasmer, a Wasm runtime, and focuses on supporting commonly used Perl modules for tasks like text processing and bioinformatics. While not aiming for full Perl compatibility, Zeroperl offers a secure and efficient way to execute specific Perl workloads in constrained environments.
Hacker News commenters generally expressed interest in Zeroperl, praising its innovative approach to sandboxing Perl using WebAssembly. Some questioned the performance implications of this method, wondering if it would introduce significant overhead. Others discussed alternative sandboxing techniques, like using containers or VMs, comparing their strengths and weaknesses to WebAssembly. Several users highlighted potential use cases, particularly for serverless functions and other cloud-native environments. A few expressed skepticism about the viability of fully securing Perl code within WebAssembly given Perl's dynamic nature and CPAN module dependencies. One commenter offered a detailed technical explanation of why certain system calls remain accessible despite the sandbox, emphasizing the ongoing challenges inherent in securing dynamic languages.
OpenLDK is a project that implements a Java Virtual Machine (JVM) and Just-In-Time (JIT) compiler written entirely in Common Lisp. It aims to be a high-performance JVM alternative, leveraging Lisp's metaprogramming capabilities for dynamic code generation and optimization. The project features a modular design, encompassing a bytecode interpreter, a tiered JIT compiler using a method-based compilation strategy, and a garbage collector. OpenLDK is considered experimental and under active development, focusing on performance enhancements and broader Java compatibility.
Commenters on Hacker News express interest in OpenLDK, primarily focusing on its unusual implementation of a Java Virtual Machine (JVM) in Common Lisp. Several question the practical applications and performance implications of this approach, wondering about its speed and suitability for real-world projects. Some highlight the potential benefits of Lisp's dynamic nature for tasks like debugging and introspection. Others draw parallels to similar projects like Clojure and GraalVM, discussing their respective advantages and disadvantages. A few express skepticism about the long-term viability of the project, while others praise the technical achievement and express curiosity about its potential. The novelty of using Lisp for JVM implementation clearly sparks the most discussion.
The author details their process of creating a WebAssembly (Wasm) virtual machine (VM) written entirely in C. Driven by a desire for a lightweight, embeddable Wasm runtime for resource-constrained environments, they built the VM from scratch, implementing core features like the stack-based execution model, linear memory, and basic WebAssembly System Interface (WASI) support. The project focused on simplicity and understandability over performance, serving primarily as a learning exercise and a platform for experimentation with Wasm. The post walks through key aspects of the VM's design and implementation, including parsing the Wasm binary format, handling function calls, and managing memory. It also highlights the challenges faced and lessons learned during the development process.
Hacker News users generally praised the author's clear writing style and the educational value of the post. Several commenters discussed the project's performance, noting that it's not optimized for speed and suggesting potential improvements like just-in-time compilation. Some shared their own experiences with WASM interpreters and related projects, including comparisons to other implementations and alternative approaches like using a stack machine. Others appreciated the detailed explanation of the parsing and execution process, finding it helpful for understanding WASM internals. A few users pointed out minor corrections or areas for potential enhancement in the code, demonstrating active engagement with the technical details.
Mukul Rathi details his journey of creating a custom programming language, focusing on the compiler construction process. He explains the key stages involved, from lexing (converting source code into tokens) and parsing (creating an Abstract Syntax Tree) to code generation and optimization. Rathi uses his language, which he implements in OCaml, to illustrate these concepts, providing code examples and explanations of how each component works together to transform high-level code into executable machine instructions. He emphasizes the importance of understanding these foundational principles for anyone interested in building their own language or gaining a deeper appreciation for how programming languages function.
Hacker News users generally praised the article for its clarity and accessibility in explaining compiler construction. Several commenters appreciated the author's approach of building a complete, albeit simple, language instead of just a toy example. Some pointed out the project's similarity to the "Let's Build a Compiler" series, while others suggested alternative or supplementary resources like Crafting Interpreters and the LLVM tutorial. A few users discussed the tradeoffs between hand-written lexers/parsers and using parser generator tools, and the challenges of garbage collection implementation. One commenter shared their personal experience of writing a language and the surprising complexity of seemingly simple features.
Summary of Comments ( 82 )
https://news.ycombinator.com/item?id=43747921
HN users discussed the difficulty and complexity of reverse engineering TikTok's obfuscated VM, expressing admiration for the author's work. Some questioned the motivation behind such extensive obfuscation, speculating about anti-competitive practices and data exfiltration. Others debated the ethics and legality of reverse engineering, particularly in the context of closed-source applications. Several comments focused on the technical aspects of the reverse engineering process, including the tools and techniques used, the challenges faced, and the insights gained. A few users also shared their own experiences with reverse engineering similar apps and offered suggestions for further research. The overall sentiment leaned towards cautious curiosity, with many acknowledging the potential security and privacy implications of TikTok's complex architecture.
The Hacker News post "Reverse engineering the obfuscated TikTok VM" (https://news.ycombinator.com/item?id=43747921) has generated a modest number of comments, mostly focusing on the technical challenges and implications of reverse-engineering TikTok's code.
Several commenters discuss the complexity of reverse-engineering TikTok's bytecode, highlighting the "control flow flattening" technique used to obfuscate the code. They explain how this technique makes it difficult to understand the app's logic by obscuring the natural flow of execution. One commenter notes that this is a common tactic used in malware and other software seeking to protect against analysis. This commenter also mentions the challenges of renaming variables and functions during the deobfuscation process, adding to the complexity of understanding the code.
Another commenter points out the difficulty in tracing back the disassembled code to specific features or functionalities within the TikTok app. This is particularly relevant in a large and complex application like TikTok, where associating specific code sections with user-facing features can be a daunting task.
Some comments delve into the broader implications of this reverse-engineering effort. One commenter questions the ultimate goal of the project, speculating whether it's for security analysis, understanding TikTok's algorithms, or potentially developing modifications for the app. They also touch upon the legal and ethical considerations of reverse-engineering proprietary software. Another commenter expresses concern over TikTok's extensive data collection practices, suggesting that reverse-engineering efforts could shed light on how this data is collected and used.
A couple of comments discuss the broader trend of app obfuscation and the ongoing "cat and mouse game" between developers who obfuscate their code and security researchers who attempt to reverse-engineer it. They point out the constant evolution of obfuscation techniques and the challenges faced by researchers in keeping up with these advancements.
Finally, a comment mentions the practical challenges of reverse-engineering, including the time and effort required to analyze obfuscated code. This highlights the significant investment needed to unravel the inner workings of complex applications like TikTok. The thread lacks highly upvoted or controversial comments, keeping the discussion relatively focused on the technical aspects of reverse engineering and its implications for TikTok.