Story Details

  • DeepSeek releases Janus Pro, a text-to-image generator [pdf]

    Posted: 2025-01-27 16:57:45

    DeepSeek has released Janus Pro, a text-to-image model specializing in high-resolution image generation with a focus on photorealism and creative control. It leverages a novel two-stage architecture: a base model generates a low-resolution image, which is then upscaled by a dedicated super-resolution model. This approach allows for faster generation of larger images (up to 4K) while maintaining image quality and coherence. Janus Pro also boasts advanced features like inpainting, outpainting, and style transfer, giving users more flexibility in their creative process. The model was trained on a massive dataset of text-image pairs and utilizes a proprietary loss function optimized for both perceptual quality and text alignment.

    Summary of Comments ( 370 )
    https://news.ycombinator.com/item?id=42843131

    Several Hacker News commenters express skepticism about the claims made in the Janus Pro technical report, particularly regarding its superior performance compared to Stable Diffusion XL. They point to the lack of open-source code and public access, making independent verification difficult. Some suggest the comparisons presented might be cherry-picked or lack crucial details about the evaluation methodology. The closed nature of the model also raises questions about reproducibility and the potential for bias. Others note the report's focus on specific benchmarks without addressing broader concerns about text-to-image model capabilities. A few commenters express interest in the technology, but overall the sentiment leans toward cautious scrutiny due to the lack of transparency.