Stability AI announces the launch of Stable Diffusion 3 Medium, a lightweight but high-performance open source Text-to-Image model

Last February, Stability AI announced its latest two text-to-image models: Stable Cascade and Stable Diffusion 3 as well as the opening of the waiting list for an early preview of this latest iteration of its flagship model. The startup recently released Stable Diffusion 3 (SD3) Medium, a 2 billion parameter open source model that it touts as its most sophisticated image generation model to date.

The SD3 family includes models ranging from 800 million to 8 billion parameters, giving users a range of options to suit their specific creative needs.

Smaller models like SD3 Medium offer a balanced combination of performance, accessibility and efficiency. They are easier to recycle and refine for specific use cases and accessible to a wider range of users, since they are compatible with consumer hardware.

SD3 Medium

According to Stability AI, “SD3 medium’s small size makes it perfect for running on consumer PCs and laptops as well as enterprise-level GPUs”.
The minimum required to run Stable Diffusion Medium is in fact only 5 GB of VRAM (video memory). Stability AI nevertheless recommends 16 GB of VRAM for truly comfortable and optimal use.

SD3 is a latent diffusion model that consists of three different text encoders (CLIP L/14, OpenCLIP bigG/14 and T5-v1.1-XXL), a new multimodal diffusion transformer model (MMDiT) and a 16-channel Variational Autoencoder (VAE) model similar to that used for Stable Diffusion XL

Model performance

According to Stability AI, SD3 Medium stands out for its photorealism, respect for prompts, its ability to generate texts and the possibilities of fine-tuning.

It presents several significant improvements:

Overall Quality and Photorealism : Stable Diffusion 3 Medium produces images of exceptional quality, with precise details, vivid colors and realistic lighting. Thanks to the integration of a 16-channel VAE, it manages to overcome the typical challenges of AI models, including the realism of hands and faces;
Quick Understanding : SD3 Medium can handle long and complex prompts, it would excel at handling spatial reasoning, compositional elements, actions and styles. Users can optimize performance and efficiency with three built-in text encoders;
Typography : The Diffusion Transformer architecture achieves unrivaled text quality, reducing spelling, kerning, letter formation and spacing errors.
Resource Saving : Stable Diffusion 3 Medium runs perfectly on standard consumer GPUs, without performance degradation, thanks to its small VRAM footprint;
Fine tuning : This model is designed to absorb nuanced detail from small data sets, making it ideal for customization and specific applications;

Here are some images generated by the model and their prompts shared by Stability AI:

Collaboration with NVIDIA and AMD

Stability AI worked with NVIDIA to optimize the performance of its models, including Stable Diffusion 3 Medium, using NVIDIA® RTX™ and TensorRT™ GPUs. TensorRT-optimized versions deliver a 50% performance increase, ensuring unparalleled efficiency.

Additionally, AMD has optimized inference for Stable Diffusion 3 Medium across various devices, including AMD’s latest APUs, consumer GPUs, and MI-300X Enterprise GPUs, ensuring industry-leading compatibility and performance across a wide range of range of equipment.

Accessibility and Licenses

Stable Diffusion 3 Medium is an open source model released under the Stability Non-Commercial Research Community license, reaffirming Stability AI’s commitment to open generative AI. For commercial use, artists, designers and developers can upgrade to a new creator license for $20 per month. As for companies wanting large-scale commercial use, Stability AI offers suitable licenses and invites you to contact them for more details.

Try Stable Diffusion 3

Stable Diffusion 3 Medium is now available via API powered by Fireworks AI. Users can also try other versions of the Stable Diffusion 3 series, such as the SD3 Large and SD3 Ultra, with a three-day free trial on the Stable Assistant chatbot and on Discord via Stable Artisan.

SD3 Medium

Model performance

Collaboration with NVIDIA and AMD

Accessibility and Licenses

Try Stable Diffusion 3

Related posts