https://developer-blogs.nvidia.com/wp-content/uploads/2023/05/llm-graphic.png

How TensorRT Accelerates AI on RTX PCs

Unlocking Peak Generations: TensorRT Accelerates AI on RTX PCs and Workstations

As generative AI advances and becomes widespread across industries, the importance of running generative AI applications on local PCs and workstations grows. Local inference gives consumers reduced latency, eliminates their dependency on the network, and enables more control over their data.

NVIDIA GeForce and NVIDIA RTX GPUs feature Tensor Cores, dedicated AI hardware accelerators that provide the horsepower to run generative AI locally. Let’s dive into how TensorRT, NVIDIA’s software development kit, optimizes generative AI performance on these GPUs.

NVIDIA GeForce RTX GPUs Sale on Amazon

TensorRT Extension for Stable Diffusion

Stable Video Diffusion, an image-to-video generative AI model, is now optimized for the NVIDIA TensorRT SDK. This optimization unlocks the highest-performance generative AI on the more than 100 million Windows PCs and workstations powered by RTX GPUs. Additionally, the TensorRT extension for the popular Stable Diffusion WebUI by Automatic1111 adds support for ControlNets, tools that give users more control to refine generative outputs by adding other images as guidance.

UL Procyon AI Image Generation Benchmark

TensorRT acceleration can be put to the test in the new UL Procyon AI Image Generation benchmark. Internal tests have shown that it accurately replicates real-world performance. For example, it delivered speedups of 50% on a GeForce RTX 4080 SUPER GPU compared with the fastest non-TensorRT implementation.

More Efficient and Precise AI

TensorRT enables developers to access the hardware that provides fully optimized AI experiences. AI performance typically doubles compared to running the application on other frameworks. It also accelerates popular generative AI models like Stable Diffusion and SDXL. Stable Video Diffusion, in particular, experiences a 40% speedup with TensorRT. The optimized Stable Video Diffusion 1.1 Image-to-Video model can be downloaded on Hugging Face.

ControlNets for Improved Customization

With the latest update to the TensorRT extension for Stable Diffusion WebUI, TensorRT optimizations now extend to ControlNets. ControlNets are a set of AI models that help guide a diffusion model’s output by adding extra conditions. With TensorRT, ControlNets are 40% faster. Users can guide aspects of the output to match an input image, giving them more control over the final image. Multiple ControlNets can be used together for even greater customization. A ControlNet can be a depth map, edge map, normal map, or keypoint detection model, among others.

Download the TensorRT extension for Stable Diffusion WebUI on GitHub today to explore these capabilities further.