Announces RTX AI Toolkit, AIM SDK, ACE With NIMs, Copilot Runtime With RTX GPU Support


NVIDIA is pushing what’s possible on the AI PC platform further with its latest RTX technologies being announced today.

NVIDIA Pushes The AI PC Platform Forward With Several Key Technologies: RTX AI Toolkit, RTX Acceleration For CoPilot, AI Inference Manager SDK & More

The difference between NVIDIA and others who have just started their journey in the AI PC realm is quite evident from the get-go. While others are talking mostly about how their hardware, NPUs, are faster than the rivals, NVIDIA is the one making the AI PC platform vibrant by introducing several new features. The company has a list of technologies already available for AI PC consumers running its RTX platform such as the most prominent DLSS (Deep Learning Super Sampling) feature which has seen a countless amount of updates that add to its neural network to make games run and look better.

The company also offers several assistants in the form of its RTX Chat, a chatbot, that runs locally on your PC and acts as your assistant. There’s also TensorRT & TensorRT-LLM support added to Windows which accelerates GenAI & LLM models on client platforms without needing to go to the cloud and there are several gaming technologies that are coming in the future that will utilize AI-enhancements such as ACE (Avatar Cloud Engine) which also gets a new update today.

NVIDIA also lays the current landscape of AI computational power and shows how its GeForce RTX 40 Desktop CPUs scale from 242 TOPS at the entry level and all the way up to 1321 TOPS at the high end. That’s a 4.84x increase at the lowest end and a 26.42x increase at the very top compared to the latest 45-50 TOPS AI NPUs that we will be seeing on SOCs this year.

RTX 4070 Ti SUPER (Desktop)

AMD Strix (NPU – Expected)

Intel Lunar Lake ( NPU – Expected)

Even laptop NVIDIA GeForce RTX 40 options such as the RTX 4050 start at 194 TOPS which is a 3.88x increase over the fastest-coming NPU while the RTX 4090 Laptop chip offers a 13.72x speedup with its 686 TOPS.

nvidia-geforce-rtx-ai-tops-perf_chart_desktop
nvidia-geforce-rtx-ai-tops-perf_chart_laptop

Microsoft Copilot Runtime Adds RTX Acceleration

So starting with today’s announcements, we first have the Windows Copilot Runtime which is getting RTX acceleration for Local PC SLMs (Small Language Models). Copilot is being seen as the next big thing from Microsoft in the AI PC landscape & virtually everyone is trying to hop onboard the bandwagon. Microsoft & NVIDIA are working together to allow developers to bring new GenAI capabilities to the Windows OS & web applications by providing easy API access to GPU-accelerated SLMs and RAG.

NVIDIA states that RTX GPUs will accelerate these new AI capabilities, providing fast and responsive AI experiences on Windows-powered devices.

NVIDIA RTX AI Toolkit & NVIDIA AIM SDK Help Devs Make AI Experiences Faster & Better

The second update is the announcement of the NVIDIA RTX AI Toolkit which also helps developers build application-specific AI models that can be run on the PC. The RTX AI toolkit will include a suite of tools and SDKs for model customization (QLoRa), optimization (TensorRT Model Optimizer), and deployment (TensorRT Cloud) on RTX AI PCs and will be available in June.

With the new RTX AI Toolkit, developers will be able to deploy their models 4x faster and in 3x smaller packages, accelerating the roll-out process and bringing in new experiences to users faster. A comparison between a standard “General-Purpose” model and an RTX AI Tooklit Optimized model is showcased too. The GP model is running on an RTX 4090 and outputs 48 tokens/second while requiring 17 GB of VRAM. Meanwhile, an RTX AI toolkit optimized model running on an RTX 4050 GPU outputs 187 tokens/second, an uplift of almost 4x, while requiring just 5 GB of VRAM.

The RTX AI Toolkit is also being leveraged by software partners such as Adobe, Blackmagic Design, and Topaz who are integrating its components into some of the most popular creative apps.

There’s also the new NVIDIA AI Inference Manager (AIM) SDK being rolled out which is a streamlined AI deployment tooLKIT for PC developers. AIM offers developers:

  • Unified Inference APU for all backends (NIM, DML, TRT, etc.) and hardware (cloud, local GPU, etc)
  • Hybrid orchestration across PC and cloud inference with PC capability check
  • Download and configure models and runtime environment on PC
  • Low-Latency integration into game pipeline
  • Simultaneous CUDA and graphics execution

The NVIDIA AIM SDK is available in early access now and supports all major inference backends such as TensorRT, DirectML, Llama.cpp, and PyTorch CUDA across GPUs, CPUs, and NPUs.

NVIDIA ACE NIMs Are on Full Display At Computex, GenAI Digital Avatar Microservices Now Available For RTX AI PCs

Lastly, we have NVIDIA’s ACE NIMs which are debuting today. These new ACE Inference microservices reduce the deployment time for ACE models from weeks to minutes and run locally on PC devices for natural language understanding, speech synthesis, facial animation, and more.

NVIDIA will be showcasing the Covert Protocol Tech Demo development by Inworld AI at Computex while developers will also be showcasing their own ACE models at the event such as Aww Inc’s Digital brand ambassador (Audio2Face), OutPalm’s Code Z (Audio2Face), Perfect World’s Multi-Lingual Demo (Audio2Face), Soulshell’s Social Engineering Demo (Audio2Face) & UneeQ’s Sophie (Audio2Face).

2024-06-02_10-12-22
2024-06-02_10-12-23

And it doesn’t stop there, NVIDIA has also announced that ACE (Avatar Cloud Engine) is now generally available for cloud, paving the way for future of GenAI Avatars. With these digital human micro-services, you are getting the following suote of technologies:

  • NVIDIA Riva ASR, TTS and NMT — for automatic speech recognition,
    text-to-speech conversion and translation
  • NVIDIA Nemotron LLM — for language understanding and contextual
    response generation
  • NVIDIA Audio2Face — for realistic facial animation based on audio tracks
  • NVIDIA Omniverse RTX — for real-time, path-traced realistic skin and hair
  • NVIDIA Audio2Gesture — for generating body gestures based on audio
    tracks, available soon
  • NVIDIA Nemotron-3 4.5B — a new small language model (SLM)
    purpose-built for low-latency, on-device RTX AI PC inference

So as you can tell, NVIDIA unpacked a lot of interesting technologies and innovations within the AI PC segment, powered by its RTX GPU and RTX platform. This shows NVIDIA’s leadership status in the AI industry and why it remains unrivaled.

Share this story

Facebook

Twitter



Source link

Leave a Reply

Your email address will not be published. Required fields are marked *