Search: [nvidia] - Biapy Web Directory

NVIDIA Dynamo https://developer.nvidia.com/dynamo

Wed Mar 19 14:08:23 2025

email

A Datacenter Scale Distributed Inference Serving Framework.

NVIDIA Dynamo is a high-throughput low-latency inference framework designed for serving generative AI and reasoning models in multi-node distributed environments. Dynamo is designed to be inference engine agnostic (supports TRT-LLM, vLLM, SGLang or others) and captures LLM-specific capabilities.

Dynamo @ GitHub.

Related contents:

A closer look at Dynamo, Nvidia's 'operating system' for AI inference @ The register.

GPU Glossary https://modal.com/gpu-glossary/readme

Wed Jan 15 13:29:29 2025

email

We wrote this glossary to solve a problem we ran into working with GPUs here at Modal : the documentation is fragmented, making it difficult to connect concepts at different levels of the stack, like Streaming Multiprocessor Architecture , Compute Capability , and nvcc compiler flags .

exo https://github.com/exo-explore/exo

Wed Oct 16 14:06:00 2024

email

Run your own AI cluster at home with everyday devices

Forget expensive NVIDIA GPUs, unify your existing devices into one powerful GPU: iPhone, iPad, Android, Mac, Linux, pretty much any device!

Related contents:

TensorRT SDK https://developer.nvidia.com/tensorrt

Wed Sep 25 08:33:40 2024

email

NVIDIA TensorRT is an ecosystem of APIs for high-performance deep learning inference. TensorRT includes an inference runtime and model optimizations that deliver low latency and high throughput for production applications. The TensorRT ecosystem includes TensorRT, TensorRT-LLM, TensorRT Model Optimizer, and TensorRT Cloud.

TensorRT Open Source Software @ GitHub.

nvitop https://github.com/XuehaiPan/nvitop

Fri Mar 17 16:22:27 2023

email

An interactive NVIDIA-GPU process viewer and beyond, the one-stop solution for GPU process management.

Headless Steam Service https://github.com/Steam-Headless/docker-steam-headless

Wed Mar 15 08:42:55 2023

email

A Headless Steam Docker image supporting NVIDIA GPU and accessible via Web UI.
Play your games in the browser with audio. Connect another device and use it with Steam Remote Play. Easily deploy a Steam Docker instance in seconds.

vramfs https://github.com/Overv/vramfs

Sun Dec 14 13:33:43 2014

email

Unused RAM is wasted RAM, so why not put some of that VRAM in your graphics card to work?

vramfs is a utility that uses the FUSE library to create a file system in VRAM. The idea is pretty much the same as a ramdisk, except that it uses the video RAM of a discrete graphics card to store files. It is not intented for serious use, but it does actually work fairly well, especially since consumer GPUs with 4GB or more VRAM are now available.

On the developer's system, the continuous read performance is ~2.4 GB/s and write performance 2.0 GB/s, which is about 1/3 of what is achievable with a ramdisk. That is already decent enough for a device not designed for large data transfers to the host, but future development should aim to get closer to the PCI-e bandwidth limits. See the benchmarks section for more info.

Links per page

Filters