Search: [machine-learning] - Biapy Web Directory

How To Scale Your Model https://jax-ml.github.io/scaling-book/

Wed Feb 5 14:21:25 2025

email

A Systems View of LLMs on TPUs.

This book aims to demystify the art of scaling LLMs on TPUs. We try to explain how TPUs work, how LLMs actually run at scale, and how to pick parallelism schemes during training and inference that avoid communication bottlenecks.

How To Scale Your Model @ GitHub.

Oumi https://oumi.ai/

Tue Feb 4 07:41:46 2025

email

Open Universal Machine Intellingence.
E2E Foundation Model Research Platform.
Everything you need to build state-of-the-art foundation models, end-to-end.

Oumi is a fully open-source platform that streamlines the entire lifecycle of foundation models - from data preparation and training to evaluation and deployment. Whether you're developing on a laptop, launching large scale experiments on a cluster, or deploying models in production, Oumi provides the tools and workflows you need.

Oumi @ GitHub.

Transformers https://huggingface.co/docs/transformers/index

Tue Jan 28 14:23:36 2025

email

State-of-the-art Machine Learning for PyTorch, TensorFlow, and JAX.

Transformers provides APIs and tools to easily download and train state-of-the-art pretrained models. Using pretrained models can reduce your compute costs, carbon footprint, and save you the time and resources required to train a model from scratch.

Transformers @ GitHub.

Related contents:

Running inference in web extensions @ dist://ed.

ONNX Runtime https://onnxruntime.ai/

Tue Jan 28 14:22:02 2025

email

ONNX Runtime: cross-platform, high performance ML inferencing and training accelerator.

ONNX Runtime inference can enable faster customer experiences and lower costs, supporting models from deep learning frameworks such as PyTorch and TensorFlow/Keras as well as classical machine learning libraries such as scikit-learn, LightGBM, XGBoost, etc.

ONNX Runtime @ GitHub.

Related contents:

Running inference in web extensions @ dist://ed.

vLLM https://docs.vllm.ai/en/latest/

Sun Jan 26 15:50:37 2025

email

Easy, fast, and cheap LLM serving for everyone.

vLLM is a fast and easy-to-use library for LLM inference and serving.

Originally developed in the Sky Computing Lab at UC Berkeley, vLLM has evloved into a community-driven project with contributions from both academia and industry.

vLLM @ GitHub.

Related contents:

How to serve LLMs with vLLM and OVHcloud AI Deploy @ OVHcloud.

Common Crawl https://commoncrawl.org/

Sun Jan 26 15:27:53 2025

email

Open Repository of Web Crawl Data.

Common Crawl maintains a free, open repository of web crawl data that can be used by anyone.

Related contents:

S5E7 - Sommes-nous à l'aube d'un effondrement des IA ? @ Underscore_'s acast .

FineWeb https://huggingface.co/datasets/HuggingFaceFW/fineweb

Sun Jan 26 15:26:31 2025

email

15 trillion tokens of the finest data the web has to offer.

The FineWeb dataset consists of more than 15T tokens of cleaned and deduplicated english web data from CommonCrawl. The data processing pipeline is optimized for LLM performance and ran on the datatrove library, our large scale data processing library.

FineWeb was originally meant to be a fully open replication of RefinedWeb, with a release of the full dataset under the ODC-By 1.0 license. However, by carefully adding additional filtering steps, we managed to push the performance of FineWeb well above that of the original RefinedWeb, and models trained on our dataset also outperform models trained on other commonly used high quality web datasets (like C4, Dolma-v1.6, The Pile, SlimPajama, RedPajam2) on our aggregate group of benchmark tasks.

Related contents:

Materia AI https://www.trymateria.ai/

Sat Jan 25 14:02:04 2025

email

Partner of Accounting Leaders. Generative AI platform for intelligent accounting.

The preferred partner of accounting leaders.

Related contents:

#304.bin - Bilan 2024: Le début de la révolution avec Quentin Adam @ <ifttd>.

NeurIPS Conference https://neurips.cc/

Sat Jan 25 13:32:27 2025

email

The Annual Conference on Neural Information Processing Systems.

Related contents:

#104 Développer des projets IA - introduction @ Double Slash .

Groq https://groq.com/

Tue Jan 21 12:18:19 2025

email

Groq is Fast AI Inference.

Related contents:

#104 Développer des projets IA - introduction @ Double Slash .

LM Studio https://lmstudio.ai/

Tue Jan 21 12:14:53 2025

email

Discover, download, and run local LLMs.

LM Studio @ GitHub.

Related contents:

#104 Développer des projets IA - introduction @ Double Slash .

yek https://github.com/bodo-run/yek

Mon Jan 20 14:04:17 2025

email

A fast tool to read text-based files in a repository or directory, chunk them, and serialize them for LLM consumption.

Agent Recipes https://www.agentrecipes.com/

Mon Jan 20 14:03:11 2025

email

Explore Agent Recipes

Explore common agent recipes with ready to copy code to improve your LLM applications.

Related contents:

Building effective agents @ Anthropic.

SPAR3D: Stable Point Aware 3D https://platform.stability.ai/docs/api-reference#tag/3D/paths/~1v2beta~13d~1stable-point-aware-3d/post

Sun Jan 19 22:50:01 2025

email

Stable Point Aware 3D (SPAR3D) can make real-time edits and create the complete structure of a 3D object from a single image in a few seconds. SPAR3D combines the strengths of point-cloud diffusion (probabilistic) and mesh regression (deterministic) to have improved details on the unseen back regions in the input image.

Related contents:

structured-logprobs https://arena-ai.github.io/structured-logprobs/

Thu Jan 16 13:49:46 2025

email

structured-logprobs is an open-source Python library that enhances OpenAI's structured outputs by providing detailed information about token log probabilities.

This library is designed to offer valuable insights into the reliability of an LLM's structured outputs. It works with OpenAI's Structured Outputs, a feature that ensures the model consistently generates responses adhering to a supplied JSON Schema. This eliminates concerns about missing required keys or hallucinating invalid values.

structured-logprobs @ GitHub.

GPU Glossary https://modal.com/gpu-glossary/readme

Wed Jan 15 13:29:29 2025

email

We wrote this glossary to solve a problem we ran into working with GPUs here at Modal : the documentation is fragmented, making it difficult to connect concepts at different levels of the stack, like Streaming Multiprocessor Architecture , Compute Capability , and nvcc compiler flags .

Lucie LLM https://huggingface.co/collections/OpenLLM-France/lucie-llm-67099ba7b992dee2c32b1f92

Wed Jan 15 08:26:52 2025

email

Open weights LLM for French, English, German, Spanish and Italian.

Lucie Training @ GitHub.

Related contents:

OpenLLM France

https://www.openllm-france.fr/

Wed Jan 15 08:25:50 2025

email

LLM génératifs ouverts et performants.

Le Consortium OpenLLM France réunit 17 acteurs qui se sont rassemblés dans le prolongement de la création de la communauté OpenLLM France qui fédère à ce jour un écosystème de près de 200 entités (laboratoires publics de recherche, fournisseurs potentiels de données, acteurs technologiques spécialisés, fournisseurs de cas d'usage...). Ces acteurs échangent de manière publique et transparente depuis le début de l’été 2023 sur le serveur Discord de la communauté.

Links per page

Filters