Biapy's Bookmarks

ReactBench

https://www.reactbench.com/

ReactBench is an evaluation for coding agents on realistic React work. Models can pass every test in today’s benchmarks and still write React that fails in production. Tests verify behavior, but they miss React performance, accessibility, and quality issues.

ReactBench @ GitHub.

ai benchmark llm react web-service

Added 1 week ago

ClickBench

https://benchmark.clickhouse.com/

a Benchmark For Analytical DBMS.

This benchmark represents typical workload in the following areas: clickstream and traffic analysis, web analytics, machine-generated data, structured logs, and events data. It covers the typical queries in ad-hoc analytics and real-time dashboards.

ClickBench @ GitHub.

Related contents:

Lies, Damn Lies and Database Benchmarks @ QuestDB.

benchmark cc-licensed database source-available

Added 1 month ago

BIRD-bench

https://bird-bench.github.io/

A BIg Bench for Large-Scale Relational Database Grounded Text-to-SQLs.

BIRD (BIg Bench for LaRge-scale Database Grounded Text-to-SQL Evaluation) represents a pioneering, cross-domain dataset that examines the impact of extensive database contents on text-to-SQL parsing. BIRD contains over 12,751 unique question-SQL pairs, 95 big databases with a total size of 33.4 GB. It also covers more than 37 professional domains, such as blockchain, hockey, healthcare and education, etc.

BIRD-SQL @ GitHub.

Related contents:

SQL Is Solved. Here's Where Chat-BI Still Breaks @ Ju Data Engineering Newsletter.

benchmark llm source-available sql text-to-sql

Added 4 months ago

GSO

https://gso-bench.github.io/index.html

Challenging Software Optimization Tasks for Evaluating SWE-Agents.

A benchmark for evaluating language models' capabilities in developing high-performance software.

GSO (Global Software Optimization) is a benchmark for evaluating language models' capabilities in developing high-performance software. We present 100+ challenging optimization tasks across 10 codebases spanning diverse domains and programming languages. Each task provides a codebase and performance test as a precise specification, with agents required to optmize the codebase and measured against expert developer commits.

GSO @ GitHub.

benchmark foss llm mit-licensed open-source

Added 10 months ago

minification benchmarks

https://github.com/privatenumber/minification-benchmarks

What's the best JavaScript minifier?

🏃‍♂️🏃‍♀️🏃 JS minification benchmarks: babel-minify, esbuild, terser, uglify-js, swc, google closure compiler, tdewolff/minify, oxc-minify

benchmark development javascript minify web-design

Added 10 months ago

BenchBase

https://db.cs.cmu.edu/projects/benchbase/

Multi-DBMS SQL Benchmarking Framework via JDBC.

BenchBase (formerly OLTPBench) is a Multi-DBMS SQL Benchmarking Framework via JDBC.

BenchBase @ GitHub.

Related contents:

Making Postgres 42,000x slower because I am unemployed @ ByteofDev.

apache2-licensed benchmark database foss open-source optimization performance postgresql

Added 1 year ago

BenchJS

https://benchjs.com/

JavaScript Benchmarking. Browser-based JavaScript benchmarking tool.

Run, compare, and share JavaScript benchmarks in your browser.

BenchJS @ GitHub.

benchmark development foss javascript open-source web

Added 1 year ago

Stabilizer

https://github.com/ccurtsinger/stabilizer

Statistically Sound Performance Evaluation.

Stabilizer is a system that enables the use of the powerful statistical techniques required for sound performance evaluation on modern architectures. Stabilizer forces executions to sample the space of memory configurations by repeatedly rerandomizing layouts of code, stack, and heap objects at runtime.

benchmark open-source optimization statistics

Added 1 year ago

tachometer

https://github.com/google/tachometer

Statistically rigorous benchmark runner for the web.

tachometer is a tool for running benchmarks in web browsers. It uses repeated sampling and statistics to reliably identify even tiny differences in runtime.

Improving rendering performance with CSS content-visibility @ Read the Tea Leaves.

benchmark development frontend javascript optimization web

Added 1 year ago

mitata

https://github.com/evanwashere/mitata

benchmark tooling that loves you ❤️

Mitata is a benchmark tooling library for JavaScript and C++ that offers accurate timing down to picoseconds, helpful visualizations, and features like automatic garbage collection and argument handling for benchmarks.

benchmark c++ command-line foss javascript nodejs open-source

Added 1 year ago

LLM Benchmark

https://llm.aidatatools.com/

Benchmark Throughput Performance with running local large language models (LLMs) via ollama.

llm-benchmark (ollama-benchmark) @ GitHub.

benchmark llm ollama open-source

Added 2 years ago

BrowserBench.org

https://browserbench.org/

Browser Benchmarks

Speedometer is a browser benchmark that measures the responsiveness of Web applications. It uses demo web applications to simulate user actions such as adding to-do items.

Speedometer 3.0: The Best Way Yet to Measure Browser Performance @ WebKit blog.

benchmark web web-browser web-service

Added 2 years ago

hyperfine

https://github.com/sharkdp/hyperfine

A command-line benchmarking tool.

Related contents:

Episode 636: Engineering the Future @ Linux Unplugged.

apache2-licensed benchmark command-line foss mit-licensed open-source optimization shell terminal

Added 3 years ago

HASTY

https://hasty.dev/

JS performance - Dev tool. Benchmark your JS snippets for an optimized performance.

benchmark development javascript optimization performance

Added 3 years ago

UserBenchmark

https://www.userbenchmark.com/

UserBenchmark Speed test your PC in less than a minute.

benchmark comparison hardware web-service

Added 3 years ago

OpenBenchmarking.org

http://openbenchmarking.org/

An Open, Collaborative Testing Platform For Benchmarking & Performance Analysis

benchmark foss performance

Added 10 years ago

Human Benchmark

http://www.humanbenchmark.com/dashboard

Test your human brain processing capacities.

benchmark web-service

Added 10 years ago

Tsung

http://tsung.erlang-projects.org/

Tsung is a high-performance benchmark framework for various protocols including HTTP, XMPP, LDAP, etc.

Tsung @ GitHub.

Related contents:

Réaliser des tests de performances de son site web avec Tsung @ L'admin sous GNU / Linux :fr:.

benchmark erlang foss load-testing open-source

Added 15 years ago