Quick start

Run the Benchmark

From download to first result in under 10 minutes.

// Setup guide

Download and extract the kit

Download the benchmark kit from your unlock page and transfer it to your Linux host.

shell

scp dataforge-benchmark-kit.tar.gz user@host:~/
ssh user@host
tar -xzf dataforge-benchmark-kit.tar.gz
cd dataforge-benchmark-kit/

Set permissions

shell

chmod +x bench bench-report scripts/bench_setup.sh scripts/bench_suite.sh

Run the environment setup

The setup script fingerprints your hardware and database environment. It saves a JSON snapshot used later to contextualize results.

shell

./scripts/bench_setup.sh \
  --conn "postgres://user:pass@localhost/benchdb" \
  --device nvme0n1 \
  --netif eth0

Replace --device with your block device (e.g. sda, dm-3, nvme0n1). Use lsblk to identify it.

Run the full benchmark suite

The suite runs DataForge at increasing concurrency levels (1, 2, 5, 10, 20, 50). Each level ingests the full dataset and records metrics. Allow 15–30 minutes for the full run.

shell

./scripts/bench_suite.sh \
  --url http://localhost:8080 \
  --key YOUR_API_KEY \
  --conn "postgres://user:pass@localhost/benchdb" \
  --dir /path/to/csv/files \
  --prefix citation-map \
  --device nvme0n1

Generate the report

bench-report reads the JSON output from the suite, runs live Stage 3 database validation, and generates a self-contained HTML report.

shell

./bench-report \
  --results-dir bench_results_YYYYMMDD_HHMMSS \
  --conn "postgres://user:pass@localhost/benchdb" \
  --prefix citation-map \
  --count 50 \
  --expected-rows 1516282 \
  --out my-bench-report.html \
  --tag "My Infrastructure — 2026-03-26"

Open the report

The generated HTML file is self-contained. Open it in any browser. It includes your environment fingerprint, concurrency ladder results, hardware pressure metrics, and Stage 3 data validation.

shell

open my-bench-report.html

Baseline runs on properly configured systems consistently show 800K–2.5M rows/sec depending on hardware class. Results are reproducible.

Submit your results

Results can be submitted back via POST /api/benchmark-runs for inclusion in the public benchmark record. The bench-report tool includes a submit flag that handles this automatically. See the JSON format reference for the payload structure.