Timber

ML inference compiler with hardware acceleration. Compile XGBoost, LightGBM, sklearn, CatBoost, and ONNX to C99, SIMD, GPU, or FPGA — with WCET analysis, safety certification, and supply-chain security built in.

Timber is an ahead-of-time (AOT) compiler that transforms trained machine learning models into optimized, self-contained inference code. No Python runtime at inference time. No dynamic allocation. No recursion. Just fast, auditable, portable code that runs on servers, edge devices, microcontrollers, and safety-critical systems.

The Problem

The typical production path for an XGBoost or scikit-learn model today:

Train model in Python, serialize to .json or .pkl
Wrap in a Flask or FastAPI endpoint
Deploy with the full Python runtime
Pay the overhead of CPython on every inference call

This means a 2 MB model needs a 200 MB Python process to serve it, with P99 latencies measured in milliseconds rather than microseconds.

The Solution

Timber treats your trained model as a program specification and compiles it:

Model Artifact → Parse → IR → Optimize (6 passes) → Emit → C99 → .so → Serve

The result:

Metric	Python Inference	Timber
P50 Latency	672 µs	2 µs (336× faster)
P99 Latency	13,468 µs	3 µs (4,489× faster)
Throughput	840/s	447,868/s (533× faster)
Artifact Size	~50 MB	48 KB (1,000× smaller)
Dependencies	Python + NumPy + XGBoost	None (C99 only)

Ollama-Style Workflow

Inspired by Ollama's developer experience for LLMs, Timber provides the same simplicity for classical ML:

pip install timber-compiler
timber load model.json --name fraud-detector
timber serve fraud-detector

curl http://localhost:11434/api/predict \
  -d '{"model": "fraud-detector", "inputs": [[1.0, 2.0, ...]]}'

What Timber Supports

Input Frameworks

XGBoost — JSON model dumps (v2.0+), XGBoost 3.1+ per-class base_score
LightGBM — Text model files
scikit-learn — Pickle files with 9 model types: GradientBoosting, RandomForest, IsolationForest, ExtraTrees, SVM (SVC/SVR/OneClass), Naive Bayes (Gaussian/Multinomial), GaussianProcessRegressor, KNN (Classifier/Regressor), Linear models, Pipelines
CatBoost — JSON exports (oblivious trees)
ONNX — TreeEnsemble, LinearClassifier/Regressor, SVMClassifier/Regressor, Normalizer, Scaler
URDF — Robot description files → forward kinematics; outputs 4×4 homogeneous transform; inputs are joint angles

Optimization Passes

Dead leaf elimination
Constant feature detection
Threshold quantization
Frequency-ordered branch sorting
Pipeline fusion (scaler → tree)
Vectorization analysis

Output Targets

Target	Description	Use Case
C99	Shared (`.so`) or static (`.a`) libraries	General deployment
LLVM IR	`.ll` text IR with configurable target triple	Hardware-specific optimization
WebAssembly	`.wat` + JS bindings	Browser and edge deployment
MISRA-C:2012	Safety-critical C with 8-rule compliance checker	Certified systems
SIMD	AVX2, AVX-512, ARM NEON/SVE, RISC-V V	Vectorized inference (2-8× speedup)
GPU	CUDA (sm75/sm86), Metal (Apple Silicon), OpenCL	Batch inference, edge GPUs
FPGA HLS	Xilinx Vitis, Intel SDK	Deterministic hardware acceleration
Embedded	Cortex-M4/M7, ESP32, STM32H7	Microcontrollers, no-heap, static buffers

Production Features

Deterministic JSON audit trails for regulatory compliance
Differential privacy inference — Laplace and Gaussian noise mechanisms
Differential compilation for incremental model updates
Ensemble composition (voting, stacking)
Multi-worker FastAPI server with GET /api/metrics (P50/P95/P99/P999 rolling window)
Thread-safe, zero-allocation generated code
WCET analysis — Worst-case execution time for Cortex-M4/M7, x86_64, AArch64, RISC-V
Safety certification — DO-178C (aviation), ISO 26262 (automotive), IEC 62304 (medical) reports
Supply chain security — Ed25519 signing, AES-256-GCM encryption, TPM 2.0 hooks
Deployment bundles — Air-gapped tar.gz with manifests and signatures
Native servers — C++ gRPC/HTTP, ROS 2 nodes, PX4 module skeletons
650+ test nuclear-grade test suite (core + acceleration + safety)

Next Steps

→ Getting Started — load and serve your first model in 60 seconds

The Problem​

The Solution​

Ollama-Style Workflow​

What Timber Supports​

Input Frameworks​

Optimization Passes​

Output Targets​

Production Features​

Next Steps​