Intermediate Representation (IR)

The Timber IR is the central data structure connecting front-ends to back-ends. It provides a framework-agnostic representation of tree-based models.

Structure

A TimberIR is a pipeline of stages, where each stage represents a transformation:

@dataclass
class TimberIR:
    stages: list[Stage]
    metadata: dict

Stage Types

`TreeEnsembleStage`

The core stage — an ensemble of decision trees.

@dataclass
class TreeEnsembleStage:
    trees: list[Tree]
    n_features: int
    n_outputs: int
    objective: str           # "binary:logistic", "regression", "multiclass"
    base_score: float        # Bias term (in logit space for classification)
    feature_names: list[str]

`ScalerStage`

A feature scaler (from sklearn Pipeline).

@dataclass
class ScalerStage:
    means: list[float]       # Per-feature means
    scales: list[float]      # Per-feature standard deviations

The Pipeline Fusion optimizer pass absorbs this into tree thresholds.

`VotingEnsembleStage`

Weighted average of multiple sub-models.

@dataclass
class VotingEnsembleStage:
    sub_models: list[TimberIR]
    weights: list[float]
    voting: str              # "soft" or "hard"

`StackingEnsembleStage`

Stacking with a meta-learner.

@dataclass
class StackingEnsembleStage:
    base_models: list[TimberIR]
    meta_model: TimberIR
    passthrough: bool        # Also pass original features to meta-learner

Tree Representation

Each tree is a list of nodes:

@dataclass
class TreeNode:
    feature_index: int       # -1 for leaf nodes
    threshold: float         # Split threshold
    left_child: int          # Index of left child
    right_child: int         # Index of right child
    leaf_value: float        # Only valid for leaf nodes
    default_left: bool       # Missing value direction

Trees are stored in depth-first pre-order as flat arrays, enabling iterative traversal without recursion.

Design Decisions

Why a Pipeline of Stages?

Many real-world deployments use sklearn Pipeline(StandardScaler, GradientBoosting). By representing the scaler as an explicit stage, the optimizer can fuse it into the tree thresholds, eliminating preprocessing overhead.

Why Flat Arrays?

The flat array representation maps directly to C's static const arrays, enabling zero-copy code generation. No pointer chasing, no dynamic allocation, cache-friendly layout.

Why Double-Precision Accumulation?

Tree outputs are summed in double precision before final float cast. This prevents catastrophic cancellation when many small leaf values are summed, especially in models with hundreds of trees. The generated C code uses double sum = 0.0; for accumulation.

Structure​

Stage Types​

TreeEnsembleStage​

ScalerStage​

VotingEnsembleStage​

StackingEnsembleStage​

Tree Representation​

Design Decisions​

Why a Pipeline of Stages?​

Why Flat Arrays?​

Why Double-Precision Accumulation?​