Skip to content

winfunc/priori

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Priori logo

Priori

A modular Rust workspace for Bayesian inference with symbolic models, autodiff, MCMC, MAP optimization, diagnostics, simulation-based calibration, and trace/data I/O.

Priori was written as a modern Rust alternative to Rainier from Stripe for the same broad class of Bayesian workflows.

Why Priori · Rainier Context · Use Cases · Quick Start · Workspace · CLI · Library

Priori is pre-1.0 and currently organized as a focused workspace of crates rather than a single monolith.

Why Priori

  • Immutable symbolic DAG for real-valued model expressions
  • Reverse-mode symbolic autodiff over model graphs
  • HMC, eHMC, and NUTS sampling backends
  • L-BFGS-based MAP optimization
  • Diagnostics including R-hat, ESS, and BFMI summaries
  • Simulation-based calibration tooling
  • CSV, TSV, JSON, JSONL, Parquet, and Arrow/IPC data support
  • CLI workflows for fit, diagnose, predict, and sbc

From Rainier to Priori

  • Rainier is Stripe's Scala/JVM library for Bayesian inference over fixed-structure, continuous-parameter generative models, with a static computation graph and gradient-based inference. Its overview and paper are published at rainier.fit/docs/intro and rainier.fit/docs/probprog.
  • Priori follows that design lineage in Rust: symbolic graphs, reverse-mode autodiff, and HMC-family samplers for the same general class of models.
  • Priori expands the surface area with a modular crate layout, eHMC and NUTS backends, MAP optimization, diagnostics, simulation-based calibration, and modern tabular data/trace I/O.
  • The workspace includes Rainier parity checks and Rainier-based SBC regression tests to keep the comparison concrete.

Sample Use Cases

1. Estimate a latent mean from noisy measurements

Use Priori when you have repeated observations of the same quantity and want a posterior over the unknown mean instead of a single point estimate.

cargo run -p priori-cli -- fit \
  --data /path/to/train.csv \
  --response y \
  --model "normal(mu, 1)" \
  --prior "mu ~ normal(0, 10)" \
  --chains 4 \
  --iterations 500 \
  --warmup 200 \
  --output /path/to/trace.csv \
  --summary

Good fits:

  • Sensor or measurement calibration
  • Baseline-rate estimation
  • Small Bayesian building blocks inside a larger workflow

2. Fit Bayesian linear models on tabular data

Use Priori for small-to-medium regression problems where you want uncertainty over coefficients and predictions, not just least-squares estimates.

cargo run -p priori-cli -- fit \
  --data /path/to/linreg.csv \
  --response y \
  --model "normal(alpha + beta * X, 1)" \
  --prior "alpha ~ normal(0, 10)" \
  --prior "beta ~ normal(0, 10)" \
  --chains 2 \
  --iterations 300 \
  --warmup 150 \
  --output /path/to/trace.parquet

Good fits:

  • Demand or pricing models
  • Risk-score coefficient estimation
  • Product or marketing lift analysis with uncertainty intervals

3. Standardize trace diagnostics in a batch pipeline

Use Priori when your workflow needs machine-readable trace files plus a fast convergence check step.

cargo run -p priori-cli -- diagnose --trace /path/to/trace.parquet

This is useful when you want to gate downstream analysis on diagnostics like R-hat, ESS, and BFMI instead of assuming every sampler run converged.

4. Validate new models or sampler changes with simulation-based calibration

Use Priori's SBC tooling when you are developing a model family, checking inference correctness, or regression-testing sampler changes.

cargo run -p priori-cli -- sbc \
  --model "normal(mu, 1)" \
  --prior "mu ~ normal(0, 1)" \
  --repetitions 100 \
  --synthetic-samples 100 \
  --iterations 200 \
  --warmup 100 \
  --chains 2 \
  --bins 10 \
  --output /path/to/sbc.json

Good fits:

  • Model validation before production use
  • Sampler regression tests
  • CI checks for probabilistic code changes

5. Embed Bayesian inference directly in a Rust application

Use the library API when you want inference inside a Rust service, CLI, or offline job instead of shelling out to an external tool.

use priori::core::{ParameterId, Real};
use priori::dist::{Distribution, Normal};
use priori::model::{Generator, Model};
use priori::sampler::SamplerConfig;

let data = vec![1.0, 2.0, 3.0, 2.5];
let mu = Real::parameter(ParameterId(0));
let prior = Normal::new(Real::scalar(0.0), Real::scalar(10.0)).log_density_at(mu.clone());
let likelihood = Normal::new(mu.clone(), Real::scalar(1.0));

let model = Model::new()
    .observe(&data, &likelihood)
    .observe_with(&[0.0], |_| prior.clone());

let trace = model.sample(&SamplerConfig::default(), 4);
let mu_draws = trace.predict(&Generator::real(mu));

Quick Start

Requirements:

  • Rust 1.85+
  • Cargo

Build and verify from the repository root:

cargo fmt --all --check
cargo clippy --workspace --all-targets -- -D warnings
cargo test --workspace

Inspect the CLI:

cargo run -p priori-cli -- --help

Consume the facade crate directly from GitHub:

[dependencies]
priori = { git = "https://github.com/winfunc/priori" }

Workspace Layout

Crate Purpose
crates/priori-core Symbolic real expressions, bounds, and evaluation
crates/priori-autodiff Reverse-mode gradient derivation over symbolic graphs
crates/priori-dist Continuous and discrete distributions plus support transforms
crates/priori-sampler Samplers, warmup adaptation, and chain orchestration
crates/priori-optim MAP estimation and L-BFGS utilities
crates/priori-model Model assembly, compiled densities, traces, and SBC helpers
crates/priori-io DataFrame loading and prediction/trace serialization
crates/priori-diag Diagnostics summaries and simple visual outputs
crates/priori-cli CLI executable
crates/priori Unified public facade crate

CLI Quick Start

Fit a model

cargo run -p priori-cli -- fit \
  --data /path/to/train.csv \
  --response y \
  --model "normal(mu, 1)" \
  --prior "mu ~ normal(0, 10)" \
  --chains 4 \
  --iterations 500 \
  --warmup 200 \
  --seed 2026 \
  --output /path/to/trace.csv \
  --summary

Example summary output:

name      mean      std       hdi_low   hdi_high
mu          2.1700   0.4100    1.4400    2.9900

Fit a simple linear regression

cargo run -p priori-cli -- fit \
  --data /path/to/linreg.csv \
  --response y \
  --model "normal(alpha + beta * X, 1)" \
  --prior "alpha ~ normal(0, 10)" \
  --prior "beta ~ normal(0, 10)" \
  --chains 2 \
  --iterations 300 \
  --warmup 150 \
  --seed 7 \
  --output /path/to/trace.parquet

Diagnose a saved trace

cargo run -p priori-cli -- diagnose --trace /path/to/trace.csv

Example output:

param r_hat ess
    0 1.001   245.3
    1 1.005   198.8

Re-export a saved trace

predict currently normalizes a saved trace into Priori's prediction payload and writes it in another format. It does not yet generate posterior predictive quantities from a separate model specification.

cargo run -p priori-cli -- predict \
  --trace /path/to/trace.csv \
  --output /path/to/predictions.json

Run simulation-based calibration

cargo run -p priori-cli -- sbc \
  --model "normal(mu, 1)" \
  --prior "mu ~ normal(0, 1)" \
  --repetitions 100 \
  --synthetic-samples 100 \
  --iterations 200 \
  --warmup 100 \
  --chains 2 \
  --bins 10 \
  --seed 42 \
  --output /path/to/sbc.json

CLI Model Syntax

Model form

--model must use the form:

distribution(arg1, arg2, ...)

Supported CLI model distributions:

  • normal(mu, sigma)
  • poisson(lambda)
  • bernoulli(p)
  • binomial(p, n) where n must be a non-negative integer literal
  • gamma(alpha, beta)
  • exponential(rate)
  • lognormal(mu, sigma)
  • laplace(mu, sigma)
  • cauchy(mu, sigma)
  • studentt(nu, mu, sigma) or student_t(nu, mu, sigma)

Supported expression features inside model arguments:

  • Operators: +, -, *, /
  • Functions: exp(...), log(...)
  • Identifiers: prior-defined parameters and input-data column names

Prior form

Each prior must use the form:

parameter ~ distribution(numeric_args...)

Repeat --prior for multi-parameter models.

Supported CLI prior distributions:

  • normal
  • cauchy
  • laplace
  • studentt or student_t
  • gamma
  • exponential
  • beta
  • lognormal or log_normal
  • uniform

Input and Output Formats

Input

--data and --trace accept:

  • .csv
  • .tsv
  • .json arrays of records
  • .jsonl or .ndjson
  • .parquet
  • .arrow or .ipc
  • - for stdin with format auto-detection

Output

fit --output and predict --output support:

  • .csv
  • .json
  • .parquet
  • .arrow or .ipc

Output semantics:

  • .csv, .parquet, .arrow, and .ipc contain tabular samples
  • .json contains a nested payload with both samples and summaries
  • fit and predict include __chain and __draw columns in tabular outputs

Trace-aware commands recognize these metadata aliases:

  • Chain: __chain, chain, chain_id
  • Draw: __draw, draw, sample, iteration
  • Warmup marker: __warmup, warmup, is_warmup

Interoperability note:

  • diagnose and predict expect a tabular trace file
  • fit --output *.csv|*.parquet|*.arrow|*.ipc is directly consumable by diagnose and predict
  • fit --output *.json is not directly consumable by diagnose or predict

Library Quick Start

Consume the facade crate directly from GitHub, or use a local path if you are developing inside a checked-out workspace:

[dependencies]
priori = { git = "https://github.com/winfunc/priori" }
# priori = { path = "crates/priori" }

Example:

use priori::core::{ParameterId, Real};
use priori::dist::{Distribution, Normal};
use priori::model::{Generator, Model};
use priori::sampler::SamplerConfig;

fn main() {
    let data = vec![1.0, 2.0, 3.0, 2.5];

    let mu = Real::parameter(ParameterId(0));
    let prior = Normal::new(Real::scalar(0.0), Real::scalar(10.0)).log_density_at(mu.clone());
    let likelihood = Normal::new(mu.clone(), Real::scalar(1.0));

    let model = Model::new()
        .observe(&data, &likelihood)
        .observe_with(&[0.0], |_| prior.clone());

    let trace = model.sample(&SamplerConfig::default(), 4);
    let mu_draws = trace.predict(&Generator::real(mu));

    println!("draws={}", mu_draws.len());
}

Diagnostics, Reproducibility, and Logging

  • Use --seed to make CLI runs deterministic
  • If --seed is omitted, the CLI generates a fresh random seed and logs it
  • Use --verbose for debug logging
  • Use --quiet to reduce output to warnings and errors
  • RUST_LOG can override the default log filter

Validation and Benchmarks

Standard workspace checks:

cargo test --workspace
cargo clippy --workspace --all-targets -- -D warnings

Additional parity and heavier validation suites:

cargo test -p priori --test rainier_parity -- --ignored
cargo test -p priori --test sbc_rainier_models -- --ignored
cargo test -p priori --test integration_strict -- --ignored

Benchmarks:

cargo bench -p priori

Current Limitations

  • The CLI sbc command currently requires exactly one prior specification
  • The CLI sbc command currently supports scalar models without covariate columns
  • CLI prior arguments must be numeric literals rather than symbolic expressions
  • Legacy dense mass matrix construction fails closed on invalid inputs; MassMatrix::try_dense is the explicit validation path
  • The public API is still evolving and should be treated as pre-1.0

License

MIT. See LICENSE.

About

Bayesian inference in Rust: symbolic models, autodiff, HMC/NUTS, diagnostics, and SBC. A modern Rust alternative to Rainier from Stripe.

Topics

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors

Languages