Bayesian Experimentation API
This document describes the full Bayesian experimentation support. When Bayesian analysis is enabled on an experiment, the results endpoint returns a bayesian_results block alongside the existing frequentist statistics.
Overview
The platform implements a Beta-Binomial model for conversion-rate experiments and a Normal-Normal conjugate model for continuous metrics. Posterior credible intervals, Bayes factors (Savage-Dickey density ratio), and a Bayesian stopping rule are provided out of the box.
Key capabilities:
- Posterior distribution parameters (
alpha,beta) updated from observed data. - Credible intervals at configurable width (e.g., 95%).
- Bayes factor (BF10) via the Savage-Dickey density ratio for hypothesis testing.
- Probability of superiority — P(treatment > control).
- ROPE (Region of Practical Equivalence) acceptance for declaring practical equivalence.
- Bayesian stopping rule — the experiment can be flagged
stopped_earlywhen the posterior is conclusive and the minimum sample size has been met.
Enabling Bayesian Analysis on an Experiment
Experiment Create / Update Fields
Add the following fields to the request body of POST /api/v1/experiments/ or PUT /api/v1/experiments/{experiment_id}:
| Field | Type | Default | Description |
|---|---|---|---|
bayesian_enabled | boolean | false | Activates Bayesian analysis for this experiment |
bayesian_config | object | null | Configuration object (required when bayesian_enabled is true) |
bayesian_config Object
| Field | Type | Default | Description |
|---|---|---|---|
prior_alpha | float | 1.0 | Alpha parameter of the Beta prior (successes + 1); must be > 0 |
prior_beta | float | 1.0 | Beta parameter of the Beta prior (failures + 1); must be > 0 |
rope_low | float | -0.01 | Lower bound of the Region of Practical Equivalence (relative lift) |
rope_high | float | 0.01 | Upper bound of the Region of Practical Equivalence (relative lift) |
minimum_bayes_factor | float | 3.0 | BF10 threshold for the stopping rule; experiment stops when BF10 >= minimum_bayes_factor or BF10 <= 1/minimum_bayes_factor |
credible_interval_width | float | 0.95 | Width of the posterior credible interval (e.g., 0.95 for 95% HDI) |
Example: Create Experiment with Bayesian Config
curl -X POST "https://your-platform.example.com/api/v1/experiments/" \
-H "Authorization: Bearer your_access_token" \
-H "Content-Type: application/json" \
-d '{
"name": "Checkout Button Bayesian Test",
"hypothesis": "Green button increases conversion rate",
"experiment_type": "AB_TEST",
"bayesian_enabled": true,
"bayesian_config": {
"prior_alpha": 1.0,
"prior_beta": 1.0,
"rope_low": -0.005,
"rope_high": 0.005,
"minimum_bayes_factor": 10.0,
"credible_interval_width": 0.95
},
"variants": [
{ "key": "control", "name": "Control", "weight": 0.5 },
{ "key": "treatment", "name": "Treatment", "weight": 0.5 }
]
}'
Results Endpoint
GET /api/v1/results/{experiment_id}
When bayesian_enabled is true, the standard results response is augmented with a top-level bayesian_results block.
Example Request
curl -X GET "https://your-platform.example.com/api/v1/results/exp-uuid-here" \
-H "Authorization: Bearer your_access_token"
Response: 200 OK
{
"experiment_id": "exp-uuid-here",
"experiment_name": "Checkout Button Bayesian Test",
"status": "RUNNING",
"sample_size": {
"control": 4820,
"treatment": 4795
},
"metrics": [
{
"name": "conversion_rate",
"control_value": 0.1240,
"treatment_value": 0.1387,
"difference": 0.0147,
"p_value": 0.0312,
"is_significant": true
}
],
"bayesian_results": {
"metric_name": "conversion_rate",
"posterior_alpha": 671.0,
"posterior_beta": 4151.0,
"posterior_mean": 0.1390,
"credible_interval": [0.1292, 0.1490],
"credible_interval_width": 0.95,
"bayes_factor": 18.42,
"probability_of_superiority": 0.9731,
"rope_low": -0.005,
"rope_high": 0.005,
"rope_probability": 0.0018,
"decision": "ACCEPT_ALTERNATIVE",
"stopped_early": false,
"stopping_rule_met_at": null,
"iterations": 100000
}
}
BayesianDecision Enum
| Value | Description |
|---|---|
ACCEPT_NULL | Strong evidence in favour of the null hypothesis (BF10 <= 1/minimum_bayes_factor). Recommend stopping; control wins. |
ACCEPT_ALTERNATIVE | Strong evidence in favour of the alternative hypothesis (BF10 >= minimum_bayes_factor). Recommend stopping; treatment wins. |
INCONCLUSIVE | Insufficient evidence to make a decision. Experiment should continue collecting data. |
ROPE_ACCEPT | The posterior credible interval falls entirely within the ROPE. The effect is practically equivalent to zero; either variant is acceptable. |
BayesianResultsResponse Schema
| Field | Type | Description |
|---|---|---|
metric_name | string | The primary metric used for Bayesian inference |
posterior_alpha | float | Alpha parameter of the posterior Beta distribution (Beta-Binomial model) |
posterior_beta | float | Beta parameter of the posterior Beta distribution |
posterior_mean | float | Mean of the posterior distribution (alpha / (alpha + beta)) |
credible_interval | [float, float] | Highest Density Interval (HDI) at credible_interval_width |
credible_interval_width | float | Width of the HDI used (e.g., 0.95) |
bayes_factor | float | BF10 computed via the Savage-Dickey density ratio |
probability_of_superiority | float | Posterior probability that treatment conversion rate > control (via Monte Carlo with iterations samples) |
rope_low | float | Lower bound of the configured ROPE |
rope_high | float | Upper bound of the configured ROPE |
rope_probability | float | Posterior probability that the true effect size lies entirely within the ROPE |
decision | BayesianDecision | The automated decision based on the stopping rule |
stopped_early | boolean | true if the stopping rule was triggered before the planned end date |
stopping_rule_met_at | datetime or null | UTC timestamp when the stopping criterion was first satisfied |
iterations | int | Number of Monte Carlo samples used for probability of superiority estimate |
Bayesian Stopping Rule Behaviour
The platform evaluates the stopping rule after each scheduled metrics collection cycle (every 15 minutes by default). The rule fires when all of the following conditions are met:
- Minimum sample size reached: Both the control and treatment groups have at least the configured
min_sample_sizeobservations (defaults to 1,000 per variant when unset). - Conclusive Bayes factor:
BF10 >= minimum_bayes_factor(evidence for alternative) orBF10 <= 1 / minimum_bayes_factor(evidence for null). - OR ROPE acceptance: The full credible interval lies within
[rope_low, rope_high].
When the rule fires:
stopped_earlyis set totrue.stopping_rule_met_atis recorded.- The experiment status transitions to
STOPPED_EARLY. - A notification is dispatched to configured channels (Slack / email) if alerting is enabled.
Note: The Bayesian stopping rule is mathematically valid for early stopping and does not inflate the false-positive rate in the way that frequentist sequential testing does without alpha spending, making it safe to check continuously.
Prior Selection Guide
| Scenario | Recommended Prior | Rationale |
|---|---|---|
| No prior knowledge | alpha=1, beta=1 (uniform) | Completely uninformative; posterior is driven entirely by data |
| Historical baseline known | alpha = baseline_rate * N, beta = (1 - baseline_rate) * N | Encodes prior experiments as an equivalent number of observations N |
| Conservative (shrink toward zero) | alpha=0.5, beta=0.5 (Jeffreys) | Weakly informative, recommended when sample sizes are small |
Error Responses
| Status | Meaning |
|---|---|
400 Bad Request | bayesian_config provided without bayesian_enabled: true, or invalid parameter values |
401 Unauthorized | Missing or invalid Bearer token |
403 Forbidden | Insufficient role |
404 Not Found | Experiment ID does not exist |
422 Unprocessable Entity | Validation error (e.g., prior_alpha <= 0, rope_low >= rope_high) |
{
"detail": [
{
"loc": ["body", "bayesian_config", "prior_alpha"],
"msg": "prior_alpha must be greater than 0",
"type": "value_error"
}
]
}