ℹ️
What is v2?

v1 outputs (under results/<cohort>/) are frozen. v2 adds a read-only statistical layer (improvements #4, #5, #14, #15 from the hardening spec) without retraining. Heavy modules (nested-CV, SHAP, LOCO, permutation tests, GSEA, MC-Dropout, GEO external validation) ship as Python APIs but are not invoked here.

🔬 Stage 8 — Reliability Hardening (signature-only metrics)

Re-evaluates every cohort on its biologically-meaningful candidate gene set with logistic regression, OOF predictions, Youden-threshold tuning, and 1 000-iteration bootstrap 95 % CIs. PRAD specificity 73.5 % → 96.2 % (+22.7 pp); BRCA AUC 0.999 → 0.890. Source: /api/v2/reliability.

Loading…

Specificity Improvement (pp) — Improvement #14

Baseline (LR, no FocalLoss/SMOTE) vs. optimised MLP (FocalLoss α=0.25, γ=2.0). Absolute percentage-point deltas only; misleading percentage-change metrics have been removed.

Loading…

Metrics with 95 % CIs — Improvement #4

Stratified bootstrap (N = 1000) when raw predictions are available, else percentile CIs over the 5 CV folds. Source column indicates which method was used.

Loading…

Somatic dN/dS with Wilson CIs — Improvement #15

Binomial exact Wilson-CI on non-synonymous / total mutations per gene. Genes with < 3 synonymous observations are filtered out (see selected column). A gene is marked selected only if dnds_ci_lo > 1. Infinite dN/dS values are replaced with a pseudocount-adjusted estimate or marked undefined.

Loading…

Weighted Consensus Ranking — Improvement #5

LR / RF / MLP feature importances min-max normalised, ranked, and combined with weights 0.3 / 0.3 / 0.4. Stability filter keeps genes that make the top-k in ≥ 4 of 5 folds (falls back to top-k when only single-fold importances are persisted).

Loading…