β€”
Best Accuracy
bal. accuracy
β€”
Best AUC
across cancers
β€”
Avg Specificity Gain
percentage points
5
Cancer Types
TCGA cohorts
3
Models
LR Β· RF Β· MLP

Specificity Improvements by Cancer Type

MLP Performance Dashboard

Cancer Bal. Accuracy Specificity Sensitivity AUC MCC Architecture Samples (T/N)
Loading…

Task Γ— Model Results

Task Model Accuracy Precision Recall ROC AUC
Loading…
⚠️
Limitations
  • Near-perfect v1 AUC reflects the intrinsic separability of tumor vs. normal transcriptomes on the full DESeq2-filtered feature set (~5,000 genes), not signature-specific discriminatory power. The v2 Reliability Hardening table below retrains each cohort on its final candidate gene set and reports honest bootstrap-CI metrics.
  • PRAD v1 specificity (73.5%) is the lowest across cancers due to adjacent-normal tumor contamination; v2 Youden-optimal specificity is β€”.
  • UCEC has only 201 samples (smallest dataset) and reaches v1 AUC β‰ˆ 1.000 on the full feature set; v2 signature-only AUC = β€”.
  • SMOTE oversampling is applied for PRAD and BLCA within CV folds. Class weighting may be more appropriate for high-dimensional data.
ℹ️
All metrics are averaged over 5-fold stratified cross-validation. Architecture is selected dynamically: 512β†’256β†’128 for datasets with n > 600 samples, 256β†’128 for smaller datasets.

πŸ”¬ v2 β€” Honest signature-only metrics (Stage 8)

The headline metrics above are computed on the full DESeq2-filtered feature set (β‰ˆ 5 000 genes), which lets the MLP memorise tumor-vs-normal patterns and produces AUCs β‰₯ 0.99 in four out of five cohorts. The v2 hardening stage re-evaluates every cohort on its final candidate gene set (or the top-50 signature genes when too few candidates pass) with logistic regression, OOF predictions, Youden-optimal thresholding, and 1 000-iteration bootstrap 95 % CIs.

Loading…

Default = probability β‰₯ 0.5. Youden = threshold maximising sensitivity + specificity βˆ’ 1 on held-out OOF predictions. Source: results/v2/<cohort>/reliability_hardened.json.

πŸ†• v2 β€” 95 % Confidence Intervals on every metric

Metrics below are point estimates. The v2 layer computes bootstrap (or percentile-over-5-folds) 95 % CIs for AUC, balanced accuracy, MCC, sensitivity, and specificity per cohort.

Loading…

Open full v2 dashboard β†’