Receiver operating characteristic (ROC) analysis is one of the most useful tools for evaluating a diagnostic marker or binary classifier. In this tutorial, we show how to run ROC analysis in Excel with BESH Stat NG using both the GUI workflow and the UDF workflow.
The example uses a real diagnostic dataset based on the Breast Cancer Wisconsin (Diagnostic) data. We treat malignant cases as the positive class and use worst concave points as the default marker.
This tutorial is designed to mirror the existing BESH Stat NG tutorial style:
- a real-world dataset
- a GUI workflow
- a worksheet/UDF workflow
- a short ROC background section
- practical interpretation of the output
Download the example workbook
The workbook contains these sheets:
- README — short overview and expected results
- Data — the real-world dataset used in the tutorial
- UDF_Workflow — worksheet formulas using BESH Stat NG UDFs
- Reference_ROC — workbook-native ROC preview and threshold interpretation
- GUI_output — example GUI output
- Sources — dataset and documentation links
Example data
The dataset contains 569 fine-needle aspirate breast mass cases. For the UDF workflow, the outcome is recoded as:
- 1 = malignant
- 0 = benign
The default ROC marker in this tutorial is:
- Worst concave points
This is a good teaching example because it has strong discrimination between malignant and benign cases, giving a clear ROC curve and a high area under the curve (AUC).
What ROC analysis tells us
ROC analysis studies how well a continuous marker separates two groups across all possible decision thresholds.
For any cut-off value, we can compute:
- Sensitivity = proportion of positives correctly identified
- Specificity = proportion of negatives correctly identified
The ROC curve plots:
- X-axis: 1 − Specificity
- Y-axis: Sensitivity
A marker with no discrimination gives a curve close to the diagonal line. A strong marker rises quickly toward the upper-left corner.
The area under the ROC curve (AUC) summarizes overall discrimination:
- AUC = 0.5 means no discrimination
- AUC = 1.0 means perfect discrimination
In BESH Stat NG, ROC output reports:
- Wilcoxon AUC
- DeLong standard error and confidence interval
- Hanley–McNeil standard error and confidence interval
- two-sided p-value for testing AUC = 0.5
- a full cut-off table with sensitivity and specificity
GUI workflow
Step 1. Open the ROC tool
Go to: BESH Stat NG → Analyse → Graphics → ROC Curve

Step 2. Select the input data
In this example, the GUI workflow uses:
- Group by ID
- Group ID:
Data!$C$5:$C$574 - Data:
Data!$D$5:$D$574
Here:
- column C contains the diagnosis labels
- column D contains the marker values

Step 3. Choose the output location
Select New Worksheet and click Compute.
BESH Stat NG returns:
- the numerical ROC summary
- the cut-off table
- the ROC chart


GUI results for this example
Using worst concave points as the marker, BESH Stat NG returns:
- Wilcoxon AUC: 0.966703663
- DeLong 95% CI: 0.9521635 to 0.9812438
- DeLong standard error: 0.007418605
- Hanley–McNeil 95% CI: 0.9521618 to 0.9812455
- Hanley–McNeil standard error: 0.007419458
- Two-sided p-value (AUC different from 0.5): 1.85954E-77
This is an excellent ROC result. The AUC is close to 0.97, which means malignant cases tend to receive much higher marker values than benign cases.
The curve also hugs the upper-left region of the plot, which is exactly what we want to see from a strong diagnostic marker.
UDF workflow in Excel
The GUI is convenient for one-off analysis, but the UDF workflow is better when you want a reproducible workbook, a template, or a fully formula-driven tutorial.
BESH Stat NG provides two ROC worksheet functions:
BESH.PLOT.ROC_STATS(...)— numerical ROC summaryBESH.PLOT.ROC_POINTS(...)— chart-ready ROC points
ROC summary formula
Paste this formula into a blank cell:
=BESH.PLOT.ROC_STATS(Data!$D$6:$D$574,Data!$B$6:$B$574,1,"higher",0.05)This returns a spill range with the main numerical results, including:
- AUC
- standard errors
- confidence intervals
- p-value
ROC points formula
Paste this formula into another blank cell:
=BESH.PLOT.ROC_POINTS(Data!$D$6:$D$574,Data!$B$6:$B$574,1,"higher",0.05)This returns a spill table with ROC coordinates and thresholds, including:
ThresholdSensitivitySpecificityFalsePositiveRateTruePositiveRate
Meaning of the arguments
In both formulas:
- the first argument is the marker
- the second argument is the status
- the third argument is the positive class
- the fourth argument is the direction
- the fifth argument is the alpha level
So in this tutorial:
Data!$D$6:$D$574= marker valuesData!$B$6:$B$574= 1/0 status values1= malignant is the positive class"higher"= higher marker values indicate malignant cases0.05= 95% confidence intervals
Optional text-label version
If you prefer to use the diagnosis labels directly instead of the recoded 1/0 column, you can use:
=BESH.PLOT.ROC_STATS(Data!$D$6:$D$574,Data!$C$6:$C$574,"Malignant","higher",0.05)and
=BESH.PLOT.ROC_POINTS(Data!$D$6:$D$574,Data!$C$6:$C$574,"Malignant","higher",0.05)Building the ROC chart from the spill range
After entering BESH.PLOT.ROC_POINTS(...):
- select the spilled output
- insert an XY Scatter with Straight Lines
- use FalsePositiveRate for the X-axis
- use TruePositiveRate for the Y-axis
This gives a worksheet-native ROC chart that updates automatically when the source data changes.
Validation and reference checking
The ROC workflow shown in this tutorial is designed to be transparent and reproducible. In BESH Stat NG, worksheet-based workflows can be checked directly against the GUI output, the downloadable example workbook, and the project’s broader validation materials. For more detail on how results are checked, including reference datasets, automated tests, and comparison workflows, see the Validation page.
Interpreting the cut-off table
A ROC curve is useful because it shows the trade-off between sensitivity and specificity across all thresholds.
In the example workbook, the reference sheet also identifies the best Youden threshold, which is the cut-off that maximizes:
Sensitivity + Specificity − 1
For this dataset, the workbook gives a Youden-optimal threshold of about:
- Threshold: 0.1358
- Sensitivity: 0.8679
- Specificity: 0.9440
This is a strong practical cut-off for teaching purposes because it gives both high sensitivity and high specificity.
However, in real clinical use, the best threshold may depend on the cost of false positives and false negatives. For example:
- if missing a malignant case is very costly, you may prefer a lower threshold and higher sensitivity
- if unnecessary follow-up procedures are very costly, you may prefer a higher threshold and higher specificity
So the Youden cut-off is a useful summary, but it is not always the final decision threshold.
Practical interpretation of this example
This example shows that worst concave points is a very strong discriminator between malignant and benign cases in this dataset.
The evidence is consistent across the output:
- the ROC curve rises steeply
- the AUC is very high
- both standard error methods give almost identical results
- the p-value is extremely small
- the threshold table shows that high sensitivity can be achieved without sacrificing too much specificity
In plain language, that means the marker does a very good job of ranking malignant cases above benign cases.
Why use both GUI and UDF workflows?
The two workflows complement each other.
GUI workflow advantages
- quick exploratory analysis
- immediate chart output
- easy for first-time users
- useful for ad hoc reporting
UDF workflow advantages
- reproducible formula-based analysis
- easier to audit and maintain
- ideal for templates and tutorials
- easy to connect with other workbook calculations
- dynamic charting from spill ranges
A good Excel tutorial should usually show both.
Notes and tips
- The marker column must be numeric.
- The status column must contain exactly two groups.
- For UDF formulas, it is usually safest to exclude the header row.
- If smaller values indicate the positive class, change
"higher"to"lower". - If a BESH UDF returns
#NAME?, make sure the add-in is loaded and recalculate the workbook. - If you want to compare markers, duplicate the UDF workflow and replace the marker range with another variable.
In the example workbook, you can also try:
- Mean radius
- Worst area
This is a simple way to compare which single marker gives the strongest ROC performance.
Suggested structure for your own ROC workbook templates
A reusable ROC workbook in Excel can follow this pattern:
- Data sheet
Store the raw outcome and marker variables. - UDF workflow sheet
UseBESH.PLOT.ROC_STATS(...)andBESH.PLOT.ROC_POINTS(...). - Reference / interpretation sheet
Add threshold summaries, Youden index, and comments. - Chart sheet or dashboard
Show the ROC curve and key metrics.
This structure makes the workbook easy to teach, reuse, and adapt to other biomarker datasets.
Summary
BESH Stat NG makes ROC analysis in Excel straightforward in two different ways.
The GUI workflow is fast and visual.
The UDF workflow is reproducible and workbook-friendly.
In this tutorial example, the marker worst concave points gives an AUC of 0.9667, which indicates excellent discrimination between malignant and benign cases.
That makes this dataset a strong teaching example for:
- ROC basics
- AUC interpretation
- threshold selection
- GUI workflow in BESH Stat NG
- reproducible UDF-based Excel analysis
Frequently asked questions
What is ROC analysis in Excel used for?
ROC analysis in Excel is used to evaluate how well a continuous marker or prediction score separates two groups, usually a positive and a negative class. It helps assess diagnostic accuracy across all possible cut-off values.
What does AUC mean in ROC analysis?
AUC, or area under the curve, summarizes overall discrimination. An AUC of 0.5 means no discrimination, while an AUC closer to 1.0 indicates excellent separation between the two groups.
What data do I need for ROC analysis in Excel?
You need a binary outcome variable and a numeric marker or prediction score. In this tutorial, the outcome is malignant versus benign, and the marker is worst concave points.
Can BESH Stat NG perform ROC analysis in Excel with formulas?
Yes. BESH Stat NG supports ROC analysis through both the graphical interface and worksheet UDFs. This makes it possible to build reproducible, formula-based Excel workflows for ROC curves, AUC estimation, and threshold analysis.
How do I choose the best ROC cut-off?
A common approach is to use the Youden index, which maximizes sensitivity plus specificity minus one. However, in practice, the best cut-off also depends on the relative cost of false positives and false negatives.