BESH Stat version 0.18 released

I’m happy to announce the release of new BESH Stat version 0.18. The main change is the implementation of Zero-Inflated Poisson regression as described by Diane Lambert in [1]. BESH Stat results on default settings match on 4 decimal place to the R pscl package zeroinfl function parameter estimates and standard errors (by decreasing the convergence criterion on the Settings tab you can increase precision of the estimates)

What’s new?

  • Zero-Inflated Poisson regression model that required few more changes internally mentioned below
  • Poisson regression NULL deviance formula update to match [R] GLM function result in case of response variable [y] containing zero values. I that case contribution to the deviance is equal to = 2*(y * ln(1/mu) – (y-mu)). Similar update was made to deviance residuals computation.
  • Logistic regression routine updated to accept response variable values in [0, 1] interval it does not need to be 0’s 1’s output as before. This update was needed to implement zero-inflated-poisson (ZIP) regression model estimation using the expectation-maximization (EM) algorithm.
  • Complete redesign of Poisson and Logistic regression IRLS algorithm (internally) to accept externally provided starting parameter values and weights. This update was needed to implement ZIP regression model estimation using the (EM algorithm).
  • If cancel button is clicked while (Poisson, Logistic, Multinomial Logistic, Ordinal Logistic, or ZIP regression) then calculation if aborted and userform closed. In previous version just the userform was closed but computation continued in the background. (Note that however, the Cox regression still behaves the “old” way as it require more changes in the code to implement. I put it on my TODO list.)

References:

  1. Diane Lambert. Zero-Inflated Poisson Regression, With an Application to Defects in Manufacturing. Technometrics, Feb1992, 34.1

BESH Stat version 0.17 released

I’m happy to announce the release of new BESH Stat version 0.17. The main change is the addition of exact p-value computation for the unordered R x C contingency tables (Fisher-Freeman-Halton Exact test). I was working on this on-and-off for several years now.  The new procedure implements the network algorithm developed by Mehta and Patel [1,2,3] and improved by Clarkson, Fan and Joe [4]. It is a VBA translation of the FORTRAN subroutine FEXACT. The original FORTRAN code can be obtained from [5] (Fortran 77 version) or [6] (Fortran 90 version). To my knowledge this is the only VBA implementation of this method. Generated P-values match those presented in all example tables from [4] and all are computed in a fraction of a second.

What’s new?

  • Fisher-Freeman-Halton Exact test for RxC unordered contingency table added
  • Skillings-Mack test code re-factoring and slight performance improvement
  • Residual diagnostic added to the Binary logistic regression output (Pearson, Deviance, Leverage, Standardized Pearson, and Standardized Deviance residuals)
  • Hodges-Lehman Estimate of Shift option added to the Mann-Whitney user form so that it might be unchecked. In previous version estimate of shift was always computed. Since it’s the most time consuming part of computation you get considerable performance improvement when only Mann-Whitney test is requested.
  • Asymetry, Descriptive statistics, Homogeneity of variance tests, Box and Whiskers plot, and One-way ANOVA user forms re-design and re-factoring
  • User forms for procedures allowing By ID input was redesigned.
  • By ID input for example Box and whiskers plot, the order of groups in the output is now presented by order as groups appear in the ID column (previously it was by descending alphabetical order of ID categories).
  • Auto update feature added to the BESH stat. The check for new BESH Stat version is performed on weekly bases on Excel start-up when BESH stat is loading.
  • Bug fix: Stratified Cox regression caused subscript out of range in case of large number of records within single strata.
  • Bug fix: In pair data input (eg. Wilcoxon test) data from 2nd group/column input range were considered even when they were outside of specified input in case of given row being present in the 1st group/column input range.
  • Bug fix: Fixed error when loading any of the Regression user forms (Logistic, Cox, Poisson, … ) when the first row in the active sheet contains #N/A or any error value (e.g. #VALUE!, #DIV/0!, etc.)
  • BESH stat installation by double clicking on .xlam file was removed as it was causing some issues recently. Addin can be installed only by manually adding it into the list of Excel addins see here.

Refferences:

  1. Mehta CR, Patel NR. A network algorithm for performing Fisher’s exact test in r c contingency tables. Journal of the American Statistical Association 1983;78:427-34.
  2. Mehta CR, Patel NR. Algorithm 643: FEXACT: A FORTRAN subroutine for Fisher’s exact test on unordered r x c contingency tables. ACM Transactions on Mathematical Software 1986;12:154-61.
  3. Mehta CR, Patel NR. A hybrid algorithm for Fisher’s exact test in unordered r x c contingency tables. Communications in Statistics, Series A 1986;15:387-404.
  4. Clarkson, D. B., Fan, Y. and Joe, H. (1993) A Remark on Algorithm 643: FEXACT: An Algorithm for Performing Fisher’s Exact Test in r x c Contingency Tables. ACM Transactions on Mathematical Software, 19, 484-488.
  5. Fortran 77 version
  6. Fortran 90 version 

BESH Stat version 0.16 released

New BESH Stat version 0.14 was released.

This version was released earlier than planned, because I’ve identified a bug in Log-rank test that I introduced in the previous release during refactoring.

You can propose additional feature if you are missing some statistical method (just select Feature request category) and I will try to implement it in some of the future releases.

What’s new

  • Bug Fix: Log-rank test – fixed type mismatch run time error introduced in previous version 0.15 during re-factoring
  • <Copy BESHstat Path> button added to the About addin form to help during BESH stat updating.
  • Check added to Ordinal Regression procedure to check that Y has > 2 categories and user is informed if this assumption is violated.
  • Wilcoxon test, Nonparametric correlation, Nonparametric simple regression, and ROC user forms re-design and re-factoring

New BESH Stat version 0.15 was released

New BESH Stat version 0.15 was released.

Main change is the addition of new Ordinal Logistic Regression procedure. You can propose additional feature if you are missing some statistical method (just select Feature request category) and I will try to implement it in some of the future releases. Complete list of changes follows.

What’s new?

  • Added new Ordinal Logistic regression procedure that fits proportional odds model
  • Bug Fix: Poisson regression error caused by excel FACT function overflow when dependent variable values are greater than 170
  • Fixed user form selected effects listbox issue for Poisson, Logistic, and Multinomial Logistic regression. Listbox items were removed when switched between user form tabs. Issue still persists on COX and Multiple Linear Regression user form.
  • Bug Fix: Logistic regression – Exp function overflow issue fixed when computing Cox and Snell Pseudo R2 on large datasets.
  • Slight Mann-Whitney test computation speed improvement
  • Slight Kruskal–Wallis test computation speed improvement
  • Add-in update process updated (internally)
  • Code clearing and refactoring (refactoring of User forms output options code, code refactoring of forms having by ID and by Column data entry option)
  • Redesign of Kaplan-Meier plot, Log rank test, Normality, Histogram, and Outliers user forms
  • Removing limitation of maximum 50 groups for the Kaplan-Maier plot

Missing values in BESH stat

Missing values in BESH stat

Note1: BESH stat works only with numeric data (unless it is an ID column) so the text data are treated as missing. When talking about missing data I mean text string values as well.

Note2: If you have a text categorical value in your data then create a numerical format a use numerical codes in the analysis.

BESH stat treats missing values differently based on the procedure.

  • For Regression methods (eg. COX regression, Poisson, or Logistic regression) if there is a missing value for any selected variable then whole record is omitted from the analysis (ie. BESH stat performed a case-wise deletion). Procedure then creates a “Row Number” column in the output containing row position from the input dataset to see what records were actually used in the analysis.
  • For analysis of independent samples (eg. Mann-Whitney test, Kruskall-Wallis test) if a value for a sample is missing then just a given value is omitted
  • Finally, there are procedures that accepts missing values (eg. Skilling-Mack test) and in such case record is used even when some values are missing (unless all values for given record are missing).

BESH Stat version 0.14 released

New BESH Stat version 0.14 was released.

After quite some time when I was busy with other work I managed to extend the BESH stat Excel add-in capabilities by including the Poisson model into the add-in regression capabilities. So now it contains commonly used models in the area of clinical research (Ordinary least squares regression, Logistic regression – binomial as well as multinomial, COX proportional hazard regression for the analysis of survival data, and now the Poisson regression as well).

You can propose additional feature if you are missing some statistical method (just select Feature request category) and I will try to implement it in some of the future releases.

What’s new

  • Poisson regression analysis including model analysis, and residual diagnostic
  • Fixed few typos

BESH Stat version 0.13 released

New BESH Stat version 0.13 was released.

What’s new

This is mainly a maintenance release

  • Fixed issue when check for update feature don’t detect current BESH Stat version available on www.beshstat.eu because version is checked against the the cached version file.
  • code cleaning
  • Gauss-Jordan elimination option removed from the multiple linear regression option. It was removed as it doesn’t add any value to the BESH Stat adding. Other matrix inversion method that are not otherwise used will be removed latter during code refactoring
  • Standard errors presented in RxC contingency tables measures of ordinal association

BESH Stat version 0.11 released

New BESH Stat version 0.11 was released.

What’s new

  • Multinomial logistic regression added
  • fixed p-value calculation in ROC curve
  • add p-value to the single/two independent/paired proportion output
  • Code cleaning and refactoring
  • Input data row numbers were added to the output of Multiple linear regression and binary logistic regression
  • Export Chart feature was removed as it was rarely used and there are free excel add-ins that offer much better export chart capabilities – Daniel’s XL Toolbox