Criticality safety analyses rely on the availability of relevant benchmark experiments to determine justifiable margins of subcriticality. When a target application lacks neutronically similar benchmark experiments, validation studies must provide justification to the regulator that the impact of modeling and simulation limitations is well understood for the application and often must provide additional subcritical margin to ensure safe operating conditions. This study estimated the computational bias in the critical eigenvalue for several criticality safety applications supported by only a few relevant benchmark experiments. The accuracy of the following three methods for predicting computational biases was evaluated: the Upper Subcritical Limit STATisticS (USLSTATS) trending analysis method; the Whisper nonparametric method; and TSURFER, which is based on the generalized linear least-squares technique. These methods were also applied to estimate computational biases and recommended upper subcriticality limits for several critical experiments with known biases and for several cases from a blind benchmark study. The methods are evaluated based on both the accuracy of their predicted computation bias and upper subcriticality limit estimates, as well as on the consistency of the methods’ estimates, as the model parameters, covariance data libraries, and set of available benchmark data were varied. Data assimilation methods typically have not been used for criticality safety licensing activities, and this study explores a methodology to address concerns regarding the reliability of such methods in criticality safety bias prediction applications.