When measured proximity data are fit by an empirical OPC (optical proximity correction) model for full-chip layout processing, it is assumed that the data are accurate and that model parameter space is sufficiently well sampled. It is also assumed that outliers in the measured data are easily identifiable. Furthermore, if more sample data points are used in the fit, a better (more applicable) model will result. This paper addresses several key issues concerning the input of incorrect or insufficient data to such models. (1) How well can models average out random measurement noise? (2) Can one obtain a sufficiently good model fit using fewer data points? (3) How good are models at interpolating proximity data? (4) How well can models calibrated to a subset of the data (e.g., only medium range pitches) extrapolate outside this range? The approach employed was to start with a representative OPC proximity data set and perform model fits using different subsets of this data and different levels of additive noise. The fit results and predictive model behavior were then compared.