Data Quality Considerations in Using the Prism Program

The Princeton University PRISM program represents the current state of the art in computerized analysis of residential energy use data. We have found it to be quite user-friendly and to normally produce realistic and robust estimates, but our testing and production experience also shows that its results can be quite erroneous if insufficient attention is given to several crucial data quality issues.

Each data set should ideally contain 12 monthly readings of gas or electric use over one heating season, though each of these standards can be relaxed somewhat. We have used 10 to 16 months with few problems, but with periods of 8 to 9 months it becomes critical to have summer baseload use and midwinter peak use both represented for reliable results. Following analysis we further eliminate cases with regression R2 under 95%, CV of NAC over 5%, undefined or extreme parameter standard errors, or certain typical parameter deviations (eg. reference temperature beyond the 50-700F range or baseload near or below zero for a combined use account that should have a positive value).

These standards can be modified for differences in reading frequency, heating fuel, geographic/climatic area, sample size, and intended use of the analysis.

Full Report (PDF)
Data Quality Considerations in Using the PRISM Program