OMB Meeting Book - January 8, 2015

52

INTRODUCTION Proficiency testing (‘PT’) is an economical approach to a multicollaborator study which has the specific principal goal of measuring a participating collaborator result with respect to the mass of the other collaborator results. PT studies are generally performed for a nominal (middle) concentration of analyte in a particular matrix. Participants may use nominally the same method, but typically there is no direct control over the exact protocol used. Replication may or may not be present, and may vary among participants, sometimes without disclosure. Traditionally, ‘robust’ statistical methodology has been used to analyze PT data. In TR322, the use of such statistics for estimating reproducibility was deprecated. Here the issues related to robust statistics is discussed, and indications are made as to when such methodology might actually make sense. MEASURE OF CENTER (LOCATION) The original use of robust statistics was with respect to measures of centrality, i.e., the center point of the distribution. The arithmetic mean (first moment) has many good theoretical properties, particularly when a normal distribution is present, but is subject to influence by outliers (with a coefficient of 1/n, where n is the number of data in the sample). When far or multiple outliers are suspected to be present, there are two general policies in use: 1. Remove the outlier for cause, if investigation and subject-matter expertise renders the data point involved subject to crude error, contamination or other gross failure of methodology. (Statistical identification of outliers may be helpful, but removal solely upon such identification is deprecated.) After removal of any outliers, the usual statistics (e.g., arithmetic mean and standard deviation) are estimated from the remaining data. 2. Do not remove outliers, but remove their influence. This is done by using ‘robust’ statistics that give less weight to data in the far tails. Examples of such robust statistics as measures of center are: 2.1. Median. 2.2. α -trimmed mean (where a fraction α of the data are removed from each tail). The median may be interpreted as a 50%-trimmed mean, in which case both of the above examples are of the same class. Trimming eliminates the influence of far outliers and concentrates estimation using only the center points of the distribution. The immunity to outliers increases with α , which typically is 10%, 25% or 50%. Using data exclusively from the center of the empirical distribution to find a good measure of the location of the center of the distribution is non-controversial. Immunizing this measure against

2

21

Recommended to OMB by Committee on Statistics: 07-17-2013 Reviewed and approved by OMB: 07-18-2013

Made with