In Data We Trust?

Quality guru W. Edward Deming is often quoted as saying, “In God we trust, everyone else bring data.”  But how do you know if the data can be trusted?  Read on to learn how you can ensure you have valid data for making sound business decisions…

Data Variation

Data can vary for a myriad of reasons.  Survey data can vary based on your sample location.  Click through rate data can vary based on ad design.  Dimensional data can vary based on the instrumentation and the person taking the measurement.

Data variation may be quantified and expressed in two categories: accuracy and precision.  The accuracy of the data is a measure of how close the measurements are to a specific true value.  Deviation from the true value is often expressed as bias and can be viewed as the difference between the average value and the true value.  Precision on the other hand is a measure of how close the measurements are to one another and can be viewed as the scattering of points around the average.

So why does this matter?  What can excessive measurement variation lead to?

Bad Decisions

We don’t collect data for data’s sake.  Ultimately, we’re looking to acquire data, analyze, and draw a conclusion to make a decision.  However, the quality of the decision is highly related to the quality of the data.  Better data leads to better decisions.  As variation increases, so does the probability of making an erroneous decision.

What’s Your Type?

Rest assured, there are steps you can take to assure that your data are valid for your intended use.  Selecting the right method relies on the type of data you have.  Data can be readily classified into two primary categories: variable data and attribute data.

Variable data, by definition, varies in values.  Variable data is obtained using some sort of instrumentation.  Temperature, weight, length, and time are all examples of variable data.  While the characteristic being measured varies, variation in variable measurements can also include bias and variability due to the instrument and the person taking the measurement.

Attribute data results in assessment of a characteristic that can be classified in categories.  In general, these data are represented in counts.  For our purposes here, we’ll focus on simple assessments used to make a pass or fail decision.  Visual inspections, go / no-go gauges, and some automated test instruments are examples of attribute methods.  Errors in attribute methods are more common among “borderline” conditions, which have a higher probability of being misclassified.

Validation Considerations

Once the data type and the measurement method has been defined, the validation strategy may be established.

For variable methods, the accuracy is determined through a process called calibration where measures of samples are compared to a known standard.  Precision is assessed by a special designed experiment commonly known as Gage R&R, where R&R stands for repeatability and reproducibility, respectively.  These sources of measurement variation are typically compared to the specification range.  Low values are preferred, with values of 10% precision to tolerance (P/T) is generally deemed acceptable.

For attribute methods, a similar designed experiment may be performed.  However, the repeatability and reproducibility are more of a measure of how often the correct assessment is made.  Two common methods used are Attribute Agreement Analysis and Attribute Sampling Plans.

Who Do You Trust?

While I always trust in God, I don’t always trust the data that are presented – not unless the validity of the data is backed by a solid validation of the method producing the data.

Sharing Is Caring!