Home Contact Sitemap

Education Tips Blog

All Articles About Education Tips For Learning And Studying Online.

building image

rss feed technorati fav

Subscribe

Archives

Categories

Directory


Chi-square Analysis For Attribute Data

Published by admin | Filed under Education

What Is A Chi-Square Test?

- The probability density curve of a chi-square distribution is asymmetric curve stretching over the
positive side of the line and having a long right tail.

- The form of the curve depends on the value of the degrees of freedom.

Types of Chi-Square Analysis:

- Chi-square Test for Association is a (non-parametric, therefore can be used for nominal data) test of statistical significance widely used bivariate tabular association analysis.

- Typically, the hypothesis is whether or not two different populations are different enough in some characteristic or aspect of their behavior based on two random samples.

- This test procedure is also known as the Pearson chi-square test.

- Chi-square Goodness-of-fit Test is used to test if an observed distribution conforms to any particular distribution. Calculation of this goodness of fit test is by comparison of observed data with data expected based on the particular distribution.

When to apply a Chi-Squared Test:

- Chi-Squared test is used to determine if there is a statistically significant difference in the proportions for different groups. To accomplish this, it breaks all outcomes into groups.

What the Chi-Squared Test does:

- It starts by determining how many defects, for example, would be “expected” in each group involved.

- It does this by assuming that all groups have the same defect rate (which Minitab approximates from the data provided).

- Minitab then compares the expected counts with what was actually observed.

- If the numbers are different by a large enough amount, Chi-Square determines that the groups do not have the same proportion.

Chi-Square Requirements:

- Data is typically attribute (discrete). At the very least, all data must be able to be categorized as being in some category or another).

- Expected cell counts should not be low (definitely not less than 1 and preferable not less than 5) as this could lead to a false positive indication that there is a difference when, in fact, none exists.

Chi-Square Hypotheses:

- Ho: The null hypotheses (P-Value > 0.05) means the populations have the same proportions.

- Ha: The alternate hypotheses (P-Value <= 0.05) means the populations do NOT have the same proportions.

Note: if the expected cell counts are below 5, Minitab will print a warning. The warning is generated because of the fact that with the expected count in the denominator, a small value potentially creates an artificially large chi-square statistic. This is particularly troublesome if more than 20% of the cells have expected counts less than 5 and the contribution to the overall chi-square statistic is considerable.

Additionally, if any of the expected cell counts are below 1, Minitab will not even produce a p-value since the chi-square statistic is sure to be artificially inflated. In either of these cases, the binomial distribution (Minitab: Stat/ ANOVA/ Analysis of Means) may be able to be used.

Lastly: Attribute Gage R&R (AR&R) or Kappa Test is needed with an acceptable level of measurement system error prior to running a Chi-Square Analysis

Tips:

- Determine the subgroups and categories to be tested for variation (differences in proportions) as part of your data collection plan.

- Define the operational definitions for success/defect, the stratifications layers (subgroups) and the Cause & Effect diagram (fishbone) to pre-determine where the team believes differences in proportions may exist.

- Continuous (Variable) data can usually be converted into Discrete (Attribute) data by using categories

(Example: cycle time (continuous 1 hr, 1.5 hr, 2 hr) can be categorized into Cycle Time Met = 1 where success is cycle time <= 8 hrs or Cycle Time Not Met = 0 where a defect is cycle time > 8 hrs.)

Tricks

- An (MSA) Attribute R&R (Kappa Analysis) for discrete data or Gage R&R for continuous (variable) data is used prior to calculating the Chi-Square Test to ensure that the measurement variation < 10% Contribution. If the measurement variation is > 10% then the variation you will see in the Chi- Square Test is not valid as too much of the variation seen is coming from your measurement system (10% MSA error) and not your process variation.

Steven Bonacorsi is a Senior Master Black Belt instructor and coach. Steven Bonacorsi has trained hundreds of Master Black Belts, Black Belts, Green Belts, and Project Sponsors and Excutive Leaders in Lean Six Sigma DMAIC and Design for Lean Six Sigma process improvement methodologies.

Bonacorsi Consulting, LLC.
Steven Bonacorsi, President
Lean Six Sigma Master Black Belt
14 Clinton Street
Salem NH 03079
sbonacorsi@comcast.net
603-401-7047

Bookmark and Share

Related Posts

March 1st, 2008

Leave a Comment