Hypothesis tests are statistical tools used to draw conclusions based on data. With a basic understanding of the process and process performance, as well as the availability of reliable historical data, the Six Sigma teams can draw statistical conclusions that are incredibly accurate if they (1) ensure that the measurement systems are reliable, (2) have the right sample size, and (3) know how to set up the correct type of hypothesis test. This article will discuss the basics of hypothesis test, including different types of tests, how to set them up, and how to read the results.
Most of the time, the Six Sigma team will utilize statistical software such as Minitab or SigmaXL to perform the hypothesis tests. However, some tests are available in Analysis ToolPak’s add-on in Microsoft Excel.
Hypothesis tests cover three broad categories:
They were testing whether the data you have fits a data model. Usually, the Six Sigma team refers to the P-Value in Chi-Squared Goodness-of-Fit Test and Normality Test to determine the data’s distribution type. However, ultimately, it was a hypothesis test.
We are comparing a statistic to a hypothesis about the data or population.
We use the hypothesis test to answer whether something changed within the data, often after a team has modified an input or other part of the process. In the case of most Six Sigma projects, the team probably wants to determine whether the process or outcome is improved.
While the type of hypothesis test you use depends on the answers you are seeking and the type of data you have, all of the tests follow essentially the same guidelines:
You begin with a statistic or criteria that you usually compute from your sample data
You create a null hypothesis and an alternate hypothesis, in keeping with the type of test you are dealing with
The statistic or criteria is compared against reference criteria or distribution
How the calculated statistic compares to the reference criteria determines whether you accept the null hypothesis or reject the null hypothesis in favor of the alternative hypothesis
Hypothesis tests are a large part of inferential statistics, where we conclude the overall process or population by analyzing the sample data and measurements. When stating hypotheses, we are not making statements about the sample, and we are making statements about the population or the entire process.
A hypothesis is “the population mean is 5.”
We don’t need to make a hypothesis about the sample mean-we can calculate the sample mean.
Null versus Alternative Hypothesis
Hypothesis tests have two main parts; the null and alternative hypotheses.
The null hypothesis is abbreviated as H0 and is usually a statement about the data that reflects no effect or no difference. Usually, the Six Sigma practitioner refers to the p-value in Normality Test, and if the p-value>0.05, we hypothesized that our data was normal. In effect, we said “there is no statistical difference between the distribution of our data and the distribution of data on a normal curve.”
The alternative hypothesis is abbreviated as Ha or H1 and is usually a statement that is likely to be true if the null hypothesis is not true. In Normality Test, if the p-value<0.05, the alternative hypothesis was “there is a statistical difference between the distribution of our data and the distribution of data on a normal curve.” In short, if we reject the null hypothesis, we accept the alternative hypothesis-in this case, that our data is not normal.
Typically, the null hypothesis is an equal statement of some type. The mean of the new process is equal to the mean of the new process, and the distribution of the data is equal to the normal curve.
The alternative hypothesis is typically written as a not equals, a greater than, or a less than a statement. The mean of the new process is greater than the mean of the old process, and the distribution of the data is not equal to the normal curve. How you write the alternative hypothesis depends on your question and the type of hypothesis test you are running.
The risk of hypothesis error
Anytime you draw inferences about a population from sample data, there is at least some likelihood of error. With hypothesis testing, errors come in two types.
Type I Error:
The null hypothesis is rejected when it is actually true (False Positive).
Also called producer risk The probability of the risk is measured by alpha, where a is a probability between 0 and
Type II Error.
The null hypothesis is accepted when it is actually false (False Negative). Also called the consumer risk
The probability of the risk is measured by beta, where ẞ is a probability between 0 and 1.
The risk is the trade-off of the accepted confidence level for our hypothesis test. The most common confidence level is 95%, or α = 0.05. Typically, the confidence level is set with the type I error in mind, so you use alpha for the confidence level. The value of ẞ then contributes to the sample size requirements and the power.
Selecting the right hypothesis test
Three main factors determine the hypothesis test type:
Which type of data do you have (continuous/variable or discrete/attribute)
The number of interest levels for the input in question (1, 2, or more than two)
Distribution of data (normal or non-normal). What you are testing (means, medians, variance, count, or proportions)
Hence, it is crucial to understand the background of the data and the objective of performing the hypothesis test.
The hypothesis test is a robust tool that effectively assists the Six Sigma practitioner make accurate and cost-effective decisions based on the data. However, similar to other statistical tools, it is open for error if it is not adequately set up.
Check out our Six Sigma courses to learn the perfect skillset to become a successful professional in Lean Six Sigma courses. Improve your team’s efficiency with the right skillset!