## 11 Jun A town official claims that the average vehicle in their area sells for more than?the 40th?percentile of your data set. Using the data, you obtained in week 1, as well

A town official claims that the average vehicle in their area sells for **more than** the 40th percentile of your data set. Using the data, you obtained in week 1, as well as the summary statistics you found for the original data set (excluding the super car outlier), run a hypothesis test to determine if the claim can be supported. Make sure you state all the important values, so your fellow classmates can use them to run a hypothesis test as well. Use the descriptive statistics you found during Week 2 NOT the new SD you found during Week 4. Because again, we are using the original 10 sample data set NOT a new smaller sample size. Use alpha = .05 to test your claim.

(Note: You will want to use the function =PERCENTILE.INC in Excel to find the 40th percentile of your data set. Hopefully this Excel function looks familiar to you from Week 2.)

First determine if you are using a *z* or *t*-test and explain why. Then conduct a four-step hypothesis test including a sentence at the end justifying the support or lack of support for the claim and why you made that choice.

I encourage you to review the * Week 6 Hypothesis Testing PDF *at the bottom of the discussion. This will give you a step by step example on how to calculate and run a hypothesis test using Excel. I DO NOT recommend doing this by hand. Let Excel do the heavy lifting for you. You can also use this PDF in Quizzes section.

There were 5 additional PDFs that were created to help you with the Homework, Lessons and Tests in Quizzes section. While they won't be used to answer the questions in the discussion, they are just as useful and beneficial. I encourage you to review these ASAP! These PDFs are also located at the bottom of the discussion.

In this document we will discuss 2 – sample Z- hypothesis testing and confidence

intervals that uses a mean’s and known population S’s.

This PDF discusses Z-Critical Value and you are discussing a sample mean and a

population S.

There are still 3 different hypothesis scenarios with a 2 – Sample Z Hypothesis

Test.

Lower Tail Test (1 tail):

Ho: �̅�1 − �̅�2 = 0

Ha: �̅�1 − �̅�2 < 0

Upper Tailed Test (1 tail):

Ho: �̅�1 − �̅�2 = 0

Ha: �̅�1 − �̅�2 > 0

Two Tailed Test:

Ho:�̅�1 − �̅�2 = 0

Ha: �̅�1 − �̅�2 ≠ 0

The hypothesized value is 0 and the same key words apply from a 1 – sample

hypothesis test to determine which scenario to use. 𝜇1 − 𝜇2 is the difference

between the average in the first sample and the average in the second sample.

The Z – Test Statistic = �̅�1− �̅�2−0

√ 𝑆1

2

𝑛1 +

𝑆2 2

𝑛2

Where S is the population standard deviation, 𝜇1 𝑎𝑛𝑑 𝜇2 are averages and n1

and n2 are the sample sizes.

We can use =NORM.S.DIST to find the p-values. These should look familiar from

the discussion forum.

Example:

A dietitian has developed a diet that is low in fats, carbs, and cholesterol. The

dietitian wishes to examine the effects this diet has on the weights of obese

people. Two random samples of 30 obese each are selected, and one group of 30

people is places on the low-fat diet. The other 30 people are places on a diet that

contains approximately the same quantity of food, but has is not low in fats,

carbs, and cholesterol. For each person the amount of weight lost (or gained) in a

three-week period is recorded. There is a difference in the population mean

weight losses for the two diets? The population S1 = 4.67 and the population S2 =

4.04. Use alpha = .05. Here we see we are given the Raw Data set.

WL Low Diet WL Regular Diet

8 6

21 14 13 4

8 6 11 13

4 11

3 11 6 8

16 14 5 8

10 6 8 4

8 12

12 2 7 1

3 2 12 6

14 1

16 0 11 9

10 5 9 10

10 6 8 6

14 9

3 8 7 3

14 1 11 7

14 8

First step is to state the hypothesis scenario. Because the key word says

difference this means it is a two tailed test.

Ho: �̅�1 − �̅�2 = 0

Ha: �̅�1 − �̅�2 ≠ 0

Before we start calculating anything by hand and because we are given the raw

data set, we can actually run this hypothesis test in Excel. And since you installed

the Data Analysis Toolpak it is easy to do. First you need to input this Raw Data

into Excel.

Then go to Data -> Data Analysis -> and scroll to where it says z-Test Two Sample

for Means and click OK

Under Input:

Variable 1 Range: you will highlight the WL Low Diet column and make sure you

include the top row where the Label is located.

Variable 2 Range: you will highlight the WL Regular Diet column and make sure

you include the top row where the Label is located.

Hypothesize Mean Difference: can be left as 0

Variance 1 Variance (known): Here is where you will put the Known Variance for

the First Sample. In the problem we are given the Known Standard Deviation. To

find the Variance all we did was Square it. Input that value in the box.

Variance 2 Variance (known): Here is where you will put the Known Variance for

the Second Sample. In the problem we are given the Known Standard Deviation.

To find the Variance all we did was Square it. Input that value in the box.

Check the “Labels” box because we did include the first row of labels. For Alpha

out 0.05 but this can be change depending on what significance level you use.

Then make sure the bubble for New Workbook Ply: highlight and click OK. It

should look similar to the screenshot below.

Once you click OK in a new Worksheet this should populate.

z-Test: Two Sample for Means

WL Low Diet WL Regular Diet

Mean 9.866666667 6.7 Known Variance 21.8089 16.3216

Observations 30 30

Hypothesized Mean Difference

0

z 2.808838232

P(Z<=z) one-tail 0.002486031

z Critical one-tail 1.644853627

P(Z<=z) two-tail 0.004972062

z Critical two-tail 1.959963985

Here we have all the values we need to state a conclusion.

We see the Z – Test Statistic = 2.8088 and because we ran a two tailed test the

p-value = .00497.

p -value = .00497 < .05. This p-value is less than .05 which means we Reject Ho.

Yes, there is statistical evidence that there is a difference in the population mean

weight losses for the two diets.

If we were running a 1-tailed test, we are given the p-value which is .002486. Z-

Test Statistic is the same and so is the conclusion for a 1-tailed test.

Using Excel to run a hypothesis test when we are given the Raw Data is very

convenient. But if we aren’t given the Raw Data and we are given the averages

and known S’s we will need to compute the Z-Test Stat by hand and then use the

Excel function to find the p-value.

To find the Z-Test Stat we will use this equation and plug in what we know. You

should know by now how to calculate the average and SD using Excel. Which is

what I did here.

Z – Test Statistic = �̅�1− �̅�2−0

√ 𝑆1

2

𝑛1 +

𝑆2 2

𝑛2

Z – Test Statistic = 9.86667− 6.7−0

√4.672

30 +

4.042

30

Z – Test Statistic = 3.16667

√.72696333+.54405333

Z – Test Statistic = 3.16667

√1.27101666

Z – Test Statistic = 3.16667

1.1273937

Z – Test Statistic = 2.80884

When we calculate the Test Stat by hand using algebra we get the same value.

Next, we need to find the p-value. We will use the =NORM.S.DIST function to find

the p-value.

In Excel input =NORM.S.DIST(2.80884,TRUE) and hit Enter. We will type True

because this is a cumulative test.

We see this p-value = .997514 BUT remember when we use this function in Excel,

this function is in the less than form. This means if we were running a Lower

Tailed test, this would be our p-value. If we were running an Upper Tailed Test

we need to take 1 – .997514 to get the p-value for our test.

P-value = 1 – .997514 = .002486. This is the p-value for an upper tailed test.

But since we are running a Two Tailed, we take whichever p-value is smaller and

multiply it by 2.

p-value = .002486*2 = .004972. This is the p-value for a two tailed test. And if we

compare these to the Excel output that should be the same and draw the same

conclusion.

This is how you would run a 2 – sample Z hypothesis test using averages and

population S’s when we don’t have the raw data and can’t use Excel.

Now that we ran a hypothesis test, let calculate a confidence interval and draw

the same conclusion.

The equation for a 2 – sample Z confidence interval:

�̅�1 − �̅�2 ± 𝑍𝛼 2

∗ ∗ √ 𝑆1

2

𝑛1 +

𝑆2 2

𝑛2

Where Standard Error (SE) = √ 𝑆1

2

𝑛1 +

𝑆2 2

𝑛2

Margin of Error = 𝑍𝛼 2

∗ ∗ √ 𝑆1

2

𝑛1 +

𝑆2 2

𝑛2

We have all the values we need let’s plug them into our equation.

9.86667 − 6.7 ± 𝑍𝛼 2

∗ ∗ √4.672

30 +

4.042

30

The last thing we need to find is a Z-Critical Value. We will use the =NORM.S.INV

in Excel to find the Z-Critical Value.

If we want to find a 95% confidence interval, then alpha = 1 – .95 = .05. But

because this is a confidence interval and we need to take into account the plus

AND minus on both sides if the bell-shaped curve we will divide alpha be 2. .05/2

= .025. Then we take 1 – .025 = .975. We will use this value in our Excel function.

=NORM.S.INV(.975)

We see the Z – Critical Value is 1.96. We will plug this into the equation and

solve. But if you compare this Critical Value to the Excel output we got when we

ran the hypothesis test it is the same because we used Alpha = .05 in the output.

But this value will change depending on what you input for Alpha.

z-Test: Two Sample for Means

WL Low Diet WL Regular Diet

Mean 9.866666667 6.7

Known Variance 21.8089 16.3216 Observations 30 30

Hypothesized Mean Difference

0

z 2.808838232

P(Z<=z) one-tail 0.002486031

z Critical one-tail 1.644853627

P(Z<=z) two-tail 0.004972062

z Critical two-tail 1.959963985

9.86667 − 6.7 ± 𝑍𝛼 2

∗ ∗ √4.672

30 +

4.042

30

9.86667 − 6.7 ± 1.96 ∗ √4.672

30 +

4.042

30

3.16667 ± 1.96 ∗ 1.1273937

3.16667 ± 2.209697

The confidence interval goes from .95697 to 5.376367. This interval goes from a

positive value to a positive value. This means that 0 is NOT in this interval.

Because 0 is NOT in the interval, Yes, it is Significant, and we Reject Ho. This is the

same conclusion that we got with the hypothesis test.

,

This week will continue to discuss hypothesis testing and confidence interval, but

now we will discuss 2 samples.

Just like with 1 – sample hypothesis testing there are 4 steps we will follow. To

review those 4 steps please review the Week 6 Hypothesis Testing PDF.

But the conclusion will still be the same.

If the p-value is < alpha, you Reject Ho and state this test is significant.

If the p-value is > alpha, you Do Not Reject Ho and state this test is not significant.

In this document we will discuss 2 – sample proportion hypothesis testing and

confidence intervals.

There are still 3 different hypothesis scenarios with a 2 – sample proportion

hypothesis test.

Lower Tail Test (1 tail):

Ho: 𝑝1 − 𝑝2 = 0

Ha: 𝑝1 − 𝑝2 < 0

Upper Tailed Test (1 tail):

Ho: 𝑝1 − 𝑝2 = 0

Ha: 𝑝1 − 𝑝2 > 0

Two Tailed Test:

Ho: 𝑝1 − 𝑝2 = 0

Ha: 𝑝1 − 𝑝2 ≠ 0

The hypothesized value is 0 and the same key words apply from a 1 – sample

hypothesis test to determine which scenario to use.

The Z – Test Statistic = 𝑝1− 𝑝2

√𝑝∗𝑞( 1

𝑛1 +

1

𝑛2 )

Where 𝑝 = 𝑥1+𝑥2

𝑛1+𝑛2

𝑞 = 1 − 𝑝

We will then use =NORM.S.DIST function in Excel to find the p-value. This Excel

function should look familiar from Week 4.

Example:

In a developing section of a district 50 people were surveyed and 38 were in favor

of the new proposal. For the rest of the district 100 people were surveyed and

only 65 people were in favor of the new proposal. Is there evidence that the

number of people favoring the new proposal is greater in the developing section

than the rest of the district? Use alpha = .05

First step is to state the hypothesis scenario. Because the key word says greater

this means it is an upper tailed test.

Ho: 𝑝1 − 𝑝2 = 0

Ha: 𝑝1 − 𝑝2 > 0

The first proportion is favoring the new proposal in the developing district and the

second proportion is favoring the new proposal in the rest of the district.

𝑝1 = 38

50 = .76

𝑞1 = 1 − .76 = .24

𝑝2 = 65

100 = .65

𝑞2 = 1 − .65 = .35

𝑝 = 38 + 65

50 + 100 = .68667

𝑞 = 1 − .68667 = .31333

Now that we have these values we can plug them in to find the Test Statistic.

Z – Test Statistic = .76−.65

√.68667∗.31333( 1

50 +

1

100 )

= 1.369

Now that we have the Z-Test Statistics we can use the =NORM.S.DIST function to

find the p-value.

And yes, we can have a negative Z- Test Statistic, if we do that is fine. You DO

NOT have to take the absolute value of anything. Use the Test Stat. as is in the

Excel function.

In Excel input =NORM.S.DIST(1.369,TRUE)

We will write out TRUE because this test is cumulative.

We see this p-value = .9145 BUT remember when we use this function in Excel,

this function is in the less than form. This means if we were running a Lower

Tailed test, this would be our p-value. BUT since we are running an Upper Tailed

Test we need to take 1 – .9145 to get the p-value for our test.

P-value = 1 – .9145 = .0855.

We see the p-value for our upper tailed test is .0855. If we compare this to .05,

we see that:

.0855 > .05. Since the p-value is greater than alpha, We Do Not Reject Ho. This

test is not significant and No, there is no evidence that the proportion of people

favoring the new proposal is greater in the developing section than the rest of the

district at alpha = .05.

What if we were running a two tailed test? To find this p-value we would take

whichever p-value is smaller and multiple it by 2.

.0855*2 = .171. The p-value for a two tailed test would be .171.

Now that we ran a hypothesis test, let calculate a confidence interval and draw

the same conclusion.

The equation for a 2 – sample proportion is:

𝑝1 − 𝑝2 ± 𝑍𝛼 2

∗ √

𝑝1𝑞1

𝑛1 +

𝑝2𝑞2

𝑛2

Where Standard Error (SE) = √ 𝑝1𝑞1

𝑛1 +

𝑝2𝑞2

𝑛2

Margin of Error (ME) = 𝑍𝛼 2

∗ √

𝑝1𝑞1

𝑛1 +

𝑝2𝑞2

𝑛2

Plugging in what we know:

. 76 − .65 ± 𝑍𝛼 2

∗√ . 76 ∗ .24

50 +

. 65 ∗ .35

100

The last thing we need to find is the Z- Critical Value. We will use the

=NORM.S.INV function to find this. This function should look familiar from Week

4.

If we want to find a 95% confidence interval, then alpha = 1 – .95 = .05. But

because this is a confidence interval and we need to take into account the plus

AND minus on both sides if the bell-shaped curve we will divide alpha be 2. .05/2

= .025. Then we take 1 – .025 = .975. We will use this value in our Excel function.

=NORM.S.INV(.975)

We see the Z – Critical Value is 1.96. We will plug this into the equation and

solve.

. 76 − .65 ± 𝑍𝛼 2

∗√ . 76 ∗ .24

50 +

. 65 ∗ .35

100

. 76 − .65 ± 1.96√ . 76 ∗ .24

50 +

. 65 ∗ .35

100

. 76 − .65 ± 𝑍𝛼 2

∗√ . 76 ∗ .24

50 +

. 65 ∗ .35

100

. 76 − .65 ± 1.98(.076961)

. 11 ± .1508

(-.0408, .2608)

The confidence interval goes from -4.08% to 26.08%. This interval goes from a

negative value to a positive value. This means that 0 is in fact in this interval.

Because 0 is in the interval it is Not Significant, and we Do Not Reject Ho. This is

the same conclusion that we got with the hypothesis test.

,

Hypothesis Testing is a decision-making process called a Test of Significance.

There are 4 unique parts to Hypothesis Testing.

1) The Hypothesis Scenario. This includes the Null and Alternative scenarios.

a. Ho: Null Hypothesis

Ha or H1: Alternative Hypothesis

2) Z- Test Statistic

Z- Test Stat = �̂�−𝑝0

(√ 𝑝0∗𝑞0

𝑛 )

Where “𝑝0” is the hypothesized value and 𝑞0 = 1 − 𝑝0. 3) P- value. The p-value tells you if something will be significant or not and if

you can Accept or Reject the claim. You will use the p-value to draw a

conclusion regarding the hypothesis test.

a. We will use =NORM.S.DIST function to find the p-value. It should

look familiar from Week 4.

4) Conclusion:

a. If the p-value is less than alpha (< α) then Reject Ho/Accept Ha.

b. If the p-value is greater than alpha (> α) then We Do Not Reject Ho.

c. The most common alpha value is .05. If no, alpha value is given it will

default to .05 but do note that alpha can also be, .10, .01, and .005 to

name a few. Essentially alpha can be any value the statistician

deems fit, but the most common values are .05, .01 and .10.

One last thing before we get to an example. There are 3 different scenarios that

are associated with the Hypothesis Scenario.

1) There is a Lower tailed (one tailed) Test or a Left Tailed Test. If the problem

asks if there a significant decrease or less than or lower than or fewer than,

then the problem is a lower tailed test. The “<” sign corresponds with the

Ha. The hypothesis scenario will look like:

a. Ho: �̂� = 𝑝0

Ha: �̂� < 𝑝0

(Here we see that “𝑝0” is the hypothesized value and the Less

Than Sign “<” lines up with the Ha)

2) There is an Upper tailed (one tailed) Test or a Right Tailed Test. If the

problem asks is there a significant increase or more than or greater than or

higher than, then the problem is an upper tailed test. The “>” sign

corresponds with the Ha. The hypothesis scenario will look like:

a. Ho: �̂� = 𝑝0

Ha: �̂� > 𝑝0

(Here we see that “𝑝0” is the hypothesized value and the

Greater Than Sign “>” lines up with the Ha)

3) There is a Two tailed Test. If the problem asks is there a significant

difference or statistical evidence or asks if it is not the same, then the

problem is a two-tailed test. The “≠” sign corresponds with the Ha. The

hypothesis scenario will look like:

a. Ho: �̂� = 𝑝0

Ha: �̂� ≠ 𝑝0

(Here we see that “𝑝0” is the hypothesized value and the

Greater Than Sign “≠” lines up with the Ha)

The hypothesized value is what we think should happen or what has been found

to be true in the past.

Now let’s continue to look at our car price data from Week 3. In Week 3, I asked

you to calculate the average and then find how many data points fell below the

average. We called this value p and then we found q. If we look back at my data

set, we see that p = .70 and q = .30.

We will call this �̂� = .70 and �̂� = .30.

We want to run a test to see how close our data set is to a 50/50 spread? 50% of

the data would fall above the mean and 50% of the data would fall below the

mean, in a perfect world.

In other words, is there a difference between your data set and 50%? We will

calculate a 95% hypothesis to test this claim.

(Note: YES! I realize that some of you did see in your Week 3 forum that you did

get p = .50 and q = .50. If this is the case, your Test Statistic will be 0 and the p-

value will come out to be 1. That is fine, BUT it is still a good idea to go through

this example and make sure you can run a hypothesis test to get the correct

results. Extra practice never hurt anyone.)

Getting back to our test, this tells us that the hypothesized value is .50. The

hypothesis scenario will look like this:

1) Ho: �̂� = .50

Ha: �̂� ≠ .50

2) Z-Stat = �̂�−𝑝0

(√ �̂�∗�̂�

𝑛 )

= .70−.50

(√ .50∗.50

10 )

= 1.264911

Note: If your Z-Stat is negative that is fine. That does not mean the

problem is incorrect. And if your �̂� = .50, your Z-Stat would be 0 here

and that is fine also.

3) To find the p-value we will use the =NORM.S.DIST function. In Excel

type in =NORM.S.DIST(1.264911,TRUE) and hit Enter. We type in TRUE

because the hypothesis test is cumulative.

We see that the p-value = .897048. But remember this is in the Less Than form. If

we were running a Lower Tailed Test this would be our p-value. To find the p-

value for an Upper Tailed Test we would take p-value = 1 – .897048 = .102952.

Since we are running a Two Tailed Test, to get the correct p-value we would

multiply whichever p-value is smaller by 2. It will be different depending on the

test, so you need to make sure you use whichever one is smaller. Remember, p-

values CANNOT be greater than 1. If you get a p-value greater than 1, you did

something wrong.

p-value = .102952*2 = .205904. This is the p-value we will use for our conclusion.

If your Z-Stat is 0 then your p-value in this test will be 1. That is fine. Your p-value

can be 1 but it CANNOT be greater than 1.

4) Lastly, we need t