MATH 105 Lab 10: Two Sample Proportions

Part 1: Tornadoes

dorothy

Tornadoes:

A tornado is a narrow, violently rotating column of air that extends from a thunderstorm to the ground. Tornadoes occur in many parts of the world. About 1200 tornadoes hit the US yearly. Tornado Alley is a nickname invented by the media to refer to a broad area of relatively high tornado ocurrences in the central United States.

The data set ``Tornadoes'' contains a variety of varibles that were measured for all tornadoes in the US in 2017. F scale is a qualitative variable that categorizes tornadoes by its windspeed. The table below shows the F scale rating from the National Oceanic and Atmospheric Administration.

Tornadoes

For this activity we will compare the proportion of F0 tornadoes in Texas versus Kansas. The data in the column F Scale represent the Fujita F Scale of the tornado (on a scale from 0 to 5). An F0 tornado is one where the wind speeds are less than 73 miles per hour. We will treat the tornadoes that struck in Texas and Kansas as a simple random sample of all tornadoes that struck since 1950.

Because we are just interested in whether or not it was an F0 tornado, the variable F0 was created with the responses ``yes'' or ``no''. We will use this variable for our analysis.

We will first start by calculating a confidence interval for pTXpKSp_{TX} - p_{KS} where pTXp_{TX} is the proportion of tornadoes in Texas that are category F0 and pKSp_{KS} is the proportion of tornadoes in Kansas that are category F0. It is important to understand that the confidence interval for pTXpKSp_{TX} - p_{KS} is for the difference between the two proportions.

How would you interpret a confidence interval that went from a positive number to a positive number for pTXpKSp_{TX} - p_{KS} (as an example (0.115, 0.233)).

How would you interpret a confidence interval that went from a negative number to a negative number for pTXpKSp_{TX} - p_{KS} (as an example (-0.115, -0.233)).

We will work our way toward doing doing a hypothesis test and confidence interval by first collecting the following information:

  1. Number of tornadoes in Texas.
  2. Number of F0 tornadoes in Texas.
  3. Number of tornadoes in Kansas
  4. Number of F0 tornadoes in Kansas.

We can get this information by creating a contingency table in StatCrunch. We don't need to look at any numbers from other states so we can specify in StatCrunch that we only want to select values where State=TX or State=KS. The following video demonstrates how to create this table.

https://media.csuchico.edu/media/Math+105+Lab+10+Contingency+Table/1_x1ul4ixj

Now we have all the required information to perform a hypothesis test and create a confidence interval. Let's first create a confidence interval. We do this in StatCrunch by going to Stat then Proportion Stats then Two Sample then with summary. Note that we could use with data since we have the data in front of us, however, it is not organized correctly to do what we need. The video below demonstates how to compute a confidence interval for pKSpTXp_{KS} - p_{TX}.

https://media.csuchico.edu/media/MLIB027B_Recording_20201105-131522/1_vr5laj0c

Suppose now we want to test whether Kansas has a higher rate of F0 tornados than Texas.

Now run the hypothesis test and state your conclusion in context of the problem. Use a 0.05 significance level.

Part 1: Tornadoes

image of Siberian tiger

The Bol'shaya Koshka (Russian for big cat) Reserve is a newly created animal reserve that was uniquely developed to help endangered species prosper. This 10,000 acre wild animal reservation was selected because an abundance of Siberior tigers have been found in the area. The diverse terrain of the reserve provides a wide variety of habitats for many different species of animals.

Since the tigers in this area are much more abundant than any other area in the world, they are starting to draw a significant number of researchers to the region. Your primary responsibility will be to help these researchers as they study the tigers and then incoporate the results of their research into a system to identify the best management practices for this reserve.

An important component of monitoring endangered species is to understand the age distribution of the population. Shifts in the distribution could indicate potential issues in sustaining the population. Additionally, researchers would like to quantify size differences between male and femal tigers.

While the exact age is not known for most of the tigers in the reserve, the age of some tigers are known. To esimate the age of a tiger that is captured on your reserve, you will need to compare characteristics of the captured tiger to the ones that live on the research zone (whose ages are known). Another

Your mission is to go into the Bol'shaya Koshka reserved and gather sample data on tigers. Then, using your sample data, you are to establish a simple linear regression model to estimate the age of a tiger based on one of the other variables that you collect. Additionally, we will compare body measurments between male and female tigers. The data analysis part will occur over the next 2 weeks. Today we will do the data collection part of the research.

Play the tutorial for the TigerSTAT game briefly so you are familiar with the game controls. The game is found at the website

http://statgames.tietronix.com/tigerstat/tigerstat_webgl/index.html

Enter a PlayerName and GroupName math105 and sign in. You can choose either the Casual or Hard version, select Continue and Load Tutorial. If you forget commands anytime during the game play, you can hit the "p" key to pause the game and see game instrucitons.

image of TigerStat game login screen

Play the game and next week we will analyze the data. Good Luck!!