Statistics49 cards

Tagged as: nursing, marketing, music, biology, medical, medicine, pharmacology

 copy deck Copy deck

1

The science of statistics is

Collecting Organizing Summarizing Analyzing information to draw conclusions or answer questions

2

Anecdotal claims, as opposed to statistics, are

Conclusions based on very little data Stories and rumors

3

Data can be misused when

Data is incorrectly obtained Data is incorrectly analyzed

4

Good statistics should

Understand the difference between direct and indirect (lurking variable) relations Understand the impacts of variability

5

Mathematics

Solves problems with 100% certainty Has only one correct answer

6

Statistics, because of variability

Does not solve problems with 100% certainty (95% certainty is much more common) Frequently has multiple reasonable answers

7

A population

Is the group to be studied Includes all of the individuals in the group

8

A sample

Is a subset of the population Is often used in analyses because getting access to the entire population is impractical

9

Identify the research objective

What questions are to be answered? What group should be studied?

10

Collect the information needed

Can you access the entire population? How can you collect a good sample? What other methods are available and appropriate?

11

Organize and summarize the information

Descriptive statistics (chapters 2 through 4) Visual methods such as charts and graphs Numeric methods such as calculations

12

Draw conclusions from the information

Inferential statistics (chapters 8 through 15) Various methods that are appropriate for different questions and different types of data sets

13

Characteristics of the individuals under study are called

variables Examples of qualitative variables Gender Zip code Blood type States in the United States Brands of televisions

14

Qualitative variables have category values … those values cannot be

added, subtracted, etc. Examples of quantitative variables Temperature Height and weight Sales of a product Number of children in a family Points achieved playing a video game


16

Discrete variables

Variables that have a finite or a countable number of possibilities Frequently variables that are counts The possible values of qualitative variables can be listed

17

Continuous variables

Variables that have an infinite but not countable number of possibilities Frequently variables that are measurements Sometimes the variable is discrete but has so many close values that it could be considered continuous

18

The process of statistics is designed to

collect and analyze data to reach conclusions

19

Variables can be classified by their type of data

Qualitative or categorical variables Discrete quantitative variables Continuous quantitative variables

20

There are different ways to collect data

Census Existing sources Survey sampling Designed experiments These are good methods of data collection, if done correctly

21

A census is

a list. Of all the individuals in a population That records the characteristics of the individuals An example is the US Census held every 10 years (this is only an example though) Advantages Answers have 100% certainty Disadvantages May be difficult or impossible to obtain Costs may be prohibitive

22

An existing source is

An appropriate data set has already been collected That can be used for this study Advantages Saves time and money Disadvantages There may not be an applicable data set

23

A survey sample is

A study when only a subset of the population is considered A study where there is no attempt to influence the value of the variable of interest Advantages Saves time and money Disadvantages Choosing an appropriate sample could be difficult

24

A survey sample is an example of an observational study

An observational study is one where there is no attempt to influence the value of the variable An observational study is also called an ex post facto (after the fact) study Advantages It can detect associations between variables Disadvantages It cannot isolate causes to determine causation

25

A designed experiment is an experiment

That applies a treatment to individuals Often compares the treated group to a control (untreated) group Where the variables can be controlled Advantages Can analyze individual factors Disadvantages Cannot be done when the variables cannot be controlled Cannot apply in cases for moral / ethical reasons

26

A simple random sample is

when every possible sample of size n out of a population of N has an equally likely chance of occurring Examples For a simple random sample of size n = 1 from a population size of N = 5, each of the 5 possible samples has an equally likely chance of occurring For a simple random sample of size n = 2 from a population size of N = 4, each of the 6 p

27

a frame

Simple random sampling requires that we have a list of all the individuals within a population If we do not have a frame, then a different sampling method must be used

28

There are other effective ways to collect data

Stratified sampling Systematic sampling Cluster sampling Each of these is particularly appropriate in certain specific circumstances

29

A stratified sample is

obtained when we choose a simple random sample from subgroups of a population This is appropriate when the population is made up of nonoverlapping (distinct) groups called strata Within each strata, the individuals are likely to have a common attribute Between the stratas, the individuals are likely to have different common attributes

30

A systematic sample is obtained

when we choose every kth individual in a population The first individual selected corresponds to a random number between 1 and k Systematic sampling is appropriate When we do not have a frame When we do not have a list of all the individuals in a population

31

A cluster sample is obtained

when we choose a random set of groups and then select all individuals within those groups We can obtain a sample of size 50 by choosing 10 groups of 5 Cluster sampling is appropriate when it is very time consuming or expensive to choose the individuals one at a time

32

A convenience sample is obtained

when we choose individuals in an easy, or convenient way Self-selecting samples are examples of convenience sampling Individuals who respond to television or radio announcements “Just asking around” is an example of convenience sampling Individuals who are known to the pollster

33

A multistage sample is obtained

using a combination of Simple random sampling Stratified sampling Systematic sampling Cluster sampling Many large scale samples (the US census in noncensus years) use multistage sampling

34

Sources of Error In Sampling

Poor design of the sampling frame Poor design of the sample questions One type of error, sampling errors, occur because we use only part of the population in our study Samples consist of only part of the total data Samples are usually more realistic to analyze Because there are individuals in the population that are not in our sample, sampling erro

35

Types of nonsampling error

Using an incomplete frame Individuals who respond have different characteristics than individuals who do not respond Interviewer errors Misrepresented answers Data checks Questionnaire design Wording of questions Order of questions, words, and responses

36

Interviewer errors may occur when

The interviewer has a vested interest in the results The interviewer is not trained to obtain accurate information The individuals feel pressure or an obligation to provide an answer that the interviewer desires For example, if your server watches you when you fill out the restaurant’s service satisfaction questionnaire …

37

A designed experiment is

is a controlled study The purpose of designed experiments is to control as many factors as possible to isolate the effects of a particular factor Designed experiments must be carefully set up to achieve their purposes

38

explanatory variables

Some variables in a designed experiment are controlled, those are the These variables are also sometimes called the factors Factors Are part of a controlled environment Has values that can be changed by the researcher Are considered as possible causes

39

response variable

The designed experiment analyzes the affects of the factors on the

40

A treatment

is a combination of the values of the factors Examples of treatments Giving one medication to one group of patients and a different medication to another Using one type of fertilizer on a set of plots of corn and a different type of fertilizer on a different set of plots Playing country music to one group of mice and rap music to another

41

experimental units

(people, plants, materials, other objects, …) When the experimental units are people, we refer to them as subjects Subjects in an experiment correspond to individuals in a survey

42

double-blind

When both the subjects and the researchers do not know which treatment, this is called

43

placebo

Subjects not given any medication are often given a placebo such as a sugar tablet The subjects will not know which treatment they get

44

Conducting an experiment involves considerable planning

Planning steps Identify the problem Determine the factors Determine the number of experimental units Determine the level of each factor Implementation steps Conduct the experiment Test the claim

45

Three ways to deal with the factors

Control – fix the levels at a constant level (for factors not of interest) Manipulate – set the levels at predetermined levels (for factors of interest) Randomize – randomize the experimental units (for uncontrolled factors not of interest)

46

replication

When a treatment is applied to more than one experimental unit

47

A completely randomized design

design is when each experimental unit is assigned to a treatment completely at random An example A farmer wants to test the effects of a fertilizer We choose a set of plants to receive the treatment We randomly assign plants to receive different levels of fertilizer This has similarities to completely random sampling

48

A matched-pair design

is when the experimental units are paired up and each of the pair is assigned to a different treatment A matched pair design requires Units that are paired (twins, the same person before and after the treatment, …) Only two levels of treatment (one for each of the pair) An example A subject before receiving the medication The same subject after rec

49

confounding

When two effects cannot be distinguished, this is called