Statistics49 cards

Tagged as: nursing, marketing, music, biology, medical, medicine, pharmacology

 copy deck Copy deck


The science of statistics is

Collecting Organizing Summarizing Analyzing information to draw conclusions or answer questions


Anecdotal claims, as opposed to statistics, are

Conclusions based on very little data Stories and rumors


Data can be misused when

Data is incorrectly obtained Data is incorrectly analyzed


Good statistics should

Understand the difference between direct and indirect (lurking variable) relations Understand the impacts of variability



Solves problems with 100% certainty Has only one correct answer


Statistics, because of variability

Does not solve problems with 100% certainty (95% certainty is much more common) Frequently has multiple reasonable answers


A population

Is the group to be studied Includes all of the individuals in the group


A sample

Is a subset of the population Is often used in analyses because getting access to the entire population is impractical


Identify the research objective

What questions are to be answered? What group should be studied?


Collect the information needed

Can you access the entire population? How can you collect a good sample? What other methods are available and appropriate?


Organize and summarize the information

Descriptive statistics (chapters 2 through 4) Visual methods such as charts and graphs Numeric methods such as calculations


Draw conclusions from the information

Inferential statistics (chapters 8 through 15) Various methods that are appropriate for different questions and different types of data sets


Characteristics of the individuals under study are called

variables Examples of qualitative variables Gender Zip code Blood type States in the United States Brands of televisions


Qualitative variables have category values … those values cannot be

added, subtracted, etc. Examples of quantitative variables Temperature Height and weight Sales of a product Number of children in a family Points achieved playing a video game


Discrete variables

Variables that have a finite or a countable number of possibilities Frequently variables that are counts The possible values of qualitative variables can be listed


Continuous variables

Variables that have an infinite but not countable number of possibilities Frequently variables that are measurements Sometimes the variable is discrete but has so many close values that it could be considered continuous


The process of statistics is designed to

collect and analyze data to reach conclusions


Variables can be classified by their type of data

Qualitative or categorical variables Discrete quantitative variables Continuous quantitative variables


There are different ways to collect data

Census Existing sources Survey sampling Designed experiments These are good methods of data collection, if done correctly


A census is

a list. Of all the individuals in a population That records the characteristics of the individuals An example is the US Census held every 10 years (this is only an example though) Advantages Answers have 100% certainty Disadvantages May be difficult or impossible to obtain Costs may be prohibitive


An existing source is

An appropriate data set has already been collected That can be used for this study Advantages Saves time and money Disadvantages There may not be an applicable data set


A survey sample is

A study when only a subset of the population is considered A study where there is no attempt to influence the value of the variable of interest Advantages Saves time and money Disadvantages Choosing an appropriate sample could be difficult


A survey sample is an example of an observational study

An observational study is one where there is no attempt to influence the value of the variable An observational study is also called an ex post facto (after the fact) study Advantages It can detect associations between variables Disadvantages It cannot isolate causes to determine causation


A designed experiment is an experiment

That applies a treatment to individuals Often compares the treated group to a control (untreated) group Where the variables can be controlled Advantages Can analyze individual factors Disadvantages Cannot be done when the variables cannot be controlled Cannot apply in cases for moral / ethical reasons


A simple random sample is

when every possible sample of size n out of a population of N has an equally likely chance of occurring Examples For a simple random sample of size n = 1 from a population size of N = 5, each of the 5 possible samples has an equally likely chance of occurring For a simple random sample of size n = 2 from a population size of N = 4, each of the 6 p


a frame

Simple random sampling requires that we have a list of all the individuals within a population If we do not have a frame, then a different sampling method must be used


There are other effective ways to collect data

Stratified sampling Systematic sampling Cluster sampling Each of these is particularly appropriate in certain specific circumstances


A stratified sample is

obtained when we choose a simple random sample from subgroups of a population This is appropriate when the population is made up of nonoverlapping (distinct) groups called strata Within each strata, the individuals are likely to have a common attribute Between the stratas, the individuals are likely to have different common attributes


A systematic sample is obtained

when we choose every kth individual in a population The first individual selected corresponds to a random number between 1 and k Systematic sampling is appropriate When we do not have a frame When we do not have a list of all the individuals in a population


A cluster sample is obtained

when we choose a random set of groups and then select all individuals within those groups We can obtain a sample of size 50 by choosing 10 groups of 5 Cluster sampling is appropriate when it is very time consuming or expensive to choose the individuals one at a time


A convenience sample is obtained

when we choose individuals in an easy, or convenient way Self-selecting samples are examples of convenience sampling Individuals who respond to television or radio announcements “Just asking around” is an example of convenience sampling Individuals who are known to the pollster


A multistage sample is obtained

using a combination of Simple random sampling Stratified sampling Systematic sampling Cluster sampling Many large scale samples (the US census in noncensus years) use multistage sampling


Sources of Error In Sampling

Poor design of the sampling frame Poor design of the sample questions One type of error, sampling errors, occur because we use only part of the population in our study Samples consist of only part of the total data Samples are usually more realistic to analyze Because there are individuals in the population that are not in our sample, sampling erro


Types of nonsampling error

Using an incomplete frame Individuals who respond have different characteristics than individuals who do not respond Interviewer errors Misrepresented answers Data checks Questionnaire design Wording of questions Order of questions, words, and responses


Interviewer errors may occur when

The interviewer has a vested interest in the results The interviewer is not trained to obtain accurate information The individuals feel pressure or an obligation to provide an answer that the interviewer desires For example, if your server watches you when you fill out the restaurant’s service satisfaction questionnaire …


A designed experiment is

is a controlled study The purpose of designed experiments is to control as many factors as possible to isolate the effects of a particular factor Designed experiments must be carefully set up to achieve their purposes


explanatory variables

Some variables in a designed experiment are controlled, those are the These variables are also sometimes called the factors Factors Are part of a controlled environment Has values that can be changed by the researcher Are considered as possible causes


response variable

The designed experiment analyzes the affects of the factors on the


A treatment

is a combination of the values of the factors Examples of treatments Giving one medication to one group of patients and a different medication to another Using one type of fertilizer on a set of plots of corn and a different type of fertilizer on a different set of plots Playing country music to one group of mice and rap music to another


experimental units

(people, plants, materials, other objects, …) When the experimental units are people, we refer to them as subjects Subjects in an experiment correspond to individuals in a survey



When both the subjects and the researchers do not know which treatment, this is called



Subjects not given any medication are often given a placebo such as a sugar tablet The subjects will not know which treatment they get


Conducting an experiment involves considerable planning

Planning steps Identify the problem Determine the factors Determine the number of experimental units Determine the level of each factor Implementation steps Conduct the experiment Test the claim


Three ways to deal with the factors

Control – fix the levels at a constant level (for factors not of interest) Manipulate – set the levels at predetermined levels (for factors of interest) Randomize – randomize the experimental units (for uncontrolled factors not of interest)



When a treatment is applied to more than one experimental unit


A completely randomized design

design is when each experimental unit is assigned to a treatment completely at random An example A farmer wants to test the effects of a fertilizer We choose a set of plants to receive the treatment We randomly assign plants to receive different levels of fertilizer This has similarities to completely random sampling


A matched-pair design

is when the experimental units are paired up and each of the pair is assigned to a different treatment A matched pair design requires Units that are paired (twins, the same person before and after the treatment, …) Only two levels of treatment (one for each of the pair) An example A subject before receiving the medication The same subject after rec



When two effects cannot be distinguished, this is called