Shortcut

Ctrl + F is the shortcut in your browser or operating system that allows you to find words or questions quickly.

Ctrl + Tab to move to the next tab to the right and Ctrl + Shift + Tab to move to the next tab to the left.

On a phone or tablet, tap the menu icon in the upper-right corner of the window; Select "Find in Page" to search a question.

Share Us

Sharing is Caring

It's the biggest motivation to help us to make the site better by sharing this to your friends or classmates.

Biostatistics

The science that applies statistical theory and mathematical principals to research in medicine, biology, environmental science, health and related fields.

diseases develop

illness

flu

science

medicine

statistics

mathematics

birth defects

data analysis

biostatistician

epidemiology

A good hypothesis statement should: be based on information in ______________ research.

  • prior

One of the most important measures of dispersion, the _____________ is the difference between the maximum and minimum values of a distribution.

  • range
  • variance
  • IQR
  • standard deviation
6 Types of Data in Statistics &...
6 Types of Data in Statistics & Research: Key in Data Science

28% of all Woodrow students believe Monday will be snow day. You take a sample of 50 students and find that 15 of them believe Monday will be a snow day. What does 50 represent?

  • I
  • P
  • N (capital)
  • n

The tendency of getting poor output or poor outcome when the inputs or efforts are poor.

  • Garbage-in, garbage-out syndrome

A form of regression analysis where the observations are measured at the same point in time or over the same time period but differ along another dimension.

  • cross-sectional model

A sample proportion is where a random sample of objects n is taken from a population P.

  • True
  • False

A procedure of combining evidence in different reports on the same aspect. If different trials on the same regimen report varying efficacy, they can be combined to come to a unified conclusion, which may command substantially more confidence than result of any one of the individual trials.

  • Meta-analysis

linear combination of individual forecasts to assist in obtaining a more accurate forecast.

  • composite regression model

____________ is the Excel function that calculates the kurtosis of a data set’s distribution.

  • KURT

When there are more scores toward one end of the distribution than the other, this results in _____________.

  • positive
  • above average
  • negative
  • skew

The _____________ hypothesis is always the accepted fact.

  • null

The mean of the sampling distribution of the mean is the same as the _____.

  • population mean

A population has a mean of 60 and a standard deviation of 5. A random sample of 16 measurements is drawn from this population. Describe the sampling distribution of the sample means by computing its standard deviation. Assume that the population is infinite.

  • 1.96
  • 1.55
  • 1.35
  • 1.25

The ______________ is the score of a distribution residing at the 50th percentile, separating the top and bottom 50 percent of scores.

  • mode
  • median
  • mean

In hypothesis testing, the critical value is the threshold for significance.

  • critical value

The set of words that describes the essential features of a study.

  • Keywords

This condition exists when the errors do not have a constant variance across an entire range of values.

  • heteroscedasticity

The process of going into the deep of a phenomenon, data-set, thought, etc., and looking at its various components.

  • Analysis

28% of all Woodrow students believe Monday will be snow day. You take a sample of 50 students and find that 15 of them believe Monday will be a snow day. What does 28 represent?

  • E
  • N
  • S
  • P

The probability of occurrence of an event such as disease when some a-priori information such as sign-symptoms are known: denoted by P(A/B) where after slash (/) sign is what is known a-priori.

  • Conditional probability

Using computer to solve problems without understanding the implications of the underlying procedure. This is known as "black box approach".

  • True
  • False

This is a statement of what a statistical hypothesis test is set up to establish.

  • alternative hypothesis

A sampling distribution is a graph of a statistics for your sample data.

  • True
  • False

The heights female college students are normally distributed with mean of 68 inches and standard deviation of 3 inches. If 25 students are randomly drawn from the population, what would be the standard error of the resulting sampling distribution of the means?

  • 0.5
  • 0.6
  • 0.1
  • 0.3

A trial with the objective to examine if a new regimen is different from another regimen by more than a prespecified medically unimportant margin.

  • Equivalence trial

This consists of calculations that provide information about levels of variability within a regression model and form a basis for tests of significance.

  • ANOVA

A less scientific but a quick method to arrive at a consensus among experts.

  • Delphi method

An extraneous factor that could be an explanation of the outcome of interest in addition to the factor under study so that its effect can not be differentiated from the other: such as dietary factors when examining relationship between smoking and cervical cancer. Presence of unaccounted confounders decreases the validity of a study.

  • Confounder

The average speed of 1500 vehicles traveled on a stretch of highway that day is 67 miles per hour with a standard deviation of 3.5 miles per hour. If 100 vehicles are randomly selected as samples, what would be the mean of the resulting sampling distribution of sample means?

  • 67
  • 68
  • 69
  • 70

The heights of children of exceptionally tall (or short) parents “regress” to the mean of the population

  • regression

A _____ is a subset drawn from a larger population.

  • sample

A plot of the residuals versus a z value (or cumulative normal percentile) derived from the normal probability distribution for the ranking location of the residual.

  • normal probability plot

The Gallup Poll asked a random sample of 1785 adults whether they attended church during the past week. Let p-hat be the proportion of people in the sample who attended church. A newspaper report claims that 40% of all U.S. adults went to church last week. Suppose this claim is true. Calculate the standard deviation of the sampling distribution.

  • 0.2514
  • 0.2585
  • 0.0116
  • 0.5264

The simple arithmetic average of a distribution of variable values (or scores), the _____________ provides a single, concise numerical summary of a distribution.

  • mean
  • median
  • mode

Variety of causes of death: some people meet death slowly such as by cancer, and some sudden such as by myocardial infarction.

  • Death spectrum

A ____________ shows the number of observations falling into each of several ranges of values, which are typically portrayed as histograms.

  • distribution

The ______________, also called root mean squared error (RMSE), is the square root of the mean squared residual term from the ANOVA table of the summary output.

  • standard error of the estimate (SEE)

A _____ is a selected individual or group representing the full set of members of a certain group of interest.

  • population

The mean of sampling distribution of the proportion, P, is a special case of the _____.

  • sampling distribution of the mean

The area from the ROC curve to the base: used as an indicator of the efficacy of a test in terms of sensitivity and specificity – can be used to compare performance of various tests.

  • Area under the curve

A person without disease classified as with disease. In place of disease, false positivity can be for any other attribute

  • False positive

A person with disease classified as without disease. In place of disease, false negativity can be for any other attribute.

  • False negative

The heights female college students are normally distributed with mean of 68 inches and standard deviation of 3 inches. If 25 students are randomly drawn from the population, what would be the expected mean of the resulting sampling distribution of the means?

  • 68
  • 90
  • 45
  • 88

A standardized measure of the association or mutual dependence between two variables, say, X and Y.

  • correlation coefficient

A _____________ is the probability that the observed result, or a result more extreme, could be obtained if the null hypothesis is true.

  • p-value

A hypothesis is an ____________ about something in the world around you.

  • educated guess

Regardless of the shape of the population distribution, this theorem states that the sampling distribution of the mean of n independent sample values will approach the normal distribution as the sample size increases.

  • central limit theorem

Bibliography is a list of citations of the related literature.

  • True
  • False

A good hypothesis statement should: an “if” and “________________” statement.

  • then

Identification data of a document containing the authors‘ name, title, publication name, volume, publication date, page numbers, etc.

  • Citation

If you are using a significance level of .05, a ______________ allots all 5 percent to testing the statistical significance in the one direction of interest.

  • one-tailed test

The Gallup Poll asked a random sample of 1785 adults whether they attended church during the past week. Let p-hat be the proportion of people in the sample who attended church. A newspaper report claims that 40% of all U.S. adults went to church last week. Suppose this claim is true. What is the mean of the sampling distribution of p-hat?

  • 0.55
  • 0.66
  • 0.5
  • 0.4

It is possible to fit a polynomial of any number of terms to a set of data.

  • polynomial fitting

It equals the change in Y for each unit change in X.

  • slope

This is simply the SEE divided by the average of the dependent variable.

  • coefficient of variation

Increasing the sample size of an opinion poll will ___________ the variability of the estimates made from the data collected in the poll.

  • decrease
  • reduce
  • stabilize

A _____ is a value which is generated from a population.

  • parameter

Inverse of the standard error of the estimate or a derivative of this inverse.

  • Efficiency of an estimate

A statistic is unbiased if... ... the mean of the sampling distribution IS _____________ to the value of the population mean.

  • greater
  • lesser
  • equal

The _____ theorem tells us that if we have a large number of independent, identically distributed variables, the distribution will approximately follow a normal distribution.

  • central limit

When two or more comparisons or other statistical tests of hypothesis are done on the same set of data, the total probability of alpha error can increase much beyond the prefixed level such as 5%. This is known as "blinding".

  • True
  • False

____________ are used to draw inferences about a population from a sample.

  • Inferential statistics

Census is the survey of the entire population.

  • True
  • False

The first could be called missed diagnosis and the second as misdiagnosis. In place of healthy/diseased this could be any other categorization.

  • Misclassification

The Gallup Poll asked a random sample of 1785 adults whether they attended church during the past week. Let p-hat be the proportion of people in the sample who attended church. A newspaper report claims that 40% of all U.S. adults went to church last week. Suppose this claim is true. Calculate the standard deviation of the sampling distribution of p-hat.

  • 0.25652
  • 0.0045
  • 0.8987
  • 0.0116

_____ refers tot he ability to draw conclusions about the characteristics of the population as a whole based on the results of data collected from a sample.

  • Generalizability

A population has a mean of 60 and a standard deviation of 5. A random sample of 16 measurements is drawn from this population. Describe the sampling distribution of the sample means by computing its mean. Assume that the population is infinite.

  • 60
  • 75
  • 70
  • 65

You can have the Y-axis on a logarithmic scale instead of a linear one.

  • log scale

The remaining portion of life at any age that would be spent without any morbidity

  • Healthy life expectancy

The mean of the sampling distribution of the mean formula is is μM = μ, where μM is the _____ of the mean.

  • mean of the sampling distribution

A ______________ is a set of ordered observations of a phenomenon at equally spaced time points.

  • time series

The process of reaching to a decision after considering probabilities of various outcomes and value judgments regarding the utility of those outcomes.

  • Decision analysis

For data sets having a normal, bell-shaped distribution, approximately 68 percent of the data values are within 1 standard deviation of the mean; approximately 95 percent are within 2 standard deviations of the mean; and approximately 99.7 percent (nearly all) are within 3 standard deviations of the mean.

  • empirical rule

The technique of estimating a smooth trend, usually by taking weighted averages of observations.

  • smoothing

The normality assumption is at the core of a majority of standard statistical procedures, and it is important to be able to test this assumption.

  • Lilliefors test

Similar course of the disease process in the two regimens under comparison: also evaluated in terms of comparable bioavailability of drug products, say, within 80% to 125% with respect to area under the concentration curve and Cmax.

  • Bioequivalence

In a population of five university students with GPAs of 2.5, 2.3, 1.7, 1.4, and 1.1, a sample of three students are considered. What would be the standard deviation of the resulting sampling distribution?

  • 0.88
  • 0.56
  • 0.22
  • 0.32

The 300 customers who called the call center spend an average of 45 minutes on hold, with a standard deviation of 12 minutes. What is the expected average of the sampling distribution for a sample of 150 randomly selected customers?

  • 55
  • 45
  • 50
  • 60

A difference in time between an observation and a previous observation.

  • lag

If you are going to propose a hypothesis, it’s customary to write a ______________.

  • statement

An international organisation of producers and consumers of medical research that helps to clarify the research achievements, particularly health care interventions such as drugs, diet alteration and behavior change

  • Cochrane Collaboration

A _____ quantity is a quantity without a physical unit and is thus a pure number.

  • dimensionless

A distribution is the arrangement of data by the values of one variable in order, from ______________.

  • left to right
  • low to high
  • right to left
  • high to low

The __________ statistic, named after its creators, tests the hypothesis that there is no autocorrelation of one time lag present in the errors obtained from forecasting

  • Durbin-Watson (DW)

A ______________ is calculated by subtracting the forecast value from the actual value to give an error value for each forecast period. In forecasting, this term is commonly used as a synonym for residual.

  • forecast error

The probability of occurrence of one of two or more mutually exclusive events is the sum of the probabilities of their individual occurrence

  • Addition rule (of probability)

The average speed of 1500 vehicles traveled on a stretch of highway that day is 67 miles per hour with a standard deviation of 3.5 miles per hour. If 100 vehicles are randomly selected as samples, what would be the standard error of the resulting sampling distribution of sample means?

  • 0.39
  • 0.34
  • 0.56
  • 0.78

Hypothesis testing in statistics is a way for you to _______________ of a survey or experiment to see if you have meaningful results.

  • test the results

The technique of multiple regression is an extension of simple regression.

  • multiple regression

The ___________ is calculated by dividing MSR (mean squared regression) by MSE (mean squared error), or explained variance by unexplained variance.

  • F ratio or F statistic

When the sample is small, the sampling distribution of a proportion will have an approximate normal distribution.

  • True
  • False

The statistical procedure to discover a construct out of data that can possibly explain the variation and relationship among different variables.

  • Factor analysis

As the sample size increases, the mean of the _____ of the mean will approach the population mean of μ.

  • sampling distribution

A good hypothesis statement should: Have design _______________.

  • criteria

If we could take many such samples, the collection of possible values of the statistic would follow its _______________.

  • sampling distribution

The statistical procedure to classify units or individuals into groups such that the units are similar within each group but dissimilar across groups: generally used when the number and nature of the groups are not known.

  • Cluster analysis

It’s good science to let people know if your study results are solid, or if they could have happened by chance. The usual way of doing this is to test your results with a _____________.

  • p-value

_____ are a collection of statistical tools which are used to quantitatively describe or summarize a collection of data.

  • Descriptive statistics

What does this symbol represent?

  • sample
  • variable
  • population proportion
  • ratio

If I…(do this to an independent variable)….then (this will happen to the _________________).

  • dependent variable

When the error terms remaining after application of a forecasting method show autocorrelation, it indicates that the forecasting method has not removed all of the pattern from the data.

  • autocorrelated errors

The result stating that the chance of a summative measure such as sample mean following a Gaussian distribution rapidly increases in almost all practical situations as the number of individuals in a sample increases (i.e., sample size becomes large).

  • Central Limit Theorem

The frequency of desired outcome per unit of resource inputs such as time, money and manpower.

  • Efficiency

If you had to choose a "best statistic" to describe a population, which of the following would be best.

  • low bias, low variability
  • low bias, high variability
  • high bias, high variability
  • high bias, low variability

How well the actual observations fit into a specified pattern.

  • Goodness of fit

If a forecast variable Y is regressed against several explanatory variables X1, X2, . . , Xk, then the estimated Y value is designated Y.

  • multiple correlation coefficient

The set of characteristics such as age, disease and severity, which are necessary in a subject to be considered eligible for inclusion in the study.

  • Inclusion criteria

A non parametric test for comparing central tendency in three or more groups.

  • Kuskal-Wallis test

The 300 customers who called the call center spend an average of 45 minutes on hold, with a standard deviation of 12 minutes. What is the standard error of the sampling distribution for a sample of 150 randomly selected customers

  • 0.50
  • 0.58
  • 0.79
  • 0.69

If x objects have a certain characteristic then the sample proportion “p” is: p = x*n.

  • True
  • False

A summary of the death and survival pattern of a group of people—generally for the entire population of an area, but can be used for patients of a particular disease also.

  • Life table

A characteristics that is assessed only in two categories such as ascites present or absent (or yes/no), or gender as male or female.

  • Binary variable

A prospective study of a cohort for a specified period, generally to observe the occurrence of an outcome of interest, and thereby determine the incidence.

  • Cohort study

______________ are so called because the assumptions underlying their use are fewer and weaker than those associated with parametric tests

  • Nonparametric or distribution-free tests

In regression analysis, a ______________ is one that takes the values 0 or 1 to indicate the absence or presence of some categorical effect (month, quarter, strike, fire loss) that may be expected to shift the outcome.

  • dummy variable

In the game of Scrabble, each player begins by drawing 7 tiles from a bag containing 100 tiles. There are 42 vowel, 56 consonants, and 2 blank tiles in the bag. Cait chooses an SRS of 7 tiles. Let p-hat be the proportion of vowels in her sample.

  • Yes
  • No

Hypothesis should be testable, either by experiment or ________________.

  • observation

In a population of five university students with GPAs of 2.5, 2.3, 1.7, 1.4, and 1.1, a sample of three students are considered. What would be the mean of the resulting sampling distribution?

  • 1.8
  • 3.6
  • 2.5
  • 4.9

Cause-specific rate is the rate obtained when numerator is restricted to a particular cause (e.g., of morbidity or of mortality).

  • True
  • False

The sampling distribution of a proportion is when you repeat your survey for all possible samples of the population.

  • True
  • False