**1. Any systematic failure of a sampling method to represent its population; common errors are voluntary response - undercoverage - nonresponse ____ - and response ____**

**2. An individual about whom or which we have data**

**3. A distribution is this if it's not symmetric and one tail stretches out farther than the other**

**4. When groups of experimental units are similar - it is a good idea to gather them together into these**

**5. Distributions with more than two modes**

**6. To describe this aspect of a distribution - look for single vs. multiple modes - and symmetry vs. skewness**

**7. Value found by subtracting the mean and dividing by the standard deviation**

**8. The sequence of several components representing events that we are pretending will take place**

**9. The specific values that the experimenter chooses for a factor**

**10. The difference between the lowest and highest values in a data set**

**11. Extreme values that don't appear to belong with the rest of the data**

**12. The number of individuals in a sample**

**13. The middle value with half of the data above and half below it**

**14. Systematically recorded information - whether numbers or labels - together with its context**

**15. The sum of squared deviations from the mean - divided by the count minus one**

**16. Variables are said to be this if the conditional distribution of one variable is the same for each category of the other**

**17. Adding a constant to each data value adds the same constant to the mean - the median - and the quartiles - but does not change the standard deviation or IQR**

**18. A variable in which the numbers act as numerical values; always has units**

**19. A variable that is not explicitly part of a model but affects the way the variables in the model appear to be related**

**20. The ____ we care about most is straight**

**21. A sampling design in which entire groups are chosen at random**

**22. A variable whose values are compared across different treatments**

**23. In a statistical display - each data value should be represented by the same amount of area**

**24. A study that asks questions of a sample drawn from some population in the hope of learning something about the entire population**

**25. A list of individuals from whom the sample is drawn**

**26. These are hard to generate - but several websites offer an unlimited supply of equally likely random values**

**27. Sampling schemes that combine several sampling methods**

**28. A hump or local high point in the shape of the distribution of a variable; the apparent locations of these can change as the scale of a histogram is changed**

**29. A sampling scheme that biases the sample in a way that gives a part of the population less representation than it has in the population**

**30. A sampling design in which the population is divided into several subpopulations - and random samples are then drawn from each stratum**

**31. Value calculated from data to summarize aspects of the data**

**32. Tells how many standard deviations a value is from the mean; have a mean of zero and a standard deviation of one**

**33. Gives a value in 'y-units per x-unit'; changes of one unit in x are associated with changes of b1 units in predicted values of y**

**34. Data points whose x-values are far from the mean of x are said to exert ____ on a linear model; with high enough ____ - residuals can appear to be deceptively small**

**35. The distribution of either variable alone in a contingency table; the counts or percentages are the totals found in the margins (last row or column) of the table**

**36. Shows quantitative data values in a way that sketches the distribution of the data**

**37. The experimental units assigned to a baseline treatment level - typically either the default treatment - which is well understood - or a null - placebo treatment**

**38. An equation of the form y-hat = b0 + b1x**

**39. Bias introduced to a sample when a large fraction of those sampled fails to respond**

**40. Numerically valued attribute of a model**

**41. A study based on data in which no manipulation of factors has been employed**

**42. Summarized with the standard deviation - interquartile range - and range**

**43. An event is this if we know what outcomes could happen - but not which particular values will happen**

**44. An individual result of a component of a simulation**

**45. This of sample size n is one in which each set of n elements in the population has an equal chance of selection**

**46. A variable that names categories (whether with words or numerals)**

**47. When both those who could influence and evaluate the results are blinded**

**48. A value that attempts the impossible by summarizing the entire distribution with a single number - a 'typical' value**

**49. The process - intervention - or other controlled circumstance applied to randomly assigned experimental units**

**50. Lists the categories in a categorical variable and gives the count or percentage of observations for each category**