# AP Statistics Vocab

**1. Manipulates factor levels to create treatments - randomly assigns subjects to these treatment levels - and then compares the responses of the subject groups across treatment levels**

**2. The lower of this is the value with a quarter of the data below it; the upper of this has a quarter of the data above it**

**3. Design Randomization occurring within blocks**

**4. A normal model with a mean of 0 and a standard deviation of 1**

**5. The square root of the variance**

**6. Values of this record the results of each trial with respect to what we were interested in**

**7. A distribution is this if it's not symmetric and one tail stretches out farther than the other**

**8. A hump or local high point in the shape of the distribution of a variable; the apparent locations of these can change as the scale of a histogram is changed**

**9. The difference between the first and third quartiles**

**10. An observational study in which subjects are followed to observe future outcomes**

**11. A study based on data in which no manipulation of factors has been employed**

**12. Shows quantitative data values in a way that sketches the distribution of the data**

**13. The most basic situation in a simulation in which something happens at random**

**14. Graphs a dot for each case against a single axis**

**15. A scatterplot shows an association that is this if there is little scatter around the underlying relationship**

**16. When the levels of one factor are associated with the levels of another factor so their effects cannot be separated**

**17. A variable that is not explicitly part of a model but affects the way the variables in the model appear to be related**

**18. A sampling scheme that biases the sample in a way that gives a part of the population less representation than it has in the population**

**19. Consists of the individuals who are conveniently available**

**20. The natural tendency of randomly drawn samples to differ**

**21. This of sample size n is one in which each set of n elements in the population has an equal chance of selection**

**22. Any systematic failure of a sampling method to represent its population; common errors are voluntary response - undercoverage - nonresponse ____ - and response ____**

**23. Displays counts and - sometimes - percentages of individuals falling into named categories on two or more variables; categorizes the individuals on all variables at once - to reveal possible patterns in one variable that may be contingent on the cate**

**24. Done to eliminate units; values can be compared and combined even if the original variables had different units and magnitudes**

**25. The best defense against bias - in which each individual is given a fair - random chance of selection**

**26. The specific values that the experimenter chooses for a factor**

**27. A representative subset of a population - examined in hope of learning about the population**

**28. A variable in which the numbers act as numerical values; always has units**

**29. A numerically valued attribute of a model for a population**

**30. The experimental units assigned to a baseline treatment level - typically either the default treatment - which is well understood - or a null - placebo treatment**

**31. Shows the relationship between two quantitative variables measured on the same cases**

**32. A variable whose values are compared across different treatments**

**33. A quantity or amount adopted as a standard of measurement - such as dollars - hours - or grams**

**34. Displays the 5-number summary as a central box with whiskers that extend to the non-outlying data values**

**35. A variable that names categories (whether with words or numerals)**

**36. To describe this aspect of a distribution - look for single vs. multiple modes - and symmetry vs. skewness**

**37. Systematically recorded information - whether numbers or labels - together with its context**

**38. Bias introduced to a sample when a large fraction of those sampled fails to respond**

**39. The square of the correlation between y and x; gives the fraction of the variability of y accounted for by the least squares linear regression on x; an overall measure of how successful the regression is in linearly relating y to x**

**40. A distribution is this if the two halves on either side of the center look approximately like mirror images of each other**

**41. When both those who could influence and evaluate the results are blinded**

**42. The ith ___ is the number that falls above i% of the data**

**43. The differences between data values and the corresponding values predicted by the regression model; ____ = observed value - predicted value**

**44. Gives the possible values of the variable and the relative frequency of each value**

**45. A sample is this if the statistics computed from it accurately reflect the corresponding population parameters**

**46. Summarized with the mean or the median**

**47. Control - randomize - replicate - block**

**48. Individuals on whom an experiment is performed**

**49. Having one mode; this is a useful term for describing the shape of a histogram when it's generally mound-shaped**

**50. The ____ we care about most is straight**