## Test your basic knowledge |

# AP Statistics Vocab

**Instructions:**

- Answer 50 questions in 15 minutes.
- If you are not ready to take this test, you can study here.
- Match each statement with the correct term.
- Don't refresh. All questions and answers are randomly picked and ordered every time you load a test.

This is a study tool. The 3 wrong answers for each question are randomly chosen from answers to other questions. So, you might find at times the answers obvious, but you will see it re-enforces your understanding as you take the test each time.

**1. Ideally tells who was measured - what was measured - how the data were collected - where the data were collected - and when and why the study was performed**

**2. A variable whose values are compared across different treatments**

**3. Useful family of models for unimodal - symmetric distributions**

**4. If data consist of two or more groups that have been thrown together - it is usually best to fit different linear models to each group than to try to fit a single model to all of the data**

**5. Consists of the individuals who are conveniently available**

**6. A distribution that's roughly flat**

**7. A numerical measure of the direction and strength of a linear association**

**8. This corresponding to a z-score gives the percentage of values in a standard normal distribution found at that z-score or below**

**9. Lists the categories in a categorical variable and gives the count or percentage of observations for each category**

**10. The most basic situation in a simulation in which something happens at random**

**11. The number of individuals in a sample**

**12. Shows how a 'whole' divides into categories by showing a wedge of a circle whose area corresponds to the proportion in each category**

**13. Data points whose x-values are far from the mean of x are said to exert ____ on a linear model; with high enough ____ - residuals can appear to be deceptively small**

**14. These are hard to generate - but several websites offer an unlimited supply of equally likely random values**

**15. A hump or local high point in the shape of the distribution of a variable; the apparent locations of these can change as the scale of a histogram is changed**

**16. When an observed difference is too large for us to believe that is is likely to have occurred naturally**

**17. Design Randomization occurring within blocks**

**18. The lower of this is the value with a quarter of the data below it; the upper of this has a quarter of the data above it**

**19. In a statistical display - each data value should be represented by the same amount of area**

**20. Graphs a dot for each case against a single axis**

**21. Displays the 5-number summary as a central box with whiskers that extend to the non-outlying data values**

**22. A sample is this if the statistics computed from it accurately reflect the corresponding population parameters**

**23. Extreme values that don't appear to belong with the rest of the data**

**24. The differences between data values and the corresponding values predicted by the regression model; ____ = observed value - predicted value**

**25. The square of the correlation between y and x; gives the fraction of the variability of y accounted for by the least squares linear regression on x; an overall measure of how successful the regression is in linearly relating y to x**

**26. Manipulates factor levels to create treatments - randomly assigns subjects to these treatment levels - and then compares the responses of the subject groups across treatment levels**

**27. The middle value with half of the data above and half below it**

**28. The distribution of a variable restricting the who to consider only a smaller group of individuals**

**29. A point that does not fit the overall pattern seen in the scatterplot**

**30. The linear equation y-hat = b0 + b1x that satisfies the least squares criterion**

**31. To describe this aspect of a distribution - look for single vs. multiple modes - and symmetry vs. skewness**

**32. A variable in which the numbers act as numerical values; always has units**

**33. Bias introduced to a sample when a large fraction of those sampled fails to respond**

**34. A variable that names categories (whether with words or numerals)**

**35. The process - intervention - or other controlled circumstance applied to randomly assigned experimental units**

**36. The natural tendency of randomly drawn samples to differ**

**37. A variable whose levels are controlled by the experimenter**

**38. Gives the possible values of the variable and the relative frequency of each value**

**39. In a normal model - about 68% of values fall within 1 standard deviation of the mean - about 95% fall within 2 standard deviations of the mean - and about 99.7% fall within 3 standard deviations of the mean**

**40. An event is this if we know what outcomes could happen - but not which particular values will happen**

**41. A distribution is this if the two halves on either side of the center look approximately like mirror images of each other**

**42. Individuals on whom an experiment is performed**

**43. The parts of a distribution that typically trail off on either side; they can be characterized as long or short**

**44. A list of individuals from whom the sample is drawn**

**45. A treatment known to have no effect - administered so that all groups experience the same conditions**

**46. An individual result of a component of a simulation**

**47. Any systematic failure of a sampling method to represent its population; common errors are voluntary response - undercoverage - nonresponse ____ - and response ____**

**48. A sampling design in which the population is divided into several subpopulations - and random samples are then drawn from each stratum**

**49. A quantity or amount adopted as a standard of measurement - such as dollars - hours - or grams**

**50. A numerically valued attribute of a model for a population**