## Test your basic knowledge |

# AP Statistics Vocab

**Instructions:**

- Answer 50 questions in 15 minutes.
- If you are not ready to take this test, you can study here.
- Match each statement with the correct term.
- Don't refresh. All questions and answers are randomly picked and ordered every time you load a test.

This is a study tool. The 3 wrong answers for each question are randomly chosen from answers to other questions. So, you might find at times the answers obvious, but you will see it re-enforces your understanding as you take the test each time.

**1. Displays the 5-number summary as a central box with whiskers that extend to the non-outlying data values**

**2. The differences between data values and the corresponding values predicted by the regression model; ____ = observed value - predicted value**

**3. A distribution is this if the two halves on either side of the center look approximately like mirror images of each other**

**4. Having one mode; this is a useful term for describing the shape of a histogram when it's generally mound-shaped**

**5. Data points whose x-values are far from the mean of x are said to exert ____ on a linear model; with high enough ____ - residuals can appear to be deceptively small**

**6. Holds information about the same characteristic for many cases**

**7. This corresponding to a z-score gives the percentage of values in a standard normal distribution found at that z-score or below**

**8. A numerically valued attribute of a model for a population**

**9. Useful family of models for unimodal - symmetric distributions**

**10. Doing this is equivalent to changing its units**

**11. If data consist of two or more groups that have been thrown together - it is usually best to fit different linear models to each group than to try to fit a single model to all of the data**

**12. When omitting a point from the data results in a very different regression model - the point is an ____**

**13. All experimental units have an equal chance of receiving any treatment**

**14. Individuals on whom an experiment is performed**

**15. In a statistical display - each data value should be represented by the same amount of area**

**16. An arrangement of data in which each row represents a case and each column represents a variable**

**17. A sampling scheme that biases the sample in a way that gives a part of the population less representation than it has in the population**

**18. The square of the correlation between y and x; gives the fraction of the variability of y accounted for by the least squares linear regression on x; an overall measure of how successful the regression is in linearly relating y to x**

**19. When averages are taken across different groups - they can appear to contradict the overall averages**

**20. This of sample size n is one in which each set of n elements in the population has an equal chance of selection**

**21. Uses adjacent bars to show the distribution of vales in a quantitative variable; each bar represents the frequency (or relative frequency) of values falling in an interval of values**

**22. The natural tendency of randomly drawn samples to differ**

**23. When an observed difference is too large for us to believe that is is likely to have occurred naturally**

**24. Each predicted y-hat tends to be fewer standard deviations from its mean than its corresponding x was from its mean**

**25. Distributions with more than two modes**

**26. The parts of a distribution that typically trail off on either side; they can be characterized as long or short**

**27. In a normal model - about 68% of values fall within 1 standard deviation of the mean - about 95% fall within 2 standard deviations of the mean - and about 99.7% fall within 3 standard deviations of the mean**

**28. Anything in a survey design that influences response**

**29. The entire group of individuals or instances about whom we hope to learn**

**30. The sequence of several components representing events that we are pretending will take place**

**31. The sum of squared deviations from the mean - divided by the count minus one**

**32. A variable that is not explicitly part of a model but affects the way the variables in the model appear to be related**

**33. Found by summing all the data values and dividing by the count**

**34. Value calculated from data to summarize aspects of the data**

**35. A point that does not fit the overall pattern seen in the scatterplot**

**36. Places in order the effects that many re-expressions have on the data**

**37. Consists of the minimum and maximum - the quartiles Q1 and Q3 - and the median**

**38. When groups of experimental units are similar - it is a good idea to gather them together into these**

**39. A normal model with a mean of 0 and a standard deviation of 1**

**40. Numerically valued attribute of a model**

**41. An event is this if we know what outcomes could happen - but not which particular values will happen**

**42. The most basic situation in a simulation in which something happens at random**

**43. Done to eliminate units; values can be compared and combined even if the original variables had different units and magnitudes**

**44. Gives the possible values of the variable and the frequency or relative frequency of each value**

**45. A distribution that's roughly flat**

**46. A list of individuals from whom the sample is drawn**

**47. Gives a value in 'y-units per x-unit'; changes of one unit in x are associated with changes of b1 units in predicted values of y**

**48. A sampling design in which entire groups are chosen at random**

**49. The square root of the variance**

**50. A variable whose values are compared across different treatments**