## Test your basic knowledge |

# AP Statistics Vocab

**Instructions:**

- Answer 50 questions in 15 minutes.
- If you are not ready to take this test, you can study here.
- Match each statement with the correct term.
- Don't refresh. All questions and answers are randomly picked and ordered every time you load a test.

This is a study tool. The 3 wrong answers for each question are randomly chosen from answers to other questions. So, you might find at times the answers obvious, but you will see it re-enforces your understanding as you take the test each time.

**1. Graphs a dot for each case against a single axis**

**2. A scatterplot shows an association that is this if there is little scatter around the underlying relationship**

**3. When either those who could influence or evaluate the results is blinded**

**4. Any data point that stands away from the others; can be extraordinary by having a large residual or by having high leverage**

**5. Summarized with the standard deviation - interquartile range - and range**

**6. An observational study in which subjects are followed to observe future outcomes**

**7. Gives the possible values of the variable and the relative frequency of each value**

**8. A treatment known to have no effect - administered so that all groups experience the same conditions**

**9. Distributions with two modes**

**10. A normal model with a mean of 0 and a standard deviation of 1**

**11. Done to eliminate units; values can be compared and combined even if the original variables had different units and magnitudes**

**12. An observational study in which subjects are selected and then their previous conditions or behaviors are determined**

**13. Sampling schemes that combine several sampling methods**

**14. A positive ____ or association means that - in general - as one variable increases - so does the other; when increases in one variable generally correspond to decreases in the other - the association is negative**

**15. A variable that is not explicitly part of a model but affects the way the variables in the model appear to be related**

**16. This - b0 - gives a starting value in y-units; it's the y-hat-value when x is 0**

**17. Design Randomization occurring within blocks**

**18. When omitting a point from the data results in a very different regression model - the point is an ____**

**19. Bias introduced to a sample when individuals can choose on their own whether to participate in the sample**

**20. The lower of this is the value with a quarter of the data below it; the upper of this has a quarter of the data above it**

**21. In a statistical display - each data value should be represented by the same amount of area**

**22. A list of individuals from whom the sample is drawn**

**23. A distribution is this if it's not symmetric and one tail stretches out farther than the other**

**24. The entire group of individuals or instances about whom we hope to learn**

**25. When both those who could influence and evaluate the results are blinded**

**26. A sampling scheme that biases the sample in a way that gives a part of the population less representation than it has in the population**

**27. We do this by taking the logarithm - the square root - the reciprocal - or some other mathematical operation on all values in the data set**

**28. When an observed difference is too large for us to believe that is is likely to have occurred naturally**

**29. A sample is this if the statistics computed from it accurately reflect the corresponding population parameters**

**30. Data points whose x-values are far from the mean of x are said to exert ____ on a linear model; with high enough ____ - residuals can appear to be deceptively small**

**31. All experimental units have an equal chance of receiving any treatment**

**32. Any individual associated with an experiment who is not aware of how subjects have been allocated to treatment groups**

**33. This criterion specifies the unique line that minimizes the variance of the residuals or - equivalently - the sum of the squared residuals**

**34. The most basic situation in a simulation in which something happens at random**

**35. Shows the relationship between two quantitative variables measured on the same cases**

**36. The natural tendency of randomly drawn samples to differ**

**37. Although linear models provide an easy way to predict values of y for a given value of x - it is unsafe to predict for values of x far from the ones used to find the linear model equation; predictions should not be trusted**

**38. An arrangement of data in which each row represents a case and each column represents a variable**

**39. The distribution of a variable restricting the who to consider only a smaller group of individuals**

**40. Value calculated from data to summarize aspects of the data**

**41. Numerically valued attribute of a model**

**42. Found by substituting the x-value in the regression equation; they're the values on the fitted line**

**43. A sampling design in which entire groups are chosen at random**

**44. Displays counts and - sometimes - percentages of individuals falling into named categories on two or more variables; categorizes the individuals on all variables at once - to reveal possible patterns in one variable that may be contingent on the cate**

**45. A numerical summary of how tightly the values are clustered around the 'center'**

**46. Having one mode; this is a useful term for describing the shape of a histogram when it's generally mound-shaped**

**47. Shows how a 'whole' divides into categories by showing a wedge of a circle whose area corresponds to the proportion in each category**

**48. Holds information about the same characteristic for many cases**

**49. When doing this - consider their shape - center - and spread**

**50. A numerical measure of the direction and strength of a linear association**