AP Statistics Vocab

1. Graphs a dot for each case against a single axis

2. A scatterplot shows an association that is this if there is little scatter around the underlying relationship

3. When either those who could influence or evaluate the results is blinded

4. Any data point that stands away from the others; can be extraordinary by having a large residual or by having high leverage

5. Summarized with the standard deviation - interquartile range - and range

6. An observational study in which subjects are followed to observe future outcomes

7. Gives the possible values of the variable and the relative frequency of each value

8. A treatment known to have no effect - administered so that all groups experience the same conditions

9. Distributions with two modes

10. A normal model with a mean of 0 and a standard deviation of 1

11. Done to eliminate units; values can be compared and combined even if the original variables had different units and magnitudes

12. An observational study in which subjects are selected and then their previous conditions or behaviors are determined

13. Sampling schemes that combine several sampling methods

14. A positive ____ or association means that - in general - as one variable increases - so does the other; when increases in one variable generally correspond to decreases in the other - the association is negative

15. A variable that is not explicitly part of a model but affects the way the variables in the model appear to be related

16. This - b0 - gives a starting value in y-units; it's the y-hat-value when x is 0

17. Design Randomization occurring within blocks

18. When omitting a point from the data results in a very different regression model - the point is an ____

19. Bias introduced to a sample when individuals can choose on their own whether to participate in the sample

20. The lower of this is the value with a quarter of the data below it; the upper of this has a quarter of the data above it

21. In a statistical display - each data value should be represented by the same amount of area

22. A list of individuals from whom the sample is drawn

23. A distribution is this if it's not symmetric and one tail stretches out farther than the other

24. The entire group of individuals or instances about whom we hope to learn

25. When both those who could influence and evaluate the results are blinded

26. A sampling scheme that biases the sample in a way that gives a part of the population less representation than it has in the population

27. We do this by taking the logarithm - the square root - the reciprocal - or some other mathematical operation on all values in the data set

28. When an observed difference is too large for us to believe that is is likely to have occurred naturally

29. A sample is this if the statistics computed from it accurately reflect the corresponding population parameters

30. Data points whose x-values are far from the mean of x are said to exert ____ on a linear model; with high enough ____ - residuals can appear to be deceptively small

31. All experimental units have an equal chance of receiving any treatment

32. Any individual associated with an experiment who is not aware of how subjects have been allocated to treatment groups

33. This criterion specifies the unique line that minimizes the variance of the residuals or - equivalently - the sum of the squared residuals

34. The most basic situation in a simulation in which something happens at random

35. Shows the relationship between two quantitative variables measured on the same cases

36. The natural tendency of randomly drawn samples to differ

37. Although linear models provide an easy way to predict values of y for a given value of x - it is unsafe to predict for values of x far from the ones used to find the linear model equation; predictions should not be trusted

38. An arrangement of data in which each row represents a case and each column represents a variable

39. The distribution of a variable restricting the who to consider only a smaller group of individuals

40. Value calculated from data to summarize aspects of the data

41. Numerically valued attribute of a model

42. Found by substituting the x-value in the regression equation; they're the values on the fitted line

43. A sampling design in which entire groups are chosen at random

44. Displays counts and - sometimes - percentages of individuals falling into named categories on two or more variables; categorizes the individuals on all variables at once - to reveal possible patterns in one variable that may be contingent on the cate

45. A numerical summary of how tightly the values are clustered around the 'center'

46. Having one mode; this is a useful term for describing the shape of a histogram when it's generally mound-shaped

47. Shows how a 'whole' divides into categories by showing a wedge of a circle whose area corresponds to the proportion in each category

48. Holds information about the same characteristic for many cases

49. When doing this - consider their shape - center - and spread

50. A numerical measure of the direction and strength of a linear association