You’ll work with frequency tables in order to see the range of values in your data. You’ll use graphical tools like histograms, stem and leaf plots, and boxplots to get a visual picture of how the data values are distributed. You’ll learn about descriptive statistics that reduce the contents of your data to a few values, such as the mean and standard deviation. Applying these tools is the fi rst step in the process of evaluating and interpreting the contents of your data set.
Variables and Descriptive Statistics
In this chapter you’ll learn about a branch of statistics called descriptive statistics. In descriptive statistics we use various mathematical tools to summarize the values of a data set. Our goal is to take data that may contain thousands of observations and reduce it to a few calculated values. For example, we might calculate the average salaries of employees at several companies in order to get a general impression about which companies pay the most, or we might calculate the range of salaries at those companies to convey the same idea.
To create a frequency table of home prices:
1. Click Descriptive Statistics from the StatPlus menu on the Add-Ins tab and then click Frequency Tables.
2. Click the Data Values button, click the Use Range Names option button, and click Price. Click the OK button. The Frequency Table command gives you three options for organizing your table. You can use discrete values so that the table is tabulated over individual price values, or you can organize the values into bins (you’ll learn about bins shortly). For now, leave Discrete as the selected option.
3. Click the Output button, click the New Worksheet option button, and type Price Table in the New Worksheet name box. Click the OK button.
4. Click OK to start generating the frequency table. Figure 4-2 displays the completed table.
A distribution is skewed if most of the values are clustered toward either the left or the right edge of the histogram. If the values are clustered toward the left edge of the histogram, this shows positive skewness; clustering toward the right edge of the histogram shows negative skewness. Skewed distributions often occur where the variable is constrained to have positive values. I