May 8, 2022

For this project, I will use the R program to analyze data in the *mtcars* dataset. The dataset information was called first, after which the *string_vector* code was used to analyze eight of the ten variables. Following this code, the *sapply* function calculated the mean, standard deviation, and maximum values for those variables. Individual z-scores of the maximum values were calculated separately using the formula **X – mean / standard deviation** for each variable (Sturdivant et al., 2016). The z-score tells analysts about outliers, or unusual values outside the normal distribution based on the mean or average for that variable. Demonstration of successful variable calculations in R and z-score interpretations are included below.

**Figure 1**

*Dataset with Calculated Mean, Standard Deviation, and Max Values*

**Figure 2**

*Z-scores of Max Values in *mtcars* Dataset.*

## Z-Score Interpretation

The mean, standard deviation, and maximum values are necessary to calculate the z-scores for the maximum values of the eight selected variables in the *mtcars *dataset. The z-score can help the analyst identify outliers when comparing data from unimodal and symmetric distributions (Sturdivant et al., 2016). Using the R program, these values were calculated and indicate several unusual values over 2 (Sturdivant et al., 2016). The unusual values are in *mpg, hp, drat, wt, qsec,* and *carb* variables, indicating they are more than 2 standard deviations away from—and in this case, above—the mean of that variable.

**Reference**

Sturdivant, R., Pardoe, I., Berrier, I., & Watts, K. (2016). *Statistics for Data Analytics. *zyBook [online].