May 15, 2022

U.S./Euro foreign exchange rates between 01/01/2010 and 08/01/2018 were provided in a dataset. A graphical summary and hypothesis test were performed on this data to test a business analyst’s claim that the mean exchange rate is $1.25. Two main graphs helped best represent the data in this scenario: the scatter plot and the histogram. A graphical summary of the scatter plot provided a horizontal histogram; however, a separate vertical histogram with density lines was also used in the analysis to visualize the distribution better.

## Graphs to Visualize Data

The investigation into the data began with a vertical histogram. The normal density and kernel density estimate curves show frequency and normal distribution (Sturdivant et al., 2016). The histogram is multimodal, with several exchange rates occurring frequently. The most common was at the $1.3200 bin with a count of 23. This bin is within one standard deviation of the mean, as seen in the graphical and data summary below.

**Figure 1**

*Histogram with Normal Density and Kernel Density Estimate Lines*

The second and more representative of the graphs was a scatter plot using the regression and loess lines. The scatter plot is used for data given in pairs and visually identifies relationships between them (Sturdivant et al., 2016). The regression line draws out the best fit for the data (Sturdivant et al., 2016). As seen in Figure 2, exchange rates decreased over time. However, there is quite a fluctuation, so it is unlikely that a linear regression line could help determine an accurate prediction for future rates.

**Figure 2**

*Scatter Plot with Linear Regression and Leoss Lines*

Loess stands for *locally estimated scatterplot smoothing*. This line moves through the central tendency and is more flexible than a linear regression line (NIST/SEMATECH, n.d.). The loess line visualizes the relationship between the data pairs (NIST/SEMATECH, n.d.). The line was handy given the more than 100 scattered data points for the exchange rates. As shown in Figure 2, the chart shows many fluctuations, with a significant drop between 2011 and mid-2012 and between 2014 and 2017. Rates rose again between 2017 and 2018, dropping again before 2019.

## Graphical and Data Summary

Several statistical measurements and visualizations could be seen using PROC UNIVARIATE on both graphs to test for normality in rates. First, basic statistical measures for the 104 exchange rates between 2010 and 2018 reveal the mean to be $1.2471, the standard deviation to be 0.1131, the variance to be 0.0128, the range to be 0.3915, and the interquartile range to be 0.2122. The center or median of this data is close to the mean at 1.2742. Half of the exchange rates are under this value, and half are above.

**Figure 3**

*Basic Statistical Measures*

As seen in Figure 4 below, no outliers are identified on the box plot, but several extreme observations are noted in the data. These extreme values do not appear to affect the normal distribution. The maximum and minimum values were identified as 1.4460 and 1.0545, respectively. The variation or spread of the data is most noticeable on the histogram visualized in Figure 1 above. These are measured by the standard deviation and range, which indicate that the data is approximately normally distributed.

The horizontal histogram, box plot, and normal probability plot for this data can be seen in Figure 4 below. The normal probability plot demonstrates the normal distribution with a tight-knit grouping of the plot points around the diagonal reference line (Elliott & Woodward, 2016). These graphs provide adequate visualization to perform a hypothesis test on the average or mean exchange rate.

**Figure 4**

*Horizontal Histogram, Box Plot, and Normal Probability Plot*

## Hypothesis Test

A business analyst has claimed that the average U.S./Euro exchange rate is $1.25. The null hypothesis means “amounting to nothing” and is usually assumed to be the true statement in a hypothesis test (Sturdivant et al., 2016). For this business analyst’s claim, the null hypothesis (H_{0}) would be, “the average U.S./Euro exchange rate is not $1.25,” making the alternative hypothesis (H_{a}) “the average U.S./Euro exchange rate is $1.25.” To determine statistical significance for a hypothesis that does not include a greater or less than assumption, a two-tailed t-test with a p-value was performed using a 0.05 significance level. Figure 5 demonstrates the results of this test.

**Figure 5**

*The t-test with a p-value for Mean*

The data provides the p-values for all normality tests as p < 0.05. Therefore, we can reject the null hypothesis that the mean is not $1.25. Instead, we would favor the alternative hypothesis that the mean is $1.25. In conclusion, the business analyst’s claim that the mean U.S./Euro exchange rate is $1.25 is correct.

**References**

Elliott, A.C. & Woodward, W.A. (2016). *SAS essentials: Mastering SAS for data analytics 2 ^{nd} ed. *John Wiley & Sons, Inc. ISBN 978-1-119-04216-7

NIST/SEMATECH. (n.d.). *4.1.4.4. Loess (aka Lowess)* in *e-Handbook of statistical methods. *https://www.itl.nist.gov/div898/handbook/pmd/section1/pmd144.htm

Sturdivant, R., Pardoe, I., Berrier, I., & Watts, K. (2016). *Statistics for Data Analytics. *zyBook [online].