Week 7 Assignment – 100 points

Data Science And Big Data Analytics

Week 7 Assignment – 100 points

Given the yearly sales in yearly_sales .csv file, complete the following:

Show all the descriptive statistics of sales_total, including its standard deviation and variance.

Correlation of number_of_order to sales_total.

Plot the scatter graph of number_of_order to sales_total.

Perform linear regression of number_of_order to sales_total.

Draw the line of best fit (abline) over your graph.

Perform T test as shown below and show your conclusion.

Perform ANOVA test as shown below and show your conclusion.

T test

This is to test for the mean of one group; here we have sale_total.

t.test(sales_total, mu = 249) # R command for t test

H0: mu = 249 # null hypothesis

H1: mu ≠ 249 # alternative hypothesis

Rejection level = 0.05 (implies 95% confidence level)

Do not Reject H0 if p-value is <= 0.05

Reject H0 if p-value is > 0.05

ANOVA test

ANOVA is used to test the equality of mean for two groups; here we have Male and Female.

anova(lm(data = myData, sales_total ~ factor(gender))) # R command for ANOVA test.

H0: There is significant difference between Male and Female sales_total.

H1: There is no significant difference between Male and Female sales_total.

Rejection level = 0.05 (implies 95% confidence level)

