五月天青色头像情侣网名,国产亚洲av片在线观看18女人,黑人巨茎大战俄罗斯美女,扒下她的小内裤打屁股

歡迎光臨散文網(wǎng) 會(huì)員登陸 & 注冊(cè)

R語言代做編程輔導(dǎo)和解答Day 2 Lab Activities - MAT 500:Linear Regression and

2022-11-17 17:45 作者:拓端tecdat  | 我要投稿

全文鏈接:http://tecdat.cn/?p=30396

Directions: Complete the following exercises using the code discussed during computer lab. Save your work in an R script as well as a Word document containing the necessary output and comments. Be sure to use notes in the script to justify any computations. If you have any questions, do not hesitate to ask.

1 Simple Linear Regression

  1. Load the data set pressure from the datasets package in R. Perform a Simple Linear Regression on the two variables. Provide the regression equation, coefficients table, and anova table. Summarize your findings. What is the relationship between the t statistic for temperature and the F statistic in the ANOVA table?


    1. Refer to the previous exercise. Check the assumptions on the regression model and report your results. Be sure to include the scatterplot with regression equation, normal QQ plot, and residual plot. Explain what you see.


    1. Refer to exercise 1. Experiment with different transformations of the data to improve the model. What is the best transformation?

2 Multiple Linear Regression

  1. Load the swiss data set from the ‘datasets’ package in R. Find the correlation matrix and print the pairwise scatterplots. What variables seem to be related?

  2. Run a Multiple Regression on Fertility using all of the other variables as predictors. Print the model and coefficients table. Explain the meaning of the significant coefficients.

  3. Check the assumptions using the diagnostic tests mentioned in this section. Discuss your findings.

  4. Run a stepwise selection method to reduce the dimension of the model using the backward direction. Print the new model and new coefficients table. Check the assumptions and discuss any changes.

  5. Use Mallow’s Cp to determine the best model. Does your choice match the model in the previous exercise?

3 Principal Component Analysis

  1. Load the longley data set from the R datasets package. This data set was used to predict a countries GNP based on several variables. Find the correlation matrix of the explanatory variables.

  2. Refer to the previous exercise. Perform a principal component analysis on the explanatory variables using the correlation matrix. Use a scree plot to determine the optimal number of components and report them. Try to explain the meaning behind each component.

  3. Refer to the previous exercise. What proportion of variation does each component explain? What is the total cumulative variance explained by the optimal number of components?

Day 2 Lab Activities - Solutions解答

Simple Linear Regression

1.? > pressure.lm <- lm(pressure ~ temperature, data = pressure)

> summary(pressure.lm)?Call:lm(formula = pressure ~ temperature, data = pressure)?Residuals:??? Min????? 1Q? Median????? 3Q???? Max-158.08 -117.06? -32.84?? 72.30? 409.43?Coefficients:???????????? Estimate Std. Error t value Pr(>|t|)???(Intercept) -147.8989??? 66.5529? -2.222 0.040124 *?temperature??? 1.5124???? 0.3158?? 4.788 0.000171 ***---Signif. codes:? 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 1?Residual standard error: 150.8 on 17 degrees of freedomMultiple R-squared:? 0.5742,??? Adjusted R-squared:? 0.5492F-statistic: 22.93 on 1 and 17 DF,? p-value: 0.000171?> anova(pressure.lm)Analysis of Variance Table?Response: pressure??????????? Df Sum Sq Mean Sq F value?? Pr(>F)???temperature? 1 521530? 521530?? 22.93 0.000171 ***Residuals?? 17 386665?? 22745????????????????????---Signif. codes:? 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 1?

The temperature coefficient is positive so if there is a significant relationship between temperature and pressure, it is a direct relationship.? Since the p-value is less than 0.05, temperature is indeed significant in the model.? The relationship between the t statistic and the F statistic is t^2 = F.

?

?

?

2.??Linearity:?The scatterplot shows a clear violation of the linearity assumption.? The data appears to be exponentially increasing.? The standardized residual plot reinforces this observation.

????????Equal Variance:?? The lack of a linear relationship makes it difficult to determine the equality of variance in observations.

????Normality:?? The Normal Quantile plot shows a lack of linearity at the tails of the data set.? A Shapiro-Wilk test verifies that the residuals do not follow a normal distribution.

?? Shapiro-Wilk normality test data:? rstandard(pressure.lm)W = 0.8832, **p-value = 0.02438**

?

?

?

?

?

?

??

?

?

?

?

3.? Using a Box Cox transformation, the optimal transformation is either??

? or

????????

? where λ = 0.01

?

Multiple Linear Regression

1.??????????????????

???????????????? Fertility Agriculture Examination Education Catholic Infant.Mortality Fertility??????????? 1.000?????? 0.353????? -0.646??? -0.664??? 0.464??????????? 0.417 Agriculture????????? 0.353?????? 1.000????? -0.687??? -0.640??? 0.401?????????? -0.061 Examination???????? -0.646????? -0.687?????? 1.000???? 0.698?? -0.573?????????? -0.114 Education?????????? -0.664????? -0.640?????? 0.698???? 1.000?? -0.154?????????? -0.099 Catholic???????????? 0.464?????? 0.401????? -0.573??? -0.154??? 1.000??????? ????0.175 ![]()Infant.Mortality???? 0.417????? -0.061????? -0.114??? -0.099??? 0.175??????????? 1.000 ?*Related Variables:*? Fertility, Agriculture Fertility, Examination Fertility, Infant Mortality Agriculture, Examination Agriculture, Education Examination, Education

?

??

?

?

2.?

Call:lm(formula = Fertility ~ Agriculture + Examination + Education +??? Catholic + Infant.Mortality, data = swiss)?Residuals:???? Min?????? 1Q?? Median?????? 3Q????? Max-15.2743? -5.2617?? 0.5032?? 4.1198? 15.3213?Coefficients:???????????????? Estimate Std. Error t value Pr(>|t|)???(Intercept)????? 66.91518?? 10.70604?? 6.250 1.91e-07 ***Agriculture????? -0.17211??? 0.07030? -2.448? 0.01873 *?Examination????? -0.25801??? 0.25388? -1.016? 0.31546???Education??????? -0.87094??? 0.18303? -4.758 2.43e-05 ***Catholic????????? 0.10412??? 0.03526?? 2.953? 0.00519 **Infant.Mortality? 1.07705??? 0.38172?? 2.822? 0.00734 **---Signif. codes:? 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 1?Residual standard error: 7.165 on 41 degrees of freedomMultiple R-squared:? 0.7067,??? Adjusted R-squared:? 0.671F-statistic: 19.76 on 5 and 41 DF,? p-value: 5.594e-10?

All of the predictors are significant except examination.

?

3.??Residual Plot:?? There is a random pattern in the residual plot which causes no concern with the model fit.

???????

????????Normal Q-Q Plot:?? The data follows the diagonal line quite nicely, indicating that the residuals probably satisfy the normality assumption.

?

????????Scale - Location:?? The data is randomly scattered which indicates that the homoscedasticity assumption is probably met.

?

?

??

?

?

?

?

?

?

?

?

4.?

> swiss.step.b <- step(swiss.lm, direction = 'backward')Start:? AIC=190.69Fertility ~ Agriculture + Examination + Education + Catholic +??? Infant.Mortality? ?????????????????? Df Sum of Sq??? RSS??? AIC- Examination?????? 1???? 53.03 2158.1 189.86<none>????????????????????????? 2105.0 190.69- Agriculture?????? 1??? 307.72 2412.8 195.10- Infant.Mortality? 1??? 408.75 2513.8 197.03- Catholic????????? 1??? 447.71 2552.8 197.75- Education???????? 1?? 1162.56 3267.6 209.36?Step:? AIC=189.86Fertility ~ Agriculture + Education + Catholic + Infant.Mortality? ????????????????? ?Df Sum of Sq??? RSS??? AIC<none>????????????????????????? 2158.1 189.86- Agriculture?????? 1??? 264.18 2422.2 193.29- Infant.Mortality? 1??? 409.81 2567.9 196.03- Catholic????????? 1??? 956.57 3114.6 205.10- Education???????? 1?? 2249.97 4408.0 221.43?Call:lm(formula = Fertility ~ Agriculture + Education + Catholic +??? Infant.Mortality, data = swiss)?Residuals:???? Min?????? 1Q?? Median?????? 3Q????? Max-14.6765? -6.0522?? 0.7514?? 3.1664? 16.1422?Coefficients:???????????????? Estimate Std. Error t value Pr(>|t|)???(Intercept)????? 62.10131??? 9.60489?? 6.466 8.49e-08 ***Agriculture????? -0.15462??? 0.06819? -2.267? 0.02857 *?Education??????? -0.98026??? 0.14814? -6.617 5.14e-08 ***Catholic????????? 0.12467??? 0.02889?? 4.315 9.50e-05 ***Infant.Mortality? 1.07844??? 0.38187?? 2.824? 0.00722 **---Signif. codes:? 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 1?Residual standard error: 7.168 on 42 degrees of freedomMultiple R-squared:? 0.6993,??? Adjusted R-squared:? 0.6707F-statistic: 24.42 on 4 and 42 DF,? p-value: 1.717e-10

?

The new model does not include the examination variable.? Now all of the predictors are significant.

?

Residual Plot:?? There is a random pattern in the residual plot which causes no concern with the model fit.

???????

????????Normal Q-Q Plot:?? The data no longer seems to follow a precise normal distribution.? This assumption may now be violated.

?

????????Scale - Location:?? The data is randomly scattered which indicates that the homoscedasticity assumption is probably met.

?

?

?

?

?

?

?

?

?

?

?

?

?

?

?

?

?

?

?

? ?

5.? The two models that best fit Mallow's Cp are the model with all 5 variables or the model with the 4 variables Agriculture, Education, Catholic, and Infant.Mortality.? We prefer a simpler model in statistics, so the best model choice is the model with four explanatory variables.? This is the exact same model that backward selection had identified.

?

?

?

??

?

?

Principal Component Analysis

????????? Unemployed Armed.Forces Population? Year Employed Unemployed???????? 1.000?????? -0.177????? 0.687 0.668??? 0.502Armed.Forces????? -0.177??????? 1.000????? 0.364 0.417??? 0.457Population???? ?? ???0.687??????? 0.364????? 1.000 0.994??? 0.960Year??????????? ??????? ???0.668??????? 0.417????? 0.994 1.000??? 0.971Employed????? ?? ???0.502??????? 0.457????? 0.960 0.971??? 1.000![]()2.?????????????? ????? ?Comp.1? Comp.2Unemployed??? ? ?0.3633? 0.5988Armed.Forces? ??? ?0.2269 -0.7911Population???? 0.5261? 0.0435Year???????? ????? ?0.5291 -0.0024Employed??? ?????? ?0.5097 -0.1171

?

The first component is a standardized measure of GNP and the second component is difficult to interpret.

?

??

?

?

?

3.????

Component 1: **71.23%** Variance explained ??????? Component2:? **23.67%** Variance explained ? ??????? Cumulative Variance:? **94.89%**



R語言代做編程輔導(dǎo)和解答Day 2 Lab Activities - MAT 500:Linear Regression and的評(píng)論 (共 條)

分享到微博請(qǐng)遵守國(guó)家法律
堆龙德庆县| 宜君县| 孟村| 沙田区| 无极县| 昌平区| 天气| 鹤庆县| 长武县| 扎鲁特旗| 阳泉市| 屏东县| 新余市| 新民市| 海晏县| 都匀市| 蒲江县| 峡江县| 翼城县| 个旧市| 武陟县| 东宁县| 宁陕县| 临澧县| 治县。| 阿拉善右旗| 佳木斯市| 新源县| 济源市| 乌苏市| 和林格尔县| 曲麻莱县| 浏阳市| 双桥区| 天镇县| 辽中县| 友谊县| 百色市| 长春市| 咸丰县| 潼关县|