How does the outlier affect the best fit line? You would generally need to use only one of these methods. The graphical procedure is shown first, followed by the numerical calculations. On the TI-83, 83+, or 84+, the graphical approach is easier. Or we can do this numerically by calculating each residual and comparing it to twice the standard deviation. ![]() Any data points that are outside this extra pair of lines are flagged as potential outliers. We can do this visually in the scatter plot by drawing an extra pair of lines that are two standard deviations above and below the best-fit line. The standard deviation used is the standard deviation of the residuals or errors. As a rough rule of thumb, we can flag any point that is located further than two standard deviations above or below the best-fit line as an outlier. However, we would like some guideline as to how far away a point needs to be in order to be considered an outlier. We could guess at outliers by looking at a graph of the scatterplot and best fit-line. Computer output for regression analysis will often identify both outliers and influential points so that you can examine them. To begin to identify an influential point, you can remove it from the data set and see if the slope of the regression line is changed significantly.Ĭomputers and many calculators can be used to identify outliers from the data. These points may have a big effect on the slope of the regression line. Influential points are observed data points that are far from the other observed data points in the horizontal direction. The key is to examine carefully what causes a data point to be an outlier.īesides outliers, a sample may contain one or a few points that are called influential points. Hold valuable information about the population under study and should remain included in the data. It is possible that an outlier is a result of erroneous data. Sometimes, for some reason or another, they should not be included in the analysis of the data. They have large “errors”, where the “error” or residual is the vertical distance from the line to the point. Outliers are observed data points that are far from the least squares line. The linear regression below was performed on a data set with a TI calculator.In some data sets, there are values (observed data points) called outliers. According to the linear regression equation, what would be the approximate value of y when x = 3?.What is the correlation coefficient and the coefficient of determination? Is the linear regression equation a good fit for the data?.What is the linear regression equation?.Use the information shown on the screen to answer the following questions: The linear regression below was performed on a data set with a TI calculator. Which of the following calculations will create the line of best fit on the TI-83?. ![]() This means that the linear regression equation is a moderately good fit, but not a great fit, for the data. You can see that r, or the correlation coefficient, is equal to 0.9486321738, while r 2, or the coefficient of determination, is equal to 0.8999030012. After pressing ENTER to choose LinReg(ax + b), press ENTER again, and you should see the following screen: In other words, to find the correlation coefficient and the coefficient of determination, after entering the data into your calculator, press STAT, go to the CALC menu, and choose LinReg(ax + b). The correlation coefficient and the coefficient of determination for the linear regression equation are found the same way that the linear regression equation is found. Is the linear regression equation a good fit for the data? \)ĭetermining the Correlation Coefficient and the Coefficient of Determinationĭetermine the correlation coefficient and the coefficient of determination for the linear regression equation that you found in Example B.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |