Since mathematical models (regression models) are often used to predict the relationship between paired data elements, it is important to understand how to choose a model that will be a "good fit" for the particular data set. There are several things to keep in mind when attempting to develop a model that will be a "good fit": |
Linear based regression models:
Other regressions:
|
||||||||||||||||||||
2. Calculate
a correlation coefficient, r (for some models). The correlation coefficient measures the strength and the direction of a linear relationship between two variables. A value of | r | near one may indicate a "good fit". |
||||||||||||||||||||
3. Calculate
a coefficient of determination, r2 (R2). The coefficient of determination represents the percent of the data that is the closest to the line of best fit. For example, if r = 0.922, then r 2 = 0.850, which means that 85% of the total variation in y can be explained by the linear relationship between x and y (as described by the regression equation). The other 15% of the total variation in y remains unexplained. Do not place too much importance on
small differences between r2 values, such as r2 = 0.987 and
r2 = 0.984. Also, keep in mind that r,
r2 and R2 values cannot
be directly compared when calculating certain regression models. |
||||||||||||||||||||
4. Examine the residuals. Examine the scatter plot of the residuals, which depicts the measure of the signed distances between the actual data values and the outputs predicted by the model. A good linear model has residuals that are near zero and are randomly distributed. |
||||||||||||||||||||
5. Think about your answer. Is your choice realistic? Don't use a model that will lead to predicted values that are totally unrealistic.
|
||||||||||||||||||||
|
IMPORTANT!! |
Finding Your Way Around
TABLE of CONTENTS