Isye 6414 Units 1 - 3 Review

Reviewed by Editorial Team

The ProProfs editorial team is comprised of experienced subject matter experts. They've collectively created over 10,000 quizzes and lessons, serving over 100 million users. Our team includes in-house content moderators and subject matter experts, as well as a global network of rigorously trained contributors. All adhere to our comprehensive editorial guidelines, ensuring the delivery of high-quality content.

Learn about Our Editorial Process

| By Omsaben

Omsaben

Community Contributor

Quizzes Created: 1 | Total Attempts: 220

| Attempts: 220

Quiz Flashcard

Questions

Feedback

During the Quiz End of Quiz

Difficulty

Easy First Hard First Sequential

1/78 Questions

If the constant variance assumption in ANOVA does not hold, the inference on the equality of the means will not be reliable.
- True
- False

About This Quiz

Quiz Preview

2.

The estimated regression coefficients are unbiased estimators.
- True
- False
Correct Answer

A. True

Explanation

The estimated regression coefficients being unbiased estimators means that, on average, they provide accurate estimates of the true population regression coefficients. In other words, there is no systematic tendency for the estimated coefficients to consistently overestimate or underestimate the true coefficients. This is an important property in regression analysis, as it allows us to make reliable inferences about the relationships between variables in the population based on our sample data.

Rate this question:
3.

Analysis of variance (ANOVA) is a multiple regression model.
- True
- False
Correct Answer

A. True

Explanation

ANOVA is not a multiple regression model. ANOVA is a statistical technique used to compare the means of two or more groups to determine if there are any statistically significant differences between them. It is used to analyze categorical independent variables, whereas multiple regression is used to analyze continuous independent variables. Therefore, the statement that ANOVA is a multiple regression model is incorrect.

Rate this question:
4.

In multiple linear regression, we study the relationship between one response variable and both predicting quantitative and qualitative variables.
- True
- False
Correct Answer

A. True

Explanation

This statement is true because in multiple linear regression, we can have one response variable (the variable we are trying to predict) and multiple predictor variables (both quantitative and qualitative). The goal is to study the relationship between the response variable and the predictor variables to understand how they influence each other.

Rate this question:
5.

Assuming that the data are normally distributed, under the simple linear model, the estimated variance has the following sampling distribution:
- Chi-square with n-2 degrees of freedom
- T-distribution with n-2 degrees of freedom
- Chi-square with n degrees of freedom
- T-distribution with n degrees of freedom
Correct Answer

A. Chi-square with n-2 degrees of freedom

Explanation

In the simple linear model, the estimated variance follows a chi-square distribution with n-2 degrees of freedom. This is because the estimation of the variance involves subtracting the mean from each observed value, resulting in n-2 degrees of freedom. The chi-square distribution is commonly used for hypothesis testing and confidence interval estimation in linear regression.

Rate this question:
6.

The estimators of the linear regression model are derived by:
- Minimizing the sum of squared differences between observed and expected values of the response variable.
- Maximizing the sum of squared differences between observed and expected values of the response variable.
- Minimizing the sum of absolute differences between observed and expected values of the response variable.
- Maximizing the sum of absolute differences between observed and expected values of the response variable.
Correct Answer

A. Minimizing the sum of squared differences between observed and expected values of the response variable.

Explanation

The estimators of the linear regression model are derived by minimizing the sum of squared differences between observed and expected values of the response variable. This is because the goal of linear regression is to find the line that best fits the data, and the sum of squared differences is a measure of how well the line fits the data. By minimizing this sum, we are finding the line that minimizes the overall error between the observed and expected values, resulting in the best fit line.

Rate this question:
7.

We can assess the assumption of constant-variance in linear regression by plotting the residuals against fitted values.
- True
- False
Correct Answer

A. True

Explanation

In linear regression, it is important to assess the assumption of constant variance, also known as homoscedasticity. This assumption states that the variability of the residuals should be constant across all levels of the predictor variables. By plotting the residuals (the differences between the observed and predicted values) against the fitted values (the predicted values), we can visually examine if the spread of the residuals is consistent across the range of predicted values. If the plot shows a consistent spread with no clear pattern, it suggests that the assumption of constant variance is met. Therefore, the statement is true.

Rate this question:
8.

Under the normality assumption, the estimator for β1 is a linear combination of normally distributed random variables.
- True
- False
Correct Answer

A. True

Explanation

Under the normality assumption, the estimator for β1 being a linear combination of normally distributed random variables is true. This assumption is often made in linear regression models, where it is assumed that the errors follow a normal distribution. The estimator for β1 is obtained through a combination of the observed data and the error term, and since both of these are normally distributed, the estimator for β1 is also normally distributed.

Rate this question:
9.

β1^ is an unbiased estimator for β0.
- True
- False
Correct Answer

A. False
10.

The only assumptions for a linear regression model are linearity, constant variance, and normality.
- True
- False
Correct Answer

A. False

Explanation

The statement is false because the assumptions for a linear regression model include linearity, constant variance, independence of errors, and normality of errors. In addition to linearity, constant variance, and normality, the assumption of independence of errors is also necessary for the model to be valid. This means that the errors or residuals should not be correlated with each other. Therefore, the given statement is incorrect as it does not include the assumption of independence of errors.

Rate this question:
11.

If one confidence interval in the pairwise comparison includes zero, we conclude that the two means are plausibly equal.
- True
- False
Correct Answer

A. True

Explanation

If one confidence interval in the pairwise comparison includes zero, it means that the difference between the means is not statistically significant. In other words, there is a possibility that the two means are equal. Therefore, we can conclude that the two means are plausibly equal.

Rate this question:
12.

The mean sum of square errors in ANOVA measures variability within groups.
- True
- False
Correct Answer

A. True

Explanation

The statement is true because the mean sum of square errors (MSE) in ANOVA is a measure of the variability within groups. It calculates the average of the squared differences between each individual data point and the mean of its respective group. This measure helps to assess how much the data points within each group deviate from their group mean, indicating the level of variability within the groups. Therefore, the statement is correct.

Rate this question:
13.

For assessing the normality assumption of the ANOVA model, we can use the quantile-quantile normal plot and the historgram of the residuals.
- True
- False
Correct Answer

A. True

Explanation

The quantile-quantile normal plot and the histogram of the residuals are both graphical tools used to assess the normality assumption of the ANOVA model. The quantile-quantile normal plot compares the observed quantiles of the residuals to the quantiles of a normal distribution, and if the points fall approximately along a straight line, it suggests that the residuals are normally distributed. The histogram of the residuals provides a visual representation of the distribution of the residuals, and if it resembles a bell-shaped curve, it indicates normality. Therefore, the statement is true.

Rate this question:
14.

We can assess the assumption of constant-variance by plotting the residuals against fitted values.
- True
- False
Correct Answer

A. True

Explanation

The statement is true because plotting the residuals against fitted values allows us to visually examine if there is a consistent pattern in the spread of the residuals. If the spread of the residuals appears to be relatively constant across all levels of the fitted values, it suggests that the assumption of constant variance is met. On the other hand, if there is a clear pattern or trend in the spread of the residuals, it indicates that the assumption of constant variance may be violated. Therefore, plotting the residuals against fitted values is a useful tool for assessing the assumption of constant variance.

Rate this question:
15.

The only objective of multiple linear regression is prediction.
- True
- False
Correct Answer

A. False

Explanation

The statement "The only objective of multiple linear regression is prediction" is false. While prediction is indeed one of the main objectives of multiple linear regression, it is not the only objective. Another important objective of multiple linear regression is to understand the relationships between the independent variables and the dependent variable. By analyzing the coefficients of the independent variables, we can determine the strength and direction of these relationships, which provides valuable insights for decision-making and understanding the underlying factors influencing the dependent variable.

Rate this question:
16.

In order to make statistical inference on the regression coefficients, we need to estimate the variance of the error terms.
- True
- False
Correct Answer

A. True

Explanation

In order to make statistical inference on the regression coefficients, we need to estimate the variance of the error terms. This is because the error terms represent the variability or randomness in the relationship between the dependent and independent variables. By estimating the variance of the error terms, we can assess the precision and significance of the regression coefficients, which allows us to make inferences about the relationship between the variables in the population. Therefore, the statement is true.

Rate this question:
17.

The estimated regression coefficient corresponding to a predicting variable will likely be different in the model with only one predicting variable alone versus in a model with multiple predicting variables.
- True
- False
Correct Answer

A. True

Explanation

In a model with only one predicting variable, the estimated regression coefficient represents the relationship between that variable and the outcome variable in isolation. However, in a model with multiple predicting variables, the estimated regression coefficient represents the relationship between the predicting variable and the outcome variable while controlling for the effects of other variables. Therefore, it is likely that the estimated regression coefficient will be different in the two models.

Rate this question:
18.

The assumption of normality:
- It is needed for deriving the estimators of the regression coefficients.
- It is not needed for linear regression modeling and inference.
- It is needed for the sampling distribution of the estimators of the regression coefficients and hence for inference.
- It is needed for deriving the expectation and variance of the estimators of the regression coefficients.
Correct Answer

A. It is needed for the sampling distribution of the estimators of the regression coefficients and hence for inference.

Explanation

The assumption of normality is necessary for the sampling distribution of the estimators of the regression coefficients and therefore for inference. This assumption allows us to make inferences about the population parameters based on the sample data. Without this assumption, we cannot accurately estimate the regression coefficients and make valid statistical inferences.

Rate this question:
19.

The larger the coefficient of determination or R-squared, the higher the variability explained by the simple linear regression model.
- True
- False
Correct Answer

A. True

Explanation

The coefficient of determination, also known as R-squared, measures the proportion of the total variability in the dependent variable that is explained by the independent variable in a simple linear regression model. A higher R-squared value indicates that a larger percentage of the variability in the dependent variable can be attributed to the independent variable, meaning that the model is better at explaining the relationship between the variables. Therefore, the statement that the larger the coefficient of determination or R-squared, the higher the variability explained by the simple linear regression model is true.

Rate this question:
20.

The estimators of the error term variance and of the regression coefficients are random variables.
- True
- False
Correct Answer

A. True

Explanation

The estimators of the error term variance and of the regression coefficients are random variables because they are calculated using sample data, which is subject to variation. The error term variance estimator is based on the residuals, which are the differences between the observed and predicted values, and these residuals can vary from sample to sample. Similarly, the regression coefficient estimators are calculated using the sample data, and different samples can yield different coefficient estimates. Therefore, both the error term variance and the regression coefficients estimators are random variables.

Rate this question:
21.

Only the log-transformation of the response variable can be used when the normality assumption does not hold.
- True
- False
Correct Answer

A. False

Explanation

The statement is false because there are other methods that can be used when the normality assumption does not hold. One alternative is to use non-parametric statistical tests, which do not rely on the assumption of normality. Additionally, transformations other than the log-transformation, such as square root or reciprocal transformations, can also be applied to the response variable to achieve normality.

Rate this question:
22.

Controlling variables used in multiple linear regression are used to control for bias in the sample.
- True
- False
Correct Answer

A. True

Explanation

Controlling variables in multiple linear regression is indeed used to control for bias in the sample. By including these variables in the regression model, we can account for their potential influence on the dependent variable and isolate the relationship between the independent variables and the dependent variable. This helps to minimize the impact of confounding factors and ensure that the estimated coefficients are more accurate and reliable. Therefore, the statement is true.

Rate this question:
23.

The estimators for the regression coefficients are:
- Biased but with small variance
- Unbiased under normality assumptions but biased otherwise.
- Unbiased under normality assumptions but biased otherwise.
- Unbiased regardless of the distribution of the data.
Correct Answer

A. Unbiased regardless of the distribution of the data.

Explanation

The correct answer is "Unbiased regardless of the distribution of the data." This means that the estimators for the regression coefficients are not affected by the distribution of the data. They provide unbiased estimates of the true regression coefficients, regardless of whether the data follows a normal distribution or not. This is a desirable property for estimators as it ensures that the estimates are not systematically too high or too low on average.

Rate this question:
24.

Which one is correct?
- The prediction intervals need to be corrected for simultaneous inference when multiple predictions are made jointly.
- The prediction intervals are centered at the predicted value.
- The sampling distribution of the prediction of a new response is a t-distribution.
- All of the above.
Correct Answer

A. All of the above.

Explanation

The correct answer is "All of the above." This is because all three statements are true. The prediction intervals do need to be corrected for simultaneous inference when multiple predictions are made jointly. The prediction intervals are indeed centered at the predicted value. Additionally, the sampling distribution of the prediction of a new response follows a t-distribution. Therefore, all three statements are correct.

Rate this question:
25.

If the confidence interval for a regression coefficient contains the value zero, we interpret that the regression coefficient is definitely equal to zero.
- True
- False
Correct Answer

A. False

Explanation

It is plausible, but not definite.

Rate this question:
26.

If one confidence interval in the pairwise comparison includes zero under ANOVA, we conclude that the two corresponding means are plausibly equal.
- True
- False
Correct Answer

A. True

Explanation

If the confidence interval in the pairwise comparison includes zero under ANOVA, it means that there is a possibility that the difference between the two means is zero or very close to zero. This suggests that the two means are plausibly equal, as there is not enough evidence to conclude otherwise.

Rate this question:
27.

We do not need to assume normality of the response variable for making inference on the regression coefficients.
- True
- False
Correct Answer

A. False

Explanation

The statement is false because in order to make inference on the regression coefficients, we typically assume that the response variable follows a normal distribution. This assumption is necessary for conducting hypothesis tests and constructing confidence intervals. Without assuming normality, it would be difficult to make accurate inferences about the relationship between the predictor variables and the response variable.

Rate this question:
28.

The number of degrees of freedom of the χ2 (chi-square) distribution for the variance estimator is N−k+1 where k is the number of samples.
- True
- False
Correct Answer

A. False

Explanation

The correct answer is False. The number of degrees of freedom of the χ2 (chi-square) distribution for the variance estimator is N-1, not N-k+1. The degrees of freedom in this case is equal to the number of samples minus 1.

Rate this question:
29.

The fitted values are defined as:
- The difference between observed and expected responses.
- The regression line with parameters replaced with the estimated regression coefficients.
- The regression line.
- The response values.
Correct Answer

A. The regression line with parameters replaced with the estimated regression coefficients.

Explanation

The fitted values are calculated by replacing the parameters in the regression line with the estimated regression coefficients. These coefficients are estimated based on the observed data and represent the best-fit line that minimizes the sum of squared differences between the observed and predicted values. Therefore, the fitted values represent the predicted values of the response variable based on the estimated regression line.

Rate this question:
30.

The variability in the prediction comes from:
- The variability due to a new measurement.
- The variability due to estimation
- The variability due to a new measurement and due to estimation.
- None of the above.
Correct Answer

A. The variability due to a new measurement and due to estimation.

Explanation

The correct answer is "The variability due to a new measurement and due to estimation." This means that the prediction can vary because of both the uncertainty in the new measurement taken and the inherent variability in the estimation process. Both factors contribute to the overall variability in the prediction.

Rate this question:
31.

Which one is correct?
- A multiple linear regression model with p predicting variables but no intercept has p model parameters.
- The interpretation of the regression coefficients is the same whether or not interaction terms are included in the model.
- Multiple linear regression is a general model encompassing both ANOVA and simple linear regression.
- None of the above.
Correct Answer

A. Multiple linear regression is a general model encompassing both ANOVA and simple linear regression.

Explanation

The correct answer is that multiple linear regression is a general model encompassing both ANOVA and simple linear regression. This means that multiple linear regression can be used to analyze data in a way that is equivalent to both ANOVA and simple linear regression. It allows for the examination of the relationship between multiple predictor variables and a single outcome variable, taking into account the potential interactions between the predictors. This makes it a versatile and powerful tool for analyzing data in various research fields.

Rate this question:
32.

The one-way ANOVA is a linear regression model with one qualitative predicting variable.
- True
- False
Correct Answer

A. True

Explanation

The one-way ANOVA is a statistical test used to compare the means of three or more groups. It is a linear regression model because it involves fitting a line to the data and estimating the relationship between the independent and dependent variables. In this case, the qualitative predicting variable refers to the categorical variable used to group the data into different levels or categories. Therefore, the statement that the one-way ANOVA is a linear regression model with one qualitative predicting variable is true.

Rate this question:
33.

In the regression model, the variable of interest for study is the response variable.
- True
- False
Correct Answer

A. True

Explanation

In a regression model, the response variable is the variable of interest for study. This means that it is the variable that we are trying to understand, predict, or explain using other variables in the model. The response variable is also sometimes referred to as the dependent variable or the outcome variable. It is the variable that we want to analyze and study the relationship with other variables in the regression model. Therefore, the statement "the variable of interest for study is the response variable" is true.

Rate this question:
34.

A negative value of β1 is consistent with a direct relationship between x and Y.
- True
- False
Correct Answer

A. False

Explanation

A negative value of β1 is consistent with an *inverse* relationship between x and Y.

Rate this question:
35.

If one confidence interval in the pairwise comparison includes only positive values, we conclude that the difference in means is statistically significantly positive.
- True
- False
Correct Answer

A. True

Explanation

If the confidence interval in a pairwise comparison includes only positive values, it means that the lower limit of the interval is greater than zero. This indicates that there is a statistically significant difference between the means, and the difference is positive. Therefore, we can conclude that the difference in means is statistically significantly positive.

Rate this question:
36.

The estimator σ^2 is a fixed variable.
- True
- False
Correct Answer

A. False

Explanation

The statement "The estimator σ^2 is a fixed variable" is false. An estimator is a statistic used to estimate an unknown parameter, and it is not a fixed value. The estimator σ^2 represents the estimated variance and can vary depending on the sample data used to calculate it. Therefore, it is not a fixed variable.

Rate this question:
37.

The error term variance estimator has a χ2 (chi-squared) distribution with n−11 degrees of freedom for a multiple regression model with 10 predictors.
- True
- False
Correct Answer

A. True

Explanation

The error term variance estimator in a multiple regression model has a chi-squared distribution with n-1 degrees of freedom, where n is the number of observations. In this case, the model has 10 predictors, so the degrees of freedom for the error term variance estimator would be n-11. Therefore, the statement is true.

Rate this question:
38.

We detect departure from the assumption of constant variance
- When the residuals vs fitted values are larger in the ends but smaller in the middle.
- When the residuals vs fitted are scattered randomly around the zero line.
- When the histogram does not have a symmetric shape.
- All of the above.
Correct Answer

A. When the residuals vs fitted values are larger in the ends but smaller in the middle.

Explanation

When the residuals vs fitted values are larger in the ends but smaller in the middle, it suggests a departure from the assumption of constant variance. This pattern indicates heteroscedasticity, which means that the variability of the residuals is not constant across all levels of the predictor variable. In other words, the spread of the residuals is not the same throughout the range of the predicted values. This violation of the assumption can affect the reliability and accuracy of the regression model.

Rate this question:
39.

In evaluating a simple linear model:
- There is a direct relationship between coefficient of variation and the correlation between the predicting and response variables.
- The coefficient of variation is interpreted as the percentage of variability in the response variable explained by the model.
- Residual analysis is used for goodness of fit assessment.
- All of the above.
Correct Answer

A. All of the above.

Explanation

The given statement is correct because all three statements are true. The coefficient of variation is a measure of the relative variability of the response variable, and it is directly related to the correlation between the predicting and response variables. A higher coefficient of variation indicates a stronger relationship between the variables. Additionally, the coefficient of variation can be interpreted as the percentage of variability in the response variable that is explained by the model. Lastly, residual analysis is commonly used to assess the goodness of fit of a linear model, making it a valid method for evaluating the model. Therefore, all of the above statements are true.

Rate this question:
40.

Which is correct?
- If we reject the test of equal means, we conclude that all treatment means are not equal.
- If we do not reject the test of equal means, we conclude that means are definitely all equal
- If we reject the test of equal means, we conclude that some treatment means are not equal.
- None of the above.
Correct Answer

A. If we reject the test of equal means, we conclude that some treatment means are not equal.

Explanation

If we reject the test of equal means, it means that there is evidence to suggest that at least one treatment mean is different from the others. This conclusion is based on the assumption that if all treatment means were equal, the test would not have rejected the null hypothesis. Therefore, the correct answer is that if we reject the test of equal means, we conclude that some treatment means are not equal.

Rate this question:
41.

The objective of multiple linear regression is:
- To predict future new responses.
- To model the association of explanatory variables to a response variable accounting for controlling factors.
- To test hypothesis using statistical inference on the model.
- All of the above.
Correct Answer

A. All of the above.

Explanation

The objective of multiple linear regression is to predict future new responses, model the association of explanatory variables to a response variable accounting for controlling factors, and test hypotheses using statistical inference on the model. This means that all of the given options are correct objectives of multiple linear regression.

Rate this question:
42.

The residuals in simple linear regression have constant variance.
- True
- False
Correct Answer

A. True

Explanation

In simple linear regression, the residuals represent the difference between the observed values and the predicted values. The assumption of constant variance, also known as homoscedasticity, means that the variability of the residuals is consistent across all levels of the predictor variable. This assumption is important because if the residuals have non-constant variance, it can lead to biased and inefficient estimates of the regression coefficients. Therefore, the statement that the residuals in simple linear regression have constant variance is true.

Rate this question:
43.

In a multiple linear regression model with 6 predicting variables but without intercept, there are 7 parameters to estimate.
- True
- False
Correct Answer

A. True

Explanation

In a multiple linear regression model without an intercept, each predicting variable is considered as a separate parameter to estimate. Since there are 6 predicting variables, there will be 6 parameters to estimate. Additionally, in this case, there is also an additional parameter for the slope of the regression line. Therefore, the total number of parameters to estimate would be 6 + 1 = 7. Hence, the given statement is true.

Rate this question:
44.

We cannot estimate a multiple linear regression model if the predicting variables are linearly dependent.
- True
- False
Correct Answer

A. True

Explanation

In multiple linear regression, we aim to estimate the relationship between a dependent variable and multiple independent variables. However, if the independent variables are linearly dependent, it means that one or more of the independent variables can be expressed as a linear combination of the others. This leads to a problem called multicollinearity, which makes it impossible to estimate the coefficients accurately. Therefore, it is true that we cannot estimate a multiple linear regression model if the predicting variables are linearly dependent.

Rate this question:
45.

The constant variance is diagnos=ted using the quantile-quantile normal plot.
- True
- False
Correct Answer

A. False

Explanation

The constant variance is not diagnosed using the quantile-quantile normal plot. The quantile-quantile normal plot is used to check the normality of the residuals in a statistical model. Constant variance is typically diagnosed using other diagnostic plots such as a plot of residuals against fitted values or a plot of residuals against a predictor variable. Therefore, the given statement is false.

Rate this question:
46.

The hypothesis test for whether a subset of regression coefficients are all equal to zero is a partial F-test.
- True
- False
Correct Answer

A. True

Explanation

The explanation for the given correct answer is that a partial F-test is used to test whether a subset of regression coefficients, which represents a specific group of independent variables, are all equal to zero. This test is commonly used in regression analysis to determine the significance of a group of variables in explaining the dependent variable. Therefore, it is correct to say that the hypothesis test for whether a subset of regression coefficients are all equal to zero is a partial F-test.

Rate this question:
47.

We need to assume normality of the response variable for making inference on the regression coefficients.
- True
- False
Correct Answer

A. True

Explanation

In order to make accurate inferences on the regression coefficients, it is necessary to assume that the response variable follows a normal distribution. This assumption allows for the use of statistical techniques that rely on normality, such as hypothesis testing and confidence intervals. Without this assumption, the validity of the inference may be compromised. Therefore, it is important to assume normality of the response variable when making inferences on the regression coefficients.

Rate this question:
48.

We can use the normal test to test whether a regression coefficient is equal to zero.
- True
- False
Correct Answer

A. False

Explanation

The statement is false because the normal test is not used to test whether a regression coefficient is equal to zero. The normal test is used to test whether the coefficient follows a normal distribution, not whether it is equal to zero. To test whether a regression coefficient is equal to zero, we typically use a t-test or a hypothesis test with appropriate null and alternative hypotheses.

Rate this question:
49.

The objective of the pairwise comparison is:
- To find which means are equal.
- To identify the statistically significantly different means.
- To find the estimated means which are greater or lower than other.
- None of the above.
Correct Answer

A. To identify the statistically significantly different means.

Explanation

The objective of pairwise comparison is to identify the statistically significantly different means. This means that the purpose of this method is to compare different groups or treatments and determine if there is a significant difference between them. By conducting pairwise comparisons, researchers can determine which means are significantly different from each other, helping to identify any significant effects or differences in the data.

Rate this question:

Quiz Review Timeline (Updated): Sep 1, 2024 +

Our quizzes are rigorously reviewed, monitored and continuously updated by our expert board to maintain accuracy, relevance, and timeliness.

Current Version
Sep 01, 2024

Quiz Edited by
ProProfs Editorial Team
Feb 11, 2018

Quiz Created by
Omsaben

Back to top

Isye 6414 Units 1 - 3 Review

If the constant variance assumption in ANOVA does not hold, the inference on the equality of the means will not be reliable.

Quiz Preview

The estimated regression coefficients are unbiased estimators.

Analysis of variance (ANOVA) is a multiple regression model.

In multiple linear regression, we study the relationship between one response variable and both predicting quantitative and qualitative variables.

Assuming that the data are normally distributed, under the simple linear model, the estimated variance has the following sampling distribution:

The estimators of the linear regression model are derived by:

We can assess the assumption of constant-variance in linear regression by plotting the residuals against fitted values.

Under the normality assumption, the estimator for β1 is a linear combination of normally distributed random variables.

β1^ is an unbiased estimator for β0.

The only assumptions for a linear regression model are linearity, constant variance, and normality.

If one confidence interval in the pairwise comparison includes zero, we conclude that the two means are plausibly equal.

The mean sum of square errors in ANOVA measures variability within groups.

For assessing the normality assumption of the ANOVA model, we can use the quantile-quantile normal plot and the historgram of the residuals.

We can assess the assumption of constant-variance by plotting the residuals against fitted values.

The only objective of multiple linear regression is prediction.

In order to make statistical inference on the regression coefficients, we need to estimate the variance of the error terms.

The estimated regression coefficient corresponding to a predicting variable will likely be different in the model with only one predicting variable alone versus in a model with multiple predicting variables.

The assumption of normality:

The larger the coefficient of determination or R-squared, the higher the variability explained by the simple linear regression model.

The estimators of the error term variance and of the regression coefficients are random variables.

Only the log-transformation of the response variable can be used when the normality assumption does not hold.

Controlling variables used in multiple linear regression are used to control for bias in the sample.

The estimators for the regression coefficients are:

Which one is correct?

If the confidence interval for a regression coefficient contains the value zero, we interpret that the regression coefficient is definitely equal to zero.

If one confidence interval in the pairwise comparison includes zero under ANOVA, we conclude that the two corresponding means are plausibly equal.

We do not need to assume normality of the response variable for making inference on the regression coefficients.

The number of degrees of freedom of the χ2 (chi-square) distribution for the variance estimator is N−k+1 where k is the number of samples.

The fitted values are defined as:

The variability in the prediction comes from:

Which one is correct?

The one-way ANOVA is a linear regression model with one qualitative predicting variable.

In the regression model, the variable of interest for study is the response variable.

A negative value of β1 is consistent with a direct relationship between x and Y.

If one confidence interval in the pairwise comparison includes only positive values, we conclude that the difference in means is statistically significantly positive.

The estimator σ^2 is a fixed variable.

The error term variance estimator has a χ2 (chi-squared) distribution with n−11 degrees of freedom for a multiple regression model​​​​​​​ with 10 predictors.

We detect departure from the assumption of constant variance

In evaluating a simple linear model:

Which is correct?

The objective of multiple linear regression is:

The residuals in simple linear regression have constant variance.

In a multiple linear regression model with 6 predicting variables but without intercept, there are 7 parameters to estimate.

We cannot estimate a multiple linear regression model if the predicting variables are linearly dependent.

The constant variance is diagnos=ted using the quantile-quantile normal plot.

The hypothesis test for whether a subset of regression coefficients are all equal to zero is a partial F-test.

We need to assume normality of the response variable for making inference on the regression coefficients.

We can use the normal test to test whether a regression coefficient is equal to zero.

The objective of the pairwise comparison is:

The error term variance estimator has a χ2 (chi-squared) distribution with n−11 degrees of freedom for a multiple regression model with 10 predictors.