Types of Regression Analysis

319 pointsby _kcxzabout 7 years ago

13 comments

snicker7about 7 years ago

Despite what the article claims, normality is not actually an assumption of linear regression. It is "required" for doing F-tests (the F-distribution being related to the normal distribution), but it is not required for proving that the regression coefficients are consistent.

评论 #16833325 未加载

评论 #16833319 未加载

gmfawcettabout 7 years ago

> Assumptions of linear regression: There must be a linear relation between independent and dependent variables.That's not wrong, but it's a strong way to word it. If linear regression were only suitable when the variables were perfectly linearly related, it would get a lot less use. Practically, linear regression can be used when the relationship is linear-ish, at least in the interval of interest. In other words, you can choose to declare linearity as an assumption (and take responsibility for what that choice entails, and for the error it might introduce into your analysis).

评论 #16832733 未加载

评论 #16832929 未加载

评论 #16834741 未加载

评论 #16835187 未加载

siddbootsabout 7 years ago

A tool that I've found myself reaching for more and more often is Gaussian Process Regression [1] [2]* It allows you to model essentially arbitrary functions. The main model assumption is your choice of kernel, which defines the local correlation between nearby points.* You can draw samples from the distribution of all possible functions that fit your data.* You can quantify which regions of the function you have more or less certainty about.* Imagine this situation: you want to discover the functional relationship between the inputs and outputs of a long-running process. You can test any input you want, but it's not practical to exhaustively grid-search the input space. A Gaussian Process model can tell you which inputs to test next so as to gain the most information, which makes it perfect for optimising complex simulations. Used in this way, it's one means of implementing "Bayesian Optimisation" [3][1] <a href="https://en.wikipedia.org/wiki/Gaussian_process" rel="nofollow">https://en.wikipedia.org/wiki/Gaussian_process</a>[2] <a href="http://scikit-learn.org/stable/modules/generated/sklearn.gaussian_process.GaussianProcessRegressor.html#sklearn.gaussian_process.GaussianProcessRegressor" rel="nofollow">http://scikit-learn.org/stable/modules/generated/sklearn.gau...</a>[3] <a href="https://en.wikipedia.org/wiki/Bayesian_optimization" rel="nofollow">https://en.wikipedia.org/wiki/Bayesian_optimization</a>

评论 #16835003 未加载

thanatropismabout 7 years ago

Friends don't let friends use MS Word to produce equation screenshots. Not in the age of MathJax.

MichailPabout 7 years ago

Now this is a topic I desperately need. Can anyone here by any chance explain why would one choose predictors in multilinear regression that are NOT correlated to the target? I am having trouble understanding paper [1] where authors avoid using predictors that are correlated to target. Target is ozone concentration shown by referent instrument and predictors are low cost sensor outputs.[1] <a href="https://www.sciencedirect.com/science/article/pii/S092540051500355X" rel="nofollow">https://www.sciencedirect.com/science/article/pii/S092540051...</a> Section 4.1 about ozone predictors

评论 #16833654 未加载

评论 #16833680 未加载

评论 #16835085 未加载

SubiculumCodeabout 7 years ago

This article is obviously a jumping off point kind of article. Most people using linear regression have never even heard of things like ridge regression. So I like the article.However, there are at least two types of regression I'd add to the list, and a suggestion.:1 Multivariate Distance Matrix Regression (MDMR; Anderson, 2001; McArdle & Anderson, 2001).2. Regression with splines3. On polynomial regression, add mention of orthogonal polynomials.

评论 #16835270 未加载

评论 #16835773 未加载

projskiabout 7 years ago

Why did the article cover a basic term like "outlier" under "Terminologies related to regression" but omitted information about how to evaluate a regression model? I liked that there was some information at the bottom about "How to choose a regression model" that mentioned "you can select the final model based on Adjusted r-square, RMSE, AIC and BIC" but providing a little more context would make this post even better. Perhaps a link to a future blog post on the topic?

matchagauchoabout 7 years ago

Are there any ML APIs or web services that accept a vector and run various regression scenarios to identify optimal fit?I suppose vectors for both training and testing would be required.Would gladly pay $1-$5 per batch for a service to do this.

评论 #16833349 未加载

评论 #16833878 未加载

评论 #16832514 未加载

评论 #16833600 未加载

评论 #16834827 未加载

评论 #16832349 未加载

评论 #16832547 未加载

mnky9800nabout 7 years ago

Logistic regression is doing classification not regression. That is, it's assigning/predicting categories of data points instead of predicting some continuous value on some interval. Maybe this is splitting hairs but the way you evaluate a classification model is totally different than a regression one.

评论 #16834578 未加载

评论 #16834479 未加载

waynecochranabout 7 years ago

Don’t forget to put RANSAC on you list: <a href="https://en.m.wikipedia.org/wiki/Random_sample_consensus" rel="nofollow">https://en.m.wikipedia.org/wiki/Random_sample_consensus</a>

raisterabout 7 years ago

I was hoping one interesting graphic chart per Regression Analysis Type. That didn't happen, and I felt lost at sea. Please, improve the post on such amazing topic.

Myrmornisabout 7 years ago

> In simple words, regression analysis is used to model the relationship between a dependent variable and one or more independent variables.“model” isn’t a simple word.

thanatropismabout 7 years ago

This is just horrible quality material. What in the heck is this?> It is to be kept in mind that the coefficients which we get in quantile regression for a particular quantile should differ significantly from those we obtain from linear regression. If it is not so then our usage of quantile regression isn't justifiable. This can be done by observing the confidence intervals of regression coefficients of the estimates obtained from both the regressions.

评论 #16837685 未加载

13 comments

snicker7about 7 years ago

评论 #16833325 未加载

评论 #16833319 未加载

gmfawcettabout 7 years ago

评论 #16832733 未加载

评论 #16832929 未加载

评论 #16834741 未加载

评论 #16835187 未加载

siddbootsabout 7 years ago

评论 #16835003 未加载

thanatropismabout 7 years ago

Friends don't let friends use MS Word to produce equation screenshots. Not in the age of MathJax.

MichailPabout 7 years ago

评论 #16833654 未加载

评论 #16833680 未加载

评论 #16835085 未加载

SubiculumCodeabout 7 years ago

评论 #16835270 未加载

评论 #16835773 未加载

projskiabout 7 years ago

matchagauchoabout 7 years ago

评论 #16833349 未加载

评论 #16833878 未加载

评论 #16832514 未加载

评论 #16833600 未加载

评论 #16834827 未加载

评论 #16832349 未加载

评论 #16832547 未加载

mnky9800nabout 7 years ago

评论 #16834578 未加载

评论 #16834479 未加载

waynecochranabout 7 years ago

Don’t forget to put RANSAC on you list: <a href="https://en.m.wikipedia.org/wiki/Random_sample_consensus" rel="nofollow">https://en.m.wikipedia.org/wiki/Random_sample_consensus</a>

raisterabout 7 years ago

I was hoping one interesting graphic chart per Regression Analysis Type. That didn't happen, and I felt lost at sea. Please, improve the post on such amazing topic.

Myrmornisabout 7 years ago

> In simple words, regression analysis is used to model the relationship between a dependent variable and one or more independent variables.“model” isn’t a simple word.

thanatropismabout 7 years ago

评论 #16837685 未加载