Is it valid to look at the impact of a feature on residuals?

by roundsquare   Last Updated September 19, 2019 21:19 PM

Background

I'm trying to measure the causal impact of action on outcome (sorry for the vague names, but trying to keep this general). My data consists of the following for each record:

  • m other_features [the result of TruncatedSVD to a much larger number of features]
  • action which is a binary
  • outcome which is continuous [in all cases here, I use the log of the actual value]

Based on business knowledge of the situation, I'm confident in a causal model that states:

  • the other_features have a causal effect on action
  • the other_features also have a causal effect on outcome
  • the action have a causal effect on outcome

I tried two techniques and got different results - I wanted to know how to interpret these results.

Attempt 1: Vanilla Linear Regression

First, I tried to regress outcome against other_features and action. When I did this, I got the following results (I'm using $\beta$ to refer to linear coefficients):

  • $R^2 = 0.793$
  • $\beta_{action} = 0.0943$

This implies that the action has the effect of ~9.5% increase on outcome.

Attempt 2: Looking at Action vs Residuals

Second, I tried to regress outcome against other_features (i.e. without action). When I did this, I got the following results:

  • $R^2 = 0.793$ [very little difference with the original regression).

Then, I looked at the residuals of those with and without action and got the following:

enter image description here

This seems to indicate that action has about a ~-8.3% impact on outcome.

Questions

  1. Is attempt 2 a valid way to proceed?
  2. If yes, how should I interpret the difference between the two approaches?
  3. Is there anything I ought to be weary/careful/aware of in using this approach?

(Please let me know if it additional data/results would be helpful in interpreting these results, I can supplement the information here).



Related Questions


looking for the correct statistical test

Updated April 24, 2017 03:19 AM

impact of reviews on sale

Updated July 07, 2018 17:19 PM

causal model assumptions

Updated June 05, 2017 14:19 PM


correlation and linear dependency

Updated July 09, 2015 13:08 PM