Derive machine learning features from dependant variables in each data sample such that maximum information is retained

by gau   Last Updated September 28, 2018 21:19 PM

I have a collection of values for two different variables X1 and X2 for each timestamp, as briefly represented below:

  • Timestamp 1: X1=[20,50,175], X2=[100,200,300]
  • Timestamp 2: X1=[15,55,55,150,500], X2=[100,200,300,400,500]
  • Timestamp 3: X1=[10,20,25,200], X2=[50,100,150,250]
  • and so on ...

For each timestamp, length of X1 is equal to length of X2, and X1 is non-linearly dependant on X2.

Each timestamp is a data sample for my supervised regression machine learning model (response/target variable ignored here).

Question 1: What is the most promising technique to derive useful features from the dependant variables X1 and X2 in each data sample such that maximum information can be retained?

Question 2: Is this technique replicable for higher dimensions (X1,X2...Xn)?

Related Questions

Is it bad to add bias to features?

Updated June 26, 2018 05:19 AM

How to work with an unknown dependent variable?

Updated August 22, 2018 12:19 PM