scoring/predicting for new observations

by Simon   Last Updated November 17, 2017 12:19 PM

I have two data sets of variables where one of them - the new observations - has no dependent variable. The data set without a dependent variable has around 20 times the number of records. Modeling/training the first data set is fairly simple and provides reasonable results in order to score the new observations. I receive new instances of the two data sets periodically. In previous iterations of the data sets they both had very similar variable means, stand. dev., medians, and 25%/75% quantiles. But in the most recent instance they are very different.

Are there any methodologies for scoring/predicting new observations with situations as the above? Currently I'm modeling with a generalized linear model.

My initial thought is to weight the new observations in some fashion (haven't thought long enough about the best way, or even if it could be cheating in some way). Another idea is to sample from the new observations based off of distributions determine from the the data set with the dependent variable. Anyways, I'm curious if others know of any literature or key words to look for when googleing.



Related Questions


What is behind JAGS (Just Another Gibbs Sampler)?

Updated September 03, 2017 13:19 PM




How to choose variables to fit the dispersion/variance

Updated February 27, 2017 15:19 PM