Quantifying the Effects of Transfer Learning in Min-norm Interpolation
Friday, April 18,
-
Speaker(s):Pragya Sur, Assistant Professor of Statistics, Harvard University
Min-norm interpolators naturally emerge as implicit regularized limits of modern machine learning algorithms. Recently, their out-of-distribution risk was studied when test samples are unavailable during training. However, in many applications, a limited amount of test data is typically available during training. The properties of min-norm interpolation in this setting are not well understood. In this talk, I will present a characterization of the generalization error of pooled min-L2-norm interpolation under covariate and model shifts. I will demonstrate that the pooled interpolator captures both early fusion and a form of intermediate fusion. Our results have several implications. Under model shift, adding data always hurts prediction when the signal-to-noise ratio is low. However, for higher signal-to-noise ratios, transfer learning helps as long as the shift-to-signal ratio lies below a threshold that I will define. I will derive precise thresholds capturing when the pooled interpolator outperforms the target-based interpolator, and further characterize the optimal number of target samples that minimizes the generalization error. Our results also show that under covariate shift, if the source sample size is small relative to the dimension, heterogeneity between domains improves the risk. This is based on joint work with Yanke Song and Sohom Bhattacharya.