How to combine multiple kernels of large sample datasets?

by SD1024   Last Updated September 09, 2018 09:19 AM

I have multiple large sample datasets in matrix format (each has 15000 rows and 5-50 columns) corresponding to different experiments. Each matrix contains the same number of samples(rows) but the variables(columns) are not the same. My objective is to cluster the samples on the basis of all the experiments.

I tried to use Unsupervised multiple kernel learning (UMKL) to integrate the datasets followed by kernelPCA using "mixkernel" package in R (https://www.ncbi.nlm.nih.gov/pubmed/29077792). The UMKL step calculates kernels for each dataset and combines the kernels using 3 different approaches: 1) calculating a consensus kernel from multiple kernels 2) calculating a sparse kernel preserving the original topology of the data 3) calculating a full kernel preserving the original topology of the data

The kernel calculation step was fine but the kernel integration step (all 3 approaches) runs very long and my computer hangs.

Is there any way to handle this problem? More specifically, is there any way to handle multiple kernel integration for large sample datasets?

Any suggestion alternative to using kernel methods will also work.



Related Questions


Core vector machine implementation

Updated August 19, 2015 17:08 PM


CCA/KCCA for more than two views

Updated April 29, 2015 03:08 AM

Non-Orthogonality in PCA?

Updated August 18, 2018 08:19 AM