Random subspace method

In machine learning the random subspace method,^[1] also called attribute bagging^[2] or feature bagging, is an ensemble learning method that attempts to reduce the correlation between estimators in an ensemble by training them on random samples of features instead of the entire feature set.

Motivation

In ensemble learning one tries to combine the models produced by several learners into an ensemble that performs better than the original learners. One way of combining learners is bootstrap aggregating or bagging, which shows each learner a randomly sampled subset of the training points so that the learners will produce different models that can be sensibly averaged.^[a] In bagging, one samples training points with replacement from the full training set.

The random subspace method is similar to bagging except that the features ("attributes", "predictors", "independent variables") are randomly sampled, with replacement, for each learner. Informally, this causes individual learners to not over-focus on features that appear highly predictive/descriptive in the training set, but fail to be as predictive for points outside that set. For this reason, random subspaces are an attractive choice for high-dimensional problems where the number of features is much larger than the number of training points, such as learning from fMRI data^[3] or gene expression data.^[4]

The random subspace method has been used for decision trees; when combined with "ordinary" bagging of decision trees, the resulting models are called random forests.^[5] It has also been applied to linear classifiers,^[6] support vector machines,^[7] nearest neighbours^[8]^[9] and other types of classifiers. This method is also applicable to one-class classifiers.^[10]^[11] The random subspace method has also been applied to the portfolio selection^[12]^[13]^[14]^[15] problem showing its superiority to the conventional resampled portfolio essentially based on Bagging.

To tackle high-dimensional sparse problems, a framework named Random Subspace Ensemble (RaSE)^[16] was developed. RaSE combines weak learners trained in random subspaces with a two-layer structure and iterative process.^[17] RaSE has been shown to enjoy appealing theoretical properties and practical performance.^[16]

Algorithm

An ensemble of models employing the random subspace method can be constructed using the following algorithm:

Let the number of training points be N and the number of features in the training data be D.
Let L be the number of individual models in the ensemble.
For each individual model l, choose n_l (n_l < N) to be the number of input points for l. It is common to have only one value of n_l for all the individual models.
For each individual model l, create a training set by choosing d_l features from D with replacement and train the model.

Now, to apply the ensemble model to an unseen point, combine the outputs of the L individual models by majority voting or by combining the posterior probabilities.

Footnotes

^ If each learner follows the same, deterministic, algorithm, the models produced are necessarily all the same.

References

^ Ho, Tin Kam (1998). "The Random Subspace Method for Constructing Decision Forests" (PDF). IEEE Transactions on Pattern Analysis and Machine Intelligence. 20 (8): 832–844. doi:10.1109/34.709601. S2CID 206420153. Archived from the original (PDF) on 2019-05-14.
^ Bryll, R. (2003). "Attribute bagging: improving accuracy of classifier ensembles by using random feature subsets". Pattern Recognition. 36 (6): 1291–1302. doi:10.1016/s0031-3203(02)00121-8.
^ Kuncheva, Ludmila; et al. (2010). "Random Subspace Ensembles for fMRI Classification" (PDF). IEEE Transactions on Medical Imaging. 29 (2): 531–542. CiteSeerX 10.1.1.157.1178. doi:10.1109/TMI.2009.2037756. PMID 20129853.
^ Bertoni, Alberto; Folgieri, Raffaella; Valentini, Giorgio (2005). "Bio-molecular cancer prediction with random subspace ensembles of support vector machines" (PDF). Neurocomputing. 63: 535–539. doi:10.1016/j.neucom.2004.07.007. hdl:2434/9370.
^ Ho, Tin Kam (1995). Random Decision Forest (PDF). Proceedings of the 3rd International Conference on Document Analysis and Recognition, Montreal, QC, 14–16 August 1995. pp. 278–282.
^ Skurichina, Marina (2002). "Bagging, boosting and the random subspace method for linear classifiers". Pattern Analysis and Applications. 5 (2): 121–135. doi:10.1007/s100440200011.
^ Tao, D. (2006). "Asymmetric bagging and random subspace for support vector machines-based relevance feedback in image retrieval" (PDF). IEEE Transactions on Pattern Analysis and Machine Intelligence. 28 (7): 1088–99. doi:10.1109/tpami.2006.134. PMID 16792098.
^ Ho, Tin Kam (1998). "Nearest neighbors in random subspaces". Advances in Pattern Recognition. Lecture Notes in Computer Science. Vol. 1451. pp. 640–648. doi:10.1007/BFb0033288. ISBN 978-3-540-64858-1. {{cite book}}: |journal= ignored (help)
^ Tremblay, G. (2004). Optimizing Nearest Neighbour in Random Subspaces using a Multi-Objective Genetic Algorithm (PDF). 17th International Conference on Pattern Recognition. pp. 208–211. doi:10.1109/ICPR.2004.1334060. ISBN 978-0-7695-2128-2.
^ Nanni, L. (2006). "Experimental comparison of one-class classifiers for online signature verification". Neurocomputing. 69 (7): 869–873. doi:10.1016/j.neucom.2005.06.007.
^ Cheplygina, Veronika; Tax, David M. J. (2011-06-15). "Pruned Random Subspace Method for One-Class Classifiers". In Sansone, Carlo; Kittler, Josef; Roli, Fabio (eds.). Multiple Classifier Systems. Lecture Notes in Computer Science. Vol. 6713. Springer Berlin Heidelberg. pp. 96–105. doi:10.1007/978-3-642-21557-5_12. ISBN 9783642215568.
^ Varadi, David (2013). "Random Subspace Optimization (RSO)". CSS Analytics.
^ Gillen, Ben (2016). "Subset Optimization for Asset Allocation". CaltechAUTHORS.
^ Shen, Weiwei; Wang, Jun (2017), "Portfolio Selection via Subset Resampling", Proceedings of AAAI Conference on Artificial Intelligence (AAAI2017)
^ Shen, Weiwei; Wang, Bin; Pu, Jian; Wang, Jun (2019), "The Kelly growth optimal portfolio with ensemble learning", Proceedings of AAAI Conference on Artificial Intelligence (AAAI2019), 33: 1134–1141, doi:10.1609/aaai.v33i01.33011134
^ ^a ^b Tian, Ye; Feng, Yang (2021). "RaSE: Random Subspace Ensemble Classification". Journal of Machine Learning Research. 22 (45): 1–93. ISSN 1533-7928.
^ Tian, Ye; Feng, Yang (2021). "R Package "RaSEn": Random Subspace Ensemble Classification and Variable Screening". CRAN.

[3] If each learner follows the same, deterministic, algorithm, the models produced are necessarily all the same.

[ho1998-1] Ho, Tin Kam (1998). "The Random Subspace Method for Constructing Decision Forests" (PDF). IEEE Transactions on Pattern Analysis and Machine Intelligence. 20 (8): 832–844. doi:10.1109/34.709601. S2CID 206420153. Archived from the original (PDF) on 2019-05-14.

[2] Bryll, R. (2003). "Attribute bagging: improving accuracy of classifier ensembles by using random feature subsets". Pattern Recognition. 36 (6): 1291–1302. doi:10.1016/s0031-3203(02)00121-8.

[4] Kuncheva, Ludmila; et al. (2010). "Random Subspace Ensembles for fMRI Classification" (PDF). IEEE Transactions on Medical Imaging. 29 (2): 531–542. CiteSeerX 10.1.1.157.1178. doi:10.1109/TMI.2009.2037756. PMID 20129853.

[5] Bertoni, Alberto; Folgieri, Raffaella; Valentini, Giorgio (2005). "Bio-molecular cancer prediction with random subspace ensembles of support vector machines" (PDF). Neurocomputing. 63: 535–539. doi:10.1016/j.neucom.2004.07.007. hdl:2434/9370.

[ho1995-6] Ho, Tin Kam (1995). Random Decision Forest (PDF). Proceedings of the 3rd International Conference on Document Analysis and Recognition, Montreal, QC, 14–16 August 1995. pp. 278–282.

[7] Skurichina, Marina (2002). "Bagging, boosting and the random subspace method for linear classifiers". Pattern Analysis and Applications. 5 (2): 121–135. doi:10.1007/s100440200011.

[8] Tao, D. (2006). "Asymmetric bagging and random subspace for support vector machines-based relevance feedback in image retrieval" (PDF). IEEE Transactions on Pattern Analysis and Machine Intelligence. 28 (7): 1088–99. doi:10.1109/tpami.2006.134. PMID 16792098.

[9] Ho, Tin Kam (1998). "Nearest neighbors in random subspaces". Advances in Pattern Recognition. Lecture Notes in Computer Science. Vol. 1451. pp. 640–648. doi:10.1007/BFb0033288. ISBN 978-3-540-64858-1. {{cite book}}: |journal= ignored (help)

[10] Tremblay, G. (2004). Optimizing Nearest Neighbour in Random Subspaces using a Multi-Objective Genetic Algorithm (PDF). 17th International Conference on Pattern Recognition. pp. 208–211. doi:10.1109/ICPR.2004.1334060. ISBN 978-0-7695-2128-2.

[11] Nanni, L. (2006). "Experimental comparison of one-class classifiers for online signature verification". Neurocomputing. 69 (7): 869–873. doi:10.1016/j.neucom.2005.06.007.

[12] Cheplygina, Veronika; Tax, David M. J. (2011-06-15). "Pruned Random Subspace Method for One-Class Classifiers". In Sansone, Carlo; Kittler, Josef; Roli, Fabio (eds.). Multiple Classifier Systems. Lecture Notes in Computer Science. Vol. 6713. Springer Berlin Heidelberg. pp. 96–105. doi:10.1007/978-3-642-21557-5_12. ISBN 9783642215568.

[13] Varadi, David (2013). "Random Subspace Optimization (RSO)". CSS Analytics.

[14] Gillen, Ben (2016). "Subset Optimization for Asset Allocation". CaltechAUTHORS.

[ShenWang2017-15] Shen, Weiwei; Wang, Jun (2017), "Portfolio Selection via Subset Resampling", Proceedings of AAAI Conference on Artificial Intelligence (AAAI2017)

[ShenWang2019-16] Shen, Weiwei; Wang, Bin; Pu, Jian; Wang, Jun (2019), "The Kelly growth optimal portfolio with ensemble learning", Proceedings of AAAI Conference on Artificial Intelligence (AAAI2019), 33: 1134–1141, doi:10.1609/aaai.v33i01.33011134

[:0-17] Tian, Ye; Feng, Yang (2021). "RaSE: Random Subspace Ensemble Classification". Journal of Machine Learning Research. 22 (45): 1–93. ISSN 1533-7928.

[18] Tian, Ye; Feng, Yang (2021). "R Package "RaSEn": Random Subspace Ensemble Classification and Variable Screening". CRAN.

[1]

[2]

[a]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

[10]

[11]

[12]

[13]

[14]

[15]

[16]

[17]