Synthetic Model Combination: An Instance-wise Approach to Unsupervised Ensemble Learning

10/11/2022
by   Alex J. Chan, et al.
8

Consider making a prediction over new test data without any opportunity to learn from a training set of labelled data - instead given access to a set of expert models and their predictions alongside some limited information about the dataset used to train them. In scenarios from finance to the medical sciences, and even consumer practice, stakeholders have developed models on private data they either cannot, or do not want to, share. Given the value and legislation surrounding personal information, it is not surprising that only the models, and not the data, will be released - the pertinent question becoming: how best to use these models? Previous work has focused on global model selection or ensembling, with the result of a single final model across the feature space. Machine learning models perform notoriously poorly on data outside their training domain however, and so we argue that when ensembling models the weightings for individual instances must reflect their respective domains - in other words models that are more likely to have seen information on that instance should have more attention paid to them. We introduce a method for such an instance-wise ensembling of models, including a novel representation learning step for handling sparse high-dimensional domains. Finally, we demonstrate the need and generalisability of our method on classical machine learning tasks as well as highlighting a real world use case in the pharmacological setting of vancomycin precision dosing.

READ FULL TEXT

page 2

page 7

research
07/06/2022

DIWIFT: Discovering Instance-wise Influential Features for Tabular Data

Tabular data is one of the most common data storage formats in business ...
research
05/25/2023

Ensemble Synthetic EHR Generation for Increasing Subpopulation Model's Performance

Electronic health records (EHR) often contain different rates of represe...
research
11/12/2021

Scalable Diverse Model Selection for Accessible Transfer Learning

With the preponderance of pretrained deep learning models available off-...
research
01/23/2023

RF+clust for Leave-One-Problem-Out Performance Prediction

Per-instance automated algorithm configuration and selection are gaining...
research
09/14/2020

Adaptive Generation Model: A New Ensemble Method

As a common method in Machine Learning, Ensemble Method is used to train...
research
12/22/2019

Unsupervised Representation Learning by Predicting Random Distances

Deep neural networks have gained tremendous success in a broad range of ...
research
09/16/2020

Transformer Based Multi-Source Domain Adaptation

In practical machine learning settings, the data on which a model must m...

Please sign up or login with your details

Forgot password? Click here to reset