Learning Features For Relational Data

01/16/2018
by   Hoang Thanh Lam, et al.
0

Feature engineering is one of the most important but tedious tasks in data science projects. This work studies automation of feature learning for relational data. We first theoretically proved that learning relevant features from relational data for a given predictive analytics problem is NP-hard. However, it is possible to empirically show that an efficient rule based approach predefining transformations as a priori based on heuristics can extract very useful features from relational data. Indeed, the proposed approach outperformed the state of the art solutions with a significant margin. We further introduce a deep neural network which automatically learns appropriate transformations of relational data into a representation that predicts the target variable well instead of being predefined as a priori by users. In an extensive experiment with Kaggle competitions, the proposed methods could win late medals. To the best of our knowledge, this is the first time an automation system could win medals in Kaggle competitions with complex relational data.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset