Gender Recognition in Informal and Formal Language Scenarios via Transfer Learning
The interest in demographic information retrieval based on text data has increased in the research community because applications have shown success in different sectors such as security, marketing, heath-care, and others. Recognition and identification of demographic traits such as gender, age, location, or personality based on text data can help to improve different marketing strategies. For instance it makes it possible to segment and to personalize offers, thus products and services are exposed to the group of greatest interest. This type of technology has been discussed widely in documents from social media. However, the methods have been poorly studied in data with a more formal structure, where there is no access to emoticons, mentions, and other linguistic phenomena that are only present in social media. This paper proposes the use of recurrent and convolutional neural networks, and a transfer learning strategy for gender recognition in documents that are written in informal and formal languages. Models are tested in two different databases consisting of Tweets and call-center conversations. Accuracies of up to 75% are achieved for both databases. The results also indicate that it is possible to transfer the knowledge from a system trained on a specific type of expressions or idioms such as those typically used in social media into a more formal type of text data, where the amount of data is more scarce and its structure is completely different.
READ FULL TEXT