Methodology and Results for the Competition on Semantic Similarity Evaluation and Entailment Recognition for PROPOR 2016
In this paper, we present the methodology and the results obtained by our teams, dubbed Blue Man Group, in the ASSIN (from the Portuguese Avaliação de Similaridade Semântica e Inferência Textual) competition, held at PROPOR 2016[International Conference on the Computational Processing of the Portuguese Language - http://propor2016.di.fc.ul.pt/]. Our team's strategy consisted of evaluating methods based on semantic word vectors, following two distinct directions: 1) to make use of low-dimensional, compact, feature sets, and 2) deep learning-based strategies dealing with high-dimensional feature vectors. Evaluation results demonstrated that the first strategy was more promising, so that the results from the second strategy have been discarded. As a result, by considering the best run of each of the six teams, we have been able to achieve the best accuracy and F1 values in entailment recognition, in the Brazilian Portuguese set, and the best F1 score overall. In the semantic similarity task, our team was ranked second in the Brazilian Portuguese set, and third considering both sets.
READ FULL TEXT