Few-shot Learning for Cross-Target Stance Detection by Aggregating Multimodal Embeddings
Despite the increasing popularity of the stance detection task, existing approaches are predominantly limited to using the textual content of social media posts for the classification, overlooking the social nature of the task. The stance detection task becomes particularly challenging in cross-target classification scenarios, where even in few-shot training settings the model needs to predict the stance towards new targets for which the model has only seen few relevant samples during training. To address the cross-target stance detection in social media by leveraging the social nature of the task, we introduce CT-TN, a novel model that aggregates multimodal embeddings derived from both textual and network features of the data. We conduct experiments in a few-shot cross-target scenario on six different combinations of source-destination target pairs. By comparing CT-TN with state-of-the-art cross-target stance detection models, we demonstrate the effectiveness of our model by achieving average performance improvements ranging from 11 across different baseline models. Experiments with different numbers of shots show that CT-TN can outperform other models after seeing 300 instances of the destination target. Further, ablation experiments demonstrate the positive contribution of each of the components of CT-TN towards the final performance. We further analyse the network interactions between social media users, which reveal the potential of using social features for cross-target stance detection.
READ FULL TEXT