Small-world networks for summarization of biomedical articles
In recent years, many methods have been developed to identify important portions of text documents. Summarization tools can utilize these methods to extract summaries from large volumes of textual information. However, to identify concepts representing central ideas within a text document and to extract the most informative sentences that best convey those concepts still remain two crucial tasks in summarization methods. In this paper, we introduce a graph-based method to address these two challenges in the context of biomedical text summarization. We show that how a summarizer can discover meaningful concepts within a biomedical text document using the Helmholtz principle. The summarizer considers the meaningful concepts as the main topics and constructs a graph based on the topics that the sentences share. The summarizer can produce an informative summary by extracting those sentences having higher values of the degree. We assess the performance of our method for summarization of biomedical articles using the Recall-Oriented Understudy for Gisting Evaluation (ROUGE) toolkit. The results show that the degree can be a useful centrality measure to identify important sentences in this type of graph-based modelling. Our method can improve the performance of biomedical text summarization compared to some state-of-the-art and publicly available summarizers. Combining a concept-based modelling strategy and a graph-based approach to sentence extraction, our summarizer can produce summaries with the highest scores of informativeness among the comparison methods. This research work can be regarded as a start point to the study of small-world networks in summarization of biomedical texts.
READ FULL TEXT