CaT: Balanced Continual Graph Learning with Graph Condensation
Continual graph learning (CGL) is purposed to continuously update a graph model with graph data being fed in a streaming manner. Since the model easily forgets previously learned knowledge when training with new-coming data, the catastrophic forgetting problem has been the major focus in CGL. Recent replay-based methods intend to solve this problem by updating the model using both (1) the entire new-coming data and (2) a sampling-based memory bank that stores replayed graphs to approximate the distribution of historical data. After updating the model, a new replayed graph sampled from the incoming graph will be added to the existing memory bank. Despite these methods are intuitive and effective for the CGL, two issues are identified in this paper. Firstly, most sampling-based methods struggle to fully capture the historical distribution when the storage budget is tight. Secondly, a significant data imbalance exists in terms of the scales of the complex new-coming graph data and the lightweight memory bank, resulting in unbalanced training. To solve these issues, a Condense and Train (CaT) framework is proposed in this paper. Prior to each model update, the new-coming graph is condensed to a small yet informative synthesised replayed graph, which is then stored in a Condensed Graph Memory with historical replay graphs. In the continual learning phase, a Training in Memory scheme is used to update the model directly with the Condensed Graph Memory rather than the whole new-coming graph, which alleviates the data imbalance problem. Extensive experiments conducted on four benchmark datasets successfully demonstrate superior performances of the proposed CaT framework in terms of effectiveness and efficiency. The code has been released on https://github.com/superallen13/CaT-CGL.
READ FULL TEXT