CrossLink

Dynamic Link are real-world widespread. Although them have different semantics, but all can be modeled as a link prediction task

Examples of dynamic link in real-world.

Link prediction (LP) is a crucial task in dynamic graph modeling. owever, current methods mainly consider single graph setting. In this setting, the graph model is trained using supervised learning on a given graph and then makes inferences on the same graph (End2End setting).

This approach has several notable limitations when applied in real-world scenarios:

(1) High human/time costs: The End2End setting requires independently training different models for each graph. Each training process demands careful design and optimization of hyperparameters by experts. Additionally, the training process is time-consuming.

(2) Unsuitability for small datasets: The End-to-End setting typically requires a substantial number of samples for satisfactory domain-specific performance. This makes it ill-suited for small-scale application scenarios, such as B2B businesses or situations involving large graphs with limited data.

(3) Inability to learn more knowledge from different applications: Graphs in different applications may contain complementary knowledge. For instance, users purchasing items and users listening to music are both projections of human behavior. Therefore, learning from both user-item graphs and user-music graphs can help the model better understand behavior-related knowledge. However, End2End training is limited to a single graph.

Hence, we propose CrossLink, the first framework for cross-domain link prediction.CrossLink learns the evolution pattern of a specific downstream graph and subsequently makes pattern-specific link predictions.

However, cross-domain link prediction faces a fundamental challenge: how to model ambiguous structures. Different graphs are interdependent, meaning the same structure may hold different meanings and evolve differently across various graphs.

As shown in Figure, Graph A typically follows a triadic closure process, where two nodes with common neighbors are more likely to form edges, while Graph B exhibits a contrasting pattern. Consequently, even if the node pair (red and blue nodes) in Graph A and Graph B has the same local structure, their ground truths are different. We refer to this type of local structure, which has diverse ground truths across various graphs, as ambiguous structure.

Current methods usually consider single graph settings, focusing on predicting future edges between two nodes solely based on their local structure.

(a) shows a case of structure conflict. Graph A follow a triadic closure process, while Graph B exhibits a contrasting process. (b) shows current methods cannot address this conflict. and (c) shows how prediction via modeling evolution, and it can address structure conflict.

Therefore, these methods struggle to effectively model ambiguous structures under cross-domain setting. This limitation not only impedes the model's ability to accurately learn the meaning of ambiguous structures in multiple graphs but also hinders its capability to infer future edges correctly in target graphs, especially when the graphs contain many ambiguous structures.

To address these challenges,CrossLink adopts the following approach:

Framework of CrossLink. (a) Models the graph's evolution process via a sequence of link prediction tasks with ground truths; (b) Evolution-specific link prediction based on both nodes' representations and the evolution process.

TheCrossLink process follows these key steps:

Choose one ego-graph from a random domain for analysis and processing
Sort all links based on their temporal appearance in the graph
Generate embeddings for each graph structure
Process the embeddings through a Transformer architecture for analysis
Generate label predictions through parallel computation
Calculate and optimize loss functions in parallel

Results

Design Advantages:

Enhanced efficiency in processing and prediction
Improved generalization through in-context learning, adapting NLP concepts to cross-domain modeling

Key Insights:

Link prediction demonstrates consistency across various domains, making it a universally applicable task
DyExpert achieves significant improvements in cross-domain link prediction performance
The model surpasses fully supervised performance metrics across six distinct datasets

Performance of various methods regarding cross-domain link prediction. We report their Average Precision
(average of 3 runs and omit by %) across eight graphs.

Multi-domain training enhances model generalization, with CLG effectively modeling differences between domains
Extended evolution periods on target datasets improve prediction accuracy, highlighting the value of historical data

Analysis result of CrossLink regarding multi-domain training. (a) shows the result of ablation studies, where ``w/o'' removes a certain component of our model. (b) shows the performance ofCrossLink adopts different maximum sequence lengths (both training and inference). (c) indicates the performance on evaluated graphs that model solely trained by a specific graph.

Optimal hidden size varies according to training dataset size, requiring dynamic adjustment
DyExpert shows scaling potential similar to prompt tuning in GPT, where performance improves with increased prompts and cases

Performance ofCrossLink with diverse settings. (a) show the performance of the model improves with more training samples. (b) shows the performance ofCrossLink is influenced by the number of training graphs. (c) shows the best-hidden size of the model under 6M training samples. (d) further shows the best-hidden size under different training samples .

CrossLink

Enhancing Cross-domain Link Prediction via
Evolution Process Modeling

Introduction

Background

Method

Results

BibTeX

CrossLink

Enhancing Cross-domain Link Prediction via Evolution Process Modeling

Introduction

Background

Method

Results

BibTeX

Enhancing Cross-domain Link Prediction via
Evolution Process Modeling