Advances in deep learning have greatly improved structure prediction of
...
Image rescaling is a commonly used bidirectional operation, which first
...
Unlike vision and language data which usually has a unique format, molec...
The accurate protein-ligand binding affinity prediction is essential in ...
Recent years have witnessed significant success in Gradient Boosting Dec...
Relative Positional Encoding (RPE), which encodes the relative distance
...
This technical note describes the recent updates of Graphormer, includin...
This technical note describes the recent updates of Graphormer, includin...
The attention module, which is a crucial component in Transformer, canno...
In this technical report, we present our solution of KDD Cup 2021 OGB
La...
The Transformer architecture has become a dominant choice in many domain...
Semantic understanding of programs is a fundamental problem for programm...
Transformer has demonstrated its great power to learn contextual word
re...
Lossy image compression is one of the most commonly used operators for
d...
Pre-trained contextual representations (e.g., BERT) have become the
foun...
High-resolution digital images are usually downscaled to fit various dis...
A well-known issue of Batch Normalization is its significantly reduced
e...
The Transformer is widely used in natural language processing tasks. To ...
Recently, path norm was proposed as a new capacity measure for neural
ne...
It has been widely observed that many activation functions and pooling
m...