genCNN: A Convolutional Architecture for Word Sequence Prediction

03/17/2015
by   Mingxuan Wang, et al.
0

We propose a novel convolutional architecture, named genCNN, for word sequence prediction. Different from previous work on neural network-based language modeling and generation (e.g., RNN or LSTM), we choose not to greedily summarize the history of words as a fixed length vector. Instead, we use a convolutional neural network to predict the next word with the history of words of variable length. Also different from the existing feedforward networks for language modeling, our model can effectively fuse the local correlation and global correlation in the word sequence, with a convolution-gating strategy specifically designed for the task. We argue that our model can give adequate representation of the history, and therefore can naturally exploit both the short and long range dependencies. Our model is fast, easy to train, and readily parallelized. Our extensive experiments on text generation and n-best re-ranking in machine translation show that genCNN outperforms the state-of-the-arts with big margins.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/23/2017

Sequential Recurrent Neural Networks for Language Modeling

Feedforward Neural Network (FNN)-based language models estimate the prob...
research
11/06/2022

Suffix Retrieval-Augmented Language Modeling

Causal language modeling (LM) uses word history to predict the next word...
research
08/22/2017

Long-Short Range Context Neural Networks for Language Modeling

The goal of language modeling techniques is to capture the statistical a...
research
05/06/2015

A Fixed-Size Encoding Method for Variable-Length Sequences with its Application to Neural Network Language Models

In this paper, we propose the new fixed-size ordinally-forgetting encodi...
research
11/06/2017

Distributed Representation for Traditional Chinese Medicine Herb via Deep Learning Models

Traditional Chinese Medicine (TCM) has accumulated a big amount of preci...
research
04/21/2022

Sequence-Based Target Coin Prediction for Cryptocurrency Pump-and-Dump

As the pump-and-dump schemes (P Ds) proliferate in the cryptocurrency ...
research
11/02/2019

FCEM: A Novel Fast Correlation Extract Model For Real Time Steganalysis of VoIP Stream via Multi-head Attention

Extracting correlation features between codes-words with high computatio...

Please sign up or login with your details

Forgot password? Click here to reset