【ICLR 2015】Multiple Object Recognition with Visual Attention

关键词: 注意力, 强化学习, 多目标识别
Multiple Object Recognition with Visual Attention
Jimmy Ba, Volodymyr Mnih, Koray Kavukcuoglu
paper: https://arxiv.org/abs/1412.7755
We present an attention-based model for recognizing multiple objects in images.
The proposed model is a deep recurrent neural network trained with reinforcement
learning to attend to the most relevant regions of the input image. We show that the
model learns to both localize and recognize multiple objects despite being given
only class labels during training. We evaluate the model on the challenging task of
transcribing house number sequences from Google Street View images and show
that it is both more accurate than the state-of-the-art convolutional networks and
uses fewer parameters and less computation.
