126 篇殿堂级深度学习论文分类整理 从入门到应用 | 干货

如果你有非常大的决心从事深度学习,又不想在这一行打酱油,那么研读大牛论文将是不可避免的一步。而作为新人,你的第一个问题或许是:“ 论文那么多,从哪一篇读起?”

本文将试图解决这个问题——文章标题本来是:“从入门到绝望,无止境的深度学习论文”。请诸位备好道具,开启头悬梁锥刺股的学霸姿势。

开个玩笑。

但对非科班出身的开发者而言,读论文的确可以成为一件很痛苦的事。但好消息来了——为避免初学者陷入迷途苦海,昵称为 songrotek 的学霸在 GitHub 发布了他整理的深度学习路线图, 分门别类梳理了新入门者最需要学习的 DL 论文,又按重要程度给每篇论文打上星星。

截至目前,这份 DL 论文路线图已在 GitHub 收获了近万颗星星好评,人气极高。雷锋网 (公众号:雷锋网) 感到非常有必要对大家进行介绍。

闲话少说,该路线图根据以下四项原则而组织:

  • 从大纲到细节

  • 从经典到前沿

  • 从一般到具体领域

  • 关注最新研究突破

作者注:有许多论文很新但非常值得一读。

1 深度学习历史和基础

1.0 书籍

[0] Bengio, Yoshua, Ian J. Goodfellow, and Aaron Courville. "Deep learning ." An MIT Press book. (2015). [pdf] (Ian Goodfellow 等大牛所著的教科书,乃深度学习圣经。你可以同时研习这本书以及以下论文)  ★★★★★

地址:https://github.com/HFTrader/DeepLearningBook/raw/master/DeepLearningBook.pdf

1.1 调查

[1] LeCun, Yann, Yoshua Bengio, and Geoffrey Hinton. " Deep learning ." Nature 521.7553 (2015): 436-444. [pdf] (三巨头做的调查)   ★★★★★

地址:http://www.cs.toronto.edu/~hinton/absps/NatureDeepReview.pdf

1.2 深度置信网络 (DBN,深度学习前夜的里程碑)

[2] Hinton, Geoffrey E., Simon Osindero, and Yee-Whye Teh. " A fast learning algorithm for deep belief nets ." Neural computation 18.7 (2006): 1527-1554. [pdf] (深度学习前夜)  ★★★

地址:http://www.cs.toronto.edu/~hinton/absps/ncfast.pdf

[3] Hinton, Geoffrey E., and Ruslan R. Salakhutdinov. " Reducing the dimensionality of data with neural networks. " Science 313.5786 (2006): 504-507. [pdf] (里程碑,展示了深度学习的前景)  ★★★

地址:http://www.cs.toronto.edu/~hinton/science.pdf

1.3 ImageNet 的进化(深度学习从此萌发)

[4] Krizhevsky, Alex, Ilya Sutskever, and Geoffrey E. Hinton. " Imagenet classification with deep convolutional neural networks. " Advances in neural information processing systems. 2012. [pdf] (AlexNet, 深度学习突破)   ★★★★★

地址:http://papers.nips.cc/paper/4824-imagenet-classification-with-deep-convolutional-neural-networks.pdf

[5] Simonyan, Karen, and Andrew Zisserman. " Very deep convolutional networks for large-scale image recognition. " arXiv preprint arXiv:1409.1556 (2014). [pdf] (VGGNet,神经网络变得很深层)  ★★★

地址:https://arxiv.org/pdf/1409.1556.pdf

[6] Szegedy, Christian, et al. " Going deeper with convolutions. " Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2015. [pdf] (GoogLeNet)  ★★★

地址:http://www.cv-foundation.org/openaccess/content_cvpr_2015/papers/Szegedy_Going_Deeper_With_2015_CVPR_paper.pdf

[7] He, Kaiming, et al. " Deep residual learning for image recognition. " arXiv preprint arXiv:1512.03385 (2015). [pdf](ResNet,特别深的神经网络, CVPR 最佳论文)   ★★★★★

地址:https://arxiv.org/pdf/1512.03385.pdf

1.4 语音识别的进化

[8] Hinton, Geoffrey, et al. " Deep neural networks for acoustic modeling in speech recognition: The shared views of four research groups. " IEEE Signal Processing Magazine 29.6 (2012): 82-97. [pdf] (语音识别的突破) ★★★★

地址:http://cs224d.stanford.edu/papers/maas_paper.pdf

[9] Graves, Alex, Abdel-rahman Mohamed, and Geoffrey Hinton. " Speech recognition with deep recurrent neural networks. " 2013 IEEE international conference on acoustics, speech and signal processing. IEEE, 2013. [pdf] (RNN) ★★★

地址:http://arxiv.org/pdf/1303.5778.pdf

[10] Graves, Alex, and Navdeep Jaitly. " Towards End-To-End Speech Recognition with Recurrent Neural Networks. " ICML. Vol. 14. 2014. [pdf] ★★★

地址:http://www.jmlr.org/proceedings/papers/v32/graves14.pdf

[11] Sak, Haşim, et al. " Fast and accurate recurrent neural network acoustic models for speech recognition. " arXiv preprint arXiv:1507.06947 (2015). [pdf] (谷歌语音识别系统)  ★★★

地址:http://arxiv.org/pdf/1507.06947

[12] Amodei, Dario, et al. " Deep speech 2: End-to-end speech recognition in english and mandarin ." arXiv preprint arXiv:1512.02595 (2015). [pdf] (百度语音识别系统)  ★★★★

地址:https://arxiv.org/pdf/1512.02595.pdf

[13] W. Xiong, J. Droppo, X. Huang, F. Seide, M. Seltzer, A. Stolcke, D. Yu, G. Zweig " Achieving Human Parity in Conversational Speech Recognition. " arXiv preprint arXiv:1610.05256 (2016). [pdf] (最前沿的语音识别, 微软)  ★★★★

地址:https://arxiv.org/pdf/1610.05256v1

研读以上论文之后,你将对深度学习历史、模型的基本架构(包括 CNN, RNN, LSTM)有一个基础的了解,并理解深度学习如何应用于图像和语音识别问题。接下来的论文,将带你深入探索深度学习方法、在不同领域的应用和前沿尖端技术。我建议,你可以根据兴趣和工作/研究方向进行选择性的阅读。

2 深度学习方法

2.1 模型

[14] Hinton, Geoffrey E., et al. " Improving neural networks by preventing co-adaptation of feature detectors. " arXiv preprint arXiv:1207.0580 (2012). [pdf] (Dropout)  ★★★

地址:https://arxiv.org/pdf/1207.0580.pdf

[15] Srivastava, Nitish, et al. " Dropout: a simple way to prevent neural networks from overfitting. " Journal of Machine Learning Research 15.1 (2014): 1929-1958. [pdf]  ★★★

地址:http://www.jmlr.org/papers/volume15/srivastava14a.old/source/srivastava14a.pdf

[16] Ioffe, Sergey, and Christian Szegedy. " Batch normalization: Accelerating deep network training by reducing internal covariate shift. " arXiv preprint arXiv:1502.03167 (2015). [pdf] (2015 年的杰出研究)  ★★★★

地址:http://arxiv.org/pdf/1502.03167

[17] Ba, Jimmy Lei, Jamie Ryan Kiros, and Geoffrey E. Hinton. " Layer normalization. " arXiv preprint arXiv:1607.06450 (2016). [pdf] (Batch Normalization 的更新)  ★★★★

地址:https://arxiv.org/pdf/1607.06450.pdf?utm_source=sciontist.com&utm_medium=refer&utm_campaign=promote

[18] Courbariaux, Matthieu, et al. " Binarized Neural Networks: Training Neural Networks with Weights and Activations Constrained to+ 1 or−1. " [pdf] (新模型,快)  ★★★

地址:https://pdfs.semanticscholar.org/f832/b16cb367802609d91d400085eb87d630212a.pdf

[19] Jaderberg, Max, et al. " Decoupled neural interfaces using synthetic gradients. " arXiv preprint arXiv:1608.05343 (2016). [pdf] (训练方法的创新,研究相当不错)  ★★★★★

地址:https://arxiv.org/pdf/1608.05343

[20] Chen, Tianqi, Ian Goodfellow, and Jonathon Shlens. " Net2net: Accelerating learning via knowledge transfer. " arXiv preprint arXiv:1511.05641 (2015). [pdf] (改进此前的训练网络,来缩短训练周期)  ★★★

地址:https://arxiv.org/abs/1511.05641

[21] Wei, Tao, et al. " Network Morphism. " arXiv preprint arXiv:1603.01670 (2016). [pdf] (改进此前的训练网络,来缩短训练周期)  ★★★

地址:https://arxiv.org/abs/1603.01670

2.2 优化 Optimization

[22] Sutskever, Ilya, et al. " On the importance of initialization and momentum in deep learning. " ICML (3) 28 (2013): 1139-1147. [pdf] (Momentum optimizer)  ★★

地址:http://www.jmlr.org/proceedings/papers/v28/sutskever13.pdf

[23] Kingma, Diederik, and Jimmy Ba. " Adam: A method for stochastic optimization. " arXiv preprint arXiv:1412.6980 (2014). [pdf] (Maybe used most often currently)  ★★★

地址:http://arxiv.org/pdf/1412.6980

[24] Andrychowicz, Marcin, et al. " Learning to learn by gradient descent by gradient descent." arXiv preprint arXiv:1606.04474 (2016). [pdf] (Neural Optimizer,Amazing Work)  ★★★★★

地址:https://arxiv.org/pdf/1606.04474

[25] Han, Song, Huizi Mao, and William J. Dally. "Deep compression: Compressing deep neural network with pruning, trained quantization and huffman coding. " CoRR, abs/1510.00149 2 (2015). [pdf] (ICLR best paper, new direction to make NN running fast,DeePhi Tech Startup)  ★★★★★

地址:https://pdfs.semanticscholar.org/5b6c/9dda1d88095fa4aac1507348e498a1f2e863.pdf

[26] Iandola, Forrest N., et al. " SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and< 1MB model size. " arXiv preprint arXiv:1602.07360 (2016). [pdf] (Also a new direction to optimize NN,DeePhi Tech Startup)  ★★★★

地址:http://arxiv.org/pdf/1602.07360

2.3 无监督学习/深度生成模型

[27] Le, Quoc V. " Building high-level features using large scale unsupervised learning. " 2013 IEEE international conference on acoustics, speech and signal processing. IEEE, 2013. [pdf] (里程碑,吴恩达, 谷歌大脑, Cat)  ★★★★

地址:http://arxiv.org/pdf/1112.6209.pdf&embed

[28] Kingma, Diederik P., and Max Welling. " Auto-encoding variational bayes. " arXiv preprint arXiv:1312.6114 (2013). [pdf](VAE)  ★★★★

地址:http://arxiv.org/pdf/1312.6114

[29] Goodfellow, Ian, et al. " Generative adversarial nets. " Advances in Neural Information Processing Systems. 2014. [pdf](GAN,很酷的想法)  ★★★★★

地址:http://papers.nips.cc/paper/5423-generative-adversarial-nets.pdf

[30] Radford, Alec, Luke Metz, and Soumith Chintala. " Unsupervised representation learning with deep convolutional generative adversarial networks. " arXiv preprint arXiv:1511.06434 (2015). [pdf] (DCGAN)  ★★★★

地址:http://arxiv.org/pdf/1511.06434

[31] Gregor, Karol, et al. " DRAW: A recurrent neural network for image generation. " arXiv preprint arXiv:1502.04623 (2015). [pdf] (VAE with attention, 很出色的研究)  ★★★★★

地址:http://jmlr.org/proceedings/papers/v37/gregor15.pdf

[32] Oord, Aaron van den, Nal Kalchbrenner, and Koray Kavukcuoglu. " Pixel recurrent neural networks. " arXiv preprint arXiv:1601.06759 (2016). [pdf] (PixelRNN)  ★★★★

地址:http://arxiv.org/pdf/1601.06759

[33] Oord, Aaron van den, et al. "Conditional image generation with PixelCNN decoders. " arXiv preprint arXiv:1606.05328 (2016). [pdf] (PixelCNN)  ★★★★

地址:https://arxiv.org/pdf/1606.05328

2.4 递归神经网络(RNN) / Sequence-to-Sequence Model

[34] Graves, Alex. " Generating sequences with recurrent neural networks. " arXiv preprint arXiv:1308.0850 (2013). [pdf] (LSTM, 效果很好,展示了 RNN 的性能)  ★★★★

地址:http://arxiv.org/pdf/1308.0850

[35] Cho, Kyunghyun, et al. " Learning phrase representations using RNN encoder-decoder for statistical machine translation. " arXiv preprint arXiv:1406.1078 (2014). [pdf] (第一篇 Sequence-to-Sequence 的论文)  ★★★★

地址:http://arxiv.org/pdf/1406.1078

[36] Sutskever, Ilya, Oriol Vinyals, and Quoc V. Le. " Sequence to sequence learning with neural networks. " Advances in neural information processing systems. 2014. [pdf] (杰出研究)  ★★★★★

地址:http://papers.nips.cc/paper/5346-information-based-learning-by-agents-in-unbounded-state-spaces.pdf

[37] Bahdanau, Dzmitry, KyungHyun Cho, and Yoshua Bengio. " Neural Machine Translation by Jointly Learning to Align and Translate. " arXiv preprint arXiv:1409.0473 (2014). [pdf]  ★★★★

地址:https://arxiv.org/pdf/1409.0473v7.pdf

[38] Vinyals, Oriol, and Quoc Le. " A neural conversational model. " arXiv preprint arXiv:1506.05869 (2015). [pdf] (Seq-to-Seq 聊天机器人)  ★★★

地址:http://arxiv.org/pdf/1506.05869.pdf%20(http://arxiv.org/pdf/1506.05869.pdf)

2.5 神经网络图灵机

[39] Graves, Alex, Greg Wayne, and Ivo Danihelka. " Neural turing machines. " arXiv preprint arXiv:1410.5401 (2014). [pdf] (未来计算机的基础原型机)  ★★★★★

地址:http://arxiv.org/pdf/1410.5401.pdf

[40] Zaremba, Wojciech, and Ilya Sutskever. " Reinforcement learning neural Turing machines. " arXiv preprint arXiv:1505.00521 362 (2015). [pdf]  ★★★

地址:https://pdfs.semanticscholar.org/f10e/071292d593fef939e6ef4a59baf0bb3a6c2b.pdf

[41] Weston, Jason, Sumit Chopra, and Antoine Bordes. " Memory networks. " arXiv preprint arXiv:1410.3916 (2014). [pdf]  ★★★

地址:http://arxiv.org/pdf/1410.3916

[42] Sukhbaatar, Sainbayar, Jason Weston, and Rob Fergus. " End-to-end memory networks. " Advances in neural information processing systems. 2015. [pdf]  ★★★★

地址:http://papers.nips.cc/paper/5846-end-to-end-memory-networks.pdf

[43] Vinyals, Oriol, Meire Fortunato, and Navdeep Jaitly. " Pointer networks. " Advances in Neural Information Processing Systems. 2015. [pdf]  ★★★★

地址:http://papers.nips.cc/paper/5866-pointer-networks.pdf

[44] Graves, Alex, et al. " Hybrid computing using a neural network with dynamic external memory. " Nature (2016). [pdf] (里程碑,把以上论文的想法整合了起来)  ★★★★★

地址:https://www.dropbox.com/s/0a40xi702grx3dq/2016-graves.pdf

2.6 深度强化学习

[45] Mnih, Volodymyr, et al. " Playing atari with deep reinforcement learning ." arXiv preprint arXiv:1312.5602 (2013). [pdf]) (第一个以深度强化学习为题的论文)   ★★★★

地址:http://arxiv.org/pdf/1312.5602.pdf

[46] Mnih, Volodymyr, et al. " Human-level control through deep reinforcement learning ." Nature 518.7540 (2015): 529-533. [pdf] (里程碑)  ★★★★★

地址:https://storage.googleapis.com/deepmind-data/assets/papers/DeepMindNature14236Paper.pdf

[47] Wang, Ziyu, Nando de Freitas, and Marc Lanctot. " Dueling network architectures for deep reinforcement learning. " arXiv preprint arXiv:1511.06581 (2015). [pdf] (ICLR 最佳论文,很棒的想法)   ★★★★

地址:http://arxiv.org/pdf/1511.06581

[48] Mnih, Volodymyr, et al. " Asynchronous methods for deep reinforcement learning. " arXiv preprint arXiv:1602.01783 (2016). [pdf] (前沿方法)  ★★★★★

地址:http://arxiv.org/pdf/1602.01783

[49] Lillicrap, Timothy P., et al. " Continuous control with deep reinforcement learning. " arXiv preprint arXiv:1509.02971 (2015). [pdf] (DDPG)   ★★★★

地址:http://arxiv.org/pdf/1509.02971

[50] Gu, Shixiang, et al. " Continuous Deep Q-Learning with Model-based Acceleration. " arXiv preprint arXiv:1603.00748 (2016). [pdf] (NAF)   ★★★★

地址:http://arxiv.org/pdf/1603.00748

[51] Schulman, John, et al. " Trust region policy optimization. " CoRR, abs/1502.05477 (2015). [pdf] (TRPO)   ★★★★

地址:http://www.jmlr.org/proceedings/papers/v37/schulman15.pdf

[52] Silver, David, et al. " Mastering the game of Go with deep neural networks and tree search. " Nature 529.7587 (2016): 484-489. [pdf] (AlphaGo)  ★★★★★

地址:http://willamette.edu/~levenick/cs448/goNature.pdf

2.7 深度迁移学习 /终生学习 / 强化学习

[53] Bengio, Yoshua. " Deep Learning of Representations for Unsupervised and Transfer Learning ." ICML Unsupervised and Transfer Learning 27 (2012): 17-36. [pdf] (这是一个教程)  ★★★

地址:http://www.jmlr.org/proceedings/papers/v27/bengio12a/bengio12a.pdf

[54] Silver, Daniel L., Qiang Yang, and Lianghao Li. " Lifelong Machine Learning Systems: Beyond Learning Algorithms. " AAAI Spring Symposium: Lifelong Machine Learning. 2013. [pdf] (对终生学习的简单讨论)  ★★★

地址:http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.696.7800&rep=rep1&type=pdf

[55] Hinton, Geoffrey, Oriol Vinyals, and Jeff Dean. " Distilling the knowledge in a neural network. " arXiv preprint arXiv:1503.02531 (2015). [pdf] (大神们的研究)   ★★★★

地址:http://arxiv.org/pdf/1503.02531

[56] Rusu, Andrei A., et al. " Policy distillation. " arXiv preprint arXiv:1511.06295 (2015). [pdf] (RL 领域)  ★★★

地址:http://arxiv.org/pdf/1511.06295

[57] Parisotto, Emilio, Jimmy Lei Ba, and Ruslan Salakhu★★★tdinov. " Actor-mimic: Deep multitask and transfer reinforcement learning. " arXiv preprint arXiv:1511.06342 (2015). [pdf] (RL 领域)  ★★★

地址:http://arxiv.org/pdf/1511.06342

[58] Rusu, Andrei A., et al. " Progressive neural networks. " arXiv preprint arXiv:1606.04671 (2016). [pdf] (杰出研究, 很新奇的想法)  ★★★★★

地址:https://arxiv.org/pdf/1606.04671

2.8 One Shot 深度学习

[59] Lake, Brenden M., Ruslan Salakhutdinov, and Joshua B. Tenenbaum. " Human-level concept learning through probabilistic program induction. " Science 350.6266 (2015): 1332-1338. [pdf] (不含深度学习但值得一读)  ★★★★★

地址:http://clm.utexas.edu/compjclub/wp-content/uploads/2016/02/lake2015.pdf

[60] Koch, Gregory, Richard Zemel, and Ruslan Salakhutdinov. " Siamese Neural Networks for One-shot Image Recognition. "(2015) [pdf]  ★★★

地址:http://www.cs.utoronto.ca/~gkoch/files/msc-thesis.pdf

[61] Santoro, Adam, et al. " One-shot Learning with Memory-Augmented Neural Networks. " arXiv preprint arXiv:1605.06065 (2016). [pdf] (one shot 学习的基础一步)  ★★★★

地址:http://arxiv.org/pdf/1605.06065

[62] Vinyals, Oriol, et al. " Matching Networks for One Shot Learning. " arXiv preprint arXiv:1606.04080 (2016). [pdf]  ★★★

地址:https://arxiv.org/pdf/1606.04080

[63] Hariharan, Bharath, and Ross Girshick. " Low-shot visual object recognition. " arXiv preprint arXiv:1606.02819 (2016). [pdf] (通向更大规模数据的一步)  ★★★★

地址:http://arxiv.org/pdf/1606.02819

3 应用

3.1 自然语言处理 (NLP)

[1] Antoine Bordes, et al. " Joint Learning of Words and Meaning Representations for Open-Text Semantic Parsing. " AISTATS(2012) [pdf]  ★★★★

地址:https://www.hds.utc.fr/~bordesan/dokuwiki/lib/exe/fetch.php?id=en%3Apubli&cache=cache&media=en:bordes12aistats.pdf

[2] Mikolov, et al. " Distributed representations of words and phrases and their compositionality. " ANIPS(2013): 3111-3119 [pdf] (word2vec)  ★★★

地址:http://papers.nips.cc/paper/5021-distributed-representations-of-words-and-phrases-and-their-compositionality.pdf

[3] Sutskever, et al. "“ Sequence to sequence learning with neural networks. " ANIPS(2014) [pdf] ★★★ 

地址:http://papers.nips.cc/paper/5346-sequence-to-sequence-learning-with-neural-networks.pdf

[4] Ankit Kumar, et al. "“ Ask Me Anything: Dynamic Memory Networks for Natural Language Processing. " arXiv preprint arXiv:1506.07285(2015) [pdf]  ★★★★

地址:https://arxiv.org/abs/1506.07285

[5] Yoon Kim, et al. " Character-Aware Neural Language Models. " NIPS(2015) arXiv preprint arXiv:1508.06615(2015) [pdf]  ★★★

地址:https://arxiv.org/abs/1508.06615

[6] Jason Weston, et al. " Towards AI-Complete Question Answering: A Set of Prerequisite Toy Tasks. " arXiv preprint arXiv:1502.05698(2015) [pdf] (bAbI tasks)  ★★★

地址:https://arxiv.org/abs/1502.05698

[7] Karl Moritz Hermann, et al. " Teaching Machines to Read and Comprehend. " arXiv preprint arXiv:1506.03340(2015) [pdf](CNN/每日邮报完形填空风格的问题)  ★★

地址:https://arxiv.org/abs/1506.03340

[8] Alexis Conneau, et al. " Very Deep Convolutional Networks for Natural Language Processing. " arXiv preprint arXiv:1606.01781(2016) [pdf] (文本分类的前沿技术)  ★★★

地址:https://arxiv.org/abs/1606.01781

[9] Armand Joulin, et al. " Bag of Tricks for Efficient Text Classification. " arXiv preprint arXiv:1607.01759(2016) [pdf] (比前沿技术稍落后, 但快很多)  ★★★

地址:https://arxiv.org/abs/1607.01759

3.2 物体检测

[1] Szegedy, Christian, Alexander Toshev, and Dumitru Erhan. " Deep neural networks for object detection ." Advances in Neural Information Processing Systems. 2013. [pdf]  ★★★

地址:http://papers.nips.cc/paper/5207-deep-neural-networks-for-object-detection.pdf

[2] Girshick, Ross, et al. " Rich feature hierarchies for accurate object detection and semantic segmentation. " Proceedings of the IEEE conference on computer vision and pattern recognition. 2014. [pdf] (RCNN)  ★★★★★

地址:http://www.cv-foundation.org/openaccess/content_cvpr_2014/papers/Girshick_Rich_Feature_Hierarchies_2014_CVPR_paper.pdf

[3] He, Kaiming, et al. " Spatial pyramid pooling in deep convolutional networks for visual recognition. " European Conference on Computer Vision. Springer International Publishing, 2014. [pdf] (SPPNet)  ★★★★

地址:http://arxiv.org/pdf/1406.4729

[4] Girshick, Ross. " Fast r-cnn. " Proceedings of the IEEE International Conference on Computer Vision. 2015. [pdf]  ★★★★

地址:https://pdfs.semanticscholar.org/8f67/64a59f0d17081f2a2a9d06f4ed1cdea1a0ad.pdf

[5] Ren, Shaoqing, et al. " Faster R-CNN: Towards real-time object detection with region proposal networks. " Advances in neural information processing systems. 2015. [pdf]  ★★★★

地址:http://papers.nips.cc/paper/5638-analysis-of-variational-bayesian-latent-dirichlet-allocation-weaker-sparsity-than-map.pdf

[6] Redmon, Joseph, et al. " You only look once: Unified, real-time object detection. " arXiv preprint arXiv:1506.02640 (2015). [pdf] (YOLO,杰出研究,非常具有使用价值)  ★★★★★

地址:http://homes.cs.washington.edu/~ali/papers/YOLO.pdf

[7] Liu, Wei, et al. " SSD: Single Shot MultiBox Detector. " arXiv preprint arXiv:1512.02325 (2015). [pdf]  ★★★

地址:http://arxiv.org/pdf/1512.02325

[8] Dai, Jifeng, et al. " R-FCN: Object Detection via Region-based Fully Convolutional Networks. " arXiv preprint arXiv:1605.06409 (2016). [pdf]  ★★★★

地址:https://arxiv.org/abs/1605.06409

3.3 视觉追踪

[1] Wang, Naiyan, and Dit-Yan Yeung. " Learning a deep compact image representation for visual tracking. " Advances in neural information processing systems. 2013. [pdf] (第一篇使用深度学习做视觉追踪的论文,DLT Tracker)  ★★★

地址:http://papers.nips.cc/paper/5192-learning-a-deep-compact-image-representation-for-visual-tracking.pdf

[2] Wang, Naiyan, et al. " Transferring rich feature hierarchies for robust visual tracking. " arXiv preprint arXiv:1501.04587 (2015). [pdf] (SO-DLT)  ★★★★

地址:http://arxiv.org/pdf/1501.04587

[3] Wang, Lijun, et al. " Visual tracking with fully convolutional networks. " Proceedings of the IEEE International Conference on Computer Vision. 2015. [pdf] (FCNT)  ★★★★

地址:http://www.cv-foundation.org/openaccess/content_iccv_2015/papers/Wang_Visual_Tracking_With_ICCV_2015_paper.pdf

[4] Held, David, Sebastian Thrun, and Silvio Savarese. " Learning to Track at 100 FPS with Deep Regression Networks. " arXiv preprint arXiv:1604.01802 (2016). [pdf] (GOTURN,在深度学习方法里算是非常快的,但仍比非深度学习方法慢很多)  ★★★★

地址:http://arxiv.org/pdf/1604.01802

[5] Bertinetto, Luca, et al. " Fully-Convolutional Siamese Networks for Object Tracking. " arXiv preprint arXiv:1606.09549 (2016). [pdf] (SiameseFC,实时物体追踪领域的最新前沿技术)  ★★★★

地址:https://arxiv.org/pdf/1606.09549

[6] Martin Danelljan, Andreas Robinson, Fahad Khan, Michael Felsberg. " Beyond Correlation Filters: Learning Continuous Convolution Operators for Visual Tracking. " ECCV (2016) [pdf] (C-COT)  ★★★★

地址:http://www.cvl.isy.liu.se/research/objrec/visualtracking/conttrack/C-COT_ECCV16.pdf

[7] Nam, Hyeonseob, Mooyeol Baek, and Bohyung Han. " Modeling and Propagating CNNs in a Tree Structure for Visual Tracking. " arXiv preprint arXiv:1608.07242 (2016). [pdf] (VOT2016 获奖论文,TCNN)  ★★★★

地址:https://arxiv.org/pdf/1608.07242

3.4 图像标注

[1] Farhadi,Ali,etal. " Every picture tells a story: Generating sentences from images ". In Computer VisionECCV 2010. Springer Berlin Heidelberg:15-29, 2010. [pdf]  ★★★

地址:https://www.cs.cmu.edu/~afarhadi/papers/sentence.pdf

[2] Kulkarni, Girish, et al. " Baby talk: Understanding and generating image descriptions ". In Proceedings of the 24th CVPR, 2011. [pdf] ★★★★

地址:http://tamaraberg.com/papers/generation_cvpr11.pdf

[3] Vinyals, Oriol, et al. " Show and tell: A neural image caption generator ". In arXiv preprint arXiv:1411.4555, 2014. [pdf] ★★★

地址:https://arxiv.org/pdf/1411.4555.pdf

[4] Donahue, Jeff, et al. " Long-term recurrent convolutional networks for visual recognition and description ". In arXiv preprint arXiv:1411.4389 ,2014. [pdf] 

地址:https://arxiv.org/pdf/1411.4389.pdf

[5] Karpathy, Andrej, and Li Fei-Fei. " Deep visual-semantic alignments for generating image descriptions ". In arXiv preprint arXiv:1412.2306, 2014. [pdf] ★★★★★

地址:https://cs.stanford.edu/people/karpathy/cvpr2015.pdf

[6] Karpathy, Andrej, Armand Joulin, and Fei Fei F. Li. " D eep fragment embeddings for bidirectional image sentence mapping ". In Advances in neural information processing systems, 2014. [pdf] ★★★★

地址:https://arxiv.org/pdf/1406.5679v1.pdf

[7] Fang, Hao, et al. " From captions to visual concepts and back ". In arXiv preprint arXiv:1411.4952, 2014. [pdf] ★★★★★

地址:https://arxiv.org/pdf/1411.4952v3.pdf

[8] Chen, Xinlei, and C. Lawrence Zitnick. " Learning a recurrent visual representation for image caption generation ". In arXiv preprint arXiv:1411.5654, 2014. [pdf] ★★★★

地址:https://arxiv.org/pdf/1411.5654v1.pdf

[9] Mao, Junhua, et al. " Deep captioning with multimodal recurrent neural networks (m-rnn) ". In arXiv preprint arXiv:1412.6632, 2014. [pdf] ★★★

地址:https://arxiv.org/pdf/1412.6632v5.pdf

[10] Xu, Kelvin, et al. " Show, attend and tell: Neural image caption generation with visual attention ". In arXiv preprint arXiv:1502.03044, 2015. [pdf] ★★★★★

地址:https://arxiv.org/pdf/1502.03044v3.pdf

3.5 机器翻译

部分里程碑研究被列入 RNN / Seq-to-Seq 版块。

[1] Luong, Minh-Thang, et al. " Addressing the rare word problem in neural machine translation. " arXiv preprint arXiv:1410.8206 (2014). [pdf]  ★★★★

地址:http://arxiv.org/pdf/1410.8206

[2] Sennrich, et al. "Neural Machine Translation of Rare Words with Subword Units ". In arXiv preprint arXiv:1508.07909, 2015. [pdf] ★★★

地址:https://arxiv.org/pdf/1508.07909.pdf

[3] Luong, Minh-Thang, Hieu Pham, and Christopher D. Manning. " Effective approaches to attention-based neural machine translation ." arXiv preprint arXiv:1508.04025 (2015). [pdf]  ★★★★

地址:http://arxiv.org/pdf/1508.04025

[4] Chung, et al. " A Character-Level Decoder without Explicit Segmentation for Neural Machine Translation ". In arXiv preprint arXiv:1603.06147, 2016. [pdf]  ★★

地址:https://arxiv.org/pdf/1603.06147.pdf

[5] Lee, et al. " Fully Character-Level Neural Machine Translation without Explicit Segmentation ". In arXiv preprint arXiv:1610.03017, 2016. [pdf]  ★★★★★

地址:https://arxiv.org/pdf/1610.03017.pdf

[6] Wu, Schuster, Chen, Le, et al. " Google's Neural Machine Translation System: Bridging the Gap between Human and Machine Translation ". In arXiv preprint arXiv:1609.08144v2, 2016. [pdf] (Milestone)  ★★★★

地址:https://arxiv.org/pdf/1609.08144v2.pdf

3.6 机器人

[1] Koutník, Jan, et al. " Evolving large-scale neural networks for vision-based reinforcement learning. " Proceedings of the 15th annual conference on Genetic and evolutionary computation. ACM, 2013. [pdf]  ★★★

地址:http://repository.supsi.ch/4550/1/koutnik2013gecco.pdf

[2] Levine, Sergey, et al. " End-to-end training of deep visuomotor policies. " Journal of Machine Learning Research 17.39 (2016): 1-40. [pdf]  ★★★★★

地址:http://www.jmlr.org/papers/volume17/15-522/15-522.pdf

[3] Pinto, Lerrel, and Abhinav Gupta. " Supersizing self-supervision: Learning to grasp from 50k tries and 700 robot hours. " arXiv preprint arXiv:1509.06825 (2015). [pdf]  ★★★

地址:http://arxiv.org/pdf/1509.06825

[4] Levine, Sergey, et al. " Learning Hand-Eye Coordination for Robotic Grasping with Deep Learning and Large-Scale Data Collection ." arXiv preprint arXiv:1603.02199 (2016). [pdf]  ★★★★

地址:http://arxiv.org/pdf/1603.02199

[5] Zhu, Yuke, et al. " Target-driven Visual Navigation in Indoor Scenes using Deep Reinforcement Learning. " arXiv preprint arXiv:1609.05143 (2016). [pdf]  ★★★★

地址:https://arxiv.org/pdf/1609.05143

[6] Yahya, Ali, et al. " Collective Robot Reinforcement Learning with Distributed Asynchronous Guided Policy Search. " arXiv preprint arXiv:1610.00673 (2016). [pdf]  ★★★★

地址:https://arxiv.org/pdf/1610.00673

[7] Gu, Shixiang, et al. " Deep Reinforcement Learning for Robotic Manipulation. " arXiv preprint arXiv:1610.00633 (2016). [pdf]  ★★★★

地址:https://arxiv.org/pdf/1610.00633

[8] A Rusu, M Vecerik, Thomas Rothörl, N Heess, R Pascanu, R Hadsell." Sim-to-Real Robot Learning from Pixels with Progressive Nets. " arXiv preprint arXiv:1610.04286 (2016). [pdf]  ★★★★

地址:https://arxiv.org/pdf/1610.04286.pdf

[9] Mirowski, Piotr, et al. " Learning to navigate in complex environments. " arXiv preprint arXiv:1611.03673 (2016). [pdf]  ★★★★

地址:https://arxiv.org/pdf/1611.03673

3.7 艺术

[1] Mordvintsev, Alexander; Olah, Christopher; Tyka, Mike (2015). " Inceptionism: Going Deeper into Neural Networks ". Google Research. [html] (Deep Dream)  ★★★★

地址:https://research.googleblog.com/2015/06/inceptionism-going-deeper-into-neural.html

[2] Gatys, Leon A., Alexander S. Ecker, and Matthias Bethge. " A neural algorithm of artistic style. " arXiv preprint arXiv:1508.06576 (2015). [pdf] (杰出研究,迄今最成功的方法)  ★★★★★

地址:http://arxiv.org/pdf/1508.06576

[3] Zhu, Jun-Yan, et al. " Generative Visual Manipulation on the Natural Image Manifold. " European Conference on Computer Vision. Springer International Publishing, 2016. [pdf] (iGAN)  ★★★★

地址:https://arxiv.org/pdf/1609.03552

[4] Champandard, Alex J. " Semantic Style Transfer and Turning Two-Bit Doodles into Fine Artworks. " arXiv preprint arXiv:1603.01768 (2016). [pdf] (Neural Doodle)  ★★★★

地址:http://arxiv.org/pdf/1603.01768

[5] Zhang, Richard, Phillip Isola, and Alexei A. Efros. " Colorful Image Colorization ." arXiv preprint arXiv:1603.08511 (2016). [pdf]  ★★★★

地址:http://arxiv.org/pdf/1603.08511

[6] Johnson, Justin, Alexandre Alahi, and Li Fei-Fei. " Perceptual losses for real-time style transfer and super-resolution ." arXiv preprint arXiv:1603.08155 (2016). [pdf]  ★★★★

地址:https://arxiv.org/pdf/1603.08155.pdf

[7] Vincent Dumoulin, Jonathon Shlens and Manjunath Kudlur. " A learned representation for artistic style. " arXiv preprint arXiv:1610.07629 (2016). [pdf]  ★★★★

地址:https://arxiv.org/pdf/1610.00633

[8] Gatys, Leon and Ecker, et al." Controlling Perceptual Factors in Neural Style Transfer. " arXiv preprint arXiv:1611.07865 (2016). [pdf] (control style transfer over spatial location,colour information and across spatial scale) ★★★★

地址:https://arxiv.org/pdf/1610.04286.pdf

[9] Ulyanov, Dmitry and Lebedev, Vadim, et al. " Texture Networks: Feed-forward Synthesis of Textures and Stylized Images. " arXiv preprint arXiv:1603.03417(2016). [pdf] (纹理生成和风格变化)  ★★★★

地址:https://arxiv.org/pdf/1611.03673

3.8 目标分割 Object Segmentation

[1] J. Long, E. Shelhamer, and T. Darrell, “ Fully convolutional networks for semantic segmentation. ” in CVPR, 2015. [pdf]  ★★★★★

地址:https://arxiv.org/pdf/1411.4038v2.pdf

[2] L.-C. Chen, G. Papandreou, I. Kokkinos, K. Murphy, and A. L. Yuille. " Semantic image segmentation with deep convolutional nets and fully connected crfs. " In ICLR, 2015. [pdf]  ★★★★★

地址:https://arxiv.org/pdf/1606.00915v1.pdf

[3] Pinheiro, P.O., Collobert, R., Dollar, P. " Learning to segment object candidates. " In: NIPS. 2015. [pdf]  ★★★★

地址:https://arxiv.org/pdf/1506.06204v2.pdf

[4] Dai, J., He, K., Sun, J. "Instance-aware semantic segmentation via multi-task network cascades. " in CVPR. 2016 [pdf]  ★★★

地址:https://arxiv.org/pdf/1512.04412v1.pdf

[5] Dai, J., He, K., Sun, J. " Instance-sensitive Fully Convolutional Networks. " arXiv preprint arXiv:1603.08678 (2016). [pdf]  ★★★

地址:https://arxiv.org/pdf/1603.08678v1.pdf

原文地址: https://github.com/songrotek/Deep-Learning-Papers-Reading-Roadmap 雷锋网获授权雷锋网获授权

相关文章:

深度学习零基础进阶第四弹​|干货分享

深度学习全网最全学习资料汇总之入门篇

深度学习全网最全学习资料汇总之模型介绍篇

深度学习盛会ICLR2017最佳论文出炉,雷锋网带你5min过重点(附论文链接)

雷锋网版权文章,未经授权禁止转载。详情见 转载须知

性价比最高的人工智能课程,现已开放报名。3月4日-5日,中国自动化学会智能自动化专业委员会主任、清华大学邓志东教授在1024慕课学院开讲人工智能之神经网络,更多课程包括机器学习之推荐系统、NLP自然语言处理、TensorFlow案例实战等即将上线,敬请期待! 点击此处报名

我来评几句
登录后评论

已发表评论数()

相关站点

+订阅
热门文章