CVPR 2020 | 京东AI研究院对视觉与语言的思考:从自洽,交互到共生( 四 )

  [4] Yingwei Pan, Ting Yao, Yehao Li, and Tao Mei, “X-Linear Attention Networks for Image Captioning.” In CVPR, 2020.

  [5] Ting Yao, Yingwei Pan, Yehao Li, and Tao Mei, “Incorporating Copying Mechanism in Image Captioning for Learning Novel Objects.” In CVPR, 2017.

  [6] Anderson Peter, Xiaodong He, Chris Buehler, Damien Teney, Mark Johnson, Stephen Gould, and Lei Zhang. "Bottom-up and top-down attention for image captioning and visual question answering." In CVPR, 2018.

CVPR 2020 | 京东AI研究院对视觉与语言的思考:从自洽,交互到共生。  [7] Kelvin Xu, Jimmy Lei Ba, Ryan Kiros, Kyunghyun Cho, Aaron Courville, Ruslan Salakhutdinov, Richard S. Zemel, and Yoshua Bengio, “Show, Attend and Tell: Neural Image Caption Generation with Visual Attention.” In ICML, 2015.[8] Piyush Sharma, Nan Ding, Sebastian Goodman, and Radu Soricut, “Conceptual Captions: A Cleaned, Hypernymed, Image Alt-text Dataset For Automatic Image Captioning.” In ACL, 2018.

  [9] Lun Huang, Wenmin Wang, Jie Chen, and Xiao-Yong Wei. “Attention on Attention for Image Captioning.” In ICCV, 2019.