Vision and Language: Past, Present, and Future

Publisher：闻天明Release Time：2022-02-22Number of visits：252

Speaker: Jiebo Luo, University of Rochester

Time: 10:00-11:00 Feb.25.2022

Host: Shenghua Gao

Link: Zoom: https://zoom.us/j/94115364432?pwd=MDVtSm83MzdNWXIrWnRkV2lUUy9KZz09

Number：941 1536 4432 Password：294814

Bilibili: https://live.bilibili.com/22272691

Abstract:

Computer vision and natural language processing are two key branches of artificial intelligence. Since the goal of computer vision has always been automatic extraction, analysis, and understanding of useful information from a single image or a sequence of images, it is natural for vision and language to come together to enable high-level computer vision tasks. Conversely, information extracted from images and videos can facilitate natural language processing tasks. Recent advances in machine learning and deep learning are facilitating reasoning about images and text in a joint fashion. in this talk, we will review a recently active area of research at the intersection of vision and language, including video-language alignment, image and video captioning, visual question answering, image retrieval using complex text queries, image generation from textual descriptions, language grounding in images and videos, as well as multimodal machine translation and vision-aided grammar induction.

Bio:

Jiebo Luo is a Professor of Computer Science at the University of Rochester which he joined in 2011 after a prolific career of fifteen years at Kodak Research Laboratories. He has authored over 500 technical papers and holds over 90 U.S. patents. His research interests include computer vision, NLP, machine learning, data mining, computational social science, and digital health. He has been involved in numerous technical conferences, including serving as program co-chair of ACM Multimedia 2010, IEEE CVPR 2012, ACM ICMR 2016, and IEEE ICIP 2017, as well as general co-chair of ACM Multimedia 2018. He has served on the editorial boards of the IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), IEEE Transactions on Multimedia (TMM), IEEE Transactions on Circuits and Systems for Video Technology (TCSVT), IEEE Transactions on Big Data (TBD), ACM Transactions on Intelligent Systems and Technology (TIST), Pattern Recognition, Knowledge and Information Systems (KAIS), and Intelligent Medicine. He is the current Editor-in-Chief of the IEEE Transactions on Multimedia. Professor Luo is a Fellow of ACM, AAAI, IEEE，SPIE, and IAPR.

导航

Vision and Language: Past, Present, and Future