|                            ||                            ||                            |
|Lihi Zelnik-Manor                            ||Ramin Zabih                            ||Long Quan                            |
|Technion, Israel                            ||Google Research & Cornell Tech, USA                            ||HKUST,HK                            |
|吴毅红                            ||贾佳亚                            ||李玺                            |
|中国科学院自动化研究所                            ||香港中文大学                            ||浙江大学                            |
|凌海滨                            ||梅林                            ||山世光                            |
|亮风台                            ||公安部第三研究所                            ||中国科学院计算所                            |
|陶大程                            ||陶海                            ||杨铭                            |
|University of Technology Sydney                            ||北京文安                            ||地平线机器人技术                             |
Abstract: By far, most of the bits in the world are image and video data. YouTube alone gets 300 hours of video uploaded every minute. Adding to that personal pictures, videos, TV channels and the gazillion of security cameras shooting 24/7 one quickly sees that the amount of visual data being recorded is colossal. In this talk I will discuss the problem of “saliency prediction” - separating between the important parts of images/videos (the “wheat”) from the less important ones (the “chaff”). Predicting what people find important could be useful for many applications. In advertising, it may be important for the producer to know if the key concept catches the viewer’s eye. Alternatively, if one knows where people are likely to look, relevant content can be placed there. In video editing knowing where viewer’s look could help create smoother shot transitions. Reliable gaze prediction could drive gaze-aware compression or key-frame selection.
In this talk I will discuss approaches for saliency prediction in images and videos and how the quality of these algorithms can be assessed. I will further explore the meaning of saliency in the context of different tasks, some of which call for specific tailored definition of ``importance''. Finally, realizing that there could be different definitions to saliency I will discuss approaches to predicting task-oriented saliency.
Bio: Lihi Zelnik-Manor is an Associate Professor in the Faculty of Electrical Engineering in the Technion, Israel. Between 2014 and 2016 she was a visiting Associate Professor at CornellTech. Prior to the Technion, she worked as a post-doctoral fellow in the Department of Engineering and Applied Science in the California Institute of Technology (Caltech). She holds a PhD and MSc (with honors) in Computer Science from the Weizmann Institute of Science and a BSc (summa cum laude) in Mechanical Engineering from the Technion.
Prof. Zelnik-Manor’ awards and honors include the Israeli high-education planning and budgeting committee (Vatat) scholarship for outstanding Ph.D. students, the Sloan-Swartz postdoctoral fellowship, the best Student Paper Award at the IEEE SMI'05, the AIM@SHAPE Best Paper Award 2005 and the Outstanding Reviewer Award at CVPR'08. She is also a recipient of the Gutwirth prize for the promotion of research and several grants from ISF, MOST, the 7th European R&D Program, and others. Prof Zelnik-Manor has served as Area Chair for ECCV and CVPR multiple times, as Program Chair of CVPR’16 and as Associate Editor at TPAMI. She has further had industrial collaborations with Intel, Adobe, and Microsoft Research.
Abstract: Many problems in computer vision involve making inferences about a pixel in the presence of locally ambiguous evidence. Markov Random Fields (MRF's) provide a natural way to formulate such problems, but the MRF inference problem is computationally extremely difficult. Graph cut techniques have been quite successful for 1st-order MRF's, and commonly produce results that are within a few percent of the global minimum. However, there is considerable evidence that a wide range of vision problems require higher-order priors. In this talk I will describe my research group's recent work on higher-order MRF inference.
This is joint work with many co-authors, but primarily with my PhD students Alex Fix and Chen Wang.
Bio: His research interests lie in computer vision and in medical imaging. He has worked on a variety of problems in early vision, including motion and stereo; many of these problems can be solved very accurately using algorithms based on graph cuts, which was given the Test of Time award at ICCV 2011 and the Koenderink prize at ECCV 2012. He served as a Program Chair for CVPR 2007 and was a General Chair for CVPR 2013. He was the Editor-in-Chief of the IEEE Transactions on Pattern Analysis and Machine Intelligence from 2009 through 2012, and from 2013 through mid 2015 he chaired the PAMI-TC, which runs the main vision conferences. He is also the president and founder of the Computer Vision Foundation. In 2018 he will be a general chair for ECCV.
Since the fall of 2013 he is at CornellNYC Tech with a joint appointment in Weill Cornell Radiology. He is now on leave from Cornell, running a group at Google.
Abstract: In the first part of the talk, I will review the state of the art of the three dimensional reconstruction from images or photographs developed in the past three decades in computer vision. In the second part of the talk, I will focus on the most recent exciting work of large-scale 3D reconstruction from drone photographs, and showcase the performances of our approach over a large samples of case studies of hundreds square kilometres in both high-rise metropolitan areas and low-rise rural areas in different cities of different countries. I will also demonstrate the online cloud platform and portal www.altizure.com, developed and funded by the HKUST team.
Bio: Long QUAN is a Professor of the Department of Computer Science and Engineering at the Hong Kong University of Science and Technology (HKUST). He received his Ph.D. in 1989 in Computer Science from INPL, France. He entered as a permanent researcher into the Centre National de la Recherche Scientifique (CNRS) in 1990 and was appointed at the Institut National de Recherche en Informatique et Automatique (INRIA) in Grenoble, France. He joined the HKUST in 2001, and was the founding Director of the HKUST Center for Visual Computing and Image Science. He is a Fellow of the IEEE Computer Society.
He works on vision geometry, 3D reconstruction and image-based modeling. He supervised the first French Best Ph.D. Dissertation in Computer Science of the Year 1998 (le prix de thèse SPECIF 1998, now le prix de thèse Gilles Kahn), the Piero Zamperoni Best Student Paper Award of the ICPR 2000, and the Best Student Poster Paper of IEEE CVPR 2008. He co-authored one of the six highlight papers of the SIGGRAPH 2007. He was also elected as the HKUST Best Ten Lecturers in 2004 and 2009. He has served as an Associate Editor of IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI) and a Regional Editor of Image and Vision Computing Journal (IVC). He is on the editorial board of the International Journal of Computer Vision (IJCV), the Electronic Letters on Computer Vision and Image Analysis (ELCVIA), the Machine Vision and Applications (MVA), and the Foundations and Trends in Computer Graphics and Vision. He was a Program Chair of IAPR International Conference on Pattern Recognition (ICPR) 2006 Computer Vision and Image Analysis, is a Program Chair of ICPR 2012 Computer and Robot Vision, and is a General Chair of the IEEE International Conference on Computer Vision (ICCV) 2011.
Bio: 吴毅红，中国科学院自动化研究所模式识别国家重点实验室，研究员、博士生导师。研究方向为多视几何、相机标定与定位、SLAM、移动视觉等。2001年毕业于中国科学院系统科学研究所，获博士学位。在重要期刊和会议上包括PAMI、IJCV、ICCV等发表论文70余篇。目前为《计算机辅助设计与图形学学报》编委、《计算机科学与探索》编委，《The Open Computer Science Journal》编委。
Abstract: This talk covers general review of computer vison research in recent years from two perspectives. It will be first exemplified by the computer vision goals that cannot be easily achieved by human. These tasks involve solving a series of low-level problems such as filtering, stereo matching, depth estimation, deconvolution, and motion estimation. Then a few hot topics to simulate human intelligence in image understanding will be introduced, which include semantic segmentation, object classification, and object detection. Several techniques developed in our team will be demonstrated.
Bio: Jiaya Jia is currently a professor in Department of Computer Science and Engineering, The Chinese University of Hong Kong (CUHK). He heads the research group focusing on computational photography, machine learning, practical optimization, and low- and high-level computer vision. He currently serves as an associate editor for the IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI) and served as an area chair for ICCV and CVPR. He was also on the technical paper program committees of SIGGRAPH, SIGGRAPH Asia, ICCP, and 3DV for several times, and cochaired the Workshop on Interactive Computer Vision, in conjunction with ICCV 2007. He received the Young Researcher Award 2008 and Research Excellence Award 2009 from CUHK.
Bio: 浙江大学教授，博导，现就职浙江大学计算机学院人工智能研究所，入选第五批中国国家“青年千人计划”和浙江省151第二层次人才。主要从事计算机视觉、模式识别和机器学习等领域的研究和开发。在目标跟踪、目标行为识别、图像标注、视频检索、哈希（hashing）函数学习、深度特征学习等方面取得了深入系统的研究成果，其中在视频的运动跟踪、理解与检索等方面的研究具有特色和优势，取得了多项具有国际影响力的创新性成果。本人在国际权威期刊和国际顶级学术会议发表文章80多篇。担任神经计算领域知名国际刊物Neurocomputing和Neural Processing Letters的Associate Editor，同时担任多个计算机视觉和模式识别方面的国际刊物和国际会议的审稿人和程序委员。获得两项最佳国际会议论文奖（包括ACCV 2010和DICTA 2012），ICIP2015 Top 10% paper award，另外分别获得两项中国北京市自然科学技术奖（包括一等奖和二等奖），以及一项中国专利优秀奖。
Abstract: 随着Pokémon Go的走红，增强现实在游戏上的应用和潜力为广大游戏迷所接受，并且引发了相关的商业和投资方的关注。在本报告中，我们先介绍增强现实和游戏之间的历史渊源和发展，然后陈述二者结合的必然性和带来的优势。接下来我们会结合实践讨论相关的增强现实方面的技术，挑战，以及思路。最后，我们会对亮风台在相关方面的努力做一个总结和展示，并对未来进行展望。
Bio: 凌海滨博士于1997年和2000年于北京大学分别获得学士和硕士学位，之后于2006年于美国马里兰大学获得博士学位，然后在加州大学洛杉矶分校从事了一年的博士后研究。凌博士在2001年任微软亚洲研究院助理研究员，2007~2008年任西门子研究院研究员。从2008起任职于美国天普大学（Temple University），现在为计算机系副教授。此外，凌博士是亮风台科技的共同创始人并担任其首席科学家。其主要研究领域包括计算机视觉、增强现实、人机交互和医学图像，获2003年度ACM UIST最佳学生论文奖，2014年度美国自然科学基金CAREER Award。任CVPR 2014和CVPR 2016年的领域主席（Area Chair），并且担任IEEE Trans. on Pattern Analysis and Machine Intelligence和Pattern Recognition的编委。
Bio: 2000年获得西安交通大学工学博士学位。2000年至2006年，先后在复旦大学计算机科学与工程系、德国弗赖堡大学计算机系、德国人工智能研究中心作为博士后和高级访问学者开展研究工作。2007年，加入公安部第三研究所担任警用装备技术研发中心智能图像处理学科带头人，2008年任物联网技术研发中心副主任，2012年2月任物联网技术研发中心主任。2012年12月受聘公安部第三研究所研究员,2015年被上海市科委评为上海市优秀技术带头人。主要研究兴趣包括计算机视觉、人工智能、物联网应用、大数据处理等方面。负责规划了基于视频结构化描述技术的新一代视频监控网络体系、面向各级公安机关各警种的视频警务应用产品体系以及相关标准体系。为“十三五”期间公安视频监控的大规模深度应用奠定了基础。现任上海市图像图形学学会理事、中国指挥控制学会富媒体专业委员会委员、公安部社会公共安全应用物联网应用标准化技术委员会委员、ACM上海分会学术委员会委员、上海智能视频监控工程技术研究中心常务副主任，曾任BDSC 2014 Workshop主席、ICSSC 2013程序委员。近年来，先后在国内外权威期刊和会议上发表学术论文60余篇（其中被SCI/EI收录20余篇），申请国家发明专利近50项（授权9项），获得软件著作权登记6项。
Bio: 山世光，博士，中科院计算所研究员、博士生导师，中科院智能信息处理重点实验室常务副主任。主要从事计算机视觉、模式识别、机器学习等相关研究工作。迄今已发表CCF A类论文50余篇，全部论文被Google Scholar引用9000余次。曾应邀担任过ICCV，ACCV，ICPR，FG等多个国际会议的领域主席（Area Chair），现任IEEE Trans. on Image Processing，Neurocomputing和Pattern Recognition Letters等国际学术刊物的编委（AE）。研究成果曾获2005年度国家科技进步二等奖和2015年度国家自然科学奖二等奖。他是2012年度基金委“优青”获得者，2015年度CCF青年科学奖获得者。
Abstract: In recent years, many algorithms for learning from multi-view data by considering the diversity of different views have been proposed. These views may be obtained from multiple sources or different feature subsets. For example, a person can be identified by face, fingerprint, signature or iris with information obtained from multiple sources, while an image can be represented by its color or texture features, which can be seen as different feature subsets of the image. In this talk, we will organize the similarities and differences between a wide variety of multi-view learning approaches, highlight their limitations, and then demonstrate the basic fundamentals for the success of multi-view learning. The thorough investigation on the view insufficiency problem and the in-depth analysis on the influence of view properties (consistence and complementarity) will be beneficial for the continuous development of multi-view learning.
Bio:Dacheng Tao is Professor of Computer Science with the Centre for Quantum Computation & Intelligent Systems, and the Faculty of Engineering and Information Technology in the University of Technology, Sydney. He mainly applies statistics and mathematics to data analytics problems and his research interests spread across computer vision, data science, image processing, machine learning, and video surveillance. His research results have expounded in one monograph and 100+ publications at prestigious journals and prominent conferences, such as IEEE T-PAMI, T-NNLS, T-IP, JMLR, IJCV, NIPS, ICML, CVPR, ICCV, ECCV, AISTATS, ICDM; and ACM SIGKDD, with several best paper awards, such as the best theory/algorithm paper runner up award in IEEE ICDM’07, the best student paper award in IEEE ICDM’13, and the 2014 ICDM 10 Year Highest-Impact Paper Award. He is a Fellow of the IEEE, IAPR, OSA, and SPIE.
Abstract: This talk will demonstrate our recent progress in developing embedded systems in several key computer vision sub-fields including video-based face recognition, vehicle attribute analysis, urban management event detection, and high density crowd counting. The developed algorithms combine the traditional feature-plus-classifier approach with the recent advances in deep learning to make high performance computer vision systems practical and enable products in several vertical markets including intelligent transportation systems (ITS), business intelligence (BI), and smart video surveillance.
We will demonstrate a single-GPU video analytic box that can process up to 8 channels of analog or 2 channels of 1080p HD video inputs and a prototype 40-GPU server system capable of processing up to 80 channels of 1080p video inputs.
Bio: Dr. Tao received BS and MS degrees in Automation from Tsinghua University in 1991 and 1993, respectively. He received the PhD degree in Electrical Engineering from the University of Illinois at Urbana-Champaign in 1999. From 1999 to 2001, he was a member of technical staff in the Vision Technology Laboratory at Sarnoff Corporation, NJ. From July 2001 to June 2010, he served as an assistant and then an associate professor in the Department of Computer Engineering at the University of California at Santa Cruz. Dr. Tao holds more than 10 US patents and published more than 130 papers in the field of image processing and computer vision. Dr. Tao served as the associate editor of Computer Vision and Applications and Pattern Recognition and has been a reviewer of CVPR, ICCV, ECCV and other computer vision related conferences. Dr. Tao is a founder and the CEO of Beijing Vion Technology, Inc., a company focusing on developing world leading computer vision and artificial intelligence algorithms and products, with various applications in intelligent transportation systems (ITS), public safety, and business intelligence.
Abstract: This talk will cover both a brief introduction of Horizon Robotics and the personal learning in developing products primarily using computer vision techniques. Artificial intelligence startup Horizon Robotics, founded in June 2015, strives to innovate turn-key solutions that integrate software, hardware, and cloud systems, to make human life more convenient, safe, and fun. In productionizing image recognition techniques, especially using deep convolutional neural networks, the major technical challenges include, but not limited to, the balance between computational efficiency and recognition accuracy (i.e., the cost vs. performance), the trade-off of developing time against functionalities, the issues on product consistency, reliability and the deliverables. Nevertheless, the rapid advance of computer vision technology opens up more business opportunities such as smart home and autonomous driving, etc.
Bio: Dr. Ming Yang is the Co-founder & Vice President of Horizon Robotics Inc. He is one of the founding members of the Facebook Artificial Intelligence Research (FAIR) and a former senior researcher at NEC Labs America. Dr. Yang is a well-recognized researcher in computer vision and machine learning. His research interests include object tracking, face recognition, massive image retrieval and multimedia content analysis. Dr. Yang owns 14 US patents, and has over 50 publications in top international conferences and journals with more than 3400 citations, h-index 28. During his tenure at Facebook, Dr. Yang led the deep learning research project “DeepFace”, which had a significant impact in the deep learning research community and got widely reported by various media including Science Magazine, MIT Tech Review and Forbes. Dr. Ming Yang received his B.Eng. and M.Eng. degree from the Department of Electrical Engineering at Tsinghua University and Ph.D. degree from the Department of Electrical Engineering and Computer Science at Northwestern University.