In conjunction with CVPR 2019.
Long Beach, CA - June 16th 2019 (Morning)Room: Seaside 7
The exploitation of the power of big data in the last few years led to a big step forward in many applications of Computer Vision. However, most of the tasks tackled so far are involving mainly visual modality due to the unbalanced number of labelled samples available among modalities (e.g., there are many huge labelled datasets for images while not as many for audio or IMU based classification), resulting in a huge gap in performance when algorithms are trained separately.
This workshop aims to bring together communities of machine learning and multimodal data fusion. We expect contributions involving video, audio, depth, IR, IMU, laser, text, drawings, synthetic, etc. Position papers with feasibility studies and cross-modality issues with highly applicative flair are also encouraged therefore we expect a positive response from academic and industrial communities.
This is an open call for papers, soliciting original contributions considering recent findings in theory, methodologies, and applications in the field of multimodal machine learning. Potential topics include, but are not limited to:
Material and information about the 1st edition @ ECCV 2018 in Munich can be found here.
Papers will be limited to 8 pages according to the CVPR format (c.f. main conference authors guidelines). All papers will be reviewed by at least two reviewers with double blind policy. Papers will be selected based on relevance, significance and novelty of results, technical merit, and clarity of presentation. Papers will be published in CVPR 2019 proceedings.
All the papers should be submitted using CMT website.
Room: Seaside 7
08:20 - Initial remarks and workshop introduction
08:30 - WiFi and Vision Multimodal Learning for Accurate and Robust Device-Free Human Activity Recognition - Han Zou; Jianfei Yang; Hari Prasanna Das; Huihan Liu; Yuxun Zhou; Costas Spanos.
08:50 - Invited Speaker: Kristen Grauman - Disentangling Object Sounds in Video Download
09:40 - Two Stream 3D Semantic Scene Completion - Martin Garbade; Yueh-Tung Chen; Johann Sawatzky; Jürgen Gall.
10:00 - Co-compressing and Unifying Deep CNN Models for Efficient Human Face and Speaker Recognition - Timmy S. T. Wan; Jia-Hong Lee; Yi-Ming Chan; Chu-Song Chen.
10:20 - Coffee Break
10:30 - Invited Speaker: Andrew Owens - Learning Sight from Sound
11:20 - Spotlight session (3 mins presentation for each poster)
12:00 - Poster Session - Pacific Arena Ballroom, Slots 157-166.
Posters: Authors of all accepted papers will present their work in the poster session. Please use CVPR poster template. More instructions can be found in the 'News and Updates' section of CVPR website. Check your Paper ID to find your poster board.
Oral talks: Please review the program above to check if you paper was accepted for oral presentation. The presentations will each have a duration of 15 minutes (plus 5 minute for questions).
Spotlights: All other accepted papers will be presented in a 3-minute spotlight, right before the poster session. Please send your slides (pdf) by June 15th at mula.workshop@gmail.com
Kristen Grauman is a Professor in the Department of Computer Science at the University of Texas at Austin and a Research Scientist in Facebook AI Research (FAIR). Her research in computer vision and machine learning focuses on visual recognition and search. Before joining UT-Austin in 2007, she received her Ph.D. at MIT. She is an Alfred P. Sloan Research Fellow, a Microsoft Research New Faculty Fellow, and a recipient of NSF CAREER and ONR Young Investigator awards, the PAMI Young Researcher Award in 2013, the 2013 Computers and Thought Award from the International Joint Conference on Artificial Intelligence (IJCAI), the Presidential Early Career Award for Scientists and Engineers (PECASE) in 2013, and the Helmholtz Prize (computer vision test of time award) in 2017. She and her collaborators were recognized with the CVPR Best Student Paper Award in 2008 for their work on hashing algorithms for large-scale image retrieval, the Marr Prize at ICCV in 2011 for their work on modeling relative visual attributes, the ACCV Best Application Paper Award in 2016 for their work on automatic cinematography for 360 degree video, and a Best Paper Honorable Mention at CHI in 2017 for work on crowds and visual question answering.
Andrew Owens is a postdoctoral scholar at UC Berkeley. He received a Ph.D. in computer science from MIT in 2016. He is a recipient of a CVPR Best Paper Honorable Mention Award (2011), a Microsoft Research Ph.D. Fellowship (2015), and an NDSEG Graduate Fellowship (2011). He will begin as an assistant professor at The University of Michigan in the spring of 2020.
For additional info please contact us here