Visual Perception via Learning in an Open World

The 4th workshop on Open World Vision

Location: Seattle Convention Center

Time: 9:00am - 17:15pm Local Time (PDT), June 18, 2024

in conjunction with CVPR 2024, Seattle, US


Overview

Visual perception is indispensable for numerous applications, spanning transportation, healthcare, security, commerce, entertainment, and interdisciplinary research. Visual perception algorithms developed in a closed-world setup often generalize poorly to the real open-world, which contains situations that are never-before-seen, dynamic, vast, and unpredictable. This requires visual perception algorithms to be developed for the open-world, to address its complexities such as recognizing unknown objects, debiasing imbalanced data distributions, leveraging multimodal signals, efficient few-shot learning, etc. Moreover, today's most powerful visual perception models are pretrained in an open-world, e.g., training them on web-scale data consisting of images, langauges and so on. We are in the best era to study Visual Perception via Learning in an Open World (VPLOW). Therefore, we are inviting you to our VPLOW workshop, where multiple speakers and challenge competitions will cover a variety of topics of VPLOW. We hope our workshop stimulates fruitful discussions.

You might be interested in our previous workshops:


Topics

Topics of interest include, but are not limited to:

  • data: long-tailed distribution, open-set, unknowns, streaming data, biased data, unlabeled data, anomaly, multi-modality, etc.
  • concepts: open-vocabulary, ontology/taxonomy of object classes, evolving class ontology, etc.
  • learning: X-shot learning, Y-supervised learning, lifelong/continual learning, domain adaptation/generalization, open-world learning, multimodal pretraining, prompt learning, foundation model tuning, etc.
  • social impact: safety, fairness, real-world applications, inter-disciplinary research, etc.
  • misc: datasets, benchmarks, interpretability, robustness, generalization, etc.

Examples

Let's consider the following motivational examples.

  • Open-world data follows a long-tail distribution. Data tends to follow a long-tailed distribution and real-world tasks often emphasize the rarely-seen data. A model trained on such long-tailed data can perform poorly on rare or underrepresentative data. For example, a visual recognition model can misclassify underrepresented minorities and make unethical predictions (ref. case1, case2).
  • Open-world contains unknown examples. Largely due to the long-tail nature of data distribution, visual perception models are invariably confronted by unknown examples in the open world. Failing to detecting the unknowns can cause serious issues. For example, a Tesla Model 3 did not identify an unknown overturned truck and crashed into the truck (ref. case).
  • Open-world requires learning with evolving data, and labels. The world of interest is changing over time, e.g., driving scenes (in different cities and under different weather), the search engine ("apple" means different things today and 20 years ago). This says that the data distribution and semantics are continually changing and evolving. How to address distribution shifts and concept drifts?

Speakers


Shu Kong
UMacau, Texas A&M

Deva Ramanan
Carnegie Mellon University

Walter J. Scheirer
University of Notre Dame

Ziwei Liu
Nanyang Technological University
Xiaolong Wang
UC San Diego



Organizers

Please contact Shu Kong with any questions: aimerykong [at] gmail [dot] com


Shu Kong
UMacau, Texas A&M

Yanan Li
Zhejiang Lab

Yu-Xiong Wang
University of Illinois at Urbana-Champaign
Andrew Owens
University of Michigan

Deepak Pathak
Carnegie Mellon University

Carl Vondrick
Columbia University

Abhinav Shrivastava
University of Maryland


Advisory Board

Deva Ramanan
Carnegie Mellon University

Terrance Boult
University of Colorado Colorado Springs
Walter J. Scheirer
University of Notre Dame



Challenge Organizers


Shu Kong
UMacau, Texas A&M

Yanan Li
Zhejiang Lab

Qianqian Shen
Zhejiang University

Yunhan Zhao
UC Irvine

Xiaoyu Yue
University of Sydney

Wenwei Zhang
Shanghai AI Lab

Jiangmiao Pang
Shanghai AI Lab

Pan Zhang
Shanghai AI Lab

Xiaoyi Dong
Shanghai AI Lab

Yuhang Zang
Shanghai AI Lab

Jiaqi Wang
Shanghai AI Lab


Coordinators


Tian Liu
Texas A&M

Yunhan Zhao
UC Irvine




Important Dates and Details



Program Schedule

This section is under construction!!!!

PDT / Time in Vancouver
Event
Title/Presenter
Links
09:00 - 09:30
Opening remarks
Shu Kong Texas A&M
Visual Perception via Learning in an Open World
09:30 - 10:00
Invited talk #1
Zeynep Akata, University of Tübingen / MPI-INF
Explainability in Deep Learning Through Communication
10:00 - 10:30
Invited talk #2
Alex Berg UCI
Labeling the open world: from active learning to Segment Anything
10:30 - 10:45
Break / time buffer
10:45 - 11:15
Invited talk #3
Andrew Owens, UMich
Image Forensics as Open World Vision
11:15 - 12:05
Host remark of Challenge-1 ObjDisc
12:05 - 13:15
Lunch break
13:15 - 13:45
Invited talk #4
Aljosa Osep CMU & Laura Leal-Taixé TUM
Learning to Understand the World from Video
13:45 - 14:15
Invited talk #5
14:15 - 14:45
Invited talk #6
Serge Belongie University of Copenhagen
Searching for Structure in Unfalsifiable Claims
14:45 - 15:00
Break / time buffer
15:00 - 15:30
Invited talk #7
Subhransu Maji UMass-Amherst
Counting in the Open World
15:30 - 16:20
Host remark of Challenge-2 FMDC
FMDC, SOCAR
Foundational Model without Descriptive Caption (FMDC) Challenge
16:20 - 17:10
Talks by Challenge-3 GRIT hosts
GRIT, AI2
The General Robust Image Task (GRIT) Benchmark
Tanmay Gupta AI2
Derek Hoiem UIUC
Christopher Clark AI2
17:10 - 17:15
Closing remarks