Visual Perception via Learning in an Open World

The 3rd workshop on Open World Vision

Location: West 118-120

Time: 9:00am - 17:15pm Vancouver Local Time (PDT), June 18, 2023

The recorded workshop is released on YouTube:

in conjunction with CVPR 2023, Vancouver, Canada


Visual perception is indispensable for numerous applications, spanning transportation, healthcare, security, commerce, entertainment, and interdisciplinary research. Visual perception algorithms developed in a closed-world setup often generalize poorly to the real open-world, which contains situations that are never-before-seen, dynamic, vast, and unpredictable. This requires visual perception algorithms to be developed for the open-world, to address its complexities such as recognizing unknown objects, debiasing imbalanced data distributions, leveraging multimodal signals, efficient few-shot learning, etc. Moreover, today's most powerful visual perception models are pretrained in an open-world, e.g., training them on web-scale data consisting of images, langauges and so on. We are in the best era to study Visual Perception via Learning in an Open World (VPLOW). Therefore, we are inviting you to our VPLOW workshop, where multiple speakers and challenge competitions will cover a variety of topics of VPLOW. We hope our workshop stimulates fruitful discussions.

You might be interested in our previous workshops:


Topics of interest include, but are not limited to:

  • data: long-tailed distribution, open-set, unknowns, streaming data, biased data, unlabeled data, anomaly, multi-modality, etc.
  • concepts: open-vocabulary, ontology/taxonomy of object classes, evolving class ontology, etc.
  • learning: X-shot learning, Y-supervised learning, lifelong/continual learning, domain adaptation/generalization, open-world learning, multimodal pretraining, prompt learning, foundation model tuning, etc.
  • social impact: safety, fairness, real-world applications, inter-disciplinary research, etc.
  • misc: datasets, benchmarks, interpretability, robustness, generalization, etc.


Let's consider the following motivational examples.

  • Open-world data follows a long-tail distribution. Data tends to follow a long-tailed distribution and real-world tasks often emphasize the rarely-seen data. A model trained on such long-tailed data can perform poorly on rare or underrepresentative data. For example, a visual recognition model can misclassify underrepresented minorities and make unethical predictions (ref. case1, case2).
  • Open-world contains unknown examples. Largely due to the long-tail nature of data distribution, visual perception models are invariably confronted by unknown examples in the open world. Failing to detecting the unknowns can cause serious issues. For example, a Tesla Model 3 did not identify an unknown overturned truck and crashed into the truck (ref. case).
  • Open-world requires learning with evolving data, and labels. The world of interest is changing over time, e.g., driving scenes (in different cities and under different weather), the search engine ("apple" means different things today and 20 years ago). This says that the data distribution and semantics are continually changing and evolving. How to address distribution shifts and concept drifts?


Shu Kong
Texas A&M

Subhransu Maji

Serge Belongie
University of Copenhagen

Lior Wolf
Tel Aviv University

Zeynep Akata
University of Tübingen / MPI-INF
Aljosa Osep
Tal Shaharabany
Tel-aviv University


Please contact Shu Kong with any questions: shu [at] tamu [dot] edu

Shu Kong
Texas A&M

Yu-Xiong Wang
University of Illinois at Urbana-Champaign
Andrew Owens
University of Michigan

Deepak Pathak
Carnegie Mellon University

Carl Vondrick
Columbia University

Abhinav Shrivastava
University of Maryland

Deva Ramanan
Carnegie Mellon University

Terrance Boult
University of Colorado Colorado Springs

Challenge Organizers

Tanmay Gupta
Allen Institute for AI (AI2)

Derek Hoiem
University of Illinois at Urbana-Champaign
Aniruddha (Ani) Kembhavi
Allen Institute for AI (AI2)

Amita Kamath
Allen Institute for AI (AI2)

Pulkit Kumar
University of Maryland

University of Maryland


Yanan Li
Zhejiang Lab

Important Dates and Details

Please go to each challenge's website for its exact dates!
  • Submission deadline for Challenge-1: Obj-Disc: June 10, 2023 at 11:59pm PST.
  • Submission deadline for Challenge-2: FMDC: June 15, 2023 at 11:59pm PST.
  • Submission deadline for Challenge-3: GRIT: June 10, 2023 at 11:59pm PST.
  • Workshop date: June 18, 2023

Program Schedule

PDT / Time in Vancouver
09:00 - 09:30
Opening remarks
Shu Kong Texas A&M
Visual Perception via Learning in an Open World
09:30 - 10:00
Invited talk #1
Zeynep Akata, University of Tübingen / MPI-INF
Explainability in Deep Learning Through Communication
10:00 - 10:30
Invited talk #2
Alex Berg UCI
Labeling the open world: from active learning to Segment Anything
10:30 - 10:45
Break / time buffer
10:45 - 11:15
Invited talk #3
Andrew Owens, UMich
Image Forensics as Open World Vision
11:15 - 12:05
Host remark of Challenge-1 ObjDisc
12:05 - 13:15
Lunch break
13:15 - 13:45
Invited talk #4
Aljosa Osep CMU & Laura Leal-Taixé TUM
Learning to Understand the World from Video
14:15 - 14:45
Invited talk #6
Serge Belongie University of Copenhagen
Searching for Structure in Unfalsifiable Claims
14:45 - 15:00
Break / time buffer
15:00 - 15:30
Invited talk #7
Subhransu Maji UMass-Amherst
Counting in the Open World
15:30 - 16:20
Host remark of Challenge-2 FMDC
Foundational Model without Descriptive Caption (FMDC) Challenge
16:20 - 17:10
Talks by Challenge-3 GRIT hosts
The General Robust Image Task (GRIT) Benchmark
Tanmay Gupta AI2
Derek Hoiem UIUC
Christopher Clark AI2
17:10 - 17:15
Closing remarks