Amazon-Berkeley Objects: A Large-Scale Dataset for 3D Object Understanding


Collecting large amounts of high-quality 3D annotations (such as voxels or meshes) for individual real-world objects poses a challenge. One way around this problem is to focus only on synthetic, computer-aided design models. This has the advantage that the data is large in scale, but most objects are untextured and there is no guarantee that the object may exist in the real world. As a result, 3D reconstruction methods trained on this data work quite well on clear-background renderings of synthetic objects, but don't necessarily generalize to real images or more complex object geometries.

In order to tie 3D annotations to the real world, other datasets annotate real images with exact, pixel-aligned 3D models. These datasets have helped bridge some of the synthetic-to-real domain gap, however the size of the datasets are relatively small (in number of unique 3D models and categories), likely due to the difficulty of finding images that correspond to exact matches of 3D models. The provided 3D models are also untextured, thus the annotations in these datasets are largely used for shape or pose-based tasks, rather than tasks such as material prediction.

The goal of this work is to release a new, realistic dataset for 3D object understanding grounded by real images and objects. The dataset is dervied from product listings, a natural data source for object-centric multi-view images. Overall, Amazon-Berkeley Objects (ABO) contains 147,702 product listings associated with 398,212 unique images and up to 18 different attribute annotations. It also includes "360-view" turnable-style images for 8,222 products and has 7,953 artist-designed 3D meshes. Because the 3D models are artist-designed, they are equipped with high resolution physically-based materials that allow for photorealistic rendering.

We use this realistic, object-centric 3D dataset to measure the domain gap for single-view 3D reconstruction networks trained on synthetic objects. We also use multi-view images from the dataset to measure the robustness of state-of-the-art metric learning approaches to different camera viewpoints. Finally, leveraging the physically-based rendering materials in ABO, we perform single- and multi-view material estimation for a variety of complex, real-world geometries. The full dataset is available for download at:


Jasmine Collins (

Thomas Dideriksen (

Matthieu Guillaumin (

Jitendra Malik (