Visualizing LiDAR data from Waymo Open Dataset

Kushal B Kusram
2 min readJan 28, 2021

My previous article about Waymo Open Dataset discussed how to set up and extract camera images and corresponding labels. This article is a follow-up that outlines how to extract and visualize LiDAR data. However, as the previous codebase underwent slight changes, I shall readdress setting up the environment again before extracting the data. You may consider this article to be updated documentation for the current version of the code.

Prerequisites

You need access to Waymo Open Dataset Storage Bucket hosted on their Google Cloud Platform. You will need this to download independent files.

  1. Sign in with your Google credentials here, and you will receive an email that you now have access to the dataset, which includes the Google Cloud Storage Bucket hosted here.
  2. Let’s now set up the environment — Proceed to create a virtual environment for your project with Python. Make sure to install python3-dev, python3-pip, and virtualenv as well.
  3. Install gsutil and configure the utility with your Google Account. For a project-id, go ahead and create a free project on GCP to associate that ID.
  4. Clone the WaymoDataToolkit repository.
  5. pip install waymo-open-dataset-tf-2–1–0
  6. pip install -r requirements.txt should install the remaining required libraries.

Extract Camera and LiDAR Data

The repository includes extract.py that provides you with an abstraction to retrieve and extract camera and range data.

The first step is to initialize the WaymoDataToolkit with URL, and the following does the job:

dataset = WaymoDataToolkit.WaymoDataToolkit(url)

The toolkit can retrieve the file by calling the following function on the object created:

dataset.dataRetriever()

The retrieved file now needs to be extracted into images and labels, set using the variables defined previously:

dataset.dataExtractor()

By default, the toolkit assumes there is the following directory present in the root of the project:

data/camera/images, data/camera/labels and data/range

The images are saved as PNG, with their corresponding labels saved in text format. The LiDAR data is converted to point cloud from the raw data using their utility function and saved as a raw byte list of NumPy arrays consisting 5 point clouds.

Visualize Camera and LiDAR Data

The repository includes visualize.py, which provides you with example code that visualizes camera and range data. The toolkit uses OpenCV for images and Open3D for point cloud data.

Visualizing Camera Data as Images
Visualizing LiDAR Data as a Point Cloud

What Next?

This toolkit aims to provide a jump start to researchers to explore Waymo Data in their experiments and load Waymo data into their existing algorithms quickly. Hopefully I find time to work on creating machine learning algorithms based on this annotated data.

--

--