2D Semantic Segmentation

Learn how to create an example 2D semantic segmentation project using the KITTI open source dataset


This quickstart will walk you through uploading data into Aquarium, starting with a standard open source dataset for a 2D semantic segmentation task.

Before you get started, there are also pages with some background and information around key concepts in Aquarium.

The main steps we will cover are:

  • Create a project within Aquarium

  • Upload labeled masks

  • Upload inference masks

By the end of this guide, you should have a good idea of how to upload your data into Aquarium and explore your semseg dataset!


To follow along with this Quickstart guide here are some things you'll need:

  • Download the quickstart dataset

    • The dataset contains the raw images, masks, and an end-to-end example upload script

  • Ensure you have installed the latest Aquarium client

    • pip install aquariumlearning

  • A development environment running a version of Python 3.6+

Python Client Library

The aquariumlearning package requires Python >= 3.6.

Aquarium provides a python client library to simplify integration into your existing ML data workflows. In addition to wrapping API requests, it also handles common needs such as efficiently encoding uploaded data or using disk space to work with datasets larger than available system memory.

You can install and use the library using the following code block:

!pip install aquariumlearning

import aquariumlearning as al
al_client = al.Client()

To get your API key, you can follow these instructions.

Kitti Dataset Ingestion

This quickstart leverages the KITTI open source dataset. This dataset contains images captured from from a car driving around a mid-size city with up to 15 cars and 30 pedestrians are visible per image.

For this 2D Semantic Segmentation, we'll be pixel level classification. The KITTI dataset has a lot of labeled classes, and a smaller amount of predicted classes. The predicted classes are:

  • road

  • sidewalk

  • building

  • wall

  • fence

  • pole

  • traffic light

  • traffic sign

  • vegetation

  • terrain

  • sky

  • person

  • rider

  • car

  • truck

  • bus

  • train

  • motorcycle

  • bicycle

You can download the quickstart dataset (334MB) at this link. Once downloaded and unzipped, you should be able to open the 2d_semseg_ingest.py file, add your API key, and run the file to upload data to your org! The quickstart folder will be formatted

# Overall data structure
├── labels
│   ├── 000000_10.png
│   ├── 000001_10.png
│   └── ...
├── inferences
│   ├── 000000_10.png
│   ├── 000001_10.png
│   └── ...
├── rgb_images
│   ├── 000000_10.png
│   ├── 000000_10.webp
│   ├── 000001_10.png
│   ├── 000001_10.webp
│   └── ...
├── semseg_img_list.json
└── 2d_semseg_ingest.py

# in the rgb_images folder, the .webp images are compressed images 
# that are used as preview urls

# All images are mirrored online at
# https://storage.googleapis.com/aquarium-public/quickstart/semseg/inference_masks/inferences/<image_name>.png
# https://storage.googleapis.com/aquarium-public/quickstart/semseg/rgb_images/rgb_images/<image_name>.png
# https://storage.googleapis.com/aquarium-public/quickstart/semseg/label_masks/labels/<image_name>.png

There is a file in the zip named semseg_img_list.json, this file has a list of all the names of the images that are going to be uploaded and exists as a utility to make it easier to loop through data for upload.

Uploading the Data

There is a complete upload script included in the downloadable quickstart dataset as well as here in the docs. The following sections break off individual pieces of the process to discuss them in depth, but for a complete end to end example please refer to the whole script.


Projects are the highest level grouping in Aquarium and they allow us to:

  • Define a specific core task - in this case, segmentation of road images (2D Semantic Segmentation)

  • Define a specific ontology/class map

  • Hold multiple datasets for a given task

You can click here for more information on defining projects and best practices!

In the examples below, you'll see a reference to a ./classnames.json on line 40, this json is a simple list of the classnames we will be using in our project. That json file is available in the downloadable quickstart files.

# provided classname file
CLASSNAMES_FILE = 'classnames.json'

# load in your classname file
with open(CLASSNAMES_FILE) as f:
    classnames = json.load(f)

Base Images

For a semantic segmentation use case, the image you are segmenting is uploaded for reference and rendering.

In the code block below, we highlight the function .add_image() used to add both the original full quality image that is being segmented as well as the compressed version of that image. The compressed version of the image is uploaded with the preview_url parameter and allows for quicker and more efficient viewing of the image when using Aquarium.

The function .add_image() is called on an object of type LabeledFrame. We will cover this in more detail in the next sections. See the complete upload script at the bottom of this guide for an example of a semantic segmentation ingestion script.

The following code block highlights the .add_image() function:

# we go into detail talking about a frame in later sections
# SENSOR_ID represents the sensor value for this image, can be used for querying

Labeled Datasets

Aquarium supports both image masks and numpy arrays to represent labeled data, where each pixel value is represented by a number that maps to a class id. For example, say your total segmentation model identifies 6 unique classes. Your image would be a greyscale image or a numpy array where each pixel would be represented by a value 0-5, where each numeric value maps to the corresponding order in which you defined your class map.

For the quickstart example, we leverage image masks as our labels! To upload the image masks, you first need to create a LabeledDataset object to represent all your labels. Then we create a Frame object that we add all of our labels and inferences too. We'll break down the code in more detail below, but want to point out normally when uploading (and like you'll see in the completed upload script) data is normally uploaded in some kind of loop:

# define your dataset object
dataset = al.LabeledDataset()

# define a frame for each image we are working with
# each frame needs a unique id, image name is usually a good option
frame_id = filename.split('.png')[0]
# create the frame object that we will add labels to
frame = al.LabeledFrame(frame_id=frame_id)

# now we add the label mask
# when you upload labels in Aquarium, each label needs a unique id
label_id = f'{frame_id}_gt'
# in the quickstart we upload greyscale images 
# where each pixel value maps to a class id
# we format the image URL for the specific image
# Aquarium has all the images publicly available in a bucket
label_mask_path = os.path.join(LABEL_MASK_DIR, f'{frame_id}.png').replace("\\", "/")
# once the URL is formatted, calling the below function sets your label to the greyscale mask image
frame.add_label_2d_semseg(sensor_id=SENSOR_ID, label_id=label_id, mask_url=label_mask_path)

# lastly, we add the new frame to the dataset

This is an example of the greyscale image that is being uploaded to Aquarium:


Now that we have created a Project and a LabeledDataset, let's also upload those model inferences. Inferences, like labels, must be matched to a frame. For each labeled frame in your dataset, we will create an inference frame and then assign the appropriate inferences to that inference frame.

The inferences just like labels can be represented as a greyscale image or a numpy array. Same concept apply to inferences as we discussed above for labels.

Creating InferencesFrame objects and adding inferences will look very similar to creating LabeledFrames and adding labels to them.

Important Things To Note:

  • Each InferencesFrame must exactly match to a LabeledFrame in the dataset. This is accomplished by ensuring the frame_id property is the same between corresponding LabeledFrames and InferencesFrames.

  • It is possible to assign inferences to only a subset of frames within the overall dataset (e.g. just the test set).

Breaking down some of what you'll see in the provided ingest script:

# just like we defined a labeled dataset
# we define in the Inferences set to hold our predictions
inferences = al.Inferences()

# define the inference frame object
# makes sure the frame_id matches what you used for the LabeledFrame object
inf_frame = al.InferencesFrame(frame_id=frame_id)

# just like we created a label id, we create an inference label id
# the individual inference id must also be unique 
inf_label_id = f'{frame_id}_{INFERENCE_SET_NAME}_inf'
# create the URL path to point to the inference mask
orig_inf_mask_path = os.path.join(INFERENCE_MASK_DIR, f'{frame_id}.png')
# add an inference mask using below function
inf_frame.add_inference_2d_semseg(sensor_id=SENSOR_ID, label_id=inf_label_id, mask_url=orig_inf_mask_path)

# add the new inference frame to your inference set

At this point we have created a project in Aquarium, and uploaded our labels and inferences. The data has been properly formatted, but now as our final step, let's use the client to actually upload the data to our project!

Submit the Datasets!

Now that we have the datasets, using the client we can upload the data:

# we run sanity checks as well to make sure dataset and inference sets 
# arent already there before we create a new one in cases where you are testing
# multiple uploads
if not al_client.dataset_exists(PROJECT_NAME, DATASET_NAME):
    print("Creating Dataset")
        PROJECT_NAME, DATASET_NAME, dataset=dataset, embedding_distance_metric='cosine'

if not al_client.inferences_exists(PROJECT_NAME, DATASET_NAME, INFERENCE_SET_NAME):
    print("Submitting Inferences")
    al_client.create_inferences(PROJECT_NAME, DATASET_NAME, inferences_id=INFERENCE_SET_NAME, inferences=inferences, embedding_distance_metric='cosine')

With the code snippet above your data will start uploading! Now we can monitor the status within the UI!

Monitoring Your Upload

When you start an upload, Aquarium performs some crucial tasks like indexing metadata and generating embeddings for dataset so it may take a little bit of time before you can fully view your dataset. You can monitor the status of your upload in the application as well as your console after running your upload script. To view your upload status, log into Aquarium and click on your newly created Project. Then navigate to the tab that says "Streaming Uploads" where you can view the status of your dataset uploads.

Once your upload is completed under the "Datasets" tab, you'll see a view like this:

And congrats!! You've uploaded your data into Aquarium! You're now ready to start exploring your data in the application!

Completed Upload Example Script

Putting it all together here is the entire script you can use to replicate this project. You can download this script and the necessary data here. If you need help getting your API key, you can follow these instructions.

#!/usr/bin/env python3
import json
import os
import glob
import string
import subprocess
import random
import click
import requests
from tqdm import tqdm
from collections import namedtuple
from google.cloud import storage
from PIL import Image
import numpy as np
import io
import aquariumlearning as al

# setting the basics about our data
SENSOR_ID = 'img0'
IMG_WIDTH = 1242

# set up aquarium client
al_client = al.Client()

# Project names have to be unique so we add a random string to the end
# you can name the project whatever works for you
PROJECT_NAME = 'Semantic_Segmentation_'+''.join(random.choices(string.ascii_lowercase, k=5))
DATASET_NAME = 'initial_labels_segmentation'
INFERENCE_SET_NAME = 'initial_inferences'
# this is the path to a list of each image name to make it easier to iterate through
# in case you ever want to pull images from a bucket 
RGB_DIR = 'https://storage.googleapis.com/aquarium-public/quickstart/semseg/rgb_images/rgb_images/'
INFERENCE_MASK_DIR = 'https://storage.googleapis.com/aquarium-public/quickstart/semseg/inference_masks/inferences/'
LABEL_MASK_DIR = 'https://storage.googleapis.com/aquarium-public/quickstart/semseg/label_masks/labels/'

# read in additional data
DATA_PATH = 'semseg_img_list.json'
CLASSNAMES_FILE = 'classnames.json'

# load in your classname file
with open(CLASSNAMES_FILE) as f:
    classnames = json.load(f)

# creating a list of all the image files
# read in file names we'll be working with
with open(DATA_PATH) as f:
    file_names = json.load(f)

# define your dataset and inference set
dataset = al.LabeledDataset()
inferences = al.Inferences()

# for each image file available
for img_idx, img_filepath in enumerate(file_names):
    # create frame id from image name
    dir_path, filename = os.path.split(img_filepath)
    frame_id = filename.split('.png')[0]
    frame = al.LabeledFrame(frame_id=frame_id)
    inf_frame = al.InferencesFrame(frame_id=frame_id)

    # pull the full image URL
    # the quickstart provides the images to you directly, but we upload into Aquarium with URLS
    # We have the images hosted in a bucket with public URLS you can leverage
    # The URL for each image will look something like: 
    # https://storage.googleapis.com/aquarium-public/quickstart/semseg/rgb_images/rgb_images/000000_10.png
    image_url = os.path.join(RGB_DIR, f'{frame_id}.png').replace("\\", "/")
    preview_url = os.path.join(RGB_DIR, f'{frame_id}.webp').replace("\\", "/")

    # add full color image for reference photo
    frame.add_image(sensor_id=SENSOR_ID, image_url=image_url, preview_url=preview_url)
    frame.add_user_metadata('split', 'training')

    # now we add the label mask
    # make label id, we create it by adding _gt to the frame id
    label_id = f'{frame_id}_gt'
    label_mask_path = os.path.join(LABEL_MASK_DIR, f'{frame_id}.png').replace("\\", "/")
    frame.add_label_2d_semseg(sensor_id=SENSOR_ID, label_id=label_id, mask_url=label_mask_path)

    # the inference label is the id with _pred as a suffix
    inf_label_id = f'{frame_id}_{INFERENCE_SET_NAME}_inf'
    orig_inf_mask_path = os.path.join(INFERENCE_MASK_DIR, f'{frame_id}.png')
    inf_frame.add_inference_2d_semseg(sensor_id=SENSOR_ID, label_id=inf_label_id, mask_url=orig_inf_mask_path)


if not al_client.project_exists(PROJECT_NAME):
    print("Creating Project")
    al_client.create_project(PROJECT_NAME, al.LabelClassMap.from_classnames(classnames), primary_task="2D_SEMSEG"

if not al_client.dataset_exists(PROJECT_NAME, DATASET_NAME):
    print("Creating Dataset")
        PROJECT_NAME, DATASET_NAME, dataset=dataset, embedding_distance_metric='cosine'

if not al_client.inferences_exists(PROJECT_NAME, DATASET_NAME, INFERENCE_SET_NAME):
    print("Submitting Inferences")
    al_client.create_inferences(PROJECT_NAME, DATASET_NAME, inferences_id=INFERENCE_SET_NAME, inferences=inferences, embedding_distance_metric='cosine'

What Now?

Now that you have uploaded data, time to explore and understand your data better using Aquarium!

Last updated