# Labeled Datasets

## Overview

Labeled datasets contain your ground truth/labels. A labeled dataset belongs to a [Project](https://legacy-docs.aquariumlearning.com/aquarium/getting-started/key-concepts#projects) and consist of multiple [labeled frames.](https://legacy-docs.aquariumlearning.com/aquarium/getting-started/key-concepts#labeled-frame)

A labeled frame is one logical "frame" of data, such as an image from a camera stream. They can contain one or more media/sensor inputs, zero or more ground truth labels, and arbitrary user provided metadata.

{% hint style="info" %}
For example, in a 2D classification case, a frame would contain the image, labels, and all associated metadata. \
\
Whereas with a 3D object detection use case, a frame can contain images, point clouds, labels, and metadata too. &#x20;
{% endhint %}

For real examples of uploading labeled data, please look at our [<mark style="color:blue;">quickstart guides</mark>](https://legacy-docs.aquariumlearning.com/aquarium/getting-started/quickstart-guides)!

## Prerequisites to Uploading Labeled Data

**In order to ensure the following steps will work smoothly, this guide assumes you have already:**

* [Created a project](https://legacy-docs.aquariumlearning.com/aquarium/integrating-with-aquarium/creating-projects-in-aquarium)
* Installed the Aquarium SDK
* Have URLs for your raw data (images, point clouds, etc.)
  * See our data sharing docs for more details on URL requirements.
* Have access to your labels for your raw data

To view your labeled data once uploaded, you will have to make sure that you have selected and set up the appropriate data sharing method for your team.

## Creating and Formatting Your Labeled Data

To ingest a labeled dataset, there are two main objects you'll work with:

* [LabeledFrame](https://aquarium-not-pypi.web.app/aquariumlearning/docs/#aquariumlearning.LabeledFrame)
  * An object containing all relevant information for a single frame/image. Data URLs, labels, metadata, etc.
* [LabeledDataset](https://aquarium-not-pypi.web.app/aquariumlearning/docs/#aquariumlearning.LabeledDataset)
  * A collection of LabeledFrames.

For each datapoint, we create a LabeledFrame and add it to the LabeledDataset in order to create the dataset that we upload into Aquarium.&#x20;

**This usually means looping through your data and creating LabeledFrames to add to the LabeledDataset object.**

{% hint style="info" %}
If you have generated your own embeddings and want to use them during your labeled data uploads, please also see [this](#custom-embeddings) section for additional guidance!
{% endhint %}

Defining these objects looks like this:

```python
labeled_dataset = al.LabeledDataset()

for frame_id, frame_data in my_list_of_data:
    # Frames must have a unique frame_id
    frame = al.LabeledFrame(frame_id=frame_id)
    ...
    labeled_dataset.add_frame(frame)
```

Once you've defined your frame, we need to associate some data with it! In the next sections, we show you how to add your main form of input data to the frame (images, point clouds, etc), and then associate the ground truth labels to that frame.

### Adding Data to Your Labeled Frame

Each LabeledFrame in your dataset can contain one or more input pieces of data. In many computer vision tasks, this may be a single image. In a robotics or self-driving task, this may be a full suite of camera images, lidar point clouds, and radar scans.&#x20;

Here are some common data types, their expected formats, and how to work with them in Aquarium:

{% tabs %}
{% tab title="Image" %}
Your ML task utilizes images and you would like to add an image to your labeled data

[**Python API Definition Link**](https://aquarium-not-pypi.web.app/aquariumlearning/docs/#aquariumlearning.LabeledFrame.add_image)

**Example Usage**

```
frame.add_image(
    image_url='https://storage.googleapis.com/aquarium-public/datasets/rareplanes/train/PS-RGB_tiled/96_10400100096C2500_tile_936.png',
    preview_url='',
    date_captured='2020-07-10 15:00:00.000',
    width=1280,
    height=720
)
```

\
**Relevant Function Parameter Descriptions**

<table><thead><tr><th width="348">Parameter Name</th><th>Description</th></tr></thead><tbody><tr><td>image_url</td><td><em>string</em> - URL to load the image by</td></tr><tr><td>preview_url (Optional)</td><td><em>string</em> - URL to a compressed form of the image for faster loading in browsers, must be same pixel dimensions as original image</td></tr><tr><td>date_captured (Optional)</td><td>ISO formatted date-time string</td></tr><tr><td>width (Optional)</td><td><em>int</em> - will be inferred otherwise</td></tr><tr><td>height (Optional)</td><td><em>int</em> - will be inferred otherwise</td></tr></tbody></table>
{% endtab %}

{% tab title="Metadata" %}
{% hint style="warning" %}
Metadata is indexed on Aquarium's side, so make sure you aren't sharing any private information.
{% endhint %}

Your ML task utilizes images and you would like to add an image to your labeled data

[**Python API Definition Link**](https://aquarium-not-pypi.web.app/aquariumlearning/docs/#aquariumlearning.InferencesFrame.add_user_metadata)

**Example Usage**

```
labeled_frame.add_user_metadata(
    key = 'deployment_id',
    val = value
)

# In the case of nullable values, you can also provide an explicit type
labeled_frame.add_user_metadata(
     key = 'nullable_field',
     val = maybe_null, 
     val_type='int'
)
```

\
**Relevant Function Parameter Descriptions**

<table><thead><tr><th width="348">Parameter Name</th><th>Description</th></tr></thead><tbody><tr><td>key</td><td><em>string</em> - URL to load the image by</td></tr><tr><td>val</td><td><em>string</em> - URL to a compressed form of the image for faster loading in browsers, must be same pixel dimensions as original image</td></tr><tr><td>val_type(Optional)</td><td>ISO formatted date-time string</td></tr></tbody></table>
{% endtab %}

{% tab title="Geospatial" %}
{% hint style="warning" %}
Geospatial metadata is indexed on Aquarium's side, so make sure you aren't sharing any private information.
{% endhint %}

Your ML task utilizes geospatial data and you want to add the data as context for analysis and filtering

[**Python API Definition Link**](https://aquarium-not-pypi.web.app/aquariumlearning/docs/#aquariumlearning.LabeledFrame.add_geo_latlong_data)

**Example Usage**

```python
# EPSG:4326 WGS84 Latitute Longitude coordinates
labeled_frame.add_geo_latlong_data(
    lat = 37.044030, 
    lon = -112.526130
)
```

\
**Relevant Function Parameter Descriptions**

<table><thead><tr><th width="348">Parameter Name</th><th>Description</th></tr></thead><tbody><tr><td>lat</td><td><em>float</em> - lattitude of Geo Location</td></tr><tr><td>lon</td><td><em>float</em> - longitude of Geo Location</td></tr></tbody></table>
{% endtab %}

{% tab title="Audio" %}
{% hint style="info" %}
If your model also works against spectrograms, you can provide both an audio data input and an image. The Aquarium UI will then present both alongside each other.
{% endhint %}

Aquarium supports audio files that are natively playable in browsers. For maximum compatibility, we recommend providing .mp3 files.

[**Python API Definition Link**](https://aquarium-not-pypi.web.app/aquariumlearning/docs/#aquariumlearning.LabeledFrame.add_image)

**Example Usage**

```python
labeled_frame.add_audio(
    # A URL to load the mp3 file from
    audio_url='',
    # Optional: ISO formatted date-time string
    date_captured=''
)
```

**Relevant Function Parameter Descriptions**

<table><thead><tr><th width="348">Parameter Name</th><th>Description</th></tr></thead><tbody><tr><td>audio_url</td><td><em>string</em> - URL to load the mp3 file from</td></tr><tr><td>date_captured</td><td><em>string</em> - ISO formatted date-time string</td></tr></tbody></table>
{% endtab %}

{% tab title="Point Cloud" %}
Because point cloud formats haven't standardized as much as image data yet, we currently supported two formats. Please reach out if you have a different representation as we'd be more than happy to support your in-house representation.&#x20;

### PCL / PCD

Aquarium supports the `*.pcd` file format used by the PCL library, including the binary and compressed binary encodings. Numeric values for the following column names are expected: x, y, z, intensity (optional), range (optional).

[**Python API Definition Link**](https://aquarium-not-pypi.web.app/aquariumlearning/docs/#aquariumlearning.UnlabeledFrame.add_point_cloud_pcd)

**Example Usage**

```python
labeled_frame.add_point_cloud_pcd(
    pcd_url='',
    coord_frame_id="",
    date_captured=''
)
```

**Relevant Function Parameter Descriptions**

<table><thead><tr><th width="348">Parameter Name</th><th>Description</th></tr></thead><tbody><tr><td>pcd_url</td><td><em>string</em> - URL to load the point cloud by</td></tr><tr><td>coord_frame_id (Optional)</td><td><em>string</em> - If your point cloud is relative to a specific coordinate frame, you can reference it by name here</td></tr><tr><td>date_captured (Optional)</td><td><em>string</em> - ISO formatted date-time string</td></tr></tbody></table>

### KITTI-like binary files

Similar to the raw KITTI lidar formats, we can also take in raw, dense binary files of little-endian values. This is in many ways more fragile, but also requires no third party libraries.&#x20;

[**Python API Definition Link**](https://aquarium-not-pypi.web.app/aquariumlearning/docs/#aquariumlearning.UnlabeledFrame.add_point_cloud_bins)

**Example Usage**

```python
frame.add_point_cloud_bins(
    point_cloud_url='',
    intensity_url='',
    range_url'',
    coord_frame_id="",
    date_captured=''
)
```

**Relevant Function Parameter Descriptions**

<table><thead><tr><th width="348">Parameter Name</th><th>Description</th></tr></thead><tbody><tr><td>point_cloud_url</td><td><p><em>string</em> - URL for the point positions: </p><pre><code>    #    float32 [x1, y1, z1, x2, y2, z2, ...]
</code></pre></td></tr><tr><td>intensity_url</td><td><p><em>string</em> - URL for the point intensities:</p><pre><code>unsigned int32 [i1, i2, i3, ...]
</code></pre></td></tr><tr><td>range_url</td><td><p><em>string</em> - URL for the point ranges</p><pre><code>float32 [r1, r2, r3, ...]
</code></pre></td></tr><tr><td>coord_frame_id (Optional)</td><td><em>string</em> - If your point cloud is relative to a specific coordinate frame, you can reference it by name here.</td></tr><tr><td>date_captured (Optional)</td><td><em>string</em> - ISO formatted date-time string</td></tr></tbody></table>
{% endtab %}

{% tab title="3D Geometry" %}
Aquarium supports rendering basic 3D geometry meshes. Please reach out if you have any needs that aren't captured here.

#### .OBJ (Wavefront) Files

[**Python API Definition Link**](https://aquarium-not-pypi.web.app/aquariumlearning/docs/#aquariumlearning.UnlabeledFrame.add_obj)

**Example Usage**

```python
labeled_frame.add_obj(
    obj_url='',
    coord_frame_id="",
    date_captured=''
)
```

**Relevant Function Parameter Descriptions**

<table><thead><tr><th width="348">Parameter Name</th><th>Description</th></tr></thead><tbody><tr><td>obj_url</td><td><em>string</em> - URL to add a *.obj formatted text</td></tr><tr><td>coord_frame_id (Optional)</td><td><em>string</em> - If your object geometry is relative to a specific coordinate frame, you can reference it by name here</td></tr><tr><td>date_captured (Optional)</td><td><em>string</em> - ISO formatted date-time string</td></tr></tbody></table>
{% endtab %}

{% tab title="3D Coordinate Frames" %}
In robotics applications, you often have multiple sensors in multiple coordinate frames. Aquarium supports specifying different coordinate frames, which will be used when interpreting 3D data inputs and labels.&#x20;

[**Python API Definition Link**](https://aquarium-not-pypi.web.app/aquariumlearning/docs/#aquariumlearning.LabeledFrame.add_coordinate_frame_3d)

**Example Usage**

```python
labeled_frame.add_coordinate_frame_3d(
    coord_frame_id='robot_ego_frame',
    position={'x': 0, 'y': 0, 'z': 0},
    orientation={'w': 1, 'x': 0, 'y': 0, 'z': 0},
    parent_frame_id=""
)
```

**Relevant Function Parameter Descriptions**

<table><thead><tr><th width="348">Parameter Name</th><th>Description</th></tr></thead><tbody><tr><td>coord_frame_id</td><td><em>string</em> - String identifier for this coordinate frame</td></tr><tr><td>position </td><td><em>dictionary with keys 'x', 'y', 'z'</em> - Position offset of this coordinate frame, values can be an int or float</td></tr><tr><td>orientation</td><td><em>dictionary with keys 'w', 'x', 'y', 'z' -</em> Rotation/Orientation of this coordinate frame, represented as a quaternion, values can be an int or a float</td></tr><tr><td>parent_frame_id (Optional)</td><td><em>string</em> - string ID of the parent coordinate frame that this one is relative to</td></tr></tbody></table>
{% endtab %}
{% endtabs %}

### Adding Labels to Your Labeled Frame

Each labeled frame in your dataset can contain **zero or more ground truth labels**.

Here are some common label types, their expected formats, and how to work with them in Aquarium:

{% tabs %}
{% tab title="Classification" %}
You are working with 2D or 3D data and want to add a classification label

[**Python API Definition Link - 2D**](https://aquarium-not-pypi.web.app/aquariumlearning/docs/#aquariumlearning.LabeledFrame.add_label_2d_classification)

**Example Usage**

```python
# Standard 2D case
labeled_frame.add_label_2d_classification(
    label_id='unique_id_for_this_label',
    classification='dog'
)
```

**Relevant Function Parameter Descriptions**

<table><thead><tr><th width="324">Parameter Name</th><th>Description</th></tr></thead><tbody><tr><td>label_id</td><td><em>string</em> - a unique id across all other labels in this dataset</td></tr><tr><td>classification </td><td><em>string</em> - what the label is classified as</td></tr><tr><td>user_attrs (Optional)</td><td><em>dict</em> - Any additional label-level metadata fields. Defaults to None.</td></tr></tbody></table>
{% endtab %}

{% tab title="2D Bounding Box" %}
You are creating a label on an image that is a 2D bounding box

[**Python API Definition Link**](https://aquarium-not-pypi.web.app/aquariumlearning/docs/#aquariumlearning.LabeledFrame.add_label_2d_bbox)

**Example Usage**

<pre class="language-python"><code class="lang-python"><strong># top, left, width, and height are in pixels
</strong><strong>labeled_frame.add_label_2d_bbox(
</strong>    label_id='unique_id_for_this_label',
    classification='dog',
    top=200,
    left=300,
    width=250,
    height=150
)
</code></pre>

**Relevant Function Parameter Descriptions**

<table><thead><tr><th width="324">Parameter Name</th><th>Description</th></tr></thead><tbody><tr><td>label_id</td><td><em>string</em> - a unique id across all other labels in this dataset</td></tr><tr><td>classification </td><td><em>string</em> - what the label is classified as</td></tr><tr><td>top, left, width, height</td><td><em>int or float -</em> Coordinates are in absolute pixel space</td></tr><tr><td>user_attrs (Optional)</td><td><em>dict</em> - Any additional label-level metadata fields. Defaults to None.</td></tr></tbody></table>
{% endtab %}

{% tab title="3D Cuboid" %}
Aquarium supports 3D cuboid labels, with 6-DOF position and orientation.

[**Python API Definition Link**](https://aquarium-not-pypi.web.app/aquariumlearning/docs/#aquariumlearning.LabeledFrame.add_label_3d_cuboid)

**Example Usage**

```python
labeled_frame.add_label_3d_cuboid(
    label_id="unique_id_for_this_label",
    classification="car",
    dimensions=[1.0, 0.5, 0.5],
    position=[2.0, 2.0, 1.0],
    rotation=[0.0, 0.0, 0.0, 1.0],
    coord_frame_id="robot_ego_frame"
)
```

**Relevant Function Parameter Descriptions**

<table><thead><tr><th width="324">Parameter Name</th><th>Description</th></tr></thead><tbody><tr><td>label_id</td><td><em>string</em> - Unique string identifier for this coordinate frame</td></tr><tr><td>classification </td><td><em>string -</em> classification of the cuboi<em>d</em></td></tr><tr><td>dimensions</td><td><em>list with values for xyz dimensions -</em> xyz dimensions of cuboid</td></tr><tr><td>position (Optional)</td><td><em>list with values for xyz</em> - Position of the center of the object</td></tr><tr><td>rotation (Optional)</td><td><em>list with values for XYZW -</em> Ordered object rotation quaternion</td></tr><tr><td>coord_frame_id (Optional)</td><td> <em>str</em> - If your cuboid is relative to a specific coordinate frame, you can reference it by name here</td></tr><tr><td>user_attrs (Optional)</td><td><em>dict</em> - Any additional label-level metadata fields. Defaults to None.</td></tr></tbody></table>

#### 3D Cuboid Image Projection (Optional)

<figure><img src="https://391596125-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-MI6IGz_3V5m6p1UIhXm%2Fuploads%2F5KIZcA0kQrGL5XrYjSXV%2FScreen%20Shot%202022-10-05%20at%205.55.11%20PM.png?alt=media&#x26;token=128be079-6663-4603-9442-40b3c664a238" alt=""><figcaption><p>Example of 3D cuboids projected onto image</p></figcaption></figure>

If you have images attached to your frames alongside 3D cuboids, we can project the cuboids onto the images! First you will need to add a 2D coordinate frame:

```python
frame.add_coordinate_frame_2d(
    # String identifier for this coordinate frame.
    coord_frame_id="camera_1_coordinate_frame",
    # focal length x in pixels.
    fx=fx,
    # focal length y in pixels.
    fy=fy,
    # Optional: Either "fisheye" for the fisheye model,
    # or "brown_conrady" for the pinhole model with
    # Brown-Conrady distortion.
    camera_model=camera_model
    # Optional: Dict of the form {x, y, z}.
    position=position,
    # Optional: Quaternion rotation dict of the form {w, x, y, z}.
    orientation=orientation,
    # Optional: 4x4 row major order camera matrix mapping
    # 3d world space to camera space (x right, y down, z forward).
    # Keep in mind, if you pass in the camera matrix it will stack
    # on top of the position/orientation you pass in as well. This
    # is only needed if you cannot properly represent your camera
    # using the position/orientation parameters.
    camera_matrix=camera_matrix,
    # Optional: optical center pixel x coordinate.
    cx=cx,
    # Optional: optical center pixel y coordinate.
    cy=cy,
    # Optional: k1 radial distortion coefficient (Brown-Conrady, fisheye).
    k1=k1,
    # Optional: k2 radial distortion coefficient (Brown-Conrady, fisheye).
    k2=k2,
    # Optional: k3 radial distortion coefficient (Brown-Conrady, fisheye).
    k3-k3,
    # Optional: k4 radial distortion coefficient (Brown-Conrady, fisheye).
    k4=k4,
    # Optional: k5 radial distortion coefficient (Brown-Conrady).
    k5=k5,
    # Optional: k6 radial distortion coefficient (Brown-Conrady).
    k6=k6,
    # Optional: p1 tangential distortion coefficient (Brown-Conrady).
    p1=p1,
    # Optional: p2 tangential distortion coefficient (Brown-Conrady).
    p2=p2,
    # Optional: s1 thin prism distortion coefficient (Brown-Conrady).
    s1=s1,
    # Optional: s2 thin prism distortion coefficient (Brown-Conrady).
    s2=s2,
    # Optional: s3 thin prism distortion coefficient (Brown-Conrady).
    s3=s3,
    # Optional: s4 thin prism distortion coefficient (Brown-Conrady).
    s4=s4,
    # Optional: camera skew coefficient (fisheye).
    skew=skew,
    # Optional: String id of the parent coordinate frame.
    parent_frame_id=parent_frame_id
)
```

{% hint style="info" %}
View the [Python API docs](https://aquarium-not-pypi.web.app/aquariumlearning/docs/#aquariumlearning.LabeledFrame.add_coordinate_frame_2d) for more information on the parameters to `add_coordinate_frame_2d`
{% endhint %}

And then you can use this 2D coordinate frame when adding your image to the frame:

```python
frame.add_image(
    # A unique name to refer to this image by
    sensor_id='camera_1',
    # A URL to load the image by
    image_url='',
    # A URL to a compressed form of the image for faster loading in browsers.
    # It must be the same pixel dimensions as the original image.
    preview_url='',
    # Optional: ISO formatted date-time string
    date_captured='',
    # Optional: width of image in pixels, will be inferred otherwise
    width=1280,
    # Optional: height of image in pixels, will be inferred otherwise
    height=720
    # Optional: 2D coordinate frame to use for this image
    coord_frame_id="camera_1_coordinate_frame",
)
```

Now you will be able to see your 3D cuboids projected onto your images and have camera distortion properly accounted for!
{% endtab %}

{% tab title="2D Semseg" %}
2D Semantic Segmentation labels are represented by an image mask, where each pixel is assigned an integer value in the range of \[0,255]. For efficient representation across both servers and browsers, Aquarium expects label masks to be encoded as grey-scale PNGs of the same dimension as the underlying image.

If you have your label masks in the form of a numpy ndarray, we recommend using the pillow python library to convert it into a PNG:

```python
! pip3 install pillow

from PIL import Image
...

# 2D array, where each value is [0,255] corresponding to a class_id
# in the project's label_class_map.
int_arr = your_2d_ndarray.astype('uint8')

Image.fromarray(int_arr).save(f"{imagename}.png")
```

Because this will be loaded dynamically by the web-app for visualization, this image mask will need to be hosted somewhere. To upload it as an asset to Aquarium, you can use the following utility:

[**Python API Definition Link**](https://aquarium-not-pypi.web.app/aquariumlearning/docs/#aquariumlearning.Client.upload_asset_from_filepath)

**Example Usage**

<pre class="language-python"><code class="lang-python"><strong># The image mask needs to be a grey-scale PNG 
</strong><strong>mask_url = al_client.upload_asset_from_filepath(
</strong><strong>    project_id = '', 
</strong><strong>    dataset_id = '', 
</strong><strong>    filepath = ''
</strong>)
</code></pre>

**Relevant Function Parameter Descriptions**

<table><thead><tr><th width="324">Parameter Name</th><th>Description</th></tr></thead><tbody><tr><td>project_id</td><td><em>string</em> - name of the project</td></tr><tr><td>dataset_id </td><td><em>string</em> - name of the labeled dataset</td></tr><tr><td>filepath</td><td><em>string -</em> The filepath to grab the assset data from</td></tr><tr><td>user_attrs (Optional)</td><td><em>dict</em> - Any additional label-level metadata fields. Defaults to None.</td></tr></tbody></table>

{% hint style="info" %}
This utility hosts and stores a copy of the label mask (not the underlying RGB image) with Aquarium. If you would like your label masks to remain outside of Aquarium, chat with us and we'll help figure out a good setup.
{% endhint %}

Now, we add the label to the frame like any other label type:

[**Python API Definition Link**](https://aquarium-not-pypi.web.app/aquariumlearning/docs/#aquariumlearning.UpdateGTLabelSet.add_label_2d_semseg)

**Example Usage**

```python
frame.add_label_2d_semseg(
    # The sensor id of the image this label corresponds to
    sensor_id='some_camera',
    # A unique id across all other labels in this dataset
    label_id='unique_id_for_this_label',
    # Expected to be a PNG, with values in [0,255] that correspond
    # to the class_id of classes in the label_class_map
    mask_url='url_to_greyscale_png'
)
```

**Relevant Function Parameter Descriptions**

<table><thead><tr><th width="324">Parameter Name</th><th>Description</th></tr></thead><tbody><tr><td>label_id</td><td><em>string</em> - a unique id across all other labels in this dataset</td></tr><tr><td>classification </td><td><em>string</em> - what the label is classified as</td></tr><tr><td>top, left, width, height</td><td><em>int or float -</em> Coordinates are in absolute pixel space</td></tr><tr><td>user_attrs (Optional)</td><td><em>dict</em> - Any additional label-level metadata fields. Defaults to None.</td></tr></tbody></table>
{% endtab %}

{% tab title="2D Polygon Lists" %}
Aquarium represents instance segmentation labels as 2D Polygon Lists. Each label is represented by one or more polygons, which do not need to be connected.

[**Python API Definition Link**](https://aquarium-not-pypi.web.app/aquariumlearning/docs/#aquariumlearning.LabeledFrame.add_label_2d_polygon_list)

**Example Usage**

```python
labeled_frame.add_label_2d_polygon_list(
    label_id='unique_id_for_this_label',
    classification='dog',
    polygons=[
        {'vertices': [(x1, y1), (x2, y2), ...]},
        {'vertices': [(x1, y1), (x2, y2), ...]}
    ],
    center: [center_x, center_y]
)
```

**Relevant Function Parameter Descriptions**

<table><thead><tr><th width="324">Parameter Name</th><th>Description</th></tr></thead><tbody><tr><td>label_id</td><td><em>string</em> - Unique id across all other labels in this dataset</td></tr><tr><td>classification </td><td><em>string -</em> classification of the polygon label</td></tr><tr><td>polygons</td><td><em>dictionary with format</em><br><em>{'vertices': [(x1, y1), (x2, y2) ...]} -</em> coordinates are in absolute pixel space -- these are polygon vertices, not a line string, there should be no duplicated vertices in the list</td></tr><tr><td>center (Optional)</td><td><em>list</em> <em>of ints or floats</em> - indicate the center position of the object</td></tr><tr><td>user_attrs</td><td><em>dict</em> - Any additional label-level metadata fields. Defaults to None.</td></tr></tbody></table>
{% endtab %}
{% endtabs %}

### Making Metadata Fields Queryable

When you add metadata fields to your Label, we need to take one extra step so that you can query those fields and search for them in the Analysis view! \
\
Add the code below to your script, you can add it in right **after** you call `create_dataset()`:

{% code lineNumbers="true" %}

```python
# next section goes into detail on how to call this
# create dataset before updating metadata schema
al_client.create_dataset(
    PROJECT_NAME, 
    DATASET_NAME, 
    dataset=labeled_dataset
)

# this method is a list of dict objects where you provide the 
# name of the field and the type of field 
al_client.update_dataset_object_metadata_schema(AL_PROJECT, AL_DATASET,
    [
        {"name": 'METADATA_FIELD_NAME_1', "type": "STRING"},
        {"name": 'METADATA_FIELD_NAME_2', "type": "STRING"},
        {"name": 'METADATA_FIELD_NAME_1', "type": "STRING"}
    ]
)
```

{% endcode %}

#### You can also run update\_dataset\_object\_metadata\_schema() on it's own after an upload to update the metadata to be queryable.&#x20;

You can run this snippet on it's own after an upload:&#x20;

{% code lineNumbers="true" %}

```python
import aquariumlearning as al

al_client = al.Client()
al_client.set_credentials(api_key='YOUR_API_KEY')

AL_PROJECT = 'YOUR PROJECT NAME'
AL_DATASET = 'YOUR DATASET NAME'

al_client.update_dataset_object_metadata_schema(AL_PROJECT, AL_DATASET,
    [
        {"name": 'METADATA_FIELD_NAME_1', "type": "STRING"},
        {"name": 'METADATA_FIELD_NAME_2', "type": "STRING"},
        {"name": 'METADATA_FIELD_NAME_1', "type": "STRING"}
    ]
)
```

{% endcode %}

## Putting It All Together

{% hint style="info" %}
In the [API docs](https://aquarium-not-pypi.web.app/aquariumlearning/docs/#aquariumlearning.LabeledFrame) you can see the other operations associated with a LabeledFrame.
{% endhint %}

Now that we've discussed the general steps for adding labeled data, here is an example of what this would look like for a 2D classification example would look like this:

```python
# Add an image to the frame
image_url = "https://storage.googleapis.com/aquarium-public/quickstart/pets/imgs/" + entry['file_name']
labeled_frame.add_image(image_url=image_url)

# Add the ground truth classification label to the frame
label_id = frame_id + '_gt'
labeled_frame.add_label_2d_classification(
    label_id=label_id, 
    classification=entry['class_name']
)

# once you have created the frame, add it to the dataset you created
labeled_dataset.add_frame(labeled_frame)
```

## Uploading Your Labeled Dataset

Now that we have everything all set up, let's submit your new labeled dataset to Aquarium!

{% hint style="info" %}
&#x20;Aquarium does some processing of your data, like indexing metadata and possibly calculating embeddings, so after they're submitted so you may see a delay before they show up in the UI. You can view some examples of what to expect as well as troubleshooting your upload [here](https://legacy-docs.aquariumlearning.com/aquarium/integrating-with-aquarium/uploading-data/..#monitoring-upload-status)!
{% endhint %}

### Submitting Your Dataset

You can submit your LabeledDataset to be uploaded in to Aquarium by calling [`.create_dataset()`](https://aquarium-not-pypi.web.app/aquariumlearning/docs/#aquariumlearning.Client.create_dataset).

{% hint style="info" %}
To spot check our data immediately, we can set the preview\_first\_frame flag to`True` and see a link in the console to a preview frame allows you to make sure data and labels look right.
{% endhint %}

This is an example of what the `create_dataset()` call will look like:

```python
DATASET_NAME = 'labels_v1'

# In order to create a dataset in Aquarium you must provide
# name you would like for your project
# name you would like for your labeled dataset
# the LabeledDataset object you have created and added frames to
al_client.create_dataset(
    PROJECT_NAME, 
    DATASET_NAME, 
    dataset=labeled_dataset
)
```

After kicking off your dataset, it can take anywhere from minutes to multiple hours depending on your dataset size.&#x20;

{% hint style="info" %}
You can monitor your uploads under the "Streaming Uploads" tab in the project view. [Here](https://legacy-docs.aquariumlearning.com/aquarium/integrating-with-aquarium/uploading-data/..#monitoring-upload-status) is a guide on how to find that page.
{% endhint %}

Once completed within Aquarium on the Project page, you'll be able to see your project with an updated count of how many labeled datasets have been added to the Project (the count also includes then number of [unlabeled datasets](https://legacy-docs.aquariumlearning.com/aquarium/getting-started/key-concepts#unlabeled-datasets)).

![You can see your new labeled dataset fully uploaded and reflected in the count in the bottom left corner of the project card.](https://391596125-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-MI6IGz_3V5m6p1UIhXm%2Fuploads%2F5rCyuCfgj0AYnZRZjwOf%2FScreen%20Shot%202022-08-01%20at%2010.16.15%20AM.png?alt=media\&token=eaa1388b-9646-4548-ba3f-59929e894c0c)

## Additional Features

### Multiple Sensor IDs

Sensor IDs are used reference data points that exist on a frame.  They are usually omitted in frames and labels, but become necessary if there exists more than one type of data point on a single frame.  A good example of this is a frame with multiple camera view points. &#x20;

```python
labeled_frame.add_image(sensor_id='camera_front', image_url='')
labeled_frame.add_image(sensor_id='camera_right', image_url='')
labeled_frame.add_image(sensor_id='camera_left', image_url='')

# 2D BBOX label on the `camera_front` image
labeled_frame.add_label_2d_bbox(
    sensor_id='camera_front',
    label_id='unique_id_for_this_label',
    classification='dog',
    top=200,
    left=300,
    width=250,
    height=150
)

# 2D BBOX label on the `camera_left` image
labeled_frame.add_label_2d_bbox(
    sensor_id='camera_left',
    label_id='unique_id_for_this_label',
    classification='cat',
    top=200,
    left=300,
    width=250,
    height=150
)

# Inferences MUST match the same sensor id as the
# corresponding base frame sensor id.
inference_frame.add_inference_2d_bbox(
    sensor_id='camera_front',
    label_id='abcd_inference',
    classification='cat',
    top=200,
    left=300,
    width=250,
    height=150,
    confidence=0.85
)
```

## Quickstart Examples

For examples of how to upload labeled datasets, check out our quickstart examples.&#x20;

{% content-ref url="../../getting-started/quickstart-guides" %}
[quickstart-guides](https://legacy-docs.aquariumlearning.com/aquarium/getting-started/quickstart-guides)
{% endcontent-ref %}
