Skip to content

Image Data

hover supports bulk-labeling images through their URLs.

💡 Let's do a quickstart for images and note what's different from texts.

This page assumes that you have know the basics

i.e. simple usage of dataset and annotator. Please visit the quickstart tutorial if you haven't done so.

Running Python right here

Think of this page as almost a Jupyter notebook. You can edit code and press Shift+Enter to execute.

Behind the scene is a Binder-hosted Python environment. Below is the status of the kernel:

Status: failed
Lost connection to Binder

To download a notebook file instead, visit here.

Dataset for Images

hover handles images through their URL addresses. URLs are strings which can be easily stored, hashed, and looked up against. They are also convenient for rendering tooltips in the annotation interface.

Similarly to SupervisableTextDataset, we can build one for images:

Vectorizer for Images

We can follow a URL -> content -> image object -> vector path.

Caching and reading from disk

This guide uses @wrappy.memoize in place of @functools.lru_cache for caching.

  • The benefit is that wrappy.memoize can persist the cache to disk, speeding up code across sessions.

Cached values for this guide have been pre-computed, making it much master to run the guide.

Embedding and Plot

This is exactly the same as in the quickstart, just switching to image data:

What's special for images?

Tooltips

For text, the tooltip shows the original value.

For images, the tooltip embeds the image based on URL.

  • images in the local file system shall be served through python -m http.server.
  • they can then be accessed through https://localhost:<port>/relative/path/to/file.

Search

For text, the search widget is based on regular expressions.

For images, the search widget is based on vector cosine similarity.

  • the dataset has remembered the vectorizer under the hood and passed it to the annotator.
  • please let us know if you think there's a better way to search images in this case.