Skip to content

hover.recipes

  • hover.recipes.stable

    High-level functions to produce an interactive annotation interface.

    Stable recipes whose function signatures should almost never change in the future.

    linked_annotator(dataset, **kwargs)

    Display the dataset on a 2D map in two views, one for search and one for annotation.
    Param Type Description
    dataset SupervisableDataset the dataset to link to
    **kwargs kwargs to forward to each Bokeh figure

    Expected visual layout:

    SupervisableDataset BokehDataFinder BokehDataAnnotator
    manage data subsets search -> highlight make annotations
    Source code in hover/recipes/stable.py
    @servable(title="Linked Annotator")
    def linked_annotator(dataset, **kwargs):
        """
        ???+ note "Display the dataset on a 2D map in two views, one for search and one for annotation."
    
            | Param     | Type     | Description                          |
            | :-------- | :------- | :----------------------------------- |
            | `dataset` | `SupervisableDataset` | the dataset to link to  |
            | `**kwargs` |       | kwargs to forward to each Bokeh figure |
    
            Expected visual layout:
    
            | SupervisableDataset | BokehDataFinder     | BokehDataAnnotator |
            | :------------------ | :------------------ | :----------------- |
            | manage data subsets | search -> highlight | make annotations   |
        """
        layout, _ = _linked_annotator(dataset, **kwargs)
        return layout
    

    simple_annotator(dataset, **kwargs)

    Display the dataset with on a 2D map for annotation.
    Param Type Description
    dataset SupervisableDataset the dataset to link to
    **kwargs kwargs to forward to each Bokeh figure

    Expected visual layout:

    SupervisableDataset BokehDataAnnotator
    manage data subsets make annotations
    Source code in hover/recipes/stable.py
    @servable(title="Simple Annotator")
    def simple_annotator(dataset, **kwargs):
        """
        ???+ note "Display the dataset with on a 2D map for annotation."
    
            | Param     | Type     | Description                          |
            | :-------- | :------- | :----------------------------------- |
            | `dataset` | `SupervisableDataset` | the dataset to link to  |
            | `**kwargs` |       | kwargs to forward to each Bokeh figure |
    
            Expected visual layout:
    
            | SupervisableDataset | BokehDataAnnotator |
            | :------------------ | :----------------- |
            | manage data subsets | make annotations   |
        """
        layout, _ = _simple_annotator(dataset, **kwargs)
        return layout
    


  • hover.recipes.experimental

    High-level functions to produce an interactive annotation interface.

    Experimental recipes whose function signatures might change significantly in the future. Use with caution.

    active_learning(dataset, vecnet, **kwargs)

    Display the dataset for annotation, putting a classification model in the loop.

    Currently works most smoothly with VectorNet.

    Param Type Description
    dataset SupervisableDataset the dataset to link to
    vecnet VectorNet model to use in the loop
    **kwargs forwarded to each Bokeh figure

    Expected visual layout:

    SupervisableDataset BokehSoftLabelExplorer BokehDataAnnotator BokehDataFinder
    manage data subsets inspect model predictions make annotations search and filter
    Source code in hover/recipes/experimental.py
    @servable(title="Active Learning")
    def active_learning(dataset, vecnet, **kwargs):
        """
        ???+ note "Display the dataset for annotation, putting a classification model in the loop."
            Currently works most smoothly with `VectorNet`.
    
            | Param     | Type     | Description                          |
            | :-------- | :------- | :----------------------------------- |
            | `dataset` | `SupervisableDataset` | the dataset to link to  |
            | `vecnet`  | `VectorNet` | model to use in the loop          |
            | `**kwargs` |         | forwarded to each Bokeh figure       |
    
            Expected visual layout:
    
            | SupervisableDataset | BokehSoftLabelExplorer    | BokehDataAnnotator | BokehDataFinder     |
            | :------------------ | :------------------------ | :----------------- | :------------------ |
            | manage data subsets | inspect model predictions | make annotations   | search and filter   |
        """
        layout, _ = _active_learning(dataset, vecnet, **kwargs)
        return layout
    

    snorkel_crosscheck(dataset, lf_list, **kwargs)

    Display the dataset for annotation, cross-checking with labeling functions.
    Param Type Description
    dataset SupervisableDataset the dataset to link to
    lf_list list a list of callables decorated by @hover.utils.snorkel_helper.labeling_function
    **kwargs kwargs to forward to each Bokeh figure

    Expected visual layout:

    SupervisableDataset BokehSnorkelExplorer BokehDataAnnotator BokehDataFinder
    manage data subsets inspect labeling functions make annotations search and filter
    Source code in hover/recipes/experimental.py
    @servable(title="Snorkel Crosscheck")
    def snorkel_crosscheck(dataset, lf_list, **kwargs):
        """
        ???+ note "Display the dataset for annotation, cross-checking with labeling functions."
    
            | Param     | Type     | Description                          |
            | :-------- | :------- | :----------------------------------- |
            | `dataset` | `SupervisableDataset` | the dataset to link to  |
            | `lf_list` | `list`   | a list of callables decorated by `@hover.utils.snorkel_helper.labeling_function` |
            | `**kwargs` |       | kwargs to forward to each Bokeh figure |
    
            Expected visual layout:
    
            | SupervisableDataset | BokehSnorkelExplorer       | BokehDataAnnotator | BokehDataFinder     |
            | :------------------ | :------------------------- | :----------------- | :------------------ |
            | manage data subsets | inspect labeling functions | make annotations   | search and filter   |
        """
        layout, _ = _snorkel_crosscheck(dataset, lf_list, **kwargs)
        return layout
    


  • hover.recipes.subroutine

    Building blocks of high-level recipes.

    Includes the following:

    • functions for creating individual standard explorers appropriate for a dataset.

    active_learning_components(dataset, vecnet, **kwargs)

    Active-learning specific components of a recipe.
    Param Type Description
    dataset SupervisableDataset the dataset to link to
    vecnet VectorNet vecnet to use in the loop
    **kwargs kwargs to forward to the BokehSoftLabelExplorer
    Source code in hover/recipes/subroutine.py
    def active_learning_components(dataset, vecnet, **kwargs):
        """
        ???+ note "Active-learning specific components of a recipe."
    
            | Param      | Type     | Description                          |
            | :--------- | :------- | :----------------------------------- |
            | `dataset`  | `SupervisableDataset` | the dataset to link to  |
            | `vecnet`   | `VectorNet` | vecnet to use in the loop          |
            | `**kwargs` | | kwargs to forward to the `BokehSoftLabelExplorer` |
        """
        console = Console()
        softlabel = standard_softlabel(dataset, **kwargs)
        feature_key = dataset.__class__.FEATURE_KEY
    
        # patch coordinates for representational similarity analysis
        # some datasets may have multiple embeddings; use the one with lowest dimension
        embedding_cols = sorted(softlabel.find_embedding_fields())
        manifold_dim, _ = re.findall(r"\d+", embedding_cols[0])
        manifold_dim = int(manifold_dim)
        manifold_traj_cols = embedding_cols[:manifold_dim]
        for _col in manifold_traj_cols:
            _total_dim, _ = re.findall(r"\d+", _col)
            _total_dim = int(_total_dim)
            assert (
                _total_dim == manifold_dim
            ), f"Dim mismatch: {_total_dim} vs. {manifold_dim}"
            softlabel.value_patch_by_slider(
                _col, f"{_col}_traj", title="Manifold trajectory step"
            )
    
        # recipe-specific widget
        model_trainer = Button(label="Train model", button_type="primary")
    
        def retrain_vecnet():
            """
            Callback subfunction 1 of 2.
            """
            model_trainer.disabled = True
            console.print("Start training... button will be disabled temporarily.")
            dataset.setup_label_coding()
            vecnet.auto_adjust_setup(dataset.classes)
    
            train_loader = vecnet.prepare_loader(dataset, "train", smoothing_coeff=0.2)
            if dataset.dfs["dev"].shape[0] > 0:
                dev_loader = vecnet.prepare_loader(dataset, "dev")
            else:
                dataset._warn("dev set is empty, borrowing train set for validation.")
                dev_loader = train_loader
    
            _ = vecnet.train(train_loader, dev_loader)
            vecnet.save()
            console.print("-- 1/2: retrained vecnet")
    
        def update_softlabel_plot():
            """
            Callback subfunction 2 of 2.
            """
            # combine inputs and compute outputs of all non-test subsets
            use_subsets = ("raw", "train", "dev")
            inps = []
            for _key in use_subsets:
                inps.extend(dataset.dfs[_key][feature_key].tolist())
    
            probs = vecnet.predict_proba(inps)
            labels = [dataset.label_decoder[_val] for _val in probs.argmax(axis=-1)]
            scores = probs.max(axis=-1).tolist()
            traj_arr, _, _ = vecnet.manifold_trajectory(
                inps,
                method=hover.config["data.embedding"]["default_reduction_method"],
                reducer_kwargs=dict(dimension=manifold_dim),
                spline_kwargs=dict(points_per_step=5),
            )
    
            offset = 0
            for _key in use_subsets:
                _length = dataset.dfs[_key].shape[0]
                # skip subset if empty
                if _length == 0:
                    continue
                _slice = slice(offset, offset + _length)
                dataset.dfs[_key]["pred_label"] = labels[_slice]
                dataset.dfs[_key]["pred_score"] = scores[_slice]
                for i, _col in enumerate(manifold_traj_cols):
                    # all steps, selected slice
                    _traj = traj_arr[:, _slice, i]
                    # selected slice, all steps
                    _traj = list(np.swapaxes(_traj, 0, 1))
                    dataset.dfs[_key][f"{_col}_traj"] = _traj
    
                offset += _length
    
            softlabel._dynamic_callbacks["adjust_patch_slider"]()
            softlabel._update_sources()
            model_trainer.disabled = False
            console.print("-- 2/2: updated predictions. Training button is re-enabled.")
    
        def callback_sequence():
            """
            Overall callback function.
            """
            retrain_vecnet()
            update_softlabel_plot()
    
        model_trainer.on_click(callback_sequence)
    
        return softlabel, model_trainer
    

    get_explorer_class(task, feature)

    Get the right hover.core.explorer class given a task and a feature.

    Can be useful for dynamically creating explorers without knowing the feature in advance.

    Param Type Description
    task str name of the task, which can be "finder", "annotator", "margin", "softlabel", or "snorkel"
    feature str name of the main feature, which can be "text", "audio" or "image"

    Usage:

    # this creates an instance of BokehTextFinder
    explorer = get_explorer_class("finder", "text")(*args, **kwargs)
    

    Source code in hover/recipes/subroutine.py
    def get_explorer_class(task, feature):
        """
        ???+ note "Get the right `hover.core.explorer` class given a task and a feature."
    
            Can be useful for dynamically creating explorers without knowing the feature in advance.
    
            | Param     | Type  | Description                          |
            | :-------- | :---- | :----------------------------------- |
            | `task`    | `str` | name of the task, which can be `"finder"`, `"annotator"`, `"margin"`, `"softlabel"`, or `"snorkel"` |
            | `feature` | `str` | name of the main feature, which can be `"text"`, `"audio"` or `"image"` |
    
            Usage:
            ```python
            # this creates an instance of BokehTextFinder
            explorer = get_explorer_class("finder", "text")(*args, **kwargs)
            ```
        """
        assert task in EXPLORER_CATALOG, f"Invalid task: {task}"
        assert feature in EXPLORER_CATALOG[task], f"Invalid feature: {feature}"
        return EXPLORER_CATALOG[task][feature]
    

    recipe_layout(*components, *, style='horizontal')

    Create a recipe-level layout of bokeh objects.
    Param Type Description
    *components bokeh objects objects to be plotted
    style str "horizontal" or "vertical"
    Source code in hover/recipes/subroutine.py
    def recipe_layout(*components, style="horizontal"):
        """
        ???+ note "Create a recipe-level layout of bokeh objects."
    
            | Param      | Type     | Description                          |
            | :--------- | :------- | :----------------------------------- |
            | `*components` | `bokeh` objects | objects to be plotted      |
            | `style`    | `str`    | "horizontal" or "vertical"           |
        """
        if style == "horizontal":
            return row(*components)
        elif style == "vertical":
            return column(*components)
        else:
            raise ValueError(f"Unexpected layout style {style}")
    

    standard_annotator(dataset, **kwargs)

    Set up a BokehDataAnnotator for a SupervisableDataset.

    The annotator has a few standard interactions with the dataset:

    • read all subsets of the dataset
    • subscribe to all updates in the dataset
    • can commit annotations through selections in the "raw" subset
    Param Type Description
    dataset SupervisableDataset the dataset to link to
    **kwargs kwargs to forward to the BokehDataAnnotator
    Source code in hover/recipes/subroutine.py
    def standard_annotator(dataset, **kwargs):
        """
        ???+ note "Set up a `BokehDataAnnotator` for a `SupervisableDataset`."
    
            The annotator has a few standard interactions with the dataset:
    
            -   read all subsets of the dataset
            -   subscribe to all updates in the dataset
            -   can commit annotations through selections in the "raw" subset
    
            | Param      | Type     | Description                          |
            | :--------- | :------- | :----------------------------------- |
            | `dataset`  | `SupervisableDataset` | the dataset to link to  |
            | `**kwargs` | | kwargs to forward to the `BokehDataAnnotator` |
        """
        # auto-detect the (main) feature to use
        feature = dataset.__class__.FEATURE_KEY
        explorer_cls = get_explorer_class("annotator", feature)
    
        # first "static" version of the plot
        subsets = explorer_cls.SUBSET_GLYPH_KWARGS.keys()
        annotator = explorer_cls.from_dataset(
            dataset,
            {_k: _k for _k in subsets},
            title="Annotator: apply labels to selected RAW points",
            **kwargs,
        )
        annotator.activate_search()
        annotator.plot()
    
        # subscribe for df updates
        dataset.subscribe_update_push(annotator, {_k: _k for _k in subsets})
    
        # annotators can commit to a dataset
        dataset.subscribe_data_commit(annotator, {"raw": "raw"})
    
        # annotators by default link the selection for preview
        dataset.subscribe_selection_view(annotator, ["raw", "train", "dev", "test"])
        return annotator
    

    standard_finder(dataset, **kwargs)

    Set up a BokehDataFinder for a SupervisableDataset.

    The finder has a few standard interactions with the dataset:

    • read all subsets of the dataset
    • subscribe to all updates in the dataset
    Param Type Description
    dataset SupervisableDataset the dataset to link to
    **kwargs kwargs to forward to the BokehDataFinder
    Source code in hover/recipes/subroutine.py
    def standard_finder(dataset, **kwargs):
        """
        ???+ note "Set up a `BokehDataFinder` for a `SupervisableDataset`."
    
            The finder has a few standard interactions with the dataset:
    
            -   read all subsets of the dataset
            -   subscribe to all updates in the dataset
    
            | Param      | Type     | Description                          |
            | :--------- | :------- | :----------------------------------- |
            | `dataset`  | `SupervisableDataset` | the dataset to link to  |
            | `**kwargs` | | kwargs to forward to the `BokehDataFinder` |
        """
        # auto-detect the (main) feature to use
        feature = dataset.__class__.FEATURE_KEY
        explorer_cls = get_explorer_class("finder", feature)
    
        # first "static" version of the plot
        subsets = explorer_cls.SUBSET_GLYPH_KWARGS.keys()
        finder = explorer_cls.from_dataset(
            dataset,
            {_k: _k for _k in subsets},
            title="Finder: use search for highlight and filter",
            **kwargs,
        )
        finder.activate_search()
        finder.plot()
    
        # subscribe for df updates
        dataset.subscribe_update_push(finder, {_k: _k for _k in subsets})
        return finder
    

    standard_snorkel(dataset, **kwargs)

    Set up a BokehSnorkelExplorer for a SupervisableDataset.

    The snorkel explorer has a few standard interactions with the dataset:

    • read "raw" and "dev" subsets of the dataset, interpreting "dev" as "labeled"
    • subscribe to all updates in those subsets
    Param Type Description
    dataset SupervisableDataset the dataset to link to
    **kwargs kwargs to forward to the BokehSnorkelExplorer
    Source code in hover/recipes/subroutine.py
    def standard_snorkel(dataset, **kwargs):
        """
        ???+ note "Set up a `BokehSnorkelExplorer` for a `SupervisableDataset`."
    
            The snorkel explorer has a few standard interactions with the dataset:
    
            -   read "raw" and "dev" subsets of the dataset, interpreting "dev" as "labeled"
            -   subscribe to all updates in those subsets
    
            | Param      | Type     | Description                          |
            | :--------- | :------- | :----------------------------------- |
            | `dataset`  | `SupervisableDataset` | the dataset to link to  |
            | `**kwargs` | | kwargs to forward to the `BokehSnorkelExplorer` |
        """
        # auto-detect the (main) feature to use
        feature = dataset.__class__.FEATURE_KEY
        explorer_cls = get_explorer_class("snorkel", feature)
    
        # first "static" version of the plot
        snorkel = explorer_cls.from_dataset(
            dataset,
            {"raw": "raw", "dev": "labeled"},
            title="Snorkel: □ for correct, x for incorrect, + for missed, o for hit; click on legends to hide or show LF",
            **kwargs,
        )
        snorkel.activate_search()
        snorkel.plot()
    
        # subscribe to dataset widgets
        dataset.subscribe_update_push(snorkel, {"raw": "raw", "dev": "labeled"})
        return snorkel
    

    standard_softlabel(dataset, **kwargs)

    Set up a BokehSoftLabelExplorer for a SupervisableDataset.

    The soft label explorer has a few standard interactions with the dataset:

    • read all subsets of the dataset
    • subscribe to all updates in the dataset
    Param Type Description
    dataset SupervisableDataset the dataset to link to
    **kwargs kwargs to forward to BokehSoftLabelExplorer
    Source code in hover/recipes/subroutine.py
    def standard_softlabel(dataset, **kwargs):
        """
        ???+ note "Set up a `BokehSoftLabelExplorer` for a `SupervisableDataset`."
    
            The soft label explorer has a few standard interactions with the dataset:
    
            -   read all subsets of the dataset
            -   subscribe to all updates in the dataset
    
            | Param      | Type     | Description                          |
            | :--------- | :------- | :----------------------------------- |
            | `dataset`  | `SupervisableDataset` | the dataset to link to  |
            | `**kwargs` | | kwargs to forward to `BokehSoftLabelExplorer` |
        """
        # auto-detect the (main) feature to use
        feature = dataset.__class__.FEATURE_KEY
        explorer_cls = get_explorer_class("softlabel", feature)
    
        # first "static" version of the plot
        subsets = explorer_cls.SUBSET_GLYPH_KWARGS.keys()
        softlabel = explorer_cls.from_dataset(
            dataset,
            {_k: _k for _k in subsets},
            "pred_label",
            "pred_score",
            title="SoftLabel: inspect predictions and use score range as filter",
            **kwargs,
        )
        softlabel.activate_search()
        softlabel.plot()
    
        # subscribe to dataset widgets
        dataset.subscribe_update_push(softlabel, {_k: _k for _k in subsets})
        return softlabel