##################
Tracking Platforms
##################


MLFlow
======

**Introduction**:

We implement an end-to-end pipeline example using MLFlow's ``Experiment`` and
``Run`` capabilities. Each experiment represents each stage of the pipeline as
follows:

- <Algorithm name> - ``Train``
- <Algorithm name> - ``Evaluation``
- <Algorithm name> - ``Saliency Map Generation``

By separating and defining each ML pipeline stage as an MLFlow ``Experiment``,
it is possible to store and query per-stage information in a scalable manner.
Under each experiment, the information logging happens as part of MLFlow's
``Run-based`` workflow. Each MLFLow ``Run`` opens up a dashboard with 4 main
information subsections:

- ``Parameters`` - Specifically used in the ``Train`` stage to store model
  parameters.
- ``Metrics`` - Different metrics are stored across different stages and are
  used, most importantly, in querying specific runs based on a given threshold
  value.
- ``Tags`` - Key-value pair entries containing information that is useful for
  querying and also to link runs across different experiments.
- ``Artifacts`` - Used for storing images, numpy arrays, model metadata, etc.

**Train**:

Under the ``Train`` experiment, we create a single MLFlow run to log model
``parameters`` like learning rate, epochs, optimizer, hidden layer size, etc.
Additionally, we store some basic experiment info as ``tags``. Finally, in the
``artifacts`` section, we store the model.pkl and files containing info about
the python environment and dependencies.

**Evaluation**:

Under the ``Evaluation`` experiment, we create ``k`` MLFlow runs for ``k`` different
image samples from a given input dataset. In this experiment, for each run, we store
the ``Predicted_class_conf`` and class-wise confidence scores under ``metrics`` and
the ``Image_id``, ``GT_class`` and ``Predicted_class`` under ``tags``. By linking each
image sample to an MLFlow run, it is possible to make use of MLFlow's run filter and
query capabilities which helps to retrieve (both in the UI and at the backend using
``mlflow.search_runs()``) the image samples based on a given set of ``tags`` and ``metrics``.
Note: A call to ``mlflow.search_runs()`` with a specific filter string returns a pandas
DataFrame with the necessary query results.

**Saliency Map Generation**:

Under the ``Saliency Map Generation`` experiment, we create a single MLFlow run based on the
``Image_id`` and related information obtained through the query result from the ``Evaluation``
experiment. In this run, we store the saliency map numpy array, saliency map visualizations
(image) under ``artifacts`` for the queried ``Image_id`` and the saliency algorithm parameters
under the ``parameters`` tab. In addition, images stored under ``artifacts`` can be previewed
in the MLFlow UI. Furthermore, based on the design choice for output generation, images and
numpy arrays can be stored as separate files for each class or a single file containing
information from all classes. Most importantly, as part of a test for reproducibility, it is
possible to query the saliency algorithm parameters from an existing saliency generation run
and regenerate the same results as part of a new run. Finally, it is also possible to modify
the values an existing run to store and preview saliency maps for a different ``Image_id``
based on an updated backend query.

**MLFlow UI**:

Interacting with the MLFlow UI can be done side-by-side based on calling specific MLFlow functions
from Jupyter notebook. Creating each of the experiments mentioned above requires a call to
``mlflow.client.MlflowClient.create_experiment()``. Starting a run requires setting up a context
manager using ``mlflow.run()``. Logging all the necessary information happens within this context
manager codeblock. The MLFlow portions of the Jupyter notebook are setup in a way where each cell
corresponds to creating the experiment and run(s) corresponding to each stage discussed above.
Finally, in the last section of the Jupyter notebook, we discuss an example to edit/update an
existing MLFlow run based on modified query results.

**Summarizing based on specific attributes**:

- **Compatibility**:

  - No modifications to ``xaitk-saliency`` source code as part of the MLFlow integration.

- **Customizability**:

  - 3 different API levels to log information.

    - High level API uses the ``mlflow.sklearn.log_model`` function.
    - Mid level API uses the ``mlflow`` object to log information.
    - Low level API uses the MlflowClient object to log information. Quoting MLFlow
      documentation - “The ``mlflow.client`` module provides a Python CRUD interface to
      MLflow Experiments, Runs, Model Versions, and Registered Models. This is a lower level
      API that directly translates to MLflow REST API calls.”

- **Interactability**:

  - The main interactive nature of the MLFlow UI is in using the filter field to enter a query
    string based on a given set of metric thresholds and tag values to retrieve the corresponding
    MLFlow runs.
  - By clicking on each run, we can access the MLFlow dashboard containing the following sections:

    - Frontend unmodifiable sections - Parameters, Metrics and Artifacts
    - Frontend modifiable sections - Description and Tags
  - Updating the parameters, metrics and artifacts is strictly at the backend by querying using the
    ``mflow.search_runs()`` function and calling the ``mlflow.run()`` function on the same ``run_id``.

- **Scalability**:

  - Generating saliency maps for a given set of input samples is highly scalable. An existing MLFlow
    run can be updated to log the necessary saliency map images and numpy arrays of the updated input.
  - Generating metric values for a large dataset requires MLFlow runs of the order of ``O(n)`` where
    n is the number of image samples.