################## Tracking Platforms ################## MLFlow ====== **Introduction**: We implement an end-to-end pipeline example using MLFlow's ``Experiment`` and ``Run`` capabilities. Each experiment represents each stage of the pipeline as follows: - - ``Train`` - - ``Evaluation`` - - ``Saliency Map Generation`` By separating and defining each ML pipeline stage as an MLFlow ``Experiment``, it is possible to store and query per-stage information in a scalable manner. Under each experiment, the information logging happens as part of MLFlow's ``Run-based`` workflow. Each MLFLow ``Run`` opens up a dashboard with 4 main information subsections: - ``Parameters`` - Specifically used in the ``Train`` stage to store model parameters. - ``Metrics`` - Different metrics are stored across different stages and are used, most importantly, in querying specific runs based on a given threshold value. - ``Tags`` - Key-value pair entries containing information that is useful for querying and also to link runs across different experiments. - ``Artifacts`` - Used for storing images, numpy arrays, model metadata, etc. **Train**: Under the ``Train`` experiment, we create a single MLFlow run to log model ``parameters`` like learning rate, epochs, optimizer, hidden layer size, etc. Additionally, we store some basic experiment info as ``tags``. Finally, in the ``artifacts`` section, we store the model.pkl and files containing info about the python environment and dependencies. **Evaluation**: Under the ``Evaluation`` experiment, we create ``k`` MLFlow runs for ``k`` different image samples from a given input dataset. In this experiment, for each run, we store the ``Predicted_class_conf`` and class-wise confidence scores under ``metrics`` and the ``Image_id``, ``GT_class`` and ``Predicted_class`` under ``tags``. By linking each image sample to an MLFlow run, it is possible to make use of MLFlow's run filter and query capabilities which helps to retrieve (both in the UI and at the backend using ``mlflow.search_runs()``) the image samples based on a given set of ``tags`` and ``metrics``. Note: A call to ``mlflow.search_runs()`` with a specific filter string returns a pandas DataFrame with the necessary query results. **Saliency Map Generation**: Under the ``Saliency Map Generation`` experiment, we create a single MLFlow run based on the ``Image_id`` and related information obtained through the query result from the ``Evaluation`` experiment. In this run, we store the saliency map numpy array, saliency map visualizations (image) under ``artifacts`` for the queried ``Image_id`` and the saliency algorithm parameters under the ``parameters`` tab. In addition, images stored under ``artifacts`` can be previewed in the MLFlow UI. Furthermore, based on the design choice for output generation, images and numpy arrays can be stored as separate files for each class or a single file containing information from all classes. Most importantly, as part of a test for reproducibility, it is possible to query the saliency algorithm parameters from an existing saliency generation run and regenerate the same results as part of a new run. Finally, it is also possible to modify the values an existing run to store and preview saliency maps for a different ``Image_id`` based on an updated backend query. **MLFlow UI**: Interacting with the MLFlow UI can be done side-by-side based on calling specific MLFlow functions from Jupyter notebook. Creating each of the experiments mentioned above requires a call to ``mlflow.client.MlflowClient.create_experiment()``. Starting a run requires setting up a context manager using ``mlflow.run()``. Logging all the necessary information happens within this context manager codeblock. The MLFlow portions of the Jupyter notebook are setup in a way where each cell corresponds to creating the experiment and run(s) corresponding to each stage discussed above. Finally, in the last section of the Jupyter notebook, we discuss an example to edit/update an existing MLFlow run based on modified query results. **Summarizing based on specific attributes**: - **Compatibility**: - No modifications to ``xaitk-saliency`` source code as part of the MLFlow integration. - **Customizability**: - 3 different API levels to log information. - High level API uses the ``mlflow.sklearn.log_model`` function. - Mid level API uses the ``mlflow`` object to log information. - Low level API uses the MlflowClient object to log information. Quoting MLFlow documentation - “The ``mlflow.client`` module provides a Python CRUD interface to MLflow Experiments, Runs, Model Versions, and Registered Models. This is a lower level API that directly translates to MLflow REST API calls.” - **Interactability**: - The main interactive nature of the MLFlow UI is in using the filter field to enter a query string based on a given set of metric thresholds and tag values to retrieve the corresponding MLFlow runs. - By clicking on each run, we can access the MLFlow dashboard containing the following sections: - Frontend unmodifiable sections - Parameters, Metrics and Artifacts - Frontend modifiable sections - Description and Tags - Updating the parameters, metrics and artifacts is strictly at the backend by querying using the ``mflow.search_runs()`` function and calling the ``mlflow.run()`` function on the same ``run_id``. - **Scalability**: - Generating saliency maps for a given set of input samples is highly scalable. An existing MLFlow run can be updated to log the necessary saliency map images and numpy arrays of the updated input. - Generating metric values for a large dataset requires MLFlow runs of the order of ``O(n)`` where n is the number of image samples.