HERE MLflow Plugin
MLflow is a popular open source platform for managing Machine Learning (ML) development. A MLflow plugin is provided to manage the ML lifecycle on the platform. This plugin allows you to manage your ML experiments on the platform and share that with other users of the platform, or make it available on the HERE Marketplace. You can use any ML framework on any ML Cloud platform for training and can choose to manage the ML artifacts on the platform. The MLflow plugin can be used while training the ML model, or you can upload an already trained model onto the platform.
- Tracking experiments to record and compare parameters and results
- Packaging ML code in a reusable, reproducible form in order to share with other data scientists
- Providing a central model store to collaboratively manage the full lifecycle of an MLflow model, including model versioning, stage transitions, and annotations through a catalog HRN.
Prerequisites
Installation
To install the module, use the following command:
pip install --extra-index-url https://repo.platform.here.com/artifactory/api/pypi/analytics-pypi/simple/ here-MLflow-plugin==2.18.0
This command will install the latest version of the MLflow plugin module in the current environment.
Notes
- If you have errors related with GDAL or geopandas dependency on Windows, when installing the package, follow these steps.
- If you have errors related with Microsoft Visual C++ build tools, follow these steps.
Developer Flow
- Create a catalog for storing all the information.
here_MLflow_plugin_setup -c <catalog_id>
- Set the tracking URI pointing to this catalog hrn. Specify catalog hrn and not catalog id.
For Linux/MacOS:
export MLflow_TRACKING_URI=here+MLflow://catalog/v1/<catalog_hrn>
For Windows:
set MLflow_TRACKING_URI=here+MLflow://catalog/v1/<catalog_hrn>
or it can be set in the code
MLflow.set_tracking_uri(here+MLflow://catalog/v1/<catalog_hrn>)
Notebook Name | File Name | Description | Packages Required |
SDII Encoder | SDII_Encoder-Tensorflow.ipynb | Extract SDII Data using Tensorflow to create an encoder for the time series data. Save the model on platform using MLflow plugin. | HERE Geopandas Adapter, MLflow plugin, Tensorflow 1.x, Keras 2.3.x |
Predict Temperature at a Location Using Distributed XGBoost and Hyperparameter Tuning Using RayTune | Weather_Forecasting-Distributed_XGBoost.ipynb | Extract weather data and train the XGBoost model for predicting the value for a location. Use RayTune for tuning the hyper-parameters, save the best trained model and all the hyper-parameters on the platform using MLflow plugin. | HERE Geopandas Adapter, MLflow plugin, Ray, RayTune, XgBoost_ray, scikit-learn |
Predict Temperature at a Location Using Distributed PyTorch | Weather_Forecasting-Distributed_PyTorch.ipynb | Extract weather data and train the PyTorch model for predicting the value for a location. Save the trained model on the platform using MLflow plugin. | HERE Geopandas Adapter, MLflow plugin, Ray, PyTorch, scikit-learn |
- Start the training locally or anywhere on the cloud and upload a trained model or existing model on the platform. For more information, see the example notebooks.
- Launch the MLflow locally to visualize and compare all the logged information. Extract the
run_id
by choosing the given experiment name.
MLflow ui --backend-store-uri here+MLflow://catalog/v1/<catalog_hrn> --default-artifact-root here+MLflow://catalog/v1/<catalog_hrn>
Appendix
here_MLflow_plugin_setup -d <catalog_hrn>
LAYERNAME | LAYERTYPE | CONTENT TYPE | ATTRIBUTE1 | TYPE | ATTRIBUTE2 | TYPE | ATTRIBUTE3 | TYPE | ATTRIBUTE4 | TYPE |
tracking-experiment | index | application/json | ingestion_time | timewindow(10 min) | experiment_id | string | experiment_name | string | - | - |
tracking-run | index | application/json | start_time | timewindow(10 min) | experiment_id | string | run_id | string | - | - |
artifact-metadata | index | application/json | ingestion_time | timewindow(10 min) | run_id | string | - | - | - | - |
artifact-data | version | application/octet-stream | Partition type | Generic | - | - | - | - | - | - |
model-metadata | index | application/json | ingestion_time | timewindow(10 min) | model_name | string | - | - | - | - |
model-version-metadata | index | application/json | ingestion_time | timewindow(10 min) | model_name | string | version | int | run_id | string |