Databricks recently announced a new release of MLflow, an open source, multi-cloud framework for the machine learning lifecycle, now with R integration.
Databricks recently announced a new release of MLflow, an open source, multi-cloud framework for the machine learning lifecycle, now with R integration.
RStudio has partnered with Databricks to develop an R API for MLflow v0.7.0 which was showcased at the Spark + AI Summit Europe.
According to a release issued by the company, before MLflow, the machine learning industry did not have a standard process or end-to-end infrastructure to develop and produce applications simply and consistently.
With MLflow, organizations can package their code as reproducible runs, execute and compare hundreds of parallel experiments, and leverage any hardware or software platform for training, tuning, or hyperparameter search. Organizations can also deploy and manage models in production on a variety of clouds and serving platforms.
See also: Snowflake and Databricks integrate warehousing platform
“With MLflow, data science teams can systematically package and reuse models across frameworks, track and share experiments locally or in the cloud, and deploy models virtually anywhere,” said Matei Zaharia, Databricks CTO, the original creator of Apache Spark, and Tech Lead of MLflow. “The flurry of interest and contributions we’ve seen from the data science community validates the need for an open source framework to streamline the machine learning lifecycle.”
Since the launch of MLflow four months ago, the following new features have been added:
- In addition to R, MLflow supports Python, Java, and Scala, as well as a REST server interface which can be used from any language.
- Built-in integrations with the most popular machine learning libraries such as scikit-learn, TensorFlow, Keras, PyTorch, H2O, and Apache Spark MLlib.
- The ability to quickly deploy machine learning models to multiple cloud services, including Databricks, Azure Machine Learning, and Amazon SageMaker based on their needs. MLflow leverages AWS S3, Google Cloud Storage, and Azure Data Lake Storage, allowing teams to track and share artifacts from their code.