Spark新作MLflow 0.2发布：集成TensorFlow，Tracking Server更新和S3存储的新功能

问题导读

1.MLflow0.2版本中内置了什么内容？
2.MLflow下一步可能内置哪些机器学习组件？
3.MLflow0.2是否可以在云中运行？
4.你认为MLflow有哪些优点？

补充知识：
MLflow 大家可能了解不多，他是spark团队一个新作，更多可参考Spark MLFlow 介绍http://www.aboutyun.com/forum.php?mod=viewthread&tid=24862

##################

在今年的Spark + AI峰会上，spark团队推出了MLflow，一个简化机器学习生命周期的开源平台。自发布以来，我们已经看到了使用MLflow的科学家和工程师们的兴趣。 MLFlow的GitHub（https://github.com/databricks/mlflow）存储库已经有180个分支，十几个贡献者提交了问题。此外，第一次参加MLflow聚会的人数接近100人。

MLflow v0.2终于发布。如果按照MLflow快速入门指南（https://mlflow.org/docs/latest/quickstart.html）中的说明pip install mlflow，则可以使用MLflow 0.2。在这篇文章中，将介绍此版本中的新功能。

内置集成TensorFlow
MLflow可以很容易地从任何机器学习库中训练模型，只要可以将它们封装在Python函数中，但对于非常常用的库，我们希望提供内置支持。在此版本中，我们添加了mlflow.tensorflow包，可以轻松地将TensorFlow模型记录到MLflow Tracking。一旦记录模型后，现在可以开始使用MLflow支持的所有部署工具（例如本地REST服务器，Azure ML服务或Apache Spark进行批量inference）。

下面的示例展示了用户如何记录训练好的TF模型并使用内置功能使用pyfunc函数部署它。

训练环境:保存训练好的TF模型
[mw_shl_code=bash,true]# Save the estimator in SavedModel format.
estimator_path = your_regressor.export_savedmodel(model_dir, receiver_fn)

# Log the exported SavedModel with MLflow.
# signature_def_key: name of the signature definition to compute
#                   when the SavedModel is loaded back for inference
#                   ref: (https://www.tensorflow.org/serving/signature_defs).
# artifact_path: path (under artifact directory for the current run) to
#             where model will be saved as an artifact.
mlflow.tensorflow.log_saved_model(saved_model_dir=estimator_path,
                              signature_def_key="predict",
                              artifact_path="model")[/mw_shl_code]

部署环境：加载保存TF模型并predict
[mw_shl_code=bash,true]estimator_path = ... # location where TF is saved

# We can load the TensorFlow estimator as a Python function.
# You can use a local file or pass a run ID to load an artifact from a previous run.
pyfunc = mlflow.tensorflow.load_pyfunc(estimator_path)

# We can now apply the model on Pandas DataFrames to make predictions.
predict_df = pyfunc.predict(df)[/mw_shl_code]
在MLflow 0.2, 添加了一个新的命令mlflow server 命令，用语启动MLflow Tracking server，来跟踪和查询实验的运行。跟本地mlflow ui命令不同，mlflow server可以支持多个工作线程和S3支持的存储。你可以通过文档（https://www.mlflow.org/docs/late ... g-a-tracking-server）学习这方面的内容

S3支持的Artifact 存储
MLflow的一个关键功能是记录训练运行的输出，其中包括称为“Artifact ”的任意文件。MLflow的第一个版本只支持记录artifacts到共享POSIX文件系统。在MLflow 0.2中，我们通过—artifact-root参数向MLflow server命令添加支持存储的S3。这样可以容易在多个云实例上运行MLflow traine job并跟踪它们的结果。以下示例说明如何使用S3工件存储启动跟踪服务器。
以下例子展示了如何启动支持S3 artifact存储的跟踪服务器

运行MLflow Server在EC2 实例
[mw_shl_code=bash,true]% hostname
ec2-11-222-333-444.us-west-2.compute.amazonaws.com

% mlflow server \
   --file-store /mnt/persistent-disk/mlflow_data \
   --artifact-root s3://my-mlflow-bucket/[/mw_shl_code]

MLflow Client:
[mw_shl_code=bash,true]mlflow.set_tracking_uri("http://ec2-11-222-333-444.us-west-2.compute.amazonaws.com")
...
with mlflow.start_run():
mlflow.log_parameter("x", 1)
mlflow.log_metric("y", 2)
...
mlflow.log_artifact("/tmp/model")[/mw_shl_code]

其他改进
除了这些更大的功能之外，此版本修改了一些bug和文档。完整的更改列表可以在CHANGELOG中找到。欢迎在mlflow-users@googlegroups.com上提交更多意见，或在GitHub上提交问题或提交补丁。

MLflow下一步计划

计划在alpha版本中不断更新MLflow。例如，正在进行的工作包括整合更多库（如PyTorch，Keras和MLlib）的内置集成，并进一步改进跟踪服务器（tracking server）的可用性等。

图文精华

Spark新作MLflow 0.2发布：集成TensorFlow，Tracking Server更新和S3存储的新功能

活跃会员

热心会员

优秀版主

论坛元老

推荐 /2