We all know how crucial training data for data scientists is to build quality machine learning models. But when productionizing Machine Learning, Metadata is equally important.
Consider for example:
- Capture of Lineage Information (e.g., Which dataset influences which Model?)
- Capture of Audit Information (e.g, A given model was trained two months ago with the following training/validation performance)
- Reproducible Model Training
- Model Serving Policy (e.g., Which model should be deployed in production based on training statistics)
This is the reason we built ArangoML Pipeline, a flexible Metadata store which can be used with your existing ML Pipeline. ArangoML Pipeline can be used as a simple extension of existing ML pipelines through simple python/HTTP APIs.
Check out this page for further details on the challenge of Metadata in Machine Learning and ArangoML Pipeline.
ArangoML Pipeline Cloud
Today we are happy to announce a first version of Managed ML Metadata. Now you can start using ArangoML Pipeline without having to even start a separate docker container.
Additionally, as a cloud-based service based on ArangoDB’s managed cloud service Oasis, it can be up & running in just a few clicks and in the Free-to-Try tier even without a lengthy registration.
If you already have an existing notebook for your Machine Learning project it is as simple as adding the ArangoML Pipeline configuration pointing to our Free-to-Try tier
arangoml.arangodb.cloud and a dedicated environment (aka ArangoDB database with custom login credentials) will be generated for you and persisted in the config.
ArangoML Pipeline Cloud currently comes with two different service levels:
The Free-to-Try tier allows for a no-hassle setup as it automatically configures your own environment based on a simple API call shown above and is ideas to test ArangoML Pipeline Cloud, but comes with no guarantees for your production data.
If you are considering to use ArangoML Pipeline Cloud for production setup this is
Please reach out to firstname.lastname@example.org for sign-up and details.
How to get started
To show how easy it is to get started with ArangoML Pipeline Cloud in your existing ML pipeline we have a notebook with a modified TensorFlow Tutorial example with no setup or signup required!
If you are already using ArangoML Pipeline and just want to check how to migrate to ArangoML Pipeline Cloud we suggest to take a look at the minimal minimal example notebook.
While these notebook are mostly focused on the storing of metadata, we have a number of exciting notebooks with use-cases of how to further leverage and analyze metadata including for example datashift analysis.
- Learn more by checking out our example notebook on Google Colab
- Checkout the examples directory in our open source repository.
- Find here a tutorial notebook to get started with ArangoML Pipeline
- Learn more about using Arangopipe with common components of a machine learning stack like Tensorflow, hyperopt and pytorch
- Learn more about ArangoML Pipeline: Visit the blog
- To join a webinar for a live demo of how ArangoML Pipeline Cloud works: Register here