ArangoML Part 4: Detecting Covariate Shift in Datasets

00ArangoML, General, Graphs, Machine LearningTags: ,

This post is the fourth in a series of posts introducing ArangoML and showcasing its benefits to your machine learning pipelines. Until now, we have focused on ArangoML’s ability to capture metadata for your machine learning projects, but it does much more. 

In this post we:

  • Introduce the concept of covariate shift in datasets
  • Showcase the built-in dataset shift detection API
ArangoML Pipeline Complete pipeline - ArangoDB Machine Learning
More info

A story of a memory leak in GO: How to properly use time.After()

00GeneralTags: ,

Recently, we decided to investigate why our application ARANGOSYNC for synchronizing two ArangoDB clusters across data centers used up a lot of memory – around 2GB in certain cases. The environment contained ~1500 shards with 5000 GOroutines. Thanks to tools like pprof (to profile CPU and memory usage) it was very easy to identify the issue. The GO profiler showed us that memory was allocated in the function time.After() and it accumulated up to nearly 1GB. The memory was not released so it was clear that we had a memory leak. We will explain how memory leaks can occur using the time.After() function through three examples.

More info

A Deep And Fuzzy Dive Into Search

00General

Together with my team, I took a deep dive into the available fuzzy search approaches and algorithms for quite a while, in order to find a performant solution for the various projects ArangoSearch gets used for. 

Since the introduction of ArangoSearch back in 2018, many of our users have asked for fuzzy search support. We worked hard on getting this done, and the whole team is now excited to finally make fuzzy search available with the upcoming ArangoDB 3.7.

In the following, we will share our learnings and hope they are useful for you.

More info

ArangoDB is Open Source

We code transparently on Github.
Support & join us there
Become a Stargazer
close-link