Adding a first class primitive, abstraction and process for dynamic library writing and loading can make it easier to extend inner workings of Mesos. Making it possible to have dynamic loadable resource allocators, isolators, containerizes, authenticators and much more.
This could be a powerful feature, as we get even more extensible and flexible ways of setting up Mesos - but also for isolating dependencies and complexity in external libraries and to ease experimentation with new features.
For example, imagine a loadable allocators which contains a VM (Lua, Python, ...) which makes it possible to try out new allocator algorithms without forcing those dependencies into the project.

Definitions

Mesos version := Mesos releases.
Module API version := Bumped when the module management system changes.
Role := The purpose that a module fulfills. In a given Mesos implementation this is tied to a specific object type, e.g. “Allocator”, “Isolator”, “Authenticator”.

The API from the point of view of a module writer

 

To write a Mesos module, include the module API header file and a module role type declaration, then place one macro call inside your library source code to capture version information and add another macro call to declare a module. Then add a function body that returns a module instance. That’s all. Here is a minimal example that declares a module with the role “Calculator” and the name “example”.

 

#include <mesos/module.hpp> // module system API
#include <calculator.hpp> // module role type declaration is in here

MESOS_MODULE_LIBRARY() // declares the module library (and its versions)

class ExampleCalculator: public Calculator
{
public:
  // An example function that the module implements.
  virtual int compute(char a, long b)

  {
    return a + b;
  }
};

MESOS_MODULE(Calculator, example) // declares the module

{
  return new ExampleCalculator(); // creating and returning the module instance
}

Additional compatibility checks


By default the above only works when the module client Mesos version and the Mesos version against which the library has been compiled are exactly the same. However, with this extra declaration, you can enable backwards compatibility controlled by a table inside Mesos that allows earlier library versions.

MESOS_IS_MODULE_COMPATIBLE(example)

{
  return true;
}

Here, instead of simply returning true, the module could also perform its own checks, and return false under certain conditions. In the latter case the module will not be admitted, no matter what the results of any other checks by Mesos indicate. The module’s own checks are open-ended. In particular, they may include queries about other loaded libraries and modules and their respective versions. We will later provide an API for that.

The API from the point of view of a Mesos implementor


Only modules with pre-declared roles and binding sites can be loaded by Mesos. There is no means of dynamic discovery or binding of extra roles. To declare a loadable module, a Mesos developer needs to specify an abstract class with at least one virtual method. Here is an example that matches the module code above.

class Calculator {
public:
  Calculator() {}
  virtual ~Calculator() {}

  virtual int compute(char a, long b) = 0;
};

 

That’s all. To employ a specific instantiation of this module role/type, you can write something like this.

 

#include <module/module_manager.hpp>

Try<Calculator*> module = mesos::ModuleManager::loadModule<Calculator>("example");
if (module.isError())

{
  …
  // error handling

}

Calculator* calculator = module.get();

And then you can already use the module.

int n = calculator->compute(‘A’, 10);

Module Libraries and Module Versioning

 

Before loading the above module, a dynamic library that contains the module needs to be loaded into Mesos. This happens early in Mesos startup code. The Mesos developer does not need to touch that code when introducing new module roles.
However, the developer is responsible for registering what versions of any given module are expected to remain compatible with Mesos as it progresses. This information is maintained in a table in src/module/module_manager.cpp. It contains an entry for every possible module role that Mesos recognizes, each with a corresponding Mesos release version number. This number needs to be adjusted by the Mesos developer to reflect the current Mesos version whenever compatibility between Mesos and modules that get compiled against it gets broken. Given that module implementation for older Mesos versions can still be written in the future, this may be impossible to tell and so in doubt it is best to just bump the required module version to the current Mesos version. But if one can be reasonably sure, assuming cooperative module developers, that a certain kind of module will continue to function across several Mesos versions, the table provides an easy way to specify this.

 

MesosRole versionLibraryIs module loadableReason
0.18.00.18.00.18.0YES 

0.29.0

0.18.00.18.0YES 
0.29.00.18.00.21.0YES 
0.18.00.18.00.29.0NOLibrary compiled against a newer Mesos release.
0.29.00.21.00.18.0NOModule/Library older than the role version supported by Mesos.
0.29.00.29.00.18.0NOModule/Library older than the role version supported by Mesos.

 

The summarize, for successfully loading the module, the following relationship must exist between the various versions:

Role version <= Library version <= Mesos version

 

Assumptions, design decisions, design goals, and explanations

  • Modules can be used by all constituents of Mesos, in particular slaves as well as masters. Different sets of modules may or may not apply to either. Each Mesos version of a module client program defines its own finite set of module bindings, i.e. module injection points.
  • Modules come packed in dynamic libraries (dynlibs).
  • An installed module is a C++ object created by a call into a module implementation in a dynlib. So far there is no well-defined set of features that work or don’t work. However, we use a dynamic_cast with RTTI to verify compatibility of a module with binding into Mesos. At a minimum, this restricts Module types to abstract classes with at least one virtual function. In case RTTI turns out not to be water-proof, we will eventually have to enumerate features that do work. This still seems less tedious than breaking our very simple API up into a much longer series of C-only constructs.
  • A compiler version and flag check may become necessary.
  • If really needed, an hourglass interface (http://cppcon2014.sched.org/event/e659d8c088904f7a1540524b196afbe9#.VBiSd0sjMeE,https://github.com/CppCon/CppCon2014/raw/master/Presentations/Hourglass%20Interfaces%20for%20C%2B%2B%20APIs%20-%20Stefanus%20Du%20Toit%20-%20CppCon%202014/Hourglass%20Interfaces%20for%20C%2B%2B%20APIs%20-%20Stefanus%20Du%20Toit%20-%20CppCon%202014.pdf) can be added later, with an extra Mesos Module System version.
  • Module installation has these phases: 1) Dynlib loading, 2) verification including dependency checking, 3) instantiation, 4) binding (assignment to an l-value that gets used somehow).
  • All modules are named in one command line flag, which gets parsed early. After all dynlibs have been loaded, all verification is run. Instantiation happens later, in various places in Mesos, wherever modules are involved. This is driven by other command line flags which then reference identifiers given by the module flag. At first we only need to support a very simple naming scheme where the module name is used directly and it is expected that there are never conflicting module names. Example:
    slave --modules=”/root/path1:module1,path2:module2” --allocator=”module1” --auth=”module2”
  • There can be multiple modules in a given dynlib. This allows shared implementation and data elements and potential packaging convenience.
  • Each dynlib must indicate what version of MMS it is built for. Then future MMSs can determine whether to use a given pre-existing dynlib or not. Conversely, an older Mesos/MMS can determine that a dynlib relies on a later MMS version. This also serves as a handshake between Mesos and any arbitrary dynlib to ensure it is dealing with a Mesos module dynlib at all.
  • Once it is thus established that Mesos is dealing with a proper module dynlib with a compatible version, the dynlib is trusted to behave cooperatively and non-maliciously.
  • Each dynlib indicates what minimum Mesos version it is compatible with.
  • Each module indicates its “role”, e.g. Isolator, Allocator, Authenticator.
  • There can be multiple modules for the same role in the same dynlib, especially isolator modules.
  • User code does not need to instantiate and bind all modules in a dynlib. It can cherry-pick.

Limitations and Simplifications

  • If you build a dynlib against a certain Mesos version, it will not be allowed to be used with older Mesos versions. So for simplicity we bake the respective Mesos version into the dynlibs.
  • For each Role, a corresponding mesos version is kept in a table. If the role changes in a non-compatible way, its version must be bumped by the responsible Mesos developer to match the current mesos version.
  • To use the module API, The module writer only faces a header file. Nothing needs to be linked in for this purpose. However, to implement the specific payload features of any given module, it may have to reference any number of other parts of Mesos, including having to link against those.
  • We consider corralling all Mesos features that are used by modules into a facade layer. Thus backward compatibility could be maintained with more ease, albeit at the cost of erecting the facade.

Things that we may want to support in the future and that should not be impeded by the first MMS implementation

  • Modules should be able to express and check interdependencies and mutual compatibility.
  • We do not check module versions beyond their type for now. In addition we may want to give them a version number that gets bumped even if the type remains the same, if the protocol for using the module changed.
  • Somehow organize automatic module deletion. So far it is up to client code in Mesos to explicitly call the destructor.
  • No labels