Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

[dmitriy setrakyan]: I am not sure I see the reason for removing the word classLoader on serialization part and keeping it on deserialization. I also think that the method names should be symmetric. With that in mind, "encodeClassLoader" and "decodeClassLoader" may not be the best names, but they are consistent with each other and symmetric. My vote would be to keep the naming.

[romain.gilles]: I agree with the fact that we are not serializing the classLoader but I think the new name is too generic. Here we focus on how to resolve ClassLoader. I like the single responsibility principle therefore I think the name should relate the scoped responsibility which is resolving the classLoader. DeserialisationHintsCodec does not express this point in my opinion. ClassLoaderHintsCodec is a mix of the 2 proposals. And maybe, In order to keep the symmetry between the method name (I'm agree with Dmitriy on this point) we may give the following name to the methods encodeHints(...) and decodeHints(...).

ClassLoaderCodec

The ClassLoaderCodec should be called for every Object during serialization and deserialization and should be part of the IgniteConfiguraiton:

Code Block
public interface ClassLoaderCodec {
    @Nullable public Object encodeClassLoader(Class<?> cls, ClassLoader clsLdr) throws IgniteException;
    public ClassLoader decodeClassLoader(String fqn, @Nullable Object encodedClsLdr) throws IgniteException;
}

...

ClassLoaderCodec

...

The ClassLoaderCodec should be called for every Object during serialization and deserialization and should be part of the IgniteConfiguraiton:

Code Block
public interface ClassLoaderCodec {
    @Nullable public Object encodeClassLoader(Class<?> cls, ClassLoader clsLdr) throws IgniteException;
    public ClassLoader decodeClassLoader(String fqn, @Nullable Object encodedClsLdr) throws IgniteException;
}


[raul.kripalani]: See my comment above.

ClassLoaderCodec Implementations

Ignite will come with 2 OSGI class loader codecs out of the box, pessimistic and optimistic, leaving users with opportunity to provide their own custom class loader codecs as well (potentially for non-OSGI environments).

In general in OSGi, the same package may be exported by multiple bundles and therefore an FQN may not be sufficient to look up the correct class loader. In such cases, the codec implementation must employ a pessimistic approach and encode enough information (for example, the bundle symbolic name, plus the bundle version) for the deserializer to be able to resolve the FQN to the correct class loader. Such implementation will work for all use cases, but it introduces some overhead and increases the size of the serialized messages.

However, for the applications that can enforce one-to-one mapping of packages to bundles, a simplified (optimistic) approach can be used instead. With this approach, no encoding of the class loader is required (encodeClassLoader() returns null), and only the FQN is used for decoding of the class loader.

[raul.kripalani]: I don't like transmitting bundle symbolic names over the wire, as it couples the serialising party with the deserialising party, forcing both to contain the class inside the same bundle. As I said in the mailing list, making this assumption would be a short-sighted strategy, as users may be sharing caches across applications across multiple containers, where classes live in different bundles in different containers.

Ignite will come with 2 OSGI class loader codecs out of the box, pessimistic and optimistic, leaving users with opportunity to provide their own custom class loader codecs as well (potentially for non-OSGI environments).

In general in OSGi, the same package may be exported by multiple bundles and therefore an FQN may not be sufficient to look up the correct class loader. In such cases, the codec implementation must employ a pessimistic approach and encode enough information (for example, the bundle symbolic name, plus the bundle version) for the deserializer to be able to resolve the FQN to the correct class loader. Such implementation will work for all use cases, but it introduces some overhead and increases the size of the serialized messages.

However, for the applications that can enforce one-to-one mapping of packages to bundles, a simplified (optimistic) approach can be used instead. With this approach, no encoding of the class loader is required (encodeClassLoader() returns null), and only the FQN is used for decoding of the class loader.

[raul.kripalani]: I don't like transmitting bundle symbolic names over the wire, as it couples the serialising party with the deserialising party, forcing both to contain the class inside the same bundle. As I said in the mailing list, making this assumption would be a short-sighted strategy, as users may be sharing caches across applications across multiple containers, where classes live in different bundles in different containers.

[romain.gilles]: I see the point for the cache use case but cache is not the only use case. If I remember correctly Ignite comes from GridGain company and it was at the beginning more a distributed computing solution than a distributed caching solution. Maybe now we can see it as a distributed caching but it is still very interesting for distributed computing. In this use case I don't see a cluster of distributed computing have different implementation of the computation unit. For example let say you are using it in order to price deals store in partitioned cache. Then the bank will be quite disappointed to gate different (inconsistent) pricing result accross the cluster. Secondly, you don't want to export the computation logic unit because it is a private detail. therefore it will not be in the exported package. How can we solve this kind of use case?

I also don't think it's necessary. We just need the package name + package version. An OSGi container cannot expose the same package under the same version number twice, so the tuple (package name, package version) is enough to unambiguously locate the Bundle that exports our class.

Now, what we need to do is determine HOW we locate the Bundle. I have two ideas in mind:

  1. Create a custom OSGi Manifest header Ignite-Export-Package that lists the packages to be made available to Ignite for deserialisation purposes. Our Activator would register a BundleTracker that introspects Bundle installations and maintains a Map between (package name, package version) => Bundle, of only those bundles where the user has expressly indicated that there are packages to be made available to Ignite.
    [romain.gilles]: How do you plan to introduce the headers into the bundle. It will make Ignite development for OSGi slightly different from the non OSGi implementation. I think we should take care of my point above also here?made available to Ignite.
  2. Avoid the header and use a logic like the bundle:find-class command in Karaf: https://github.com/apache/karaf/blob/master/bundle/core/src/main/java/org/apache/karaf/bundle/command/FindClass.java. This logic queries all bundles in the container to locate the package/class. We would build a memory cache (ConcurrentHashMap) to avoid performing this lookup more than once for the same package. We also need a BundleTracker to clear the package cache for bundles that are uninstalled.
    [romain.gilles]: I was thinking about something like that in the optimistic mode. But if we have 2 (and maybe more) versions of the same bundle we will be in trouble to find the good one (wink). Except if we assume that the cluster is homogeneous and every bundle / symbolic name is installed one and only one time. Does it make sens? If we want to replace String by identifier, let say int or long, we will have to make it consistent across the cluster what is your current proposal for that?/command/FindClass.java. This logic queries all bundles in the container to locate the package/class. We would build a memory cache (ConcurrentHashMap) to avoid performing this lookup more than once for the same package. We also need a BundleTracker to clear the package cache for bundles that are uninstalled.

With either of these approaches, I think we don't need pessimistic and/or optimistic strategies. Just a single strategy would be enough.[romain.gilles]: Finally I think we may need a way to be aware of the start and the end of a serialization / deserialization of an object graph in order to handle some optimizations. Or at least provide a way in the method signature to manage mapping.


Here's how the pessimistic codec implementation might look like (in pseudo-code): 

...