Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Possible Alternative Approaches

Writing a Java Inference API that directly calls the native code - Doing this would be designing and implementing a Java Inference API that will interact with the native code using the existing JNI /JNA. Similar to the other solution, the code. The API would be designed to make Java inferencing simple and idiomatic. The existing JNI code could be shared by both the existing Scala API and the new Java Inference API. The biggest drawback to this approach is that it involves a significant amount of duplicate work that would be very difficult to maintain with current resources.

  1. Advantages
    • No overhead from converting collections.
    • No surprises from interacting with the Scala code.
    • Would likely be the faster implementations since benchmarks generally show that Java outperforms Scala.

2. Disadvantages

    • Duplication of efforts between this and the Scala API (this means reimplementing executor, ndarray, module, etc which is a significant effort).
    • Will have to reimplement off-heap memory management.
    • Added design effort to decide the Java API.

Adopt Java as the primary JVM language - This approach is basically to spend a very significant effort rewriting the entire Scala API into Java. After that was done we could begin adding support for other JVM languages using Java as a base and eventually the current Scala API could be deprecated. Obviously this involves a very significant upfront effort. Long-term it would be reasonable to expect improved performance across all JVM languages (since benchmarks generally show Java to outperform Scala) and it would likely be easier to add support for other JVM languages. The performance gains would likely be offset tremendously by the fact that most of the workload is done in the C++ native code and not in the JVM.

1. Advantages

    • Likely to see better performance across all JVM languages.

    • Easier to add support for more JVM languages in the future.


2. Disadvantages

    • Tremendous amount of upfront effort.

    • Scala API already exists and has been well received.

    • Scala is a popular data science language
    • Apache Spark is a Scala first language and is a popular analytics engine