Candidate Some potential roadmap for remainder of 2016:ideas for 2018+ below. For more details please see the JIRAs.
Predictive models
...
- Multi-class SVM
Jira server ASF JIRA serverId 5aa69414-a9e9-3523-82ec-879b028fb15b key MADLIB-1037 - Mixed effects modeling
Jira server ASF JIRA serverId 5aa69414-a9e9-3523-82ec-879b028fb15b key MADLIB-987 - Gradient boosted machines
- Gaussian Mixture Model using Expectation Maximization (EM)
...
Graph
Shortest path https://issues.apache.org/jira/browse/MADLIB-992
Standard traversal
depth first search
breadth first search
topological sort
One mode projection (converting a bi-partitite graph of user-item graph to user-user or item-item graph)
Connected components
Page rank
Hierarchical graph cut
Between-ness centrality
Minimum spanning tree
Utilities
Path functions (phase 2) https://issues.apache.org/jira/browse/MADLIB-977
Prediction metrics https://issues.apache.org/jira/browse/MADLIB-907
Sessionization https://issues.apache.org/jira/browse/MADLIB-909
Pivoting https://issues.apache.org/jira/browse/MADLIB-908
Anonymization https://issues.apache.org/jira/browse/MADLIB-911
URI tools https://issues.apache.org/jira/browse/MADLIB-910
Stratified sampling https://issues.apache.org/jira/browse/MADLIB-986
Usability
Expand coverage for PivotalR
Expand coverage for PMML export
Interface improvement and consistency
Implement an interface using named parameters
Python API
Performance and scalability
Work around PostgreSQL 1 GB field size limit https://issues.apache.org/jira/browse/MADLIB-991
Platform
- algorithm
Jira server ASF JIRA serverId 5aa69414-a9e9-3523-82ec-879b028fb15b key MADLIB-410 - Hierarchical clustering
- k-NN improvements
Jira server ASF JIRA serverId 5aa69414-a9e9-3523-82ec-879b028fb15b key MADLIB-1061 Jira server ASF JIRA serverId 5aa69414-a9e9-3523-82ec-879b028fb15b key MADLIB-1181 - Deep learning
Graph
- Personalized PageRank
Jira server ASF JIRA serverId 5aa69414-a9e9-3523-82ec-879b028fb15b key MADLIB-1084 - Betweenness centrality
Jira server ASF JIRA serverId 5aa69414-a9e9-3523-82ec-879b028fb15b key MADLIB-1121 - Graph cut
Jira server ASF JIRA serverId 5aa69414-a9e9-3523-82ec-879b028fb15b key MADLIB-1074 - Triangle counting
Jira server ASF JIRA serverId 5aa69414-a9e9-3523-82ec-879b028fb15b key MADLIB-1125 - Minimum spanning tree
- Eigenvector centrality
Jira server ASF JIRA serverId 5aa69414-a9e9-3523-82ec-879b028fb15b key MADLIB-1123 - APSP performance improvements
Jira server ASF JIRA serverId 5aa69414-a9e9-3523-82ec-879b028fb15b key MADLIB-1155
Utilities
- Balanced datasets
Jira server ASF JIRA serverId 5aa69414-a9e9-3523-82ec-879b028fb15b key MADLIB-1168
Summary - add more statisticsJira server ASF JIRA serverId 5aa69414-a9e9-3523-82ec-879b028fb15b key MADLIB-1167 - Anonymization
Jira server ASF JIRA serverId 5aa69414-a9e9-3523-82ec-879b028fb15b key MADLIB-911 - URI tools
Jira server ASF JIRA serverId 5aa69414-a9e9-3523-82ec-879b028fb15b key MADLIB-910
Usability
- Expand coverage for PivotalR
- Interface improvement and consistency for 2.0 release (does not need to be backward compatible)
Jira server ASF JIRA serverId 5aa69414-a9e9-3523-82ec-879b028fb15b key MADLIB-1157 - Python API
Performance and scalability
- Mini-batching
Jira server ASF JIRA serverId 5aa69414-a9e9-3523-82ec-879b028fb15b key MADLIB-1048 Jira server ASF JIRA serverId 5aa69414-a9e9-3523-82ec-879b028fb15b key MADLIB-1037 Jira server ASF JIRA serverId 5aa69414-a9e9-3523-82ec-879b028fb15b key MADLIB-1200 - Improve decision tree and random forest performance for run-time and memory usage
Jira server ASF JIRA serverId 5aa69414-a9e9-3523-82ec-879b028fb15b key MADLIB-1057 Jira server ASF JIRA serverId 5aa69414-a9e9-3523-82ec-879b028fb15b key MADLIB-976
Platforms and Frameworks
- PostgreSQL 10 support
Jira server ASF JIRA serverId 5aa69414-a9e9-3523-82ec-879b028fb15b key MADLIB-1185 - Support modern versions of gcc
Jira server ASF JIRA serverId 5aa69414-a9e9-3523-82ec-879b028fb15b key MADLIB-1025 - Tensorflow support, or another deep learning framework
- GPU support
...