Currently, MXNet operators and NDArray only support tensor size less than 2147483648 (2^32). This is due to the data type for array element indexing as well as value storage are using unint32_t by default in the MXNet backend.
To support large tensors, we need a systematic change across the entire MXNet backend and python front end. Specifically, the following tasks are needed at minimum:
- Choose a data type that scales beyond 2^32 to index elements in the array
- Also update data value type using the new data type
- It is desired the data type is not fixed and can be adjusted cross different platforms
- Run performance tests on various platforms to make sure no significant runtime and/or memory degradation
- Document the change.
An JIRA epic is created to track this project: