Unlike most popular Deep Learning frameworks, DL4J was envisioned with Java principles and the JVM in mind. Its backends were once, well, all Java. But those days are long gone, and Nd4J now uses native backends for both CPU and CUDA.
Matrix operations in Deeplearning4j are powered by ND4J, a linear algebra library for n-dimensional arrays. ND4J can be described as “Numpy for the JVM” with swappable backends supporting CPUs and GPUs. ND4J is available on the most common operating systems, including Linux, Mac OSX, Windows on x86_64 and Linux on ppc8. Libnd4j, the native engine that powers ND4J, is written in C++. The CPU backend is implemented with OpenMP vectorizable loops with SIMD support while the GPU backend is implemented with CUDA.