Implementing deep neural networks in safety critical systems, in particular in the aeronautical domain, will require to offer adequate specification paradigms to preserve the semantics of the trained model on the final hardware platform. We propose to extend the nnef language in order to allow traceable distribution and parallelisation optimizations of a trained model. We show how such a specification can be implemented in cuda on a Xavier platform.