The multi-source data generated by distributed systems, provide a holistic description of the system. Harnessing the joint distribution of the different modalities by a learning model can be beneficial for critical applications for maintenance of the distributed systems. One such important task is the task of anomaly detection where we are interested in detecting the deviation of the current behaviour of the system from the theoretically expected. In this work, we utilize the joint representation from the distributed traces and system log data for the task of anomaly detection in distributed systems. We demonstrate that the joint utilization of traces and logs produced better results compared to the single modality anomaly detection methods. Furthermore, we formalize a learning task - next template prediction NTP, that is used as a generalization for anomaly detection for both logs and distributed trace. Finally, we demonstrate that this formalization allows for the learning of template embedding for both the traces and logs. The joint embeddings can be reused in other applications as good initialization for spans and logs.