In knowledge graph reasoning, we observe a trend to analyze temporal data evolving over time. The additional temporal dimension is attached to facts in a knowledge base resulting in quadruples between entities such as (Nintendo, released, Super Mario, Sep-13-1985), where the relation between two entities is associated to a specific time interval or point in time. Multi-hop reasoning on inferred subgraphs connecting entities within a knowledge graph can be formulated as a reinforcement learning task where the agent sequentially performs inference upon the explored subgraph. The task in this work is to infer the predicate between a subject and an object entity, i.e., (subject, ?, object, time), being valid at a certain timestamp or time interval. Given query entities, our agent starts to gather temporal relevant information about the neighborhood of the subject and object. The encoding of information about the explored graph structures is referred to as fingerprints. Subsequently, we use the two fingerprints as input to a Q-Network. Our agent decides sequentially which relational type needs to be explored next expanding the local subgraphs of the query entities in order to find promising paths between them. The evaluation shows that the proposed method not only yields results being in line with state-of-the-art embedding algorithms for temporal Knowledge Graphs (tKG), but we also gain information about the relevant structures between subjects and objects.