As computing power is becoming the core productivity of the digital economy era, the concept of Computing and Network Convergence (CNC), under which network and computing resources can be dynamically scheduled and allocated according to users' needs, has been proposed and attracted wide attention. Based on the tasks' properties, the network orchestration plane needs to flexibly deploy tasks to appropriate computing nodes and arrange paths to the computing nodes. This is a orchestration problem that involves resource scheduling and path arrangement. Since CNC is relatively new, in this paper, we review some researches and applications on CNC. Then, we design a CNC orchestration method using reinforcement learning (RL), which is the first attempt, that can flexibly allocate and schedule computing resources and network resources. Which aims at high profit and low latency. Meanwhile, we use multi-factors to determine the optimization objective so that the orchestration strategy is optimized in terms of total performance from different aspects, such as cost, profit, latency and system overload in our experiment. The experiments shows that the proposed RL-based method can achieve higher profit and lower latency than the greedy method, random selection and balanced-resource method. We demonstrate RL is suitable for CNC orchestration. This paper enlightens the RL application on CNC orchestration.