Distributed software-defined networking (SDN), which consists of multiple inter-connected network domains and each managed by one SDN controller, is an emerging networking architecture that offers balanced centralized/distributed control. Under such networking paradigm, resource management among various domains (e.g., optimal resource allocation) can be extremely challenging. This is because many tasks posted to the network require resources (e.g., CPU, memory, bandwidth, etc.) from different domains, where cross-domain resources are correlated, e.g., their feasibility depends on the existence of a reliable communication channel connecting them. To address this issue, we employ the reinforcement learning framework, targeting to automate this resource management and allocation process by proactive learning and interactions. Specifically, we model this issue as an MDP (Markov Decision Process) problem with different types of reward functions, where our objective is to minimize the average task completion time. Regarding this problem, we investigate the scenario where the resource status among controllers is fully synchronized. Under such scenario, the SDN controller has complete knowledge of the resource status of all domains, i.e., resource changes upon any policies are directly observable by controllers, for which Q-learning-based strategy is proposed to approach the optimal solution.