Existing deep learning-based object removal methods produce plausible results. However, they generate unsatisfactory results when the object is large, especially in facial images due to the lack of information about the affected region. Most of these methods rely on the object information in terms of a binary segmentation map which is insufficient to provide information about the face boundary and semantics symmetric relation. To address the problem, we propose a two-stage GAN-based image-to-image translation method that exploits the face semantic segmentation instead of the binary segmentation map of the object. Specifically, our model learns a complete facial segmentation map from an input image (face image with unwanted object) in the first stage and translates that generated semantic segmentation map combined with input image into a plausible face image without the object. We also exploit the joint loss function that consists of low-level loss, adversarial loss, and perceptual loss to produce semantically realistic facial images. Experimental results show that our method outperforms previous state-of-the-art methods both quantitatively and qualitatively.