Simultaneous localization and mapping (SLAM) is a problem in robotics aiming to model the environment and estimate the pose of a device within it at the same time. Developed solution is the core technology for emerging applications such as self-driving cars, automated guided vehicles (AGV), and domestic robots. Inevitably, the performance of SLAM algorithms relies highly on input signals from optical equipment ranging from cameras, laser rangefinders, and LIDAR. Loop closure, the function detecting visited locations to correct accumulated errors, is a crucial element in a SLAM system. Conventionally, geometric features are used to interpret the scenes for similarity estimation. In scenarios with nearly identical scenes existing, the feature-based approaches remain ineffective. Semantic objects and the comparison of multi-frame, therefore, can be integrated into the process and present a new level of environmental information. In this article, we first provide an overview of the SLAM system. Then the semantic object-assisted and the time and spatial sequence comparison approach are proposed to improve the similarity measurement in the SLAM process. By integrating recognized objects like landmarks and signs, we can classify similar scenes better and significantly improve building-scale indoor mapping results. The performance of systems adopting various optical technologies is also compared in this work.