Indoor positioning and navigation inside an area with no GPS-data availability is a challenging problem. There are applications such as augmented reality, autonomous driving, navigation of drones inside tunnels, in which indoor positioning gets crucial. In this paper, a tandem architecture of deep network-based systems, for the first time to our knowledge, is developed to address this problem. This structure is trained on the scene images being obtained through scanning of the desired area segments using photogrammetry. A CNN structure based on EfficientNet is trained as a classifier of the scenes, followed by a MobileNet CNN structure which is trained to perform as a regressor. The proposed system achieves amazingly fine precisions for both Cartesian position and quaternion information of the camera.