Augmented Reality (AR) offers tremendous potential for a wide range of fields including entertainment, medicine, and engineering. AR allows digital models to be integrated with a real scene (typically viewed through a video camera) to provide useful information in a variety of contexts. The difficulty in authoring and modifying scenes is one of the biggest obstacles to widespread adoption of AR. 3D models must be created, textured, oriented and positioned to create the complex overlays viewed by a user. This often requires using multiple software packages in addition to performing model format conversions. In this paper, a new authoring tool is presented which uses a novel method to capture product assembly steps performed by a user with a depth+RGB camera. Through a combination of computer vision and imaging process techniques, each individual step is decomposed into objects and actions. The objects are matched to those in a predetermined geometry library and the actions turned into animated assembly steps. The subsequent instruction set is then generated with minimal user input. A proof of concept is presented to establish the method’s viability.