Data visualization, as well as data analysis and data analytics, are all an integral part of the scientific process. Collectively, these technologies provide the means to gain insight into data of ever-increasing size and complexity. Over the past two decades, a substantial amount of visualization, analysis, and analytics R&D has focused on the challenges posed by increasing data size and complexity, as well as on the increasing complexity of a rapidly changing computational platform landscape. While some of this research focuses on solely on technologies, such as indexing and searching or novel analysis or visualization algorithms, other R&D projects focus on applying technological advances to specific application problems. Some of the most interesting and productive results occur when these two activities-R&D and application-are conducted in a collaborative fashion, where application needs drive R&D, and R&D results are immediately applicable to real-world problems.
This paper presents two new strategies that can be used to greatly improve the speed of connected component labeling algorithms. To assign a label to a new object, most connected component labeling algorithms use a scanning step that examines some of its neighbors. The first strategy exploits the dependencies among them to reduce the number of neighbors examined. When considering 8-connected components in a 2D image, this can reduce the number of neighbors examined from four to one in many cases. The second strategy uses an array to store the equivalence information among the labels. This replaces the pointer based rooted trees used to store the same equivalence information. It reduces the memory required and also produces consecutive final labels. Using an array instead of the pointer based rooted trees speeds up the connected component labeling algorithms by a factor of 5 ~ 100 in our tests on random binary images.