A novel line-based streaming labeling algorithm with its VLSI architecture is proposed in this paper. Line-based
neighborhood examination scheme is used for efficient local connected components extraction. A novel reversed rooted
tree hook-up strategy, which is very suitable for hardware implementation, is applied on the mergence stage of
equivalent connected components. The reversed rooted tree hook-up strategy significant reduces the requirement of on-chip
memory, which makes the chip area smaller. Clock domains crossing FIFOs are also applied for connecting the
label core and external memory interface, which makes the label engine working in a higher frequency and raises the
throughput of the label engine. Several performance tests have been performed for our proposed hardware
implementation. The processing bandwidth of our hardware architecture can reach the I/O transfer boundary according to
the external interface clock in all the real image tests. Beside the advantage of reducing the processing time, our
hardware implementation can support the image size as large as 4096*4096, which will be very appealing in remote
sensing or any other high-resolution image applications. The implementation of proposed architecture is synthesized
with SMIC 180nm standard cell library. The work frequency of the label engine reaches 200MHz.
A novel labeling algorithm is proposed in this paper for high-resolution image connected components labeling. The
proposed method successfully solves the labeling problem in the case that the computer memory capacity is less than
required for the existing algorithms. Unlike the conventional algorithms that have to load the entire image into memory
for labeling, our algorithm can finish the labeling task with the memory requirement of only two image rows size.
Comparison and tests for high-resolution images are performed with the state-of-the-art algorithm. The proposed
algorithm achieves significant performance improvement for high-resolution images labeling.
As the demand of higher image quality and greater processing capabilities are growing, obtaining higher data bandwidth
for on-chip processing is becoming a more and more important issue. DMA (Direct Memory Access) component, as the
key element in stream processing SoC (System on Chip) , should be deeply researched and designed to satisfy the
high data bandwidth requirement of processing units. In this paper, we introduce a scalable high-performance DMA
architecture for complex SoC to satisfy rigorous high sustained bandwidth and versatile functionality requirements.
Several techniques and structures are proposed in this paper. An integrated verification environment is also built for our
design to fully verify its functionality. And the performance improvement by using our architecture is analyzed. At the
end of the paper, the post-simulation and tape-out results are provided. The whole implementation has been silicon
proven to be functional and efficient.