In this paper, we describe a computationally efficient ASIC design that leads to a highly efficient power and area implementation of space-time block decoder compared to a direct implementation of the original algorithm. Our study analyzes alternative methods of evaluating as well as implementing the previously reported maximum likelihood algorithms (Tarokh et al. 1998) for a more favorable hardware design. In our previous study (Cavus et al. 2001), after defining some intermediate variables at the algorithm level, highly computationally efficient decoding approaches, namely sign and double-sign methods, are developed and their effectiveness are illustrated for 2x2, 8x3 and 8x4 systems using BPSK, QPSK, 8-PSK, or 16-QAM modulation. In this work, alternative architectures for the decoder implementation are investigated and an implementation having a low computation approach is proposed. The applied techniques at the higher algorithm and architectural levels lead to a substantial simplification of the hardware architecture and significantly reduced power consumption. The proposed architecture is being fabricated in TSMC 0.18 μ process.