The Viterbi algorithm (VA) for decoding convolutionally encoded data has historically been implemented on special-purpose digital electronic hardware. For short/moderate (K equals 3 to 9) constraint length codes, a primary design goal is to maximize the decoded bit rate while minimizing circuit area. In recent years, a number of special-purpose architectures based upon shuffle-exchange networks, cube-connected cycles, ring-based networks, systolic arrays, or programmable processors have been designed for efficient implementation of the VA at these and longer constraint lengths. However, at the same time, the performance:cost ratio of high- end general-purpose computing machines has been improving dramatically. Recognizing the substantial investment in time and resources required to design and build an ASIC-based decoder for long (K equals 10 to 15) constraint length codes, the feasibility of implementation of the VA as a background process on a readily available general-purpose parallel processing machine deserves exploration. We consider the limitations and benefits of a Viterbi decoder for long constraint length codes implemented in software on a general-purpose parallel processing machine.