This paper proposes pipelining and bypassing unit (BPU) design method in our 32-bit RISC/DSP processor: MediaDsp3201 (briefly, MD32). MD32 is realized in 0.18μm technology, 1.8v, 200MHz working clock and can achieve 200 million/s Multiply-Accumulate (MAC) operations. It merges RISC architecture and DSP computation capability thoroughly, achieves fundamental RISC, extended DSP and single instruction multiple data (SIMD) instruction set with various addressing modes in a unified and customized DSP pipeline stage architecture. We will first describe the pipeline structure of MD32, comparing it to typical RISC-style pipeline structure. And then we will study the validity of two bypassing schemes in terms of their effectiveness in resolving pipeline data hazards: Centralized and Distributed BPU design strategy (CBPU and DBPU). A bypassing circuit chain model is given for DBPU, which register read is only placed at ID pipe stage. Considering the processor’s working clock which is decided by the pipeline time delay, the optimization of circuit that serial select with priority is also analyzed in detail since the BPU consists of a long serial path for combination logic. Finally, the performance improvement is analyzed.