Disaggregated Data Centers (DCs) have emerged as a powerful architectural framework towards increasing resource utilization and system power efficiency, requiring, however, a networking infrastructure that can ensure low-latency and high-bandwidth connectivity between a high-number of interconnected nodes. This reality has been the driving force towards high-port count and low-latency optical switching platforms, with recent efforts concluding that the use of distributed control architectures as offered by Broadcast-and-Select (BS) layouts can lead to sub-μsec latencies. However, almost all high-port count optical switch designs proposed so far rely either on electronic buffering and associated SerDes circuitry for resolving contention or on buffer-less designs with packet drop and re-transmit procedures, unavoidably increasing latency or limiting throughput. In this article, we demonstrate a 256x256 optical switch architecture for disaggregated DCs that employs small-size optical delay line buffering in a distributed control scheme, exploiting FPGA-based header processing over a hybrid BS/Wavelength routing topology that is implemented by a 16x16 BS design and a 16x16 AWGR. Simulation-based performance analysis reveals that even the use of a 2- packet optical buffer can yield <620nsec latency with >85% throughput for up to 100% loads. The switch has been experimentally validated with 10Gb/s optical data packets using 1:16 optical splitting and a SOA-MZI wavelength converter (WC) along with fiber delay lines for the 2-packet buffer implementation at every BS outgoing port, followed by an additional SOA-MZI tunable WC and the 16x16 AWGR. Error-free performance in all different switch input/output combinations has been obtained with a power penalty of <2.5dB.