Because different imaging sensors provide different signature cues to distinguish targets from backgrounds there has been a substantial amount of effort put into how to merge the information from different sensors. Unfortunately, when the imagery from two different sensors is combined the noise from each sensor is also combined in the resultant image. Additionally, attempts to enhance the target distinctness from the background also enhance the distinctness of false targets and clutter. Even so there has been some progress in trying to mimic the human vision capability by color contrast enhancement. But what has not been tried is how to mimic how the human vision system inherently does this sensor function of our color cone sensors. We do our sensor fusion in the pre- attentive phase of human vision. This requires the use of binocular stereo vision because we do have two eyes. In human vision the images from each eye are split in half, and the halves are sent to opposite sides of the brain for massively parallel processing. We don't know exactly how this process works, but the results is a visualization of the world that is 3D in nature. This process automatically combines the color, texture, and size and shape of the objects that make up the two images that our eyes produce. It significantly reduces noise and clutter in our visualization of the world. In this pre-attentive phase of human vision tha takes just an instant to accomplish, our human vision process has performed an extremely efficient fusion of cone imagery. This sensor fusion process has produced a scene where depth perception and surface contour cues are used to orient and distinguish objects in the scene before us. It is at this stage that we begin to attentively sort through the scene for objects or targets of interest. In many cases, however, the targets of interest have already been located because of their depth or surface contour cues. Camouflaged targets that blend perfectly into complex backgrounds may be made to pop out because of their depth cues. In this paper we will describe a new method termed RGB stereo sensor fusion that uses color coding of the separate pairs of sensor images fused to produce wide baseline stereo images that are displayed to observers for search and target acquisition. Performance enhancements for the technique are given as well as rationale for optimum color code selection. One important finding was that different colors (RGB) and different spatial frequencies are fused with different efficiencies by the binocular vision system.