Multiatlas methods have been successful for brain segmentation, but their application to smaller anatomies remains relatively unexplored. We evaluate seven statistical and voting-based label fusion algorithms (and six additional variants) to segment the optic nerves, eye globes, and chiasm. For nonlocal simultaneous truth and performance level estimation (STAPLE), we evaluate different intensity similarity measures (including mean square difference, locally normalized cross-correlation, and a hybrid approach). Each algorithm is evaluated in terms of the Dice overlap and symmetric surface distance metrics. Finally, we evaluate refinement of label fusion results using a learning-based correction method for consistent bias correction and Markov random field regularization. The multiatlas labeling pipelines were evaluated on a cohort of 35 subjects including both healthy controls and patients. Across all three structures, nonlocal spatial STAPLE (NLSS) with a mixed weighting type provided the most consistent results; for the optic nerve NLSS resulted in a median Dice similarity coefficient of 0.81, mean surface distance of 0.41 mm, and Hausdorff distance 2.18 mm for the optic nerves. Joint label fusion resulted in slightly superior median performance for the optic nerves (0.82, 0.39 mm, and 2.15 mm), but slightly worse on the globes. The fully automated multiatlas labeling approach provides robust segmentations of orbital structures on magnetic resonance imaging even in patients for whom significant atrophy (optic nerve head drusen) or inflammation (multiple sclerosis) is present.