Automated segmentation of vertebral bone from diagnostic Computed Tomography (CT) images has become an important part of clinical workflow today. There is an increasing need for computer aided diagnosis applications of various spine disorders including scoliosis, fracture detection and even automated reporting. While modelbased methods have been widely used, recent deep Learning methods have shown a great potential in this area. However, choice of optimal configuration of the network to get the best segmentation performance is challenging. In this work, we explore the impact of different training and inference options, including dimensions, activation function, batch normalization, kernel size, filters, patch size and patch selection strategy in U-Net architecture. 20 publicly available CT Spine datasets from Spineweb repository was used in this study divided into training/test datasets. Training with different DL configurations were repeated with these datasets. We used the best weights corresponding to each configuration for inference on the independent test dataset. These results on the test dataset with the best weights for each configurations were compared. 3D models performed consistently better than 2D approaches. Overlapped patch based inference had a big impact on enhancing performance accuracy. The selection of training patch size was also found to be crucial in improving the model performance. Moreover, the need for an effective balance of positive and negative training patches was found. The best performance in our study was obtained by using overlapped patch inference, training with RELU activation and batch normalization in a 3D U-Net architecture with training patch size of 128×128×32 that resulted in average values of precision= 97%, sensitivity= 96% and F1 (Dice)= 96% for the test dataset.