Human activity detection and recognition capabilities have broad applications for civilian, military, and homeland security. However, monitoring of human activities are very complicated and tedious tasks especially when multiple persons involved perform activities in confined spaces that impose significant obstruction, occultation and observability uncertainty. These applications require fast and reliable tracking systems to observe and inference dynamic objects from multiple coherent video sequences. In compact surveillance systems utilization of multi-cameras monitoring system is highly imperative for tracking, inference, and recognition of variety of group activities. With multi-cameras systems, complexity of occultation can be dealt with by finding and correlating the correspondences from within multiple cameras views observing the same target at once. In this paper, we demonstrate one such a multi-person tracking system developed in a virtual environment. By example, we demonstrate an efficient and effective technique for multi-target tracking, discrimination, and activity recognition in confined spaces. The exemplary scenario considered under this study represents a bus activity where multiple passengers arrive, take seats, and leave while being monitoring by four concurrently operating surveillance camera systems. In this paper, we present how processing tasks of multiple cameras are shared, what objects features they detect, track, and identify jointly. Furthermore, we present the computational intelligence techniques for processing multi-camera images for recognition of objects of interest as well as for annotation of observed individual and group activities via meta-data imagery fusion. The proposed multi-camera processing system is shown to have efficiency and effectively to track multiple targets with different degree of social interactions either with one another or with objects involved with their activities.