When completely automated systems don't yield acceptable accuracy, many practical pattern recognition systems involve the human either at the beginning (pre-processing) or towards the end (handling rejects). We believe that it may be more useful to involve the human throughout the recognition process rather than just at the beginning or end. We describe a methodology of interactive visual recognition for human-centered low-throughput applications, Computer Assisted Visual InterActive Recognition (CAVIAR), and discuss the prospects of implementing CAVIAR over the Internet. The novelty of CAVIAR is image-based interaction through a domain-specific parameterized geometrical model, which reduces the semantic gap between humans and computers. The user may interact with the computer anytime that she considers its response unsatisfactory. The interaction improves the accuracy of the classification features by improving the fit of the computer-proposed model. The computer makes subsequent use of the parameters of the improved model to refine not only its own statistical model-fitting process, but also its internal classifier. The CAVIAR methodology was applied to implement a flower recognition system. The principal conclusions from the evaluation of the system include: 1) the average recognition time of the CAVIAR system is significantly shorter than that of the unaided human; 2) its accuracy is significantly higher than that of the unaided machine; 3) it can be initialized with as few as one training sample per class and still achieve high accuracy; and 4) it demonstrates a self-learning ability.
We have also implemented a Mobile CAVIAR system, where a pocket PC, as a client, connects to a server through wireless communication. The motivation behind a mobile platform for CAVIAR is to apply the methodology in a human-centered pervasive environment, where the user can seamlessly interact with the system for classifying field-data. Deploying CAVIAR to a networked mobile platform poses the challenge of classifying field images and programming under constraints of display size, network bandwidth, processor speed, and memory size. Editing of the computer-proposed model is performed on the handheld while statistical model fitting and classification take place on the server. The possibility that the user can easily take several photos of the object poses an interesting information fusion problem. The advantage of the Internet is that the patterns identified by different users can be pooled together to benefit all peer users. When users identify patterns with CAVIAR in a networked setting, they also collect training samples and provide opportunities for machine learning from their intervention. CAVIAR implemented over the Internet provides a perfect test bed for, and extends, the concept of Open Mind Initiative proposed by David Stork. Our experimental evaluation focuses on human time, machine and human accuracy, and machine learning. We devoted much effort to evaluating the use of our image-based user interface and on developing principles for the evaluation of interactive pattern recognition system. The Internet architecture and Mobile CAVIAR methodology have many applications. We are exploring in the directions of teledermatology, face recognition, and education.