New generations of smart sensors must provide military and law enforcement agencies with reliable perceptual systems that are similar to human vision. The traditional approach cannot provide a reliable separation of an object from its background/clutter, while human vision unambiguously solves this problem. Vision is only a part of a system that converts visual information into knowledge structures. These structures drive the vision process, resolving ambiguity and uncertainty via feedback, and provide image understanding, which is an interpretation of visual information in terms of these knowledge models. These mechanisms provide a reliable recognition if the object is occluded or cannot be recognized as a whole. Biologically-inspired Network-Symbolic models convert image information into an "understandable" Network-Symbolic format, which is similar to relational knowledge models. Logic of visual scenes can be captured in the Network-Symbolic models and used for the disambiguation of visual information. Feature, symbol, and predicate are equivalent in the Network-Symbolic systems. A linking mechanism binds these features/symbols into coherent structures, and image can be interpreted by higher-level knowledge structures. View-based recognition is a hard problem for traditional algorithms that directly match a primary view of an object to a model. In Network-Symbolic Models, the derived structure, not the primary view, is a subject for unambiguous recognition.