1 January 1990 Learning to recognize reusable software by induction
Author Affiliations +
Abstract
The goal of the Partial Metrics Project is the automatic acquisition of planning knowledge from target code modules in a program library . In the current prototype the system is given a target code module written in Ada as input, and the result is a sequence of generalized transformations that can be used to design a class of related modules. This is accomplished by embedding techniques from Artificial Intelligence into the traditional structure of a compiler. The compiler performs compilation in reverse, starting with detailed code and producing an abstract description of it. The principal task facing the compiler is to find a decomposition of the target code into a collection of syntactic components that are nearly decomposable. Here, nearly decomposable corresponds to the need for each code segment to be nearly independent syntactically from the others. The most independent segments are then the target of the code generalization process. This process can be described as a form of chunking and is implemented here in terms of explanation-bas|d learning. Chunking has been shown to be an important vehicle for learning in other application domains as well . The problem of producing nearly decomposable code components becomes difficult when target code module is not well structured. The task facing users of the system is to be able to identify well-structured code modules from a library of modules that are suitable for input to the system. In this paper we describe the use of inductive learning techniques, namely variations on Quinlan’s ID3 system that are capable of producing a decision tree that can be used to conceptually distinguish between well and poorly structured code. In order to accomplish that task a set of high-level concepts used by software engineers to characterize structurally understandable code were identified. Next, each of these concepts was operationalized in terms of code complexity metrics than can be easily calculated during the compilation process. These metrics are related to various aspects of the program structure including its coupling, cohesion, data structure, control structure, and documentation. Each candidate module was then described in terms of a collection of such metrics. Using a training set of positive and negative examples of well-structured modules, each described in terms of the appointed metrics, a decision tree was produce that was used to recognize other well-structured modules in terms of their metric properties. This approach was applied to modules from existing software libraries in a variety of domains such as vision and numerical methods. The results achieved by the system were then benchmarked against the performance of experienced programmers in terms of recognizing well structured code. In a test case involving 82 modules, the system was able to discriminate between poor and well-structured code 99% of the time as compared to an 80% average for the 25 programmers sampled. The results suggest that such an inductive system can serve as a practical mechanism for effectively identifying reusable code modules in terms of their structural properties
© (1990) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
Juan Carlos Esteva, Robert G. Reynolds, "Learning to recognize reusable software by induction", Proc. SPIE 1293, Applications of Artificial Intelligence VIII, (1 January 1990); doi: 10.1117/12.21115; https://doi.org/10.1117/12.21115
PROCEEDINGS
17 PAGES


SHARE
RELATED CONTENT

Applications development of the Intel iWarp system
Proceedings of SPIE (March 11 1993)
Planning Strategic Paths Through Variable Terrain Data
Proceedings of SPIE (June 14 1984)
Hierarchic Path Generation
Proceedings of SPIE (March 29 1988)
Solution of integer programs for power electronics
Proceedings of SPIE (January 01 1990)

Back to Top