This paper presents the application of the AgentMiner<SUP>TM</SUP> tool suite to improve the efficiency of detecting data anomalies in oil well log and production data sets, which have traditionally been done by hand or through the use of database business rules. There was a need to verify the data sets, once cleansed and certified to ensure that the existing data certification process was effective. There was also a need to identify more complex relational data anomalies that cannot be addressed by simple business rules. Analysis techniques including statistical clustering, correlation and 3-D data visualization techniques were successfully utilized to identify potential complex data anomalies. A data-preprocessing tool was also applied to automatically detect simple data errors such as missing, out of range, and null values. The pre-processing tools were also used to prepare the data sets for further statistical and visualization analyses. To enhance the discovery of data anomalies two different data visualization tools for the data clusters were applied.
Data mining and knowledge discovery in databases are providing means to analyze and discover new knowledge from large datasets. The growth of the Internet has provided the average user with the ability to more easily access and gather data. Many of the existing data mining tools require users to have advanced knowledge. New graphical-based tools are needed to allow the average user to easily and quickly discover new patterns and trends from heterogenous data. SAIC is developing an agent-based data mining tool called AgentMiner<SUB>tm</SUB> as part of an internal research project. AgentMiner<SUB>tm</SUB> will allow the user to perform advanced information retrieval and data mining to discover patterns and relationships across multiple distributed, heterogeneous data sources. The current system prototype utilizes an ontology to define common concepts and data elements that are contained in the distributed data sources. AgentMiner<SUB>tm</SUB> can access data from relational databases, structured text, web pages, and open text sources. It is a Java-based application that contains a suite of graphical tools such as the Mission Manager, Graphical Ontology Builder (GOB), and Qualified English Interpreter (QEI). In addition, AgentMiner<SUB>tm</SUB> provides the capability to support both 2-D and 3-D data visualization, including animation across a selected independent variable.