Metadata are data about data and can refer to a large type of information categories: journals, digital libraries, structured and semistructured documents, etc. Our approach refers mainly to discovery of the association rules problem in metadata repository associated with semistructured documents. Extensions to heterogeneous documents and possible application to unstructured documents are taken into account also. The metadata stored in metadata repositories are processed by translation in a table, similar to the well known basket from association rule discovery problem. A slightly modified Apriori and AprioriAll algorithms are used to discover association rules among values of metadata attributes. Experimental results over a selected collection of metadata stored in an repository is presented.
"Data mining in metadata repositories", Proc. SPIE 4730, Data Mining and Knowledge Discovery: Theory, Tools, and Technology IV, (12 March 2002); doi: 10.1117/12.460213; https://doi.org/10.1117/12.460213