Data independence is a core property of database management systems (DBMSs) distinguishing them from file-based data management: The application must be presented with information in exactly the form it needs them, without having to transform it in any way before processing and, in particular, independent from the format in which data are stored within the database. For 2D digital images and other raster data such as 1D time series, 3D tomograms, 3D and 4D environmental sensor data and high-dimensional simulation data this means that the application is free to choose between a main memory representation suitable for the target machine type on hand (e.g., to perform a convolution) and some other data format (e.g., to exploit MPEG hardware support). Previously, the concept of Multidimensional Discrete Data (MDD) has been suggested to handle raster data of all kind. A specialized storage architecture has been presented for the generic and efficient storage, manipulation, and retrieval of MDD. In this paper, we use this approach to show how strict separation of logical and physical level together with a declarative query interface leads to full data independence on MDD, as known from the classical DBMS data types such as strings and numbers. At the same time, sufficient flexibility is preserved to support an arbitrary number of specialized formats in parallel. The application can specify that query results shall be delivered as pure, unencoded C/C++ main memory arrays or in any other format implemented in the DBMS which is capable of holding the data. In addition, due to the enhanced semantics available in the database, storage format and database operations can be optimized according to various criteria such as data conversion overhead and transmission bandwidth. Data compression becomes an internal feature invisible to the application and taylorable to each client's actual needs. Benefits are exemplified through an application scenario.
|