With the development of computer technology, multimedia databases, especially image databases have been widely applied in many fields, among which medical image management is one of the most important. Traditional DBMS is good at managing formatted data. However, image is more complex than traditional data types both in structure and operation, which poses a big challenge to its wide application, so most current systems can only provide some basic storage and retrieval operations but lack the ability to meet such advanced management requirements as content query. Thus, content-Based image retrieval (CBIR) draws more and more research attention in the last few years.
In CBIR, most research efforts are devoted into static image retrieval, but little has been done on evolution query of image sequences, which has great significance in medical image management. The formation and evolution of each kind of disease always demonstrate some specific patterns, which, if can be summarized, will help doctors especially those inexperienced ones further their understanding about the disease and make a better treatment plan. For example, when a new disease arises, doctors may hope to query the most similar disease evolutions from history records in the database based on the evolution of this new disease. Also, doctors can specify an abnormal image sequence manually as an example. Even, doctors may want to query those image sequences that exhibit the evolution of a certain disease.
Most existing systems aim to assist doctors to diagnose a certain kind of disease, which usually have some drawbacks in their function of semantic retrieval. Many systems are a little difficult for the users to operate, or the functions are not sufficient. And the ability to support evolution query is even less. In this paper, we offer an evolution query solution based on medical image content.
We design a system to implement our ideas. It consists of four modules, which are data process, feature management, model management and image retrieval. In addition to query formatted data and a single medical image by its content, the system can also support evolution query. This kind of query is to search image sequences evolving with time that can help doctors find out similar diagnose records from database as a reference.
Image data and the related formatted data will first be preprocessed to eliminate erroneous and contradict information, and also to make up the missing information. And then feature extraction algorithms and object rule model are employed in object (or disease symptom such as lesion and hemorrhage, etc) recognition. Finally, the system will analyze the semantics of each medical image with static semantic model, such as to determine the kind and the phase of the disease. On that basis, evolution query at both object and semantic levels can be provided. This paper mainly elaborates on evolution query methods on medical image databases.
Different from querying a single image, querying disease evolution concerns with the comparison of two image sequences evolving with time. It needs to search a certain number of similar image sequences given a target one. Just as it is not easy to define the similarity of two single images, it seems far more difficult to define the similarity of two disease evolutions. The reason is that there are too more factors that can be compared between two evolutions and even doctors themselves can not tell whether they are similar or not generally. However, in most cases, the comparison of only one factor between two evolutions looks more meaningful in practice because the similarity of multiple factors can be calculated by adding weights to each individual factor.
So, this paper first discusses the factors that can be compared between two image sequences, or the selection of feature values. Each image can be represented by a feature vector formed by a number of feature values. Image evolution is the result of feature value evolutions. But feature selection is dependent on specific application fields. Doctors care for different feature values in different diseases. We can only give some possibly common feature values as an example. Typical feature values can consists of the number of symptom types, the number, the average location and size of the symptoms. Thus the comparison of two evolutions turns into the comparison of two curves in graphic view by considering the evolution of each feature value individually.
Because we want to compare the similarity of two evolutions but not some specific feature values, we have to normalize the curves to remove the influence of vertical offset. This paper proposes two methods of normalization, that is, to normalize the curve as a whole or to treat each part separately. The former can only remove part of vertical movement. So we use the latter instead to break the whole curve into a sequence of independent lines.
This paper then gives many methods to calculate the similarity of two curves after normalization. To calculate the sum of angles formed by two lines in each time period is possible, but it seems a little complex in calculation. In addition, it is not very sensitive when the slope becomes large. The alternative is to calculate the sum of feature value differences, which uses the sum of line differences at each time point as a symbol for evolution similarity. This method is identical to calculate the sum of the area formed by two lines in each time period. We use the second method as the basis of our calculation. The less the difference is, the more similar the two curves are. But this method still has some drawbacks and needs to be improved little by little.
Firstly, line difference in each time period does not reflect the difference of orientation, because two lines with both the same and opposite orientations can possibly have the same line difference. However, two lines both going up or both going down look more similar than those with one going up and the other going down. So our system enlarges line differences by multiplying a constant weight larger than one to discriminate this situation.
Secondly, the sum of line differences treats each time period individually, thus does not take into account time continuity of the evolution. But in practice, two curves showing continuous similarity look more similar than those showing separate similarity. So we use a variant weight instead of a constant one in calculating line differences. If two curves show continuous similarity, we decrease the weight continuously to reduce line difference, which well addresses this issue.
We provide two interfaces: query by example and query by concept. In the form of query by example, we allow users to specify an existing medical image sequence as an example. We can also allow users to describe an abnormal image sequence manually by providing some useful tools. Through personalized templates, we link certain image sequences with advanced concepts to support concept based semantic queries.
Content based evolution retrieval of medical images is a brand new research field. Although it is very important in practice, little effort has been devoted to it by far. The contribution of this paper includes the following: we first analyze the great significance of evolution query on medical images in application. And then discuss the possible factors that can be compared between two evolutions, or the selection of feature values. Later, we propose two normalization methods to remove the influence of vertical offset. After that, we give several possible methods to calculate the similarity of two evolutions and elaborate on their advantages and the drawbacks. Based on these efforts, we offer an improved method and well address the issue of evolution comparison. In our future work, we will try to build an efficient index structure to achieve fast evolution query in medical image databases.