Vertical itemset mining is an important frequent pattern mining problem with broad applications. It is challenging since one may need to examine a combinatorial explosive number of possible patterns of items of a dataset in a traditional horizontal algorithm. Since high dimensional datasets typically contain a large number of columns and a small number of rows, vertical itemset mining algorithms, which extract the frequent itemsets of dataset by producing all combination of rows ids, are a good alternative for horizontal algorithms in mining frequent itemsets from high dimensional dataset. Since a rowset can be simply produced from its subsets by adding a new row id to a sub rowset, many bottom up vertical itemset mining algorithms are designed and represented in the literature. However, bottom up vertical mining algorithms suffer from a main drawback. Bottom-up algorithms start the process of generating and testing of rowsets from the small rowsets and go on to the larger rowsets, whereas the small rowsets cannot produce a frequent itemsets because they contain less than minimum support threshold number of rows. In this paper, we described a new efficient vertical top down algorithm called VTD (Vertical Top Down) to conduct mining of frequent itemsets in high dimensional datasets. Our top down approach employed the minimum support threshold to prune the rowsets which any itemset could not be extracted from them. Several experiments on real bioinformatics datasets showed that VTD is orders of magnitude better than previous closed pattern mining algorithms. Our performance study showed that this algorithm outperformed substantially the best former algorithms.
In this paper, we investigated the efficiency of upper half Farsi numerical digit structure. In other words, half of data (upper half of the digit shapes) was exploited for the recognition of Farsi numerical digits. This method can be used for both offline and online recognition. Half of data is more effective in speed process, data transfer and in this application accuracy. Hidden Markov model (HMM) was used to classify online Farsi digits. Evaluation was performed by TMU dataset. This dataset contains more than 1200 samples of online handwritten Farsi digits. The proposed method yielded more accuracy in recognition rate.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.