This paper proposes a novel fast and robust image segmentation method based on superpixels (FRISS). In order to make the algorithm adaptive as well as efficient, we first compute superpixels of the image with modified SLIC. Moreover, a modified SimHash is encoded for each superpixels. In addition, similar superpixels are associated together according to the similarity measure gotten from the Hamming distance of SimHash. FRISS can segment image with the given threshold of the similarity, which demonstrates its’ adaptation. On the other hand, the similarity is computed by the Hamming distance of SimHash code which is much faster than other similarities. From the experimental results, we can know that FRISS is fast and efficient.
Fuzzy c-means clustering (FCM), especially with spatial constraints (FCM_S), is an effective algorithm suitable for image segmentation. Its reliability contributes not only to the presentation of fuzziness for belongingness of every pixel but also to exploitation of spatial contextual information. But these algorithms still remain some problems when processing the image with noise, they are sensitive to the parameters which have to be tuned according to prior knowledge of the noise. In this paper, we propose a new FCM algorithm, combining the gray constraints and spatial constraints, called spatial and gray-level denoised fuzzy c-means (SGDFCM) algorithm. This new algorithm conquers the parameter disadvantages mentioned above by considering the possibility of noise of each pixel, which aims to improve the robustness and obtain more detail information. Furthermore, the possibility of noise can be calculated in advance, which means the algorithm is effective and efficient.
Visual Question Answering (VQA) is one of the most popular research fields in machine learning which aims to let the computer learn to answer natural language questions with images. In this paper, we propose a new method called hierarchical dynamic memory networks (HDMN), which takes both question attention and visual attention into consideration impressed by Co-Attention method, which is the best (or among the best) algorithm for now. Additionally, we use bi-directional LSTMs, which have a better capability to remain more information from the question and image, to replace the old unit so that we can capture information from both past and future sentences to be used. Then we rebuild the hierarchical architecture for not only question attention but also visual attention. What’s more, we accelerate the algorithm via a new technic called Batch Normalization which helps the network converge more quickly than other algorithms. The experimental result shows that our model improves the state of the art on the large COCO-QA dataset, compared with other methods.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.