This paper presents an action event prediction based on masked modeling for bidirectional time-series analysis in soccer. Since optimal action events should be selected based on changes in match situations, it is important to consider bidirectional time-series changes in data. To predict the next action event with the consideration of the bidirectional time-series, the proposed method learns the contexts of action event sequences by predicting the masked action events from the preceding and following contexts. The prediction accuracy of our method is improved from that of the unidirectional method, which shows the effectiveness of taking the bidirectional time series into account in soccer.
This paper presents a novel Transformer-based customer interest estimation method using videos from a security camera in a real store. Expectations for the application of Artificial Intelligence (AI) technology in various industrial fields have increased. In this study, we focus on the retail industry field rather than on the manufacturing industry field where AI technology has already been widely introduced. In the retail industry, understanding customer interest in products is one of the significant issues, and Point-of-Sales (POS) data have been used for the analysis of customer data. However, the information of customers before the purchase cannot be obtained from the POS data, which was a limitation due to the characteristics of the data. To provide a solution to the problem, we propose a new customer interest estimation method using visual information obtained from a security camera. The proposed method consists of three phases: Re-identification phase, 3D posture estimation phase, and interest estimation phase. The advantage of our architecture is that the Reidentification phase enables individual identification of multiple persons, and the 3D posture estimation phase obtains highly expressive posture information. Then interest estimation phase is used to enable individual interest estimation. Finally, we achieve a high-level customer interest estimation performance using the data obtained from a real store.
KEYWORDS: Head, Artificial intelligence, Information technology, Information science, Visualization, Vector spaces, Fluctuations and noise, Data centers, Compact discs
This paper presents a novel artist recommendation method based on knowledge graph and reinforcement learning. In the field of music services, online platforms based on subscriptions are becoming the mainstream, and the recommendation technology needs to be updated accordingly. In this field, it is desirable to achieve user-centered recommendation that satisfies various user preferences, rather than the recommendation that is biased toward popular songs and artists. Our method realizes highly accurate and explainable artist recommendation by exploring the knowledge graph constructed from users’ listening histories and artist metadata. We have confirmed the effectiveness of our method by comparing it with an existing state-of-the-art method.
This paper presents cross-domain recommendation based on multilayer graph analysis using subgraph representation. The proposed method constructs two graphs in source and target domains utilizing user-item embedding and trains link relationships between the users’ embedding features on each above graph via graph convolutional networks considering subgraph representation. Thus, the proposed method can obtain features with high representation ability, and this is the main contribution of this paper. Then the proposed method can estimate the user’s embedding features in the target domain from those in the source domain and recommend items to users by using the estimated features. Experiments on real-world e-commerce datasets verify the effectiveness of the proposed method.
This paper proposes a customer interest estimation method using security camera to meet the demand of the retail industry. In the field of retail industry, it is considered that the understanding of customers’ interests in the real store can be used for various marketing activities such as the product development and the layout of the store. Then, it is important to pay attention to customers’ behavior in the real store. Their behavior is often recorded by the cameras installed in the store for security purposes. A method for estimating their interests from the videos of the security camera is presented in this paper. The novelty of our method is three-fold. Firstly, the experimental data of subjects in our group were taken by using the security camera already installed in the real store. Secondly, we used a pre-trained posture estimation model and treated the results as the features to be trained by a two-layer neural network model. Finally, a professional have annotated the subjects’ interests. The effectiveness of our method was confirmed by comparing with benchmark supervised machine learning models.
This paper presents a method for action detection based on Temporal Cycle Consistency(TCC) Learning. The proposed method realizes the action detection of flexible length segments based on a frame-level action prediction technique. We enable calculation of similarities for spatio-temporal features based on TCC to detect target actions from input videos. Finally, our method determines temporal segments by smoothing the frame-level action detection result. Experimental results show the validity of the proposed method.
This paper presents a new interior coordination image retrieval method using object-detection-based and color features. Interior coordination requires consideration of objects’ positional information and the overall atmosphere of the room simultaneously. However, similar image retrieval methods considering the coordination characteristics have not been proposed. In the proposed method, we extract different types of features from interior coordination images and realize the similar interior coordination image retrieval based on our newly derived features.
In this paper, we propose a method to classify metastatic bone tumors using treatment-planning computed tomography images. The proposed method utilizes pre-trained deep convolutional neural network (DCNN) models as feature extractors and enables the metastatic bone tumor classification by using the obtained features. Performance of several state-of-the-art DCNN-based features was compared and evaluated in our experiment.
This paper presents a novel estimation method of field positions in soccer videos using Convolutional Neural Network (CNN)-based image features. CNN-based features have been well known to be effective for tasks in machine learning. Therefore, the proposed method adopts CNN-based image features. By using these image features, the proposed method enables accurate estimation of soccer field positions than handcrafted features, i.e., Hough transform-based features. We show the effectiveness of our method via experiment results using actual soccer videos.
This paper presents a classification method of tourism categories based on heterogeneous features considering existence of reliable results. The proposed method performs estimation of existence of reliable results based on one-versus-one scheme from three kinds of classification results obtained from tourism images, geotags and textual tags, separately. Then if the reliable result is included in the above results, this result is regarded as a final result. Otherwise, the final result is obtained by the multiple annotator logistic regression. The proposed method realizes accurate classification by estimating the existence of reliable results from more than two kinds of results.
KEYWORDS: Visualization, Feature extraction, Visual analytics, Matrices, Canonical correlation analysis, Information visualization, Principal component analysis, Information science, Information technology, Video
This paper presents gaze-based visual feature extraction via Discriminative Locality Preserving Canonical Correlation Analysis (DLPCCA) for visual sentiment estimation. The proposed method calculates novel visual features reflecting users’ visual sentiment by applying DLPCCA to gaze and original visual features. Consequently, accurate visual sentiment estimation becomes feasible by utilizing the novel visual features derived by the proposed method.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.