PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.
Proceedings Volume Fifth International Conference on Image, Video Processing, and Artificial Intelligence (IVPAI 2023), 1307401 (2024) https://doi.org/10.1117/12.3028902
This PDF file contains the front matter associated with SPIE Proceedings Volume 13074, including the Title Page, Copyright information, Table of Contents, and Conference Committee information.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Fifth International Conference on Image, Video Processing, and Artificial Intelligence
Muhammad Sabirin Hadis, Junichi Akita, Masashi Toda
Proceedings Volume Fifth International Conference on Image, Video Processing, and Artificial Intelligence (IVPAI 2023), 1307402 (2024) https://doi.org/10.1117/12.3023756
In efforts to enhance face recognition performance, techniques ranging from super-resolution methods to the use of Local Binary Pattern (LBP) and deep learning have been explored. Among these, the pseudorandom pixel placement (PSE) technique has demonstrated potential in face recognition. Nevertheless, its testing was previously limited to just 8 subjects. This study undertakes a comprehensive evaluation of the PSE technique with a larger sample, utilizing 2000 subjects from the DigiFACE1M dataset and leveraging the state-of-the-art VGG-Face deep learning model. Through experiments involving 10 different PSE patterns on 144.000 face images, our findings indicate that, compared to Regular Pixel Placement (REG), PSE achieved an improvement in average accuracy by 1.05%, reduced the standard deviation by 1.47%, and resulted in 31 additional subjects achieving 100% accuracy. We conclude that PSE consistently outperforms REG in face recognition tasks using the VGG-Face model across the majority of tested scenarios.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Proceedings Volume Fifth International Conference on Image, Video Processing, and Artificial Intelligence (IVPAI 2023), 1307403 (2024) https://doi.org/10.1117/12.3023798
This paper presents a novel 'outside-in' hand tracking system for Virtual Reality (VR) and Augmented Reality (AR) interactions using an external camera, specifically a webcam. While current VR Head Mounted Displays (HMDs) primarily employ 'inside-out' systems that limit hand positioning and lead to user discomfort, our proposed system offers greater freedom in hand placement and supports a natural hand posture. With the potential to engage multiple users simultaneously, the system enhances collaborative experiences in immersive 3D VR/AR spaces. The system captures the user's hand movements through a webcam, processes the frames using a Media Pipe 3D hand pose model, predicts gestures, and calculates the hand's position and orientation. The proposed model achieved a gesture prediction accuracy of 99% in testing. Furthermore, a Unity3D demonstration showcased the system's capability in replicating precise hand articulations and performing tasks such as button pressing and cube stacking. Our approach addresses both usability and inclusivity challenges, offering a more ergonomic and economical alternative for VR/AR interaction.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Yanjie Wang, Guanshan Liu, Suhe Huang, Yichao Qin, Zheng Li, Jie Liu, Jing Wang, Yue Ding
Proceedings Volume Fifth International Conference on Image, Video Processing, and Artificial Intelligence (IVPAI 2023), 1307404 (2024) https://doi.org/10.1117/12.3023917
Underwater unmanned vehicles are important carriers for conducting ocean exploration activities, and surface vessels are the main detection objects for underwater unmanned vehicles to conduct surface exploration. This article takes surface ships as the research object, based on the basic radiation theory in physical optics, summarizes and analyzes the current research status of surface ship detection, and based on thermodynamic models, studies the modeling and simulation methods of the infrared radiation characteristics of ships themselves; Based on the environmental characteristics of ships in the marine environment, the modeling and simulation methods of infrared radiation reflected by ships in the environment were studied; Based on an atmospheric model, a simulation method for atmospheric transmittance during infrared radiation transmission was studied; Based on the linear quantization method, the modeling and simulation methods for infrared imaging of ship targets were studied, and the infrared simulation images of ships were obtained. The research and simulation results of the theoretical calculation model for the temperature field and infrared radiation characteristics of ship skin in this article have certain reference value in the research of infrared detection and recognition of underwater unmanned vehicles against ship targets.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Proceedings Volume Fifth International Conference on Image, Video Processing, and Artificial Intelligence (IVPAI 2023), 1307405 (2024) https://doi.org/10.1117/12.3023762
The incidence of renal tumors continues to rise each year, posing a serious threat to human health. Accurate segmentation of lesions is crucial for effective treatment. To enhance the segmentation performance of kidneys and renal tumors in CT images, this paper proposes a deep learning-based segmentation framework. The framework adopts a two-stage approach, starting from rough segmentation and progressing to fine segmentation, utilizing deep learning techniques. In the rough segmentation stage, a prior contour-assisted training technique is employed to extract the region of interest, namely kidneys and renal tumors. In the fine segmentation stage, an improved 3D convolution-based U-net model is proposed. Additionally, a novel loss function incorporating the mean and variance of pixel values of kidneys and renal tumors is introduced for fine-tuning. Given the limited data available, abdominal dataset images are used for pre-training the model. Through transfer learning, the model can learn common features from abdominal images.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Proceedings Volume Fifth International Conference on Image, Video Processing, and Artificial Intelligence (IVPAI 2023), 1307406 (2024) https://doi.org/10.1117/12.3023731
Many existing deep learning methods have been proposed for Salient Object Detection (SOD) in the natural images, however they may be not compatible enough for remote sensing images by ignoring some unique domain knowledge for remote sensing images. For example, satellite images might contain more complex contexts than natural images, and many salient objects in the satellite images are small-size objects, but the existing deep learning based SOD methods for natural images do not have these special considerations. In this paper, we propose a new Transformer-aware Encoder-Decoder Network (TEDNet) combining a hybrid Convolutional Neural Network-Transformer encoder and a Transformer-enhanced decoder to learn the complex context features from the local neighbors by convolution and the long-range region dependency by Transformer for the SOD task in remote sensing images. Furthermore, we propose a new image-level and pixel-level size-guided loss for the small salient object mining to train the proposed TEDNet. Experimental results on a publicized remote sensing SOD dataset show the effectiveness and accuracy of the proposed method.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Proceedings Volume Fifth International Conference on Image, Video Processing, and Artificial Intelligence (IVPAI 2023), 1307407 (2024) https://doi.org/10.1117/12.3023724
It’s very helpful for the person with visual impairments to assist in detecting blind lane, A novelty real-time detection for blind lane detection based on context salient attention mechanism and transfer learning is proposed. The method is based on transfer learning and feature integration with the visual attention mechanism. Firstly, the bottleneck descriptors and salient attention-based features are extracted through transformer-like feature integration, and then these features are incorporated together. Secondly, we train the new model based on a pre-trained model with careful parameter tuning. In the experiments, samples of blind lane images are collected at different areas in Chengdu for training and validation with different model cases. The experimental results show that the proposed method has advantages in classification precision by directly using Swin transformer by transfer learning with an attention mechanism. Our method is better than the original one, and the total accuracy is increased from 93.98% to 96.03% on the standard flower photos dataset. We got a total precision of 99% on our own blind road datasets.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Proceedings Volume Fifth International Conference on Image, Video Processing, and Artificial Intelligence (IVPAI 2023), 1307408 (2024) https://doi.org/10.1117/12.3023739
Point-based 3D object detection has been receiving increasing attention as it can preserve the geometric information of a point cloud and avoid quantization errors or information loss caused by voxelization or projection. Point sampling plays an important role in point-based 3D detectors yet has not been thoroughly explored. In this paper, we conduct a comparative analysis of three point sampling strategies to gain a deep understanding of the effect that each strategy imposes on the final performance and intermediate stages of the network. We introduce density-aware sampling and semantic-aware sampling strategies and fit them into the backbone of a lightweight and effective baseline model, aiming to reduce the density imbalance of the point cloud and better utilize semantic information. The density-aware strategy effectively balances the density but the inference time is not suitable for real-time application. Semantic-aware sampling biased on foreground points achieves a 0.19% improvement over the baseline. Analysis on statistics and visualization reveals future research direction. We build our models on the MMDetection3D platform and evaluate performance on the KITTI dataset.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Proceedings Volume Fifth International Conference on Image, Video Processing, and Artificial Intelligence (IVPAI 2023), 1307409 (2024) https://doi.org/10.1117/12.3024927
In response to the low monitoring efficiency, high personal safety risks, and high labor costs in the inspectionofriverbank slope collapses in river defense engineering, this study proposes an intelligent recognition methodforriverbank slope collapses based on motion detection technology. This method first utilizes a lightweight U-net imagesegmentation model based on attention mechanism for accurate segmentation of the engineering slope region. Then, optical flow is used to identify the motion occurring within the slope region. Finally, a lightweight convolutional neural network is designed to classify the detected moving objects and determine whether they are part of the slope itself.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Proceedings Volume Fifth International Conference on Image, Video Processing, and Artificial Intelligence (IVPAI 2023), 130740A (2024) https://doi.org/10.1117/12.3023733
The calibration of driving behavior parameters in the traffic simulation model for pavement formed by different aggregate mixtures is of crucial significance in ensuring the effectiveness of traffic simulation evaluation results for various asphalt pavement. This study first examines different aggregate asphalt mixtures as asphalt pavement materials through experiments. Subsequently, using queue length as the evaluation indicator, a sensitivity analysis is conducted based on VISSIM traffic simulation software to identify the set of sensitive parameter set data among the driving behavior parameters for asphalt pavement under study. The study then uses the traffic simulation model to determine the value of the driving behavior parameter set that result in the smallest error rate between simulated queue length and measured queue length on asphalt pavement. This completes the calibration of driving behavior parameters for the studied asphalt pavement in the traffic simulation model. Finally, the proposed method is validated through real-world case studies. This study proposes a feasible parameter calibration method for traffic simulation scenarios of asphalt pavement formed by different gradations of asphalt mixtures. At the same time, it establishes the foundation for optimizing and controlling intersection signals for corresponding asphalt pavement.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Proceedings Volume Fifth International Conference on Image, Video Processing, and Artificial Intelligence (IVPAI 2023), 130740B (2024) https://doi.org/10.1117/12.3023736
In recent years, gradient mapping based on colormaps has gained popularity in image editing and digital painting. However, this mapping often results in the loss of detailed information in the image. In response to this challenge, we propose a method aimed at revealing hidden details without deviating from the original color tendencies of the gradient colormaps. To achieve this goal, we employ a non-linear gradient mapping combined with a generative adversarial network to iteratively adjust the colormap parameters guided by changes in the color space of the image. Through a series of experiments, we demonstrate that our method not only fulfills the desired objectives but also exhibits outstanding performance in color representation.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Proceedings Volume Fifth International Conference on Image, Video Processing, and Artificial Intelligence (IVPAI 2023), 130740C (2024) https://doi.org/10.1117/12.3024834
In this study, we explore the visual appeal of character elements in digital human guides within the tourism industry, utilizing eye-tracking technology to understand how different character elements—such as facial expressions, clothing, and accessories—impact user attention and interaction. Our findings reveal significant differences in visual attraction to various character elements, with the face often holding the most visual interest. These insights not only contribute to the theoretical understanding of visual appeal and user engagement in digital guides but also offer practical design implications for creating more engaging and effective virtual guides in tourism. The study highlights the need for further multidimensional research to fully understand and optimize user experience in digital human interactions.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Proceedings Volume Fifth International Conference on Image, Video Processing, and Artificial Intelligence (IVPAI 2023), 130740D (2024) https://doi.org/10.1117/12.3023746
The requirements of multiple Unmanned Aerial Vehicle (UAV)-based video streaming transmission rapidly increase in flying ad-hoc networks (FANET). Due to diverse network features of FANET, tradeoff design in harsh networks for has become one of the research hotspots. The communication links between the nodes, however, are often unstable, especially in harsh network environments. This article presents a deep learning-based throughput predictor (DLTP) for promoting the Quality of Experience (QoE). Based on the DLTP, we propose an adaptive algorithm to achieve the tradeoff between the bandwidth, the load, and the video parameters based on the UAV flying status and Quality of Service (QoS) evaluation. Sufficient experimental results verify that compared with the existing methods such as FESTIVE and BOLA, our proposed DLTP achieved 38-76% improvement in latency reduction, 34-53% improvement in congestion control, 45-72% improvement in packet recovery, and 32-68% improvement in rebuffering efficiency.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Proceedings Volume Fifth International Conference on Image, Video Processing, and Artificial Intelligence (IVPAI 2023), 130740E (2024) https://doi.org/10.1117/12.3023778
Through visual research on the current situation and development trends of China's industrial software industry explore its research hotspots and future directions. A total of 1812 articles were searched on China National Knowledge Infrastructure (CNKI) from January 2003 to November 2023 with the theme of "industrial software". Draw and analyze knowledge graphs of literature keywords through VOSviewer and Citespace. The results show that the current research hotspots in industrial software are concentrated in areas such as industrial internet, intelligent manufacturing, and digital twins. In recent years, there has been a high level of research on industrial software, and the field of industrial software in China may achieve considerable development.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Proceedings Volume Fifth International Conference on Image, Video Processing, and Artificial Intelligence (IVPAI 2023), 130740F (2024) https://doi.org/10.1117/12.3023769
In this paper, considering the highway traffic congestion and surrounding ramp traffic queues, a quantitative layered model was improved to construct a multi ramp control priority matrix, and a multi ramp signal control sequence method considering the influence of ramp traffic was obtained based on the recognition of traffic chaos state, in which, ramp traffic flow and queue length indicators were introduced to comprehensively consider the weight proportion of each ramp. The simulation results show that, firstly, when the demand for ramp traffic exceeds 800, the improved control model of sub-sections has a higher mainline traffic volume than ALINEA control. Secondly, compared with ALINEA control, the average speed in each zoning increased by 27.6%, 34.6% and 42.3% under the control model of sub-sections respectively. The average queue length in ramp decreased by 40.2%, 33.8% and 25.8% respectively, and the average vehicle delay decreased by 75.5%, 77.1% and 66.9% respectively. To sum up, control model of sub-sections can effectively improve the traffic flow on the congested ramp.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Proceedings Volume Fifth International Conference on Image, Video Processing, and Artificial Intelligence (IVPAI 2023), 130740G (2024) https://doi.org/10.1117/12.3023796
No matter what kind of machine learning applications, the two most important factors are the input data and the applied model. In this paper, we considered these two factors and dived into our research topic handwriting analysis to predict personality. We conducted several experiments to validate our hypothesis and drew some conclusions that the preprocessing operations are very necessary, especially binarization. The order of each preprocessing step may affect final results. We compared with different existing art-of-the-state methods used in the general machine learning tasks for our specific handwriting analysis field, and found the best model convNeXtTiny as our applied model with good prediction.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Proceedings Volume Fifth International Conference on Image, Video Processing, and Artificial Intelligence (IVPAI 2023), 130740H (2024) https://doi.org/10.1117/12.3024821
In computer vision, vehicle re-identification (Re-ID) addresses the challenge of recognizing and distinguishing vehicles as they move through different environments, under varying lighting conditions, and with changing poses and perspectives. This task is essential for applications such as video surveillance, and intelligent transportation systems. In this paper, we propose a Multi-details Vision Transformer (MD-ViT) approach for vehicle Re-ID. Our method leverages the power of transformers to handle multiple levels of detail in vehicle appearance, enabling more accurate and robust re-identification across diverse scenarios. We introduce a multiple details feature extraction process to capture fine-grained information, improving the model's ability to distinguish between vehicles with similar attributes. Furthermore, we incorporate attention mechanisms to focus on relevant vehicle details, enhancing the model's discriminative capabilities. Through comprehensive experiments on benchmark datasets, we demonstrate the effectiveness of our approach, achieving state-of-the-art results in vehicle Re-ID. Our transformer-based framework offers a promising direction for advancing vehicle reidentification with multiple details, with potential applications in smart cities, traffic monitoring, and security systems.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Proceedings Volume Fifth International Conference on Image, Video Processing, and Artificial Intelligence (IVPAI 2023), 130740I (2024) https://doi.org/10.1117/12.3023741
Finding an initial solution for traveling salesman fast and with reasonable quality is needed both by the state-of-the-art local search and branch-and-bound optimal algorithms. Classical heuristics require O(n2) and their quality may be insufficient to keep the efficiency of the local search high. In this paper, we present two fast variants that utilize a Delaunay neighborhood graph. The best methods achieve 2.66% gap for Dots datasets, and 15.04% gap for the selected TSPLIB datasets. However, these results were obtained with slower algorithms. The best of the fast methods achieved 5.76% gap for Dots datasets, and 19.07% gap for the selected TSPLIB datasets in 307 milliseconds, on average.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Proceedings Volume Fifth International Conference on Image, Video Processing, and Artificial Intelligence (IVPAI 2023), 130740J (2024) https://doi.org/10.1117/12.3023737
The proof of the completeness theorem for modal logic is a difficult and interesting thing. We generally prove this theorem with the canonical model method. Is there other proof method? So in this paper, we give another proof of the completeness theorem with respect to the class of all general frames. The set of possible worlds, W, is a set of all assignments. We construct an accessible relation over W, and give the detailed proof for the completeness theorem of the system K.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Proceedings Volume Fifth International Conference on Image, Video Processing, and Artificial Intelligence (IVPAI 2023), 130740K (2024) https://doi.org/10.1117/12.3023730
This paper focuses on the basic prosodic unit “Douju” in reading discourse of Guangzhou Cantonese from three dimensions: acoustics, physiological breathing, and psychological perception, and concludes that “Douju” is a phonological unit that can be clearly perceived by native listeners and has a distinct acoustic boundary when spoken by the speaker in a complete breathing cycle. The main prosodic features of “Douju” in Cantonese reading discourse are as follows: (1) Acoustically, there is usually a maximum pause (long pause) before and after a specific “Douju”; (2) physiologically, each “Douju” is preceded and followed by a level 1 or 2 respiratory reset; (3) perceptually, each “Douju” is preceded and followed by an acoustic long pause, which corresponds to a complete respiratory cycle.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Proceedings Volume Fifth International Conference on Image, Video Processing, and Artificial Intelligence (IVPAI 2023), 130740L (2024) https://doi.org/10.1117/12.3023745
This paper presents a vision of creating a more seamless sim-to-real robotic grasping and manipulation platform, bridging the gap between simulated environments and real-world applications. In recent years, the emergence of digital twin technology has revolutionized how we develop and test such applications. A digital twin can simulate the behavior of a physical system in a virtual replica of the real-world environment in real-time, enabling engineers and researchers to conduct detailed analysis and evaluation before deploying robotic systems in the real world. In this paper, we study the potential of creating a digital twin that allows researchers and engineers to seamlessly deploy a grasping system trained on synthetic data and tested in a simulation environment without writing any additional code or doing additional calibration. Using our proposed platform, we present a case study done using a robotic grasping algorithm and analyze the advantages and limitations of our current platform. In the end, we suggest some possible improvements and future directions to enhance the platform’s effectiveness and applicability.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Karina J. Shakhgeldyan, Valeriya V. Gribova, Boris I. Geltser, Elena A. Shalfeyeva, Bogdan V. Potapenko
Proceedings Volume Fifth International Conference on Image, Video Processing, and Artificial Intelligence (IVPAI 2023), 130740M (2024) https://doi.org/10.1117/12.3023751
Cardiovascular diseases (CVD) are the leading cause of death in most countries around the world, making the accurate assessment of risks and the selection of individual preventive strategies a current focus in healthcare. In this article, the authors presented a prototype of a Clinical Decision Support System (CDSS) for predicting and preventing cardiovascular risks based on a hybrid architecture that integrates machine learning models and ontological knowledge bases. A microservice architecture based on the Cloud-Edge approach is proposed for optimizing computational resources when processing tabular data, signals, video, and images, as well as for enhancing the effectiveness of integration with various Healthcare Information Systems (HIS). The CDSS supports the formalization not only of medical history data and results of studies but also the rules for interpreting the results of predictions based on machine learning models and methods of explainable artificial intelligence (XAI). The developed CDSS includes widely used tools in clinical cardiology and cardiothoracic surgery for risk assessment, as well as proprietary machine learning models for predicting in-hospital mortality, and others. These models contribute to making informed medical decisions for the diagnosis, prevention, and treatment of CVD. The prototype was implemented at the Medical Center of the Far Eastern Federal University and integrated with the "1C" HIS. The experience of implementing the prototype demonstrated the high potential of hybrid CDSS based on microservice architecture for use in clinical practice.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.