Translator Disclaimer
Paper
18 July 2014 Unveiling ALMA software behavior using a decoupled log analysis framework
Author Affiliations +
Abstract
ALMA Software is a complex distributed system installed in more than one hundred of computers, which interacts with more than one thousand of hardware device components. A normal observation follows a flow that interacts with almost that entire infrastructure in a coordinated way. The Software Operation Support team (SOFTOPS) comprises specialized engineers, which analyze the generated software log messages in daily basis to detect bugs, failures and predict eventual failures. These log message can reach up to 30 GB per day. We describe a decoupled and non-intrusive log analysis framework and implemented tools to identify well known problems, measure times taken by specific tasks and detect abnormal behaviors in the system in order to alert the engineers to take corrective actions. The main advantage of this approach among others is that the analysis itself does not interfere with the performance of the production system, allowing to run multiple analyzers in parallel. In this paper we'll describe the selected framework and show the result of some of the implemented tools.
© (2014) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
Juan Pablo Gil, Alexis Tejeda, Tzu-Chiang Shen, and Norman Saez "Unveiling ALMA software behavior using a decoupled log analysis framework", Proc. SPIE 9152, Software and Cyberinfrastructure for Astronomy III, 91521G (18 July 2014); https://doi.org/10.1117/12.2055352
PROCEEDINGS
7 PAGES


SHARE
Advertisement
Advertisement
RELATED CONTENT

An overview of the planned CCAT software system
Proceedings of SPIE (July 18 2014)
Wendelstein Observatory control software
Proceedings of SPIE (July 26 2016)
Reactive scheduling for LINC-NIRVANA
Proceedings of SPIE (June 30 2006)
Gathering headers in a distributed environment
Proceedings of SPIE (July 21 2008)
DASH--distributed analysis system hierarchy
Proceedings of SPIE (December 19 2002)
Defining common software for the Thirty Meter Telescope
Proceedings of SPIE (June 27 2006)

Back to Top