The Atacama Large Millimeter /sub-millimeter Array (ALMA) has been working in operations phase regime since 2013. The transition to the operations phase has changed the priorities within the observatory, in which, most of the available time will be dedicated to science observations at the expense of technical time required for testing newer version of ALMA software. Therefore, a process to design and implement a new simulation environment, which must be comparable - or at least- be representative of the production environment was started in 2017. Concepts of model in the loop and hardware in the loop were explored. In this paper we review and present the experiences gained and lessons learned during the design and implementation of the new simulation environment.
The ALMA software is a large collection of modules for implementing all the functionality needed by the observatory's day-to-day operations from proposal preparation to the scientific data delivery. ALMA software subsystems include among many others: array/antenna control, correlator, telescope calibration, submission and processing of science proposals and data archiving.
The implementation of new features and improvements for each software subsystem must be in close coordination with observatory milestones, the need to rapidly respond to operational issues, regular maintenance activities and testing resources available to verify and validate new and improved software capabilities. This paper describes the main issues detected managing all these factors together and the different approaches used by the observatory in the search of an optimal solution.
In this paper, we describe the software delivery process adopted by ALMA during the construction phase and its further evolution in early operations. We also present the acceptance process implemented by the observatory for the validation of the software before it can be used for science observations. We provide details of the main roles and responsibilities during software verification and validation as well as their participation in the process for reviewing and approving changes into the accepted software versions.
Finally, we present ideas on how these processes should evolve in the near future, considering the operational reality of the ALMA observatory as it moves into full operations, and summarize the progress implementing some of these ideas and lessons learnt.
The Atacama Large Millimeter /submillimeter Array (ALMA) has entered into operation phase since 2013. This transition changed the priorities within the observatory, in which, most of the available time will be dedicated to science observations at the expense of technical time. Therefore, it was planned to design and implement a new simulation environment, which must be comparable - or at least- be representative of the production environment. Concepts of model in the loop and hardware in the loop were explored. In this paper we review experiences gained and lessons learnt during the design and implementation of the new simulation environment.
Free-atmosphere, and surface-layer optical-turbulence have been extensively monitored over the years. The
optical-turbulence inside a telescope enclosure en the other hand has yet to be as fully characterized. For this
latest purpose, an experimental concept, LOTUCE (LOcal TUrbulenCe Experiment) has been developed in
order to measure and characterise the so-called dome-seeing. LOTUCE2 is an upgraded prototype whose main
aim is to measure optical turbulence characteristics more precisely by minimising cross-contamination of signals.
This characterisation is both quantitative (optical turbulence strength) and qualitative (assessing the optical
turbulence statistical model). We present the new opto-mechanical design, with the theoretical capabilities and
limitations to the actual models.
ALMA Software is a complex distributed system installed in more than one hundred of computers, which interacts with more than one thousand of hardware device components. A normal observation follows a flow that interacts with almost that entire infrastructure in a coordinated way. The Software Operation Support team (SOFTOPS) comprises specialized engineers, which analyze the generated software log messages in daily basis to detect bugs, failures and predict eventual failures. These log message can reach up to 30 GB per day. We describe a decoupled and non-intrusive log analysis framework and implemented tools to identify well known problems, measure times taken by specific tasks and detect abnormal behaviors in the system in order to alert the engineers to take corrective actions. The main advantage of this approach among others is that the analysis itself does not interfere with the performance of the production system, allowing to run multiple analyzers in parallel. In this paper we'll describe the selected framework and show the result of some of the implemented tools.
The Durham adaptive Optics Real Time Controller (DARC)1 is a real-time system for astronomical adaptive optics systems originally developed at Durham University and in use for the CANARY instrument. One of its main strengths is to be a generic and high performance real-time controller running on an off-the-shelf Linux computer. We are using DARC for two different implementations: BEAGLE,2 a Multi-Object AO (MOAO) bench system to experiment with novel tomographic reconstructors and LOTUCE2,3 an in-dome turbulence instrument. We present the software architecture for each application, current benchmarks and lessons learned for current and future DARC developers.
The ALMA Test Interferometer appeared as an infrastructure solution to increase both ALMA time availability for science activities and time availability for Software testing and Engineering activities at a reduced cost (<30000K USD) and a low setup time of less than 1 hour. The Test Interferometer could include up to 16 Antennas when used with only AOS resources and a possible maximum of 4 Antennas when configured using Correlator resources at OSF. A joined effort between ADC and ADE-IG took the challenge of generate the Test Interferometer from an already defined design for operations which imposed a lot of complex restrictions on how to implement it. Through and intensive design and evaluation work it was determined that is possible to make an initial implementation using the ACA Correlator and now it is also being tested the feasibility to implement the Testing Interferometer connecting the Test Array at AOS with Correlator equipment installed at the OSF, separated by 30 km. app. Lastly, efforts will be done to get interferometry between AOS and OSF Antennas with a baseline of approximately 24 km.
The ALMA software is a large collection of modules, which implements the functionality needed for the observatory day-to-day operations, including among others Array/Antenna Control, Correlator, Telescope Calibration
and Data Archiving. Many software patches must periodically be applied to fix problems detected during operations or to introduce enhancements after a release has been deployed and used under regular operational
conditions. Under this scenery, it has been imperative to establish, besides a strict conguration control system,
a weekly regression test to ensure that modications applied do not impact system stability and functionality.
A test suite has been developed for this purpose, which reflects the operations performed by the commissioning
and operations groups, and that aims to detect problems associated to the changes introduced at different versions
of ALMA software releases. This paper presents the evolution of the regression test suite, which started at the
ALMA Test Facility, and that has been adapted to be executed in the current operational conditions. Topics
about the selection of the tests to be executed, the validation of the obtained data and the automation of the
test suite are also presented.
Starting 2009, the ALMA project initiated one of its most exciting phases within construction: the first antenna
from one of the vendors was delivered to the Assembly, Integration and Verification team. With this milestone and
the closure of the ALMA Test Facility in New Mexico, the JAO Computing Group in Chile found itself in the front
line of the project's software deployment and integration effort. Among the group's main responsibilities are the
deployment, configuration and support of the observation systems, in addition to infrastructure administration,
all of which needs to be done in close coordination with the development groups in Europe, North America
and Japan. Software support has been the primary interaction key with the current users (mainly scientists,
operators and hardware engineers), as the software is normally the most visible part of the system.
During this first year of work with the production hardware, three consecutive software releases have been
deployed and commissioned. Also, the first three antennas have been moved to the Array Operations Site, at
5.000 meters elevation, and the complete end-to-end system has been successfully tested. This paper shares the
experience of this 15-people group as part of the construction team at the ALMA site, and working together
with Computing IPT, on the achievements and problems overcomed during this period. It explores the excellent
results of teamwork, and also some of the troubles that such a complex and geographically distributed project
can run into. Finally, it approaches the challenges still to come, with the transition to the ALMA operations