24 September 2012 Significantly reducing the processing times of high-speed photometry data sets using a distributed computing model
Author Affiliations +
The scientific community is in the midst of a data analysis crisis. The increasing capacity of scientific CCD instrumentation and their falling costs is contributing to an explosive generation of raw photometric data. This data must go through a process of cleaning and reduction before it can be used for high precision photometric analysis. Many existing data processing pipelines either assume a relatively small dataset or are batch processed by a High Performance Computing centre. A radical overhaul of these processing pipelines is required to allow reduction and cleaning rates to process terabyte sized datasets at near capture rates using an elastic processing architecture. The ability to access computing resources and to allow them to grow and shrink as demand fluctuates is essential, as is exploiting the parallel nature of the datasets. A distributed data processing pipeline is required. It should incorporate lossless data compression, allow for data segmentation and support processing of data segments in parallel. Academic institutes can collaborate and provide an elastic computing model without the requirement for large centralized high performance computing data centers. This paper demonstrates how a base 10 order of magnitude improvement in overall processing time has been achieved using the "ACN pipeline", a distributed pipeline spanning multiple academic institutes.
© (2012) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
Paul Doyle, Fred Mtenzi, Niall Smith, Adrian Collins, Brendan O'Shea, "Significantly reducing the processing times of high-speed photometry data sets using a distributed computing model", Proc. SPIE 8451, Software and Cyberinfrastructure for Astronomy II, 84510C (24 September 2012); doi: 10.1117/12.924863; https://doi.org/10.1117/12.924863


The first year of operation of MASCARA on sky...
Proceedings of SPIE (August 08 2016)
The TESS science processing operations center
Proceedings of SPIE (August 08 2016)
The GAIA photometric data processing
Proceedings of SPIE (September 21 2012)

Back to Top