Abstract: In recent years, the concept of Big Data has become a more prominent issue as the volume of data as well as the velocity in which it is produced exponentially increases. By 2020 the amount of data being stored is estimated to be 44 Zettabytes and currently over 31 Terabytes of data is being generated every second. Algorithms and applications must be able to effectively scale to the volume of data being generated. One such application designed to effectively and efficiently work with Big Data is IBM’s Skylark. Part of DARPA’s XDATA program, an open-source catalog of tools to deal with Big Data; Skylark, or Sketching-based Matrix Computations for Machine Learning is a library of functions designed to reduce the complexity of large scale matrix problems that also implements kernel-based machine learning tasks. Sketching reduces the dimensionality of matrices through randomization and compresses matrices while preserving key properties, speeding up computations. Matrix sketches can be used to find accurate solutions to computations in less time, or can summarize data by identifying important rows and columns. In this paper, we investigate the effectiveness of sketched matrix computations using IBM’s Skylark versus non-sketched computations. We judge effectiveness based on several factors: computational complexity and validity of outputs. Initial results from testing with smaller matrices are promising, showing that Skylark has a considerable reduction ratio while still accurately performing matrix computations.
Soundararajan Ezekiel and Michael Giansiracusa, "Matrix sketching for big data reduction (Conference Presentation)," Proc. SPIE 10199, Geospatial Informatics, Fusion, and Motion Video Analytics VII, 101990F (Presented at SPIE Defense + Security: April 12, 2017; Published: 6 June 2017); https://doi.org/10.1117/12.2262937.5460300095001.
Conference Presentations are recordings of oral presentations given at SPIE conferences and published as part of the conference proceedings. They include the speaker's narration along with a video recording of the presentation slides and animations. Many conference presentations also include full-text papers. Search and browse our growing collection of more than 12,000 conference presentations, including many plenary and keynote presentations.