Pydron is a novel approach to use cloud computing in scientific data processing. By semi-automatic parallelization of sequential Python code, the advantages of cloud computing become available with an API consisting of only two function decorators.
Pydron is developed by me as part of my research as a PhD student at ETH Zürich. My research focuses on use-cases from astronomy, but Pydron can be applied in many other fields as well.
In Astronomy the data processing is often exploratory in nature, leading to many processing iterations with code and parameter changes. With Pydron we study how automatic parallelization can offer an easy-to-use system to enable cloud infrastructure to software with a rapidly changing code base.
Dark Energy Survey
The Dark Energy Survey (DES) is an astronomy survey to study the expansion of the universe and the growth of large scale structures.
I'm a technical collaborator in DES' Data Managment group. In particular I made significant contributions to the software packaging system. We extended an existing System, EUPS, for the specific needs of the consortium.
The software and infrastructure used for the data reduction is very heterogenous as a result of the many institutions participating in the consortium. Sofware components use various programming languages and build systems and have to be installed on several operating systems.
We adapted the packaging system EUPS. For this we added several major components to it:
The system currently manages seveal hundred packages. With DES specific software as well as third-party tools and libraries.
Work on this project continues as more and more of the production system is migrated into our packaging system.
Euclid - Mapping the Dark Universe
Euclid is space mission currenty under development by the Eurpean Space Agency. The scientific goal is a better understanding of Dark Matter and Dark Energy.
The camera of Euclid will produce about half a peta-byte of raw data each year. This data, together with data from the ground-based Dark Energy Survey and Pan-Starrs (about 6 PB) has to undergo intensive processing to generate data products for the scientists.
As member of the Ground Segment Systems Team we contribute to the ground based data processing system which is currently beeing designed.
We developed a prototype for the component of the system which integrates the resources at the collaborating institutions with each other.
I'm the teaching assistent of the Software Design module taught by Prof. Dominik Gruntz.
The lecture gives an introduction into Design Patterns with a very hands-on approch, letting the students develop a vector-oriented drawing application.