When designing the architecture of the TANGO project we saw the necessity of a central component that helps to speed up the building process for different targeted heterogeneous architectures with different optimizations and deploy them if the different testbeds, especially in HPC use case, this ended up in the creation of Application Lifecycle Deployment Engine or ALDE. Let’s check step by step how will be shaped and which functionality and steps it is going to perform.
The user has the code of its application that can support different heterogeneous hardware, such as CPUs, GPUs, MCPs, FPGAS, etc. It submits this all to ALDE via its CLI or REST API. ALDE can start preparing from it different versions of the application. One optimized for running in CPU, other one optimized for running in CPU+GPU, other to run in CPU+MCP etc. The result of the application can be in different formats, depending on the possible platform where the application is going to run. Those formats are:
- Package – It is the simplest one, the different versions of the application are packaged into typical application package formats, such as deb, rpm, tar.gz, msi, etc.
- Image – ALDE builds an specific image, with operating system, necessary libraries plus the application ready to be used to but an small embedded device. In this case only one image is going to be built per device type.
- Container – That it could be a Docker or Singularity container. Although this could be also used in the container scenario. The main target here it is for the HPC environment. In HPC we are seeing a movement to containers to execute the applications since it makes the application more portable between HPC machines without the performance hassle of the VMs. In this case, if the user desires, apart from building the application with different optimizations depending on the accelerators available in the HPC environment, it could build the application with different libraries versions.
Once the application is built, the user could select the platform where to deploy it. Mainly in the HPC world, the user could select different configurations, an ideal or favourite one, and other alternatives. ALDE then will connect with the HPC workload manager and see the different free slots in the different nodes. Using this information will decide what it is the best reservation for the application to be launched, selecting the necessary containers to deploy everything.
TANGO project is developing right now ALDE. The first version will support Docker and Singularity containers and SLURM as workload manager. By this summer, a first release of the ALDE component should be found here: Github TANGO - ALDE .