Galileo Security Overview¶
Self-Hosting vs Default Hosting¶
Galileo is a Platform-as-a-Service offering, and as such it allows the end user to host certain components of the platform on their own self-administered resources. Currently users can choose to host their own object storage and their own compute resources or to use the default storage and compute resources provided by Galileo. All authentication flows are handled via Auth0.
The default storage provider used by Galileo for Mission input files and job results is GCP Storage (standard level). More info on the security profile of GCP Storage can be found here. Users can choose to store input and result files on their own self-administered object storage if they choose to do so and if Galileo already supports that particular storage technology. In this bring-your-own-storage scenario, the storage administrator has full control over the access and persistence of the files consumed and generated by the Galileo platform for the Galileo accounts targeting that resource as the storage destination. The Galileo web service requires a user-provided API key to access non-default object storage. These keys are stored in an encrypted format; access can be revoked by deactiving the API key provided to Galileo.
Users of Galileo can host their own computational resources (virtual machines, HPC, etc.) by running their own instances of the Landing Zone daemon process and authenticating it against their Galileo account. When a user runs their own LZ, they assume total control of the underlying host machine and any jobs sent to the LZ via the Galileo web service (for more info, see Access Control). The LZ daemon is written in Python and the source code can be obtained and audited for security analysis purposes.
Alternatively, users can run on the default compute resources provided in the communal Stations (the Linux and Windows Stations). Users are given access to these default computational resources when their account is created. These instances are sourced from GCP Compute Engine, AWS EC2, and Azure Virtual Machines. Be aware that the default computational resources run simultaneous workloads from multiple Galileo users in containerized virtual environments.
The LZ daemon communicates with the Galileo web service via TLS and HTTPS. The LZ daemon must be run with sufficient permissions so as to create containers with the targeted container runtime or scheduling environment (i.e. Docker, Singularity, Slurm, etc.). Reauthetication is fascilitated by writing a local authentication token file (provided by Auth0). Deleting the authentication token file will require the user to log their LZ daemon back into their account the next time it is restarted.
The Galileo platform has two primary features in which role-based access control is available: “Missions” and “Stations.”
Missions are reusable code/simulation buckets where a Galileo user can upload data files in the form of input files, scripts, binaries, etc. Importantly, a Galileo Mission can be set up as a pre-configured framework “type.” For example, a Galileo Mission can be configured as a Python project, an R project, or a Gromacs project. If a user sets up a Mission as one of the pre-supported framework types, then they do not have to supply their own Dockerfile as Galileo will produce this for them server-side. If a user does provide their own Dockerfile, Galileo identifies this Mission type as “user-defined.”
Within the context of a Mission, Galileo users can invite collaborators as role-based members. The role assigned to a member determines if they read access to the input data and results data and if they have write/execute permission.
Stations allow computational resources running the Galileo “Landing Zone” daemon to be shared with an arbitrary number of other Galileo users. Within the context of a Station, administrators can set:
Which users are in the Station
Number of available Landing Zones
What Mission framework types are allowed to run (thus determining which base images are allowed to be pulled to the host machines)
Resource usage defaults, such as max CPU and memory on a per-job and per-user basis
Daily, weekly, monthly, and yearly usage quotas
Custom user roles with associated role capabilities and resource/quota limits
User permissions are controlled via a role-based permission scheme. Custom roles can be created by administrators through the Station settings UI. The Station owner and administrators can control if a particular user role can:
Add/remove Landing Zones
Control the state of running or queued jobs within the Station context
Invite member to and remove members from the Station
Edit/assign role types
Edit the default per-job and per-user max resource usage
Edit the per-user max runtime quota
Control which framework types are allowed to run within a Station context
Run interactive Mission types