Galileo Security Overview¶
Self-Hosting vs Default Hosting¶
Galileo is a Platform-as-a-Service offering, and as such, it allows the end user to host certain components of the platform on their own self-administered resources. Currently users can choose to host their own object storage and their own compute resources or to use the default storage and compute resources provided by Galileo. In both cases, Galileo gives administrators of storage and compute assets segmented control of both resource access and permissions. All authentication flows are handled via Auth0.
The default storage provider used by Galileo for Mission input files and job results is GCP Storage (standard level). More info on the security profile of GCP Storage can be found here. Users can choose to store input and result files on their own self-administered object storage, via the Cargo Bays feature, if they choose to do so and if Galileo already supports that particular storage technology. In this bring-your-own-storage scenario, the storage administrator has full control over the access and persistence of the files consumed and generated by the Galileo platform for the Galileo accounts targeting that resource as the storage destination. The Galileo web service requires a user-provided API key to access non-default object storage. These keys are stored in an encrypted format; access can be revoked by deleting the associated Cargo Bay which removes the authentication credentials.
Users of Galileo can host their own computational resources (virtual machines, HPC, etc.) by running their own instances of the Landing Zone daemon process and authenticating it against their Galileo account. Running a Landing Zone does not require that the host machine be exposed to the wider internet via a public IP address, nor does it require any special VPN settings. When a user runs their own LZ, they retain total control of the underlying host machine and any jobs sent to the LZ via the Galileo web service (for more info, see Access Control). The LZ daemon is written in Python and the source code can be obtained and audited for security analysis purposes.
Alternatively, users can run on the default compute resources provided in the communal Stations (the Linux and Windows Stations), or they can purchase privately provisioned LZ instances. Users are given access to complimentary default computational resources when their account is created. Be aware that the default computational resources run simultaneous workloads from multiple Galileo users in containerized virtual environments. Provisioned Landing Zones, however, are private to the user who purchased them. All instances are sourced from GCP Compute Engine, AWS EC2, and Azure Virtual Machines.
The LZ daemon communicates with the Galileo web service via TLS and HTTPS. The LZ daemon must be run with sufficient permissions so as to create containers with the targeted container runtime or scheduling environment (i.e. Docker, Singularity, Slurm, etc.). Reauthentication is facilitated by writing a local authentication token file (provided by Auth0). Deleting the authentication token file will require the user to log their LZ daemon back into their account the next time it is restarted. The LZ daemon executes jobs as stand-alone docker containers (or singularity containers). For more information on container security, see the official Docker Introduction to Container Security.
Software and Applications¶
Galileo supports a suite of software and applications in the form of officially supported Mission Framework Types. Software environments like Python, Julia, and R Language as well as interactive applications like Jupyter Notebooks, PCSWMM, and QGIS are officially supported and can be configured through the Mission Configuration Wizard.
Administrators of computational resources can restrict what applications are allowed to be accessed through the customizable role settings in the Station Settings feature. Enterprise accounts can enforce a customizable default storage provider for all users in the organization, ensuring all input and output data files are stored on a specific storage solution controlled by the enterprise account owner.
The Galileo platform has two primary features in which role-based access control is available: “Missions” and “Stations.”
Missions are reusable code/simulation buckets where a Galileo user can upload data files in the form of input files, scripts, binaries, etc. Importantly, a Galileo Mission can be set up as a pre-configured framework “type.” For example, a Galileo Mission can be configured as a Python project, an R project, or a Gromacs project. If a user sets up a Mission as one of the pre-supported framework types, then they do not have to supply their own Dockerfile as Galileo will produce this for them server-side at runtime. Additionally, users cannot manually edit the Dockerfile of a pre-configured framework type, its structure is strictly controlled by the framework definition. If a user does provide their own Dockerfile, Galileo identifies this Mission type as “user-defined.”
Within the context of a Mission, Galileo users can invite collaborators as role-based members. The role assigned to a member determines if they read access to the input data and results data and if they have write/execute permission.
Missions can be used in tandem with the Cargo Bays feature. During the configuration stage of a new Mission, if the user chooses a non-default Cargo Bay (like Dropbox), all input and result files will be stored in that storage resource, the data will not persist in Galileo-hosted resources. If you lose access to your third-party storage provider, Hypernet Labs will not be able to recover it. Deleting a Cargo Bay necessarily deactivates any Mission referencing that resource as a storage provider, but does not delete the data stored there.
Stations allow computational resources running the Galileo “Landing Zone” daemon to be shared with an arbitrary number of other Galileo users without the need to expose the resource to the wider internet or set up a VPN. Within the context of a Station, administrators can set:
Which Galileo users are members of the Station
Which Landing Zones are accessible through the Station
What Mission framework types are allowed to execute (thus determining which container base images are allowed to be pulled to the host machines)
Resource usage defaults, such as max CPU, GPU, and memory on a per-job and per-user basis
Daily, weekly, monthly, and yearly usage quotas
Custom user roles with associated role capabilities and resource/quota limits
User permissions are controlled via a role-based permission scheme. Custom roles can be created by administrators through the Station settings UI. The Station owner and administrators can control if a particular user role can:
Add/remove Landing Zones
Control the state of running or queued jobs within the Station context
Invite members to and remove members from the Station
Edit/assign role types
Edit the default per-job and per-user max resource usage
Edit the per-user max runtime quota
Control which framework types are allowed to run within a Station context
Run interactive Mission types