TrinityX is the latest release of our open source HPC cluster eco-system, designed to make your cluster agile, reliable, and responsive to your individual needs.
Among these modules are Docker, OpenStack and cluster partitioning. OpenStack is of growing interest within computational life sciences due to its availability, wide industry support and growing user base. For traditional HPC installations, TrinityX adds the possibility for users to request a complete machine and install their own images or packages.
Docker is also becoming an important addition to HPC installations, used for wrapping up applications with their dependencies. This benefits research by enabling easier transfer of these applications between various environments and facilitates data replication by allowing these applications to be run 5 – 10 years in the future in order to validate findings.
TrinityX is based on Luna, our in-house provisioning tool used for the discovery and deployment of nodes. Luna is based on the BitTorrent Protocol. Images are populated via peer-to-peer transfer from controller and among compute nodes, greatly reducing the time required for discovery and deployment of nodes. Booting a 500 node cluster goes from taking hours to taking just 5 minutes.
Zabbix is the built-in tool for monitoring and reporting. The intelligent combination of all these tools allows us to manage and deliver virtual HPC or OpenStack environments as Infrastructure as a Service (IaaS). TrinityX is primarily developed according to the major demands and needs of high-performance compute clusters (HPC clusters), but also offers flexibility for other applications in traditional data centres to become more efficient.
The following features are supported in TrinityX:
- Scalable to tens of thousands of nodes
- PDSH parallel command
- Shared storage for essential configuration files and home directories
- Our lightning-fast provisioning tool that can get 400 nodes up and running in under 5 minutes
- Full hardware integration (Infiniband, Omnipath, PXE, Intel Broadwell, IPMI, switches, etc.)
- Complete HPC user environment (modules environment, scientific libraries, compilers, profilers, debuggers)
- MPI libraries – OpenMPI, Intel MPI, and optional mpich/mvapich/mvapich2
- The Simple Linux Utility for Resource Management (SLURM) preconfigured to make full use of a cluster. Optional support for PBS, Torque, Moab, LSF & SGE.
- Full HPC performance using the optional Docker-based application containerization
- High availability for controllers, storage, and login nodes
- Parallel filesystem support: Lustre, IBM Spectrum Scale (GPFS), and BeeGFS.
- An integrated authentication system that can be plugged to existing back-ends with minor tweaks
- A comprehensive monitoring and metering system to keep track of critical events and resource usage
- Node -> switch integration, automatic discovery
TrinityX with OpenStack™
- Full control of environment customization for users with different requirements
- Ease of management and usage metering
- Allows customers to host their own private or public IaaS cloud (for general IT, selling cycles)
- Ability to re-partition the cluster according to current internal demands.
We maintain a backlog for the roadmap for the coming 2 years. The priorities for major changes are determined 12 months before a major release. Priorities in general and the backlog are based on customer and engineering feedback. The next major release is planned in Q2 2017. We consult with our customers, presales, sales and engineering throughout the year what makes sense to include in upcoming minor or major releases.
A non-exhaustive list of features we would like to see in the upcoming releases:
- Dynamic scheduling of nodes between OpenStack™ and HPC partition
- Automated and integrated cloud bursting
- Support for more services from the OpenStack™ ecosystem
- Gluster support for HPC and OpenStack™
- OpenStack™ support for shared filesystems (e.g. OpenStack Manila)
- Further support for Docker applications, abstracting IB/OPA/40G devices