The Leading Parallel Cluster File System
What is BeeGFS?
BeeGFS (formerly FhGFS - developed at the Fraunhofer Institute for Industrial Mathematics ITWM) is the leading parallel cluster file system, developed with a strong focus on performance and designed for very easy installation and management. If I/O intensive workloads are your problem, BeeGFS is the solution.
Why use BeeGFS?
BeeGFS transparently spreads user data across multiple servers. By increasing the number of servers and disks in the system, you can simply scale performance and capacity of the file system to the level that you need, seamlessly from small clusters up to enterprise-class systems with thousands of nodes.
Get The Most Out Of Your Data
The flexibility, robustness, and outstanding performance of BeeGFS helps our customers around the globe to increase productivity by delivering results faster and by enabling new data analysis methods that were not possible without the advantages of BeeGFS.
BeeGFS offers maximum performance and scalability on various levels. It supports distributed file contents with flexible striping across storage servers on a per-file or per-directory basis as well as distributed metadata. BeeGFS is optimized especially for use in environments where performance matters to provide:
Best in class client throughput: 8 GB/s with only a single process streaming on a 100GBit network, while a few streams can fully saturate the network.
Best in class metadata performance: Linear scalability through dynamic metadata namespace partitioning.
Best in class storage throughput: BeeGFS servers allow flexible choice of underlying file system to perfectly fit the given storage hardware.
BeeGFS Storage Pools make different types of storage devices available within the same namespace. By having SSDs and HDDs in different pools, pinning of a user project to the flash pool enables all-flash storage performance for the current project while still providing the advantage of the cost-effecient high capacity of spinning disks for other data.
BeeGFS supports a wide range of Linux distributions such as RHEL/Fedora, SLES/OpenSuse or Debian/Ubuntu as well as a wide range of Linux kernels from ancient 2.6.18 up to the latest vanilla. The storage services run on top of an existing local filesystem (such as xfs, zfs or others) using the normal POSIX interface and clients and servers can be added to an existing system without downtime. BeeGFS supports multiple networks and dynamic failover in case one of the network connections is down.
BeeGFS client and server components can also run on the same physical machines. Thus, BeeGFS can turn a compute rack into a cost-efficient converged data processing and shared storage unit, eliminating the need for external storage resources and providing simplified management
BeeGFS On Demand
The BeeGFS server components are userspace daemons, while the client is a native kernel module that does not require any patches to the kernel itself. All BeeGFS components can be installed and updated without even rebooting the machine.
For installation and updates there are rpm/deb package repositories available; for the startup mechanism, easy-to-use system service scripts are provided.
BeeGFS was designed with easy administration in mind. The graphical administration and monitoring system enables dealing with typical management tasks in a simple and intuitive way, while everything is of course also available from a command line interface:
Live load statistics, even for individual users
Storage service management
Excellent documentation helps to have the whole system up and running in one hour.
BeeOND (BeeGFS on demand) allows on the fly creation of a complete parallel file system instance on a given set of hosts with just one single command.
BeeOND was designed to integrate with cluster batch systems to create temporary parallel file system instances on a perjob basis on the internal SSDs of compute nodes, which are part of a compute job. Such BeeOND instances do not only provide a very fast and easy to use temporary buffer, but also can keep a lot of I/O load for temporary or random access files away from the global cluster storage.
BeeGFS storage servers are typically used with an underlying RAID to transparently handle disk errors. Using BeeGFS with shared storage is also possible to handle server failures. The built-in BeeGFS Buddy Mirroring approach goes even one step further by tolerating the loss of complete servers including all data on their RAID volumes - and that with commodity servers and shared-nothing hardware.
The built-in BeeGFS Buddy Mirroring automatically replicates data, handles storage server failures transparently for running applications and provides automatic self-healing when a server comes back online, efficiently resyncing only the files that have changed while the machine was offline.
BeeGFSv7: Newest features
Free space balancing when adding new hardware
Metadata event logging enhancement
Kernel 4.14, 4.15 support
NIC handling during startup
We already talked about the BeeGFS key aspects scalability, flexibility and usability and what‘s behind them. But there are way more features in BeeGFS:
Runs on various platforms, such as x86, OpenPOWER, ARM, Xeon Phi and more
Re-export through Samba or NFSv4 possible
Support for user/group quota and ACLs
Fair I/O option on user level to prevent a single user with multiple requests from stalling requests of other users
Automatic network failover, e.g. if InfiniBand is down, BeeGFS automatically switches to Ethernet and back later
Online file system sanity check that can analyze and repair while the system is in use
Built-in benchmarking tools to help with optimal tuning for specific hardware and evaluate hardware capabilities
Support for cloud deployment on e.g. Amazon EC2 or Microsoft Azure
Happy Users World-Wide
BeeGFS is widely popular among universities and the global research community, powering some of the fastest supercomputers in the world to help scientists analyze large amounts of data efficiently every day.
BeeGFS is the parallel file system of choice in life sciences. The fast growing amount of genomics data to store and analyze quickly in fields like Precision Medicine make BeeGFS the first choice for our customers.
Finance, Oil & Gas, Media, Automotive, ...
BeeGFS is used in many different industries all around the globe to provide fast access to storage systems of all kinds and sizes, from small scale up to enterprise-class systems with thousands of hosts.