Current Features of OrangeFS
All OrangeFS requests involve objects, which hold file data, file and directory metadata, directory entries, or symbolic links. Every object’s unique handle, along with other metadata, is contained in a set of key/value pairs. Each object also includes a bytestream to hold actual file data.
Object based storage is like letting a valet park your car for you (you never need to know where the car is). In OrangeFS, the metadata server is your data valet.
Separation of Data and Metadata
After accessing one of the Metadata Servers once for a file’s location, an OrangeFS Client can then interface directly with the data servers, eliminating a major bottleneck.
MPI programs can supply a description of the data based on MPI_Datatypes, which can describe complex non-contiguous patterns of data and ultimately enable MPI file views. This allows highly efficient access to file data for parallel applications.
Multiple Network Support
OrangeFS uses a networking layer named BMI which provides a non-blocking message interface designed specifically for file systems. BMI has multiple implementation modules for a number of different networks used in high-performance computing including, TCP/IP, Myrinet, Infiniband, and Portals.
OrangeFS servers do not share any state with each other or with clients. If a server crashes, another can easily be restarted in its place.
OrangeFS clients and servers run at the user level, and kernel modifications are not needed. An optional kernel module allows OrangeFS to be mounted like any other file system, or programs can link directly to a user interface such as MPI-IO or a Posix-like interface. All this makes OrangeFS easier to install and less prone to causing system crashes.
The OrangeFS interface integrates at the system level. Its similarities with the Linux VFS make it easy to implement as a mountable file system, but is equally adaptable to user-level interfaces such as MPI-IO or Posix-like interfaces. Exposure to many features of its underlying file system allows other interfaces to take advantage of them.
OrangeFS uses server-to-server collective communication to improve the scalability of metadata operations. OrangsFS is implementing distributed directories to make directory operations more scalable. Another project is developing the ability to search metadata as an alternative to traditional path look-up for SSD support.
Small, Unaligned Accesses
The OrangeFS team is developing middleware-driven caching on the client side, including configurable semantics that provide a tradeoff between performance and consistency management.
WebDAV & S3
To ease access burdens to data, OrangeFS has many diverse interfaces for data access including cross platform user level access via WebDAV and S3
Scalable, Portable, Flexible
Disk performance of better than 1 GB/sec has been achieved on Linux clusters using standard IDE hard disks. OrrangeFS brings state-of-the-art parallel I/O concepts to production parallel systems. It is designed to scale to petabytes of storage and provide access rates at 100s of GB/sec.
OrangeFS relaxes POSIX consistency semantics where necessary to improve stability and performance. Cluster file systems that enforce POSIX consistency require stateful clients with locking subsystems, reducing the stability of the system in the face of failures. These systems can be difficult to maintain due to overhead of lock management.
Optimized MPI-IO Support
OrangeFS is designed to support a number of access models, from collective I/O to independent I/O as well as non-contiguous and structured access patterns. OrangeFS provides an object-based, stateless client interface, leading to optimizations for metadata operations within MPI-IO.
OrangeFS operates on a wide variety of systems, including IA32, IA64, Opteron, PowerPC, Alpha, and MIPS. It is easily integrated into local or shared storage configurations, and provides support for high-end networking fabrics, such as Infiniband and Myrinet.
OrangeFS is easy to deploy and manage. It builds out-of-the-box on a wide-variety of Linux distributions and has only a few required dependencies. OrangeFS builds and runs directly on your Linux installation and requires no kernel patches or specific kernel versions.
A multi-institution team of parallel I/O, networking and storage experts develops OrangeFS. It embodies the expertise of designers who have worked for over a decade in the field of parallel I/O. OrangeFS remains a platform for active research in the parallel I/O field, and the code is designed to be easy to augment for research purposes.
Secure Access Control
The OrangeFS team has developed internal security method based on signed credentials and capabilities that is designed to work with federated authentication mechanisms and yet maintain high-performance characteristics. Current research is focusing on flexible access control schemas beyond simple userID-based permissions.
The OrangeFS project is working to be production-ready, which includes improvements to the documentation base for all of OrangeFS.
OrangeFS provides a Windows client to allow Windows systems to seamlessly connect to the file system. Client applications can use files on OrangeFS in the same way as local files.
OrangeFS Developers have created a fast, efficient way to leverage OrangeFS natively with the Hadoop eco-system. This integration can be leveraged in traditional HPC methodologies to provide high performance data analysis.