iRODS Summary
A brief report on our experience with iRODS, as well as recommendations for PDS and possible futures with the system.
This document describes our experience to date with iRODS.
Overall Observations
The "i" Rule Oriented Data System, or iRODS, promises to be a next-generation middleware for large-scale, real-time digital archives. Developed by Data Intensive Cyber Environments Research and various collaborators, its rule-oriented approach promises to manage workflow, resources, services, and archive policies.
Although the number of grid-based data-centric systems are few, and the majority are still academic projects, iRODS does distinguish itself as a viable contender for the Planetary Data System (PDS). In particular:
- Compilation and installation is dead simple. Other grid systems come with hefty setup guides and require almost arcane-level of operating system knowledge.
- Configuration is surprisingly easy. A single five-line configuration file is all it takes. Contending systems give system administrators migraines as they puzzle out dozens of settings in multiple configuration files.
- Running iRODS is immediate and obvious. iRODS comes with an entire array of executable commands (called "i-commands") that parallel the standard set of Unix commands. Other grid systems even fail to provide such executables, instead giving only a framework for developers to later create such programs.
iRODS also stands out by being based on a series of micro-services that are orchestrated together using the application of an adaptive rules engine. Rather than creating a high-level service such as "upload a file", it provides a series of fine-grained services that can achieve the goal of uploading a file. Such a goal may include authentication, selection of a target repository, identification of replica repositories, transfer of blocks in a multithreaded fashion, computation of a verification message digest, metadata population, post-processing for derived products, and so forth. The set of rules and services are flexible and may be updated in for domain-specific needs.
Performance of iRODS is in line with that of contending systems. As gigabyte-scale data sets become the norm for scientific research, grid system developers have realized the inherit shortcomings of TCP/IP and have developed adaptations. The most prevalent adaptations are:
- Using a customized UDP/IP approach where datagrams are scattered/gathered at high speed.
- Using multi-threaded, concurrent, parallel TCP/IP connections.
iRODS uses the latter adaptation.
Reliability and Performance
Of paramount concern to PDS is the rate at which data can be moved around between iRODS systems and the integrity of that data. The inventory of data sets within PDS range from the miniscule to the gargantuan, and our tests of iRODS have exercised a number of sizes.
In general, we've found that collections containing fewer large files transfer more quickly than those that contain many small files. The reason for this is simple: iRODS establishes a fresh transfer for each file sent. With many small files, the overhead of each transfer (file name, destination directory, computation of parallel transfer parameters, etc.) is large. Further, the benefits of parallel, multi-threaded TCP/IP connections are lost with small files. However a single large file has low overhead compared to the file size, and parallel TCP/IP connections do yield a boost in performance.
This leads us to make the following recommendation: Where possible, always send or receive an archive of data. It doesn't matter whether the archive is of the cpio, tar, or zip formats, and it doesn't matter whether the archive is compressed. Transfer a single file wherever possible.
Performance versus Conventional TCP/IP
In order to measure the performance of iRODS against conventional TCP/IP file transfer such as FTP or HTTP, we created a large randomized data file and sent and retrieved it from various locations in the US to a single iRODS server located at San Diego Supercomputing Center. This enables iRODS to leverage its multithreaded transfer capability to greatly accelerate the movement of data. (All tests were conducted with iRODS 1.1.)
This diagram shows the systems tested:

The following table (and gratuitous graph) summarizes the results:
| |
Throughput (MiB/s) | |
|---|---|---|
| System | iRODS | Conventional |
| Commercial data center | 6.92 | 1.47 |
| Engineering/Imaging node | 9.74 | 7.58 |
| Geo node | 9.65 | 0.74 |
| PPI node | 11.16 | 9.24 |
| Small Bodies node | 8.00 | 0.68 |
iRODS clearly provides enormous benefit to systems where conventional file transfer performs poorly, and less benefit to systems closer to SDSC. (In the case of the Geo node, we believe a packet-shaping firewall causes conventional transfers to suffer high latency. For the Small Bodies node, the number of network hops to SDSC may be quite large.)
Bear in mind that these tests are for a single large file. PDS data sets, unless archived together into a large file, won't feel such transfer benefits.
Transferring PDS Data Sets
Sending an unbundled collection (directory hierarchy) of many differently-sized will be slower than sending single large archive. However, for the sake of completion, we ran tests transferring collections from a variety of PDS nodes using iRODS. These tests ran iRODS at 1, 4, and 8 parallel threads (with 1 thread being equivalent to doing a recursive FTP). The following map shows the nodes involved in the test:

Note: Because these tests take quite a bit of time to run, the results presented below are not yet complete.
The following table (and chart) summarizes the transfers of unbundled datasets from various PDS nodes to SDSC:
| |
Transfer Rate (MiB/s) | ||
|---|---|---|---|
| Transfer | 1 Thread | 4 Threads | 8 Threads |
| Atmos 1G | 0.074 | 0.12 | 0.053 |
| Atmos 10G | |||
| Atmos 20G | |||
| Atmos 1T | |||
| Eng 1G | 1.14 | 1.96 | 1.76 |
| Eng 10G | |||
| Eng 20G | |||
| Eng 1T | |||
| PPI 1G | 2.16 | 1.91 | 1.91 |
| PPI 10G | 1.74 | 1.77 | 1.35 |
| PPI 20G | |||
| PPI 1T | |||
| Rings 1G | 0.76 | 0.59 | 0.29 |
| Rings 10G | |||
| Rings 20G | |||
| Rings 1T | |||
| Small Bodies 1G | 0.46 | 0.45 | 0.56 |
| Small Bodies 10G | |||
| Small Bodies 20G | |||
| Small Bodies 1T | |||
As can be seen, multithreading has only no effect on an unbundled dataset. In fact, we're questioning why we must even continue running such tests. Both conceptually and from the above evidence, the transfer of unbundled datasets is utterly pointless. Many small files will receive no benefit from multithreaded parallel transfers, and much of the communication overhead is spent setting up and tearing down each file's individual transfer. It is far superior to bundle a dataset into an archive and send that instead. However, we will continue to gather this data as we have been instructed.
When creating a single archive file of all of the files and directories that comprise a PDS dataset and sending that, iRODS improves performance enormously. The following table (and gratuitous graph) summarize the results.
| Time (s) | Rate (MiB/s) | |||||
|---|---|---|---|---|---|---|
| Node | Size (bytes) | Unbundled | Bundled | Unbundled | Bundled | Improvement |
| PPI | 962,007,040 | 385.14 | 88.41 | 2.38 | 10.38 | 335.63% |
| PPI | 10,581,995,520 | 4297.46 | 941.19 | 2.35 | 10.72 | 356.60% |
| Small Bodies | 1,008,431,104 | 1929.54 | 233.91 | 0.50 | 4.11 | 724.91% |
| Rings | 11,543,992,320 | 23127.24 | 6608.52 | 0.48 | 1.67 | 249.96% |
| Atmospheres | 9,619,989,504 | 84164.22 | 8327.10 | 0.11 | 1.10 | 910.73% |
On each node we sent an unbundled (many files in a directory tree) dataset to SDSC. We then created a TAR-format file archive of the dataset and sent it to SDSC. In both cases we let iRODS choose a number of concurrent threads for data transfer. Because of the boost in transfer performance, we recommend bundled transfers wherever possible. Note that the iRODS server does not yet support automatic unbundling upon receipt of an archive. That feature is expected in iRODS 1.2.
Reliability
The reliability of iRODS is contingent upon its ability to:
- Transfer data correctly: no incorrect bytes are saved or retrieved
- Transfer data successfully: iRODS must recover from network outages and hardware failures
iRODS includes built-in integrity checking for data transfers. iRODS can compute a message digest (a hash function) of data after its ingested, and the value of this hash function can be compared to the source data's value as well as between replicas of the data. iRODS uses the Message-Digest Algorithm 5 (MD5) cryptographic hash function.
In addition, iRODS includes a recovery mechanism for interrupted data transfers. During the transfer of data, iRODS client systems may save an auxiliary file containing the current state of the data transfer. Should the network connection be interrupted or a power failure occur, data transfers may be resumed using this auxiliary file.
All of our tests have made use of the auxiliary file in order to recover from failures. However, during our tests, none of the sites experienced failures except for the Geo node. We suspect that a traffic-management firewall at the Geo node is aggressively closing connections it believes to be either idle or using too much bandwidth. iRODS transfer logs would show periodic interruptions at the Geo node. However, the recovery mechanism of iRODS worked around these issues easily and without operator intervention.
As for data transfer integrity, the following table summarizes the results of our findings:
| |
Transfer Integrity | |
|---|---|---|
| Location | iRODS | Conventional |
| Atmospheres node | 100% | 100% |
| Commercial data center | 100% | 100% |
| Engineering/Imaging Node | 95% | 95% |
| Geo node | 100% | 100% |
| PPI node | 100% | 100% |
| Small Bodies node | 100% | 100% |
Neither iRODS nor conventional file transfer suffered from any data corruption, except for systems at the Jet Propulsion Laboratory. Here, both plain FTP or HTTP as well as iRODS will occasionally lose data integrity. We believe JPL's border connection to the public internet to be at fault.
Futures
Transferring a bundled archive of a collection is far more efficient than transferring the unbundled collection. However, researchers often need a subset of the collection once unpacked, therefore making the retrieval of large archives for only a few small files a waste of bandwidth. The developers of iRODS have foreseen this, and are currently developing a bundling and unbundling capability.
This capability enables a curating PDS node to efficiently send a single large archive file. Once it arrives at SDSC, the file may be unbundled automatically into its collective directory hierarchy. Other consumers of the data may then download subsets of the files they require. Furthermore, such subsets may be converted into bundled archives at SDSC and then transferred to a consumer more efficiently as a single large file. These capabilities will be available in iRODS version 1.2.
iRODS also supports multiple replicas of data, enabling data to be automatically mirrored. This improves resilience in the face of failure of a particular node. It may also prove beneficial for data transfers by enabling a client to communicate with a closer replica than with a more distant one. PDS may wish to investigate implementing multiple replicas.

