DayFR Euro

iRODS decouples the data user from the infrastructure

It is already complicated enough to manage large amounts of data, as is the case in universities and research institutes. But reviewing all this data efficiently, using different technologies, presents an even greater challenge. This is precisely what the iRODS group intends to simplify.

iRODS is a start-up with an academic background. At least, if the solution was ultimately hosted in a separate company. Currently, iRODS is designed as a consortium, supported by a handful of partners from academia and business. KU Leuven and UCL are collaborating on this. “Our roots go back to the 1990s in a supercomputing center in San Diego,” says Terrell Russell, director of iRODS. Data News met him as part of the IT Press Tour.

Find relevant data

In 2008, the iRODS team – now around ten collaborators – joined the Renaissance Computing Institute at the University of North Carolina, after which the current consortium took shape from 2013. iRODS is the ‘acronym of’integrated rule-oriented data system. It is an opensource programmable file system. ‘It can run simply on a laptop,’ says Terrell Russell. ‘But also in a cluster, on site or to be distributed geographically.’ Its application areas include supercomputing, libraries and archives, genome research, healthcare, etc.

‘The goal is to easily manage large amounts of data, spread across all kinds of storage technologies, and to control access to that data.’ The use of metadata is essential here. We add metadata automatically, making it much easier for you, the user, to find the correct data and thereby increasing its value.’ This allows for highly targeted searching across various data sources, as well as a truly detailed audit.

Terrell Russell, Director of iRODS: ‘A database that links to where the actual data is located.’

A layer of abstraction

‘In reality, our solution functions as a large database that links to where the actual data resides: in the clouds, on-premises, in an archiving system, etc.’ iRODS provides a layer of abstraction that essentially decouples both the user and the data from the underlying infrastructure. In this case, the solution uses automatic workflows with, among other things, the mandatory application of all kinds of configurable rules. ‘I’ll give you an example,’ Terrell Russell continues. ‘Take a satellite that provides new data all the time. You can collect these in a reception area, after which iRODS examines them, automatically provides them with metadata and determines the storage location.’

Sometimes this process is reversed. ‘This is the case for data that cannot be moved, for example because it is too complex or expensive. In this case, iRODS directs computing power towards the data rather than the other way around.’ This makes it possible, for example, to temporarily group data sources in a simple way under iRODS, particularly as part of a collaborative project.

Help others save

At the risk of repeating ourselves, iRODS is a consortium. ‘We are part of a research institution. We are not a commercial company and therefore cannot provide guarantees to our users, such as in the form of an SLA.’ It remains to be seen whether this will change in the long term. ‘We’re delighted with the way everything is going now,’ says Terrell Russell. ‘But sometimes it’s weird to realize how much money is going under our noses, just because we’re not a commercial organisation. That doesn’t make what we do any less interesting. We actually help other organizations save money.’

If iRODS ever wanted to enter the commercial scene, investing in the user interface would certainly not be an unnecessary luxury. ‘We actually never paid any attention to it,’ admits Terrell Russell. Unlike companies like Starfish and Hammerspace, which operate more or less in the same field. ‘That’s right,’ concludes Terrell Russell. ‘These companies then sell a black box.’ Here too lies a difference: iRODS requires more time to learn and familiarize yourself with the system.

-

Related News :