In the era of big data, two of the biggest challenges that IT professionals face are: 1) speeding insight from variable data sizes, formats, and streams, and 2) securing personally identifying information (PII) to protect corporate reputations and to comply with data privacy laws.
Some try to address the performance challenge by powering legacy data integration or virtualization suites with huge servers. Others try complex Hadoop programs or unfamiliar database technologies. For data security, they turn to costly classification and de-identification technologies, and specialized compliance experts.
Many continue to seek out proven solutions they can afford. IRI (Innovative Routines International, Inc.) is a data management ISV founded in 1978 focused on fast, feature-rich processing and protection technology for data big and small.
According to IRI’s SVP and COO David Friedland, the company’s early roots were in moving mainframe sort/merge/report jobs into CP/M, DOS, UNIX, and Windows. This initial mission led IRI to develop and parallelize more data mapping functions, which also made its “CoSort” product popular with DW and BI architects needing a faster ETL and data preparation engine.
As IRI grew in big data processing markets (long before Hadoop emerged), it developed even more data-centric capabilities for profiling, processing, protection, presentation, and prototyping, including: data searching and classification, data integration and replication, data masking and encryption, data cleansing and reporting, and test data generation. Today, IRI delivers eight (8) data management and protection software products, which are supported in more than 40 international offices.
Big Data Manipulation in a Managed Environment
CoSort is the default data processing engine in IRI’s modern “total data management” platform, Voracity. The platform can also use Hadoop engines too, but more on that later.
The purpose of Voracity is to be a centralized data marshalling area and one-stop solution stack for data discovery, integration, migration, governance, and analytics. IRI touts Voracity as the “only affordable, high-speed platform for managing data in flat files, DBs, HDFS, and cloud apps, from profiling to presentation.”
Voracity uses a popular (and free) graphical integrated development environment (IDE) called IRI Workbench. Because it is built on Eclipse, the GUI for Voracity is automatically familiar to millions of users, and is a fully extensible solution stack. Many free and commercial plug-ins can open in Voracity’s user workspaces and run within Voracity workflows.
Using these flexible Eclipse “workspaces,” different stakeholders can work alone or in teams to profile and classify, integrate and harmonize, clean and mask, prototype or replicate, and blend or analyze their data, as well as track its changes through time.
More specifically, Voracity performs multiple functions within five key data management areas:
- Data Discovery — search, extract, structure, profile, classify, and diagram data sets
- Data Integration — extract, transform, load (ETL), change data capture, pivoting, etc.
- Data Migration — data type, file format, endian, and database conversion or replication
- Data Governance — cleansing, masking, test data, master data and metadata management
- Analytics — embedded reporting, BIRT and dashboard integration, or data wrangling
Fast Data Munging and Masking (With or Without Hadoop)
Many data stores are now in Hadoop Distributed File Systems (HDFS), and the cost of cluster hardware continues to fall. As a result, there’s an increasing need to process and protect data in HDFS files, Hive, etc. With this comes the challenge of a steep learning curve for Hadoop users.
IRI addresses these issues in the platform. The data transformation, masking, and reformatting jobs built visually for CoSort can also run automatically in Hadoop MapReduce 2, Spark, Spark Stream, Storm, or Tez. “Voracity’s seamless interchangeability of engines means that data analysts, ETL architects and governance teams can leverage the same Eclipse pane-of-glass to design and run jobs. There’s no need to learn Hadoop code.”
IRI sees the convergence of Hadoop commoditization and Voracity task consolidation as the perfect opportunity for smaller and mid-sized companies to capitalize on big data. IRI says that Voracity can manage both static and streaming data — from files, DBs, IoT, and more — without Apache-project complexity or mega-vendor costs.
IRI had been building toward Voracity long before realizing it. But once dedicated to the platform, IRI began to combine its existing products with other state-of-the-art technologies. Years of internal and external innovation later, Voracity is now more than the sum of its parts. For example, the platform added searching, profiling, and classification wizards to discover data ahead of analytic, quality, and masking operations.
Voracity also supports newer data delivery methods like Kafka and formats like JSON along with old school COBOL files and relational databases. It combines IRI’s proprietary engines with open source communication and parsing protocols to address all the challenges of big data today: volume, variety, velocity, veracity, and value.
A key design goal was also accessibility. IRI built Voracity to be user-friendly for a wide range of groups, including BI/DW architects, data scientists, DBAs, and GRC (governance, risk, and compliance) officers. Another goal was to future-proof it against change. The versatility of Voracity’s programs and constant evolution in Eclipse functionality make the platform a chameleon for multiple tasks today, and future data processing requirements tomorrow.
Many thought leaders in the data management industry contributed to Voracity, guiding it during its development and positioning stages to make sure it hit its mark. Analysts at Gartner, consultants at Athena Solutions, Big Data Dimension, and the Data Governance Institute all weighed in on the platform, as did the inventors of AnalytiX DS Mapping Manager and the Data Vault (Dan Linstedt). All of them ensured that Voracity became a flexible, outcome-driven platform.
Although IRI continues to expand, it already has many multinational customers. Banks and insurance companies like Bank of America, AIG, HSBC, and AXA process their data with Voracity’s CoSort engine, as do airlines like American, Japan, and Lufthansa, and automotive companies like Hyundai, Nissan, and Mercedes-Benz. Hosts of other conglomerates like Visa, Nestle, Samsung, Capgemini, Accenture, Rolex, Sony, and The Walt Disney Company also rely on IRI’s data manipulation technology.
“We are proud to have thousands of users worldwide using IRI software in contexts like data integration and data masking, which leading publications and analysts firms like Gartner, IDC, and The Bloor Group all recognize,” David remarked.
Building Partnerships to Build Business
IRI routinely collaborates with specialty providers who contribute to the Voracity ecosystem. By blending into the IRI data fabric in Eclipse, these developers can add their value to the platform with minimal user impact.
To illustrate, IRI recently announced a partnership with AnalytiX DS to enhance Voracity’s metadata management capabilities for ETL and data quality users. The companies unveiled their complementary functions in adjacent exhibits at the Dataversity Enterprise Data World conference and expo in Atlanta this year.
While talking about the partnership, David said, “Both companies are excited about bringing the combined technologies to market under a single, integrated offering.” He commented on the benefits that their clients will be getting and said, “The bridge between our platforms is built on an API-level integration of metadata. This enables anyone with either stack to use the strengths of the other on a pay-to-play basis.”
Staying Ahead of the Data Management Industry
IRI attributes its technical success to: 1) an organically-grown code base focused on big data processing speed; 2) simple and open metadata; 3) the extensibility of Voracity’s Eclipse design/deployment GUI; and, 4) input from industry thought leaders.
IRI also credits its success to conservative growth, relatively-low marketing overhead, and the priority given to its most loyal users in feature-function decisions, support resources, and licensing flexibility.
This combination of factors allow IRI to deliver enterprise-class performance and functionality at affordable subscription prices.
Moving into its next 40 years, IRI sees Voracity as a key to the growth of the company and the industry. IRI continually ranks among the data management industry’s top firms. Database Trends and Applications (DBTA) ranked the tools in IRI’s Data Protector Suite — also key components of Voracity — as a Trend Setting Product in 2017. CV Magazine named Voracity the Most Price-Performant Big Data Management Platform.
Still Active Founders
Paul Friedland, CEO, started the company in 1978. Paul’s innovation in high-performance, high-volume data processing began even earlier when he was cited in Knuth’s Sorting and Searching. Through decades of continuing innovation in co-routine architecture, multi-threading, data manipulation, and task consolidation, he made IRI a leader in the big data processing industry long before Hadoop was introduced. His son, David Friedland, COO, joined IRI in 1998 after working in technology marketing and international journalism. Today, he manages both partner and product line growth, works with analysts and stakeholders, blogs on technical topics, and speaks at trade conferences.
David said, “Despite the rich technical history I’ve seen at IRI, the years ahead bode even better. We are in the middle of exponential data growth. Voracity’s myriad solutions in data governance and analytic-related applications coupled with its price-performance position in Hadoop-fueled markets bring us new opportunities every day.”
While discussing the future of the company, he added “I see us leveraging partner technologies for NLP, machine learning, and rich visualizations as a way to add even more value to the Voracity platform going forward.”