By defining intersections in database extracts and legacy files, CoSort users can simultaneously discover, transform, and report on related data. And by performing join and lookup functions on flat files, CoSort users can both relieve the overhead on their DBMS from queries while incorporating mainframe/index file, spreadsheet, and other data into the process.
Multi-File Joins
Joining large tables to satisfy queries taxes DBMS performance, the company notes. There has also been no efficient way to compare large files and identify field changes (inserts, updates, deletes) over time, it says. "In addition to offloading DBMS's, multi-file joins offload data integration tools by merging data before it hits the tool," said Philip Russom, senior manager at the Data Warehousing Institute. "At the high end, this is useful with the distributed architectures that many users apply to scaling up their data integration solutions. At the other extreme, multi-file joins may eliminate the need for a data integration tool."Multi-Dimensional File Lookups
Data cleansing, multi-table joins, and complex computations that produce discrete solutions are resource-intensive operations. Where a simple lookup can replace a runtime computation (e.g. a mathematic expression or "pseudonymization"), theperformance gain is significant because retrieving a value in memory is faster than computing that value. To achieve these fast retrievals, CoSort users specify lookups against set files. By referencing multi-column files, users get faster answers to discrete questions like the right zip code for a city in a state lookup.
Russom adds that, "When multi-column files are sources for a data warehouse, multi-dimensional file lookups can generate cubes and other multi-dimensional
structures for the warehouse and analysis tools."
LATEST COMMENTS
MC Press Online