20110526

Footprint Comparison for DV Leaders

Comparison of DV Tools is the most popular page (and post) of this site, visited by many thousands of people. Some of them keep asking to append this comparison with different additional features, one of them is a comparison of requirements of leading DV tools for file and memory footprint and also for reading and saving time.

I took mid-sized dataset (428999 rows and 135 columns), exported it into CSV and compressed it to ZIP format, because all native DV formats (QVW by Qlikview, DXP by Spotfire, TWBX by Tableau and XLSX by Excel and PowerPivot) are compressed one way or another. My starting filesize (of ZIPped dataset) was 56 MB. Here is what I got, see for yourself:

[googleapps domain="spreadsheets" dir="spreadsheet/pub" query="hl=en_US&hl=en_US&key=0AuP4OpeAlZ3PdGszUVZDVGwxOE0wVExZcmhPdVlXQ2c&single=true&gid=0&output=html&widget=true" width="500" height="300" /]

One comment is that numbers above are all relative to configuration of hardware used for tests and also depend on other software I ran during tests, because that software also requires RAM, CPU cycles, disk I/O and even on speed of repainting applications windows on screen, especially for Excel. I probably will add more comments to this post/page, but my first impression from this comparison is that new Tableau's Data Engine (released in version 6.0 and soon will be updated in 6.1) made Tableau more competitive. Please keep in mind, that comparison of in-memory footprint was much less significant in above test, because Qlikview, Excel and PowerPivot putting all dataset into RAM, while Tableau and Spotfire can leave some (unneeded for visualization) data on disk, treating it as "virtual memory". Also Tableau using 2 executables (not just one EXE as others): tableau.exe (or tabreader.exe) and tdserver64.exe

Since Tableau is the only DV Leading software, capable to read from SSAS Cubes and from PowerPivot (local SSAS) Cubes, I also took large SSAS Cube and for testing purposes I selected SSAS Sub-Cube with 3 Dimensions, 2 Measures and 156439 "rows", measured the Time and Footprint, needed for Tableau to read Sub-Cube, Refresh it in Memory, Save to local application file, and also measurted "Cubical" Footprint of it in Memory and on Disk and then compared all results with the same tests while running Excel 2010 alone and Excel 2010 with PowerPivot:

[googleapps domain="spreadsheets" dir="spreadsheet/pub" query="key=0AuP4OpeAlZ3PdFZUczdEbEVqU2lOMXRDOXJBUXJnQ3c&single=true&gid=1&output=html&widget=true" width="500" height="300" /]

While Tableau's ability to read and visualize Cubes is cool, performance-wise Tableau is far behind of Excel and PowerPivot, especially in Reading department and memory footprint. In Saving department and File footprint Tableau is doing nothing because it is not saving cube locally in its local application TWBX file (and it keeps data in SSAS cube outside of Tableau) so Tableau's file footprint for SSAS Cubes is not an indicator but for PowerPivot-based local Cubes Tableau does better job (saving data into local application file) then both Excel and PowerPivot!

1 comment:

  1. I tried a similar analysis albeit not quite as comprehensive but saw different results
    While I noticed that QV opened quicker. it was a partial result set unless one scrolled through the data and the import process took some time longer to get data usable.

    Data File, MB
    txt, RAW 161MB
    txt, zipped 19MB

    App Stored File Base Read file Memory Save
    MB RAM s MB s
    Qlikview, QVW 69 11 7 220 14
    Spotfire, DXP 32 83 33 380 20
    Excel 2010, XLSX 73 42 46 206 15

    These were with a default table display set.

    ReplyDelete