20110526

Footprint Comparison for DV Leaders

Comparison of DV Tools is the most popular page (and post) of this site, visited by many thousands of people. Some of them keep asking to append this comparison with different additional features, one of them is a comparison of requirements of leading DV tools for file and memory footprint and also for reading and saving time.

I took mid-sized dataset (428999 rows and 135 columns), exported it into CSV and compressed it to ZIP format, because all native DV formats (QVW by Qlikview, DXP by Spotfire, TWBX by Tableau and XLSX by Excel and PowerPivot) are compressed one way or another. My starting filesize (of ZIPped dataset) was 56 MB. Here is what I got, see for yourself:

[googleapps domain="spreadsheets" dir="spreadsheet/pub" query="hl=en_US&hl=en_US&key=0AuP4OpeAlZ3PdGszUVZDVGwxOE0wVExZcmhPdVlXQ2c&single=true&gid=0&output=html&widget=true" width="500" height="300" /]

One comment is that numbers above are all relative to configuration of hardware used for tests and also depend on other software I ran during tests, because that software also requires RAM, CPU cycles, disk I/O and even on speed of repainting applications windows on screen, especially for Excel. I probably will add more comments to this post/page, but my first impression from this comparison is that new Tableau's Data Engine (released in version 6.0 and soon will be updated in 6.1) made Tableau more competitive. Please keep in mind, that comparison of in-memory footprint was much less significant in above test, because Qlikview, Excel and PowerPivot putting all dataset into RAM, while Tableau and Spotfire can leave some (unneeded for visualization) data on disk, treating it as "virtual memory". Also Tableau using 2 executables (not just one EXE as others): tableau.exe (or tabreader.exe) and tdserver64.exe

Since Tableau is the only DV Leading software, capable to read from SSAS Cubes and from PowerPivot (local SSAS) Cubes, I also took large SSAS Cube and for testing purposes I selected SSAS Sub-Cube with 3 Dimensions, 2 Measures and 156439 "rows", measured the Time and Footprint, needed for Tableau to read Sub-Cube, Refresh it in Memory, Save to local application file, and also measurted "Cubical" Footprint of it in Memory and on Disk and then compared all results with the same tests while running Excel 2010 alone and Excel 2010 with PowerPivot:

[googleapps domain="spreadsheets" dir="spreadsheet/pub" query="key=0AuP4OpeAlZ3PdFZUczdEbEVqU2lOMXRDOXJBUXJnQ3c&single=true&gid=1&output=html&widget=true" width="500" height="300" /]

While Tableau's ability to read and visualize Cubes is cool, performance-wise Tableau is far behind of Excel and PowerPivot, especially in Reading department and memory footprint. In Saving department and File footprint Tableau is doing nothing because it is not saving cube locally in its local application TWBX file (and it keeps data in SSAS cube outside of Tableau) so Tableau's file footprint for SSAS Cubes is not an indicator but for PowerPivot-based local Cubes Tableau does better job (saving data into local application file) then both Excel and PowerPivot!

20110505

Spotfire 3.3: mature, scalable, social

TIBCO released Spotfire 3.3 and first (see what is new here) that jumped to my eyes was how mature this product is. For example, among new features is improved scalability - each additional simultaneous user of a web analysis initially claims very little additional system memory:



Many Spotfire customers will be able to support a greater number of web users on their existing hardware by upgrading to 3.3. Spotfire Web Player 3.3 includes significant improvements in memory consumption (as shown above for certain scenarios). Theoretically goal is to minimize the amount of system memory needed to support larger numbers of simultaneous users on the same analysis file. Main use case here: the larger the file and the greater the number of simultaneous web users on that file, then less initial system memory required to support each additional user: it is greatly reduced compared to version 3.2.1 and earlier.

Comparison with competition and thorough testing of new Spotfire scalability has to be done (similar to what Qliktech done with Qlikview here), but my initial reaction is as I said in a Title: we are witnessing a very mature software. Apparently the Defense Intelligence Agency (DIA) agrees with me and Defense Intelligence Agency Selects TIBCO Spotfire Analytics Solutions for Department of Defense Intelligence Information System Community. "With more than 16,500 military and civilian employees worldwide, DIA is a major producer and manager of foreign military intelligence"

Spotfire 3.3 also includes collaborative bookmarking, which enables all Spotfire users  to capture a dashboard - its complete configuration, including markings, drop down selections, and filter settings and share that visualization immediately with other users of that same dashboard, regardless of client in use. Spotfire actually not just a piece of Data Visualization Software, but a real Analytical Platform with large portfolio of products, including completely integrated S-PLUS (commercial version of R Library which has million of users), best Web Client (you can go Zero-footprint with Spotfire Web Player or/and partially free Spotfire Silver), free iPad Client version 1.1.1 (requires iTunes, so be prepared for Apple intrusion), very rich API, SDK, integration with Visual Studio, support of IronPython and JavaScript , well-thought Web Architecture, set of Extension Points etc.

System requirements for Spotfire 3.3 can be found here. Coincidentally with 3.3 Release Spotfire VAR Program got expansion too. Spotfire has a very rich set of training options, see it here. You can also find set of good Spotfire videos from Colin White's Screencast Library, especially 2011 Webcasts.

My only and large concern with Spotfire is its focus, since it is part of a large corporation TIBCO, which has 50+ products and 50+ reasons to focus on something else. Indirectly it can be confirmed with sales: my estimate that Tableau is growing much faster than Spotfire (sales-wise) and Qlikview Sales probably 3 times larger (dollar-wise) than Spotfire sales. Since TIBCO bought Spotfire in 2007, I expected Spotfire will be integrated with other great TIBCO products, but after 4 years it is still not a case... And TIBCO has no reason to change its corporate policies, since its busines is good and stock is doing well:



(at least 500% increase of share price since end of 2008!). Also see article written by Ted Stamas for SeekingAlpha and comparison of TIBX vs. ETF here:



I think it is interesting to notice that TIBCO recently rejected a buyout offer from HP!