From nobody Wed Feb 18 09:54:24 1998 Path: newsfeed.cv.nrao.edu!newsgate.duke.edu!nntprelay.mathworks.com!newsfeed.internetmci.com!192.52.106.6!ncar!csn!nntp-xfer-1.csn.net!news-2.csn.net!not-for-mail From: jbeal@jbeal.org (Jeremy Beal) Newsgroups: sci.data.formats Subject: Re: Lots of File Formats for free Date: Wed, 18 Feb 1998 00:38:37 GMT Organization: SuperNet Inc. +1.303.285.0194 Denver Colorado Lines: 17 Message-ID: <34ea2d2f.2854007015@news-2.sni.net> References: <6c6gio$9tm$1@news.netvision.net.il> Reply-To: www.nvmedia.com/~`jbeal NNTP-Posting-Host: 198.233.40.11 X-Newsreader: Forte Free Agent 1.1/32.230 Xref: newsfeed.cv.nrao.edu sci.data.formats:233 On Sun, 15 Feb 1998 12:37:45 +0200, "Tsahi Carmona" wrote: >I've got lots of file formats to share with anyone who wishes to get that >information: >Multimedia (JPEG, GIF, BMP, AVI, WAV...) >Compresetions (LZEXE, ARJ, ZIP...) >And more... > There is a site at http://wotsit.simsware.com/ which is dedicated to collecting file formats. (Not necessarily scientific.) I'm sure they would appreciate any information... Jeremy Beal Get my e-mail address at www.nvmedia.com/jbeal (Tired of the damn spam) From nobody Thu Feb 19 09:33:31 1998 Path: newsfeed.cv.nrao.edu!newsgate.duke.edu!nntprelay.mathworks.com!news-peer.gip.net!news.gsl.net!gip.net!newspump.sol.net!sol.net!uwm.edu!uwvax!news From: Andy Glew Newsgroups: comp.databases,sci.engr.mecha,uwisc.misc,sci.data.formats,comp.data.administration,sci.econ,sci.op-research Subject: [Fwd: ISO Data Management, Graphing, and Visualization Tools] Date: Wed, 18 Feb 1998 17:03:25 -0600 Organization: U Wisc CS (& Intel) Lines: 530 Message-ID: <34EB68BD.6E7E7E2C@cs.wisc.edu> NNTP-Posting-Host: helga.cs.wisc.edu Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="------------9D034D89CA46DD3D0AD0D439" X-Mailer: Mozilla 4.03 [en] (WinNT; I) Xref: newsfeed.cv.nrao.edu comp.databases:6499 sci.data.formats:235 comp.data.administration:466 sci.econ:11545 sci.op-research:1197 This is a multi-part message in MIME format. --------------9D034D89CA46DD3D0AD0D439 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Brief ==== I seek information or recommendation for data management, graphing, exploration and visualization tools. Particularly hoping for ++ Graphical zoom - e.g. on a scatterplot, draw a box around points with mouse, and have axes re-offset and rescaled so that the box is fullscreen. ++ Ability to specify filter programs to be run on imported data, instead of reading it from a file (so that said filters can do the file format conversion task). ++ Database for results. Please respond by email to glew@cs.wisc.edu Detail ===== Newsgroups -------------- Friends suggested that I post the attached request, which I had sent to my domain newsgroup, comp.arch, as well as some application software newsgroups, to newsgroups where readers are more likely to have experience of what I want: sci.engr.mecha other engineering fields, because engineering fields other than CS (I'm an EE, so I dis CS) have more experience in doing good experimental work than CS. sci.econ sci.op-research because other fields, such as economics, may be doing this (SOMEBODY has to be buying SAS and SPSS - certainly not CS people) sci.med because a lot of the examples from these software packages are medical etc. etc. Please respond by email to glew@cs.wisc.edu, since these newsgroups are outside my reading list. Status ------- The attachment describes what I am looking for in detail. Don't be scared off by the wishlist - I'll happily settle for something like gnuplot with graphical zoom, because I can couple its filters to an external database. If you don't ask, you won't get... Since I posted last night, I have received several emails and located a software showroom where I tried SAS, SPSS, and Minitab demos and/or tutorials. Sumary: SAS: It may be the best seller, but SAS is an antiquated dinosaur - mainframe, textual command oriented, lousy graphs. I just say no. SPSS: Reasonably good GUI, as well as a reasonably good command language. [Very GOOD]: pivotting table editor [BAD]: no graphical mouse zoom - to change axes on a scatterplot, you have to type in limits. Unclear if database JOINs are possible in its command language. SAS/JMP: SAS Institute's graphical, GUI, interactive beast. Unfortunately, their webpage is broken, so I haven't been able to try the demo. SAS/StatVIEW: A company recently acquired by SAS. Looks good, nicely graphical. Unclear if it has graphical zoom. Minitab: Slightly better than SAS, but still antiquated; more limited in analysis abilities. Other comments are in attachments. --------------9D034D89CA46DD3D0AD0D439 Content-Type: message/rfc822 Content-Transfer-Encoding: 7bit Content-Disposition: inline Path: uwvax!news From: Andy Glew Newsgroups: comp.arch Subject: ISO Data Management, Graphing, and Visualization Tools Date: Tue, 17 Feb 1998 19:00:46 -0600 Organization: U Wisc CS (& Intel) Message-ID: <34EA32BE.5CBFB5EB@cs.wisc.edu> NNTP-Posting-Host: helga.cs.wisc.edu Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="------------7EE2F594139DDCB6FBD5113B" X-Mailer: Mozilla 4.03 [en] (WinNT; I) This is a multi-part message in MIME format. --------------7EE2F594139DDCB6FBD5113B Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit This seems not to have posted from my UNIX system, so I will fall back to posting from Netscape. See attachment for questions about data management, graphing, and visualization tools. --------------7EE2F594139DDCB6FBD5113B Content-Type: text/plain; charset=us-ascii; name="k.txt" Content-Transfer-Encoding: 7bit Content-Disposition: inline; filename="k.txt" Sender: glew@balder Newsgroups: comp.arch,comp.soft-sys.matlab,comp.soft-sys.math.mathematica,comp.soft-sys.sas,comp.soft-sys.stat.spss,comp.soft-sys.stat.systat,comp.graphics.visualization,comp.graphics.apps.gnuplot Subject: ISO data management, graphing, and visualization From: Andrew Glew Organization: CS Department, University of Wisconsin Lines: 378 X-Newsreader: Gnus v5.3/Emacs 19.34 Date: 17 Feb 1998 18:59:25 -0600 Message-ID: Brief ===== I seek recommendations about software packages to manage experimental data, and to prepare graphs and visualizations. Short list of wished-for features: a) underlying database - i.e. I would like to do joins of differing datasets b) decent graphs: e.g. log axes, different forms of graph c) interactive data browsing: e.g. click on a point, jump to related variable, drag box to control zooming in, etc. (but I also want good batch mode graph control) Justification for Posting ========================= I have probably posted to enough newsgroups to set off some spam filters. Here's why I chose these newsgroups: comp.arch My application domain is the field of computer architecture comp.soft-sys.matlab comp.soft-sys.math.mathematica comp.soft-sys.sas comp.soft-sys.stat.spss comp.soft-sys.stat.systat comp.graphics.apps.gnuplot Some applications that I know are in this field - I seek advice to help me choose between these, and others comp.graphics.visualization The generic field of visualization Detail ====== Context - Improving My Personal BKMs ------------------------------------ comp.arch readers may be aware that I returned to school to finish my Ph.D. after having worked in industry for many years. I am finally beginning to collect experimental data for my research, and I would like to have a system to manage the data. I would like to improve my personal "Best Known Methods" for this task of data management and visualization. I.e. I would like to find better tools to do this than the Perl scripts and GNUplot that I used in my MS 8 years ago, or the not-much-better technology that I have used in industry. In particular, in my last industrial stint "Excell" was considered the state of the art for preparing graphs. I consider this distinctly unsatisfactory - especially since Excell was extremely slow in handling large datasets (upwards of 64000 data points), which I have always been able to do in GNUplot. Furthermore, I consider the imprecation "reduce your data set" not always desirable. I have been playing with computer performance data for more than 15 years now. Surely the state of the art has improved somewhat? Types of Data ------------- The datasets that I wish to manage range from Profiles: 2-tuples of (address,count) for many of the locations in a program - often giving 100s of thousands of data points, which I wish to quickly scroll around in, looking at coarse scale (related to the precision of my screen) so that I can zoom in and out. Simulator Results: Typical measurements such as IPC (Instructions per Clock) and estimated run time for particular benchmarks. 100 or so parameters or measurements per benchmark. Traces: Ideally, I believe that the approach I describe here (that of a database) could be extended to traces of events such as start,execute,retirement for instructions in a program => millions of measurements, limited mainly by disk space. I should note that version control of measurements is an issue - i.e. my measurements are almost never just streams of numbers, but are, at the minimum, streams of numbers augmented with a version, a few dates, and a comment description, which I wish to be an intrinsic part of the data. The Similator Result measurements are often ad-hoc - in that new metrics are created on a daily basis, and discarded as readily. Thus tools that make it painful to create new metric types - e.g. by requiring a database schema to be created by hand - are undesirable. Similarly, measurements are sparse. Wish List - Database -------------------- It has long seemed to me that many of the problems of data manipulation could be handled by a Relational Database Management system. For example, if you have (in a very schematic notation) Experiment = { name = "Baseline", results = { { benchmark = spec95,126.gcc,I=test, ipc = 45 } { benchmark = spec95,126.gcc,I=ref, ipc = 22 } { benchmark = spec95,129.compress,I=test, ipc = 988 } ... } } Experiment = { name = "LatestGreatIdea", results = { { benchmark = spec95,126.gcc,I=test, ipc = 450 } { benchmark = spec95,126.gcc,I=ref, ipc = 212 } { benchmark = spec95,129.compress,I=test, ipc = 98 } ... } } then something like an SQL join should be able to perform the comparison. Straightforward attempts to put this into an RDBMS usually founder on a) the extreme slowness of RDBMS b) cost (perhaps the MiniSQL freeware will make this okay) c) the painful contortions that most RDBMS require in handling inconsistent data - i.e. I don't want to define a schema, my schema is in my datafiles (i.e. the tuple field names) Nonetheless, I have hope that an RDBMS, or perhaps a OODBMS, or even an ORDBMS, may be okay. Wish List - Graphing -------------------- Of course, I wish to be able to produce any graph that I can find in the computer architecture literature, and/or those that I am familiar with from other fields... Wish List - GUI *and* Batch --------------------------- I would like to be able to interactively walk into and out of my data, drawing boxes to zoom, etc. Tools like GNUplot seem to fall down in this regard --- it is a real pain to have to type "set xrange [4:1000]", as opposed to using the mouse. Wish List - Automation ---------------------- There seem to be a number of interactive tools around. Each, of course, insists on slightly different input formats. It is straightforward to write Perl scripts to do the conversion, but... it would be *really* *nice* if those scripts could be wired into the interface, so that, e.g., MATLAB could automatically use my .SSOUT to .MAT converter when I try read a .SSOUT file. I.e. I don't want to automatically create every possible datafile format. Data management nightmare. Rather, I want to convert as needed. Wish List - Good Built In Handling of Missing Data -------------------------------------------------- As mentioned above, my data is often sparse - many metrics are not available in all configurations. Math systems that permit e.g. calculations of ratios and which handle missing data nicely are highly desirable. I.e. can it give "NA" (Not Available) as an answer? Giving spurious values such as 0 (Perl's default lacking programming) is highly undesirable and misleading. Note, especially, that you do not want simply to join on records where all fields are defined - explicitly listing undefined is very useful. Candidates ---------- The list of candidate tools for this that I am aware of includes DEVise Datadesk SPSS Sigmaplot SAS Statsoft SYSTAT JMP IDL IPL gnuplot xmgr MATLAB Mathematica I am not, however, familiar with all of these - I do not feel that I know enough to make a good decision. Hence this post, inviting recommendations from others. (Please reply to me personally by mail - I leave it up to you to decide if you want to reply to the newsgroups as well.) I don't know much, but I have gathered some random impressions and information about these tools that I will sumarize here - hoping that others may correct me. DEVise ------ "Database Exploration and Visualization". Academic work in progress, from the University of Wisconsin. Contains an underlying database - apparently ad-hoc coded, although plans to interface to a standard commercial database. Can use an SQL subset. Primitive graphs - doesn't even have logarithmic axes. Limited GUI querying - click on a graphical object, see the data. Most data exploration (e.g. range setting) seems to have to be done via menus. TCL/TK based => extensible? (in the sense that you can extend any source code that you can read). Datadesk -------- Commercial. Seems to provide the most interactive data browsing: twirling axes in 3D, zooming in and out via boxes, etc. Linked variable windows. Datastructures seem primitive - straight vectors. Q: does it support structures? MATLAB ------ Commercial. Matrix based, but supports matrices of matrices, and matrics with named tuple fields. Good internal programming language. Many packages. Theoretically database type JOINs could be written, but don't seem to exist in standard set of add-ons. Q: is there a database add-on? Theoretically good interactive graph browsing could be written, but standard graphing tools seem to be command/text based. (Part of me seems to think that what I want is SQL for MATLAB, with interactive graph browsing.) SPSS, SAS --------- Commercial. I have used SPSS (and SAS) extensively back in the mainframe days, so I feel confident of their computing abilities. (I am considerably less confident of the computing abilities of many of the newer, graphics oriented, tools.) I am less familiar with the GUI interfaces that have been added in the last decade. Q: do they provide the interactive features, such as drag a box to zoom? Q: in the old days SAS and SPSS were basically sequential file of records oriented. Do they have any database features, like JOIN? IDL --- Commercial. Widely advertized, seems to have a general programming language with good graphics hooks. Q: database? gnuplot ------- Freeware. Very textual, does a good job for a very limited repertoire of graphs. Nice in that it seems to be one of the few programs to allow filters to be specified when importing data, e.g. plot '< grep-ssout VariableName*100 datafile', '< grep-ssout Variable2/Variable1 datafile' IPL --- Seems to be mainly batch oriented. Semi-freeware. xmgr ---- Does the graphical "zoom in by drawing a box" thing. Doesn't do much else. Mathematica ----------- Might be able to handle stuff like this, but seems to have performance problems. Not GUI that I can see. Conclusion ========== Any help in choosing tools that I can use will be appreciated. Basically, I want something that can work out of the box, but which I can also extend. I am somewhat shy of investing in tools without good, detailed explanations or demonstrations: a) $$$ - actually, I am willing to spend typical commercial software prices, but I get somewhat fatigued with the hassle of returning software when it turns out not to do what I want. Hence, I will appreciate people sending me in the direction of demoware packages - ideally time limited demoware, because my experience is that crippleware which, e.g, is limited in the data set sizes it can manage, often looks good on the small data sets but then dies on the large data sets. b) personal time - it takes a long time to learn how to use many of these packages. I am somewhat reluctant to dedicate such time. --- Andy "Krazy" Glew, glew@cs.wisc.edu Place URGENT in email subject line for mail filter prioritization. DISCLAIMER: private posting, not representative of employer. {{ VLIW: the leading edge of the last generation of computer architecture }} --------------7EE2F594139DDCB6FBD5113B-- --------------9D034D89CA46DD3D0AD0D439-- From nobody Wed Feb 25 13:43:25 1998 Path: newsfeed.cv.nrao.edu!newsgate.duke.edu!nntprelay.mathworks.com!nntp.abs.net!feed2.news.erols.com!erols!newsfeeds.sol.net!uwm.edu!uwvax!news From: Andy Glew Newsgroups: comp.arch,comp.soft-sys.matlab,comp.soft-sys.mathematica,comp.soft-sys.sas,comp.soft-sys.stat.systat,comp.soft-sys.stat.spss,comp.graphics.apps.gnuplot,sci.data.formats Subject: Re: ISO Data Management, Graphing, and Visualization Tools Date: Tue, 24 Feb 1998 17:00:27 -0600 Organization: U Wisc CS (& Intel) Lines: 615 Message-ID: <34F3510A.547376D7@cs.wisc.edu> References: <34EA32BE.5CBFB5EB@cs.wisc.edu> NNTP-Posting-Host: helga.cs.wisc.edu Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-Mailer: Mozilla 4.03 [en] (WinNT; I) To: Andy Glew Xref: newsfeed.cv.nrao.edu comp.arch:5204 comp.soft-sys.matlab:6672 comp.soft-sys.sas:7500 comp.soft-sys.stat.systat:1100 comp.soft-sys.stat.spss:3236 comp.graphics.apps.gnuplot:2574 sci.data.formats:239 A little while back I posted in search of Data Management, Graphing, and Data Visualization tools. Off and on in the past week I have been evaluating such tools, reading the glossies, running the demos, waiting for sales and technical support to tell me that their package does not do what I hoped it would do... I thought that I would share the results of this evaluation of existing software packages with y'all. $Header: /u/g/l/glew/work/data-exploration-tool-search/RCS/evaluation-notes,v 1.4 1998/02/24 22:54:10 glew Exp $ Evaluation of Data Management, Graphing, and Exploration Tools ============================================================== BOTTOM LINE: there is no single answer, unfortunately. SPSS seems to be the nicest Windows enabled data management, stats, and graphing tool. However, it has distinct limitations when it comes to data exploration; specifically, it can't do "mouse based zoom", or any of the other nice things that make data exploration quicker. DataDesk is the nicest interactive data exploration tool, has mouse based zoom and a whole slew of other interactive features, but it has only an extremely limited repertoire of graphs, plots, and charts. JMP does more graphs than DataDesk, and is reasonably interactive, has mouse based zoom, but not much more - but it cannot do the most common graph types in my field (e.g. clustered bar graphs, let alone clustered stacked bar graphs). None of these tools seem to do the "advanced" graphs, such as clustered stacked bar charts. In an ideal world I'd buy SPSS and Datadesk, and also a good "batch mode" package such as MatLAB to draw the really sophisticated stuff. Requirements ============ As I go through this process, I am learning more about what I want in a tool. Chart Interaction mouse based zoom draw a box around some data points on the screen, and have axes adjusted so that box fills full screen. Many (most) tools do not have a mouse based zoom, but instead require axis limits to be typed into a dialog box, or worse. Occasionally hides as a magnifying glasstool, or as lasso or point selection. select/query select points, and be told Chart Types MUST axis types {linear,log} x, {linear,log} y MUST {scatter, lines {w/wo symbols}} multiple datasets editable colors, symbols, linestyles, etc., assumed bar {vertical, horizontal} clustered bar stacked bar stacked clustered bar WANT ability to add comments (arrows, text, etc.) to charts to explain features. NICE overlay arbitrary graphs on top of each other 3D graphs - 2D arry of bars, etc. thumbnails small multiples (typically of scatterplots) Survey: I scanned quickly over the last volume of ISCA and MICRO conference journals to see what sorts of graphs are in use. Briefly: Micro97 ISCA97 XY Scatter 7 6 XY line graphs 30 103 Simple Bar Graphs 7 2 Clustered Bar graphs 73 38 Cluster Cluster Bar 8 0 Stacked Bar graphs 22 13 Clustered Stacked Bars 20 8 3D surface plots 2 0 Data Management: MUST Import arbitrary textfiles in some reasonable format (CSV,TSV,WSV, etc.) WANT Import from a pipe, or through a program filter - so that can write a program to extract and reformat data from the database of a different program (and thus avoid the problems of proliferating data formats, temporary files, etc.) NICE Import Excel. OLE automation on import. WANT Ability to embed comments in a file - so that can track data provenance, etc. WANT No limit on points in dataset - e.g. I have prepared graphs with 200,000 points in the past. However, it is often hard to verify, from demoware, what the capabilities are. Output: MUST: EPS (Emeddable Post Script) OR MUST: Windows objects that can be embedded in Word (OLE automation). MUST: Finally produce Postscript viewable by Ghostview (for acceptance by conferences). Products ======== Evaluated --------- DataDesk Good magazine reviews Evaluation: crippleware demo downloaded from web Graph Interaction: pretty good MUST(OK): graphical zoom Seems to be the most interactive package that I have tried so far. Overall, has the nicest forms of interaction with the graph I have seen. Not only does it have mouse based zoom (box based), but it also has the ability to drag the graph around, changing i.e. changing xmin,xmax,ymin,ymax without changing the scale, and other interactive features. Graph Types: poor MUST(FAILS): LIMITED GRAPH TYPES Unfortunately, while it seems to be very interactive, it does not seem to have a good selection of graph/plot/chart types. For example, its concept of lineplots is to plot the variable as Y against the case number as X. I.e. it does not have XY lineplots. Similarly, Datadesk's barcharts are restricted in a similar manner, and understand nothing of clustering, let alone stacking, etc. While I might purchase DataDesk just for data exploration of scatterplots, I definitely cannot consider it as the answer to my graphing needs. Particularly not at its cost (600$ or so). Klugey User Interface: originally MAC based, not at all natural feeling in the Windows environment. SPSS Cost: 175$, as a "student gradpack". Comments: SPSS 7 and 8 look reasonably good on Windows - much better than the old interface, which carried over from mainframes. However, still have access to the old command language - can run from command line, or within a procedure automation tool - which is good. All of the SPSS products (SPSS, Sigmaplot, etc.) seem to have good tutorials and help. Database still SPSS sequential files ???? unclear if can do a JOIN Pivoting table editor - VERY GOOD!!! Graphs/Charts - a reasonable variety of graphs. Q: can it do clustered bars, stacked bars, clustered stacked bars? BAD: doesn't seem to be able to do mouse based zooms. SPSS/Deltagraph Cost: 295$ Recently acquired company. http://www.spss.com/software/DeltaGraph/ Comment: Seems to be the "PowerPoint" of PC graphing packages - produces sexy looking graphs, fancy backgrounds, etc. Seems to be recommended by many social science departments (psychology, demographics). Chart Types 2-D Charts Area, Area %, Bar, Bar-Stacked, Bar-Floating,* Bar-Segmentation,* Bar-Stacked Segmentation,* Bubble, Build-Up,* Build-Up-Stacked,* Column, Column-Stacked, Column-Segmentation,* Column-Stacked Segmentation,* Column-Floating,* Column-XY,* Combination, Contour Fill, Contour Line, Double X axis, Double XY axes, Double Y axis, High-Low, High Low Open Close (Candlestick and Whisker), Line, Line-Filled, Line-XY, Line-Paired XY, Pictograph, Pie, Pie-Donut, Pie-Multiple, Pie-Stacked, Polar, Quality Control X bar r, Quality Control X bar s, Quality Control p, Quality Control n, Quality Control u, Quality Control c, Radar, Range, Scatter, Scatter-Paired XY, Scatter (with optional droplines), Spider, Step, Table, Ternary, Ternary %, Time Line, Vector-Gridded, Vector-Radius/Angle, Vector-XY, XYZ Contour Fill, and XYZ Contour Line. Text Charts Bullet, Organization, Table. 3-D Charts Area, Column, Ribbon, Scatter, Scatterline, Surface Fill, Surface Line, Wireframe, True 3-D XYZ Surface Fill, True 3-D XYZ Surface Line. Statistical Charts Histogram, Ogive, Pareto, Box Plot, Survival.* Doesn't seem to have clustered stacked bars (as are a fad in computer architecture papers recently). UNKNOWN: mouse based zoom? Data management Limit: 32,000 points per data sert MUST(OK): text files. Unclear if interaction with Excell, etc., is by producing files, or by OLE automation. SAS + SAS/GRAPH Cost: no student price 150$ departmental >1000$ industrial very complicated and particular licencing - even though Intel has a licence, I can't install it on my office machine because I can't access the licence server Comments: Very clunky mainframe like interface. Lots of typing in old command language. Poor tutorial and help. Charts: MUST (FAILS): Limited types, it seems. They even use line printer examples!!!!! MUST (FAILS) (Supposed to be OK, not verified): Mouse Based Zoom: I tried to find it, could, although later SAS/GRAP technical support said it is in "Edit/Graphics/Magnify Tool". => lousy help and tutorials if I couldn't find this!!! SAS/JMP Recently acquired company => SAS will probably absorb its function into main SAS. Evaluation Demo: crippleware Poor help, no tutorial to speak of. Chart Interaction: MUST (OK): has mouse based zoom (magnify toll) (BUG: selecting magnify tool from right click menu is broken; only works when selected from toolbar) Nice GUI for spinning 3D Chart: MUST (FAILS): Limited graph types. MISSING clustered bars, stacked bars, etc. Has scattergraphs - are in help file - but I could not find how to make them. XMGR Cost: freeware Chart Interaction: MUST (OK): mouse based zooming Chart: MUST (FAILS): Limited graph types MISSING clustered bars, etc. (so write them yourself...) DEVise University of Wisconsin project Cost: freeware? Written in TCL/TK, hence supposedly extensible, Limited functionality SQL database underneath it. (will be extended to Oracle?) Chart Interaction: MUST (FAILS): no mouse based zoom. Chart: MUST (FAILS): Limited graph types. Very poor quality graphs. Sigmaplot From SPSS Demo online NICE!!!!: Tufte macro package Evaluation: demo online very badly behaved demo: requires admin to install, etc. good help and tutorial (true for all SPSS products) Data Management NICE (OK): OLE automation MUST (OK): EPS output (GIFF, etc.) Limits: 16K columns by 64K rows Chart Interaction NICE (FAILS): doesn't seem to have "click on a point to query" - i.e. doesn't highlight that point in dataset (at least not trivially) MUST (FAILS): doesn't have "mose based zoom". Actually, has "mouse based magnify", but it just magnifies, doesn't redraw axes. Chart types: lots of types, including (more on web page): note that it has grouped bar and stacked bar Q: does it have grouped stacked bar? 2D Scatter - 14 types Line - 4 types Scatter and Line - 10 types Step - 8 types Vertical Bar - 2 types Horizontal Bar - 2 types Vertical, Grouped Bar - 2 types Horizontal, Grouped Bar - 2 types Vertical, Stacked Bar Horizontal, Stacked Bar Box - 2 types Polar - 3 types Histograms - 6 types Ternary - 3 types Time-Series Bubble Pie Control Charts Needle Quadrant Population 3D Multiple, intersecting plots with hidden line removal, smooth or discrete shading, transparent or opaque fills, and light source shading 3D rotation Perspective preview Scatter Bar 3D line - trajectory and waterfall Mesh Contour S-Plus From Mathsoft Cost: 500$ Checked out web page. Extensible embedded language. Website with lots of S-plus libraries. MUST(Fails?): mouse based zoom As is usual from the silly demos online, cannot tell if it has mouse based zoom. MUST(OK): clustered bars, scatter Mathsoft also sells: Axum - 200$ - graphing Mathcad - 130$ - misc stuff Axum From Mathsoft Cost: 200$ MUST(Fails?) Mouse based zoom As is usual from the silly demos online, cannot tell if it has mouse based zoom. My guess is not. MUST(OK): clustered bars Basically cannot evaluate it from its demos or literature Statistica Evaluated: Statistica 5.1 demo MUST(FAILS?): no mouse based zoom Actually has a mouse based magnify, but it doesn't rescale or reaxis - instead it is just a visual magnify Overall feels very cluttered - not a good UI. Competent techsupport, though Systat Cost: 185$ gradpak. MUST(OK-50%as mouse based zoom Specifically, can lasso or otherwise select points, and replot so that only those points are visible. Unfortunately, the selection tools are available only for scatterplots. Wasn't able to find out how to unselect data without reloading the whole bloody data set. (The demo says "Be careful to unselect, or else you will be restricted to the selected points.) Looks possible, but looks like it will have some kluges that make it awkward to use. Statview From SPSS Demo in the mail MUST(Fails): does not have mouse based zoom (reported by SPSS sales) Excel Microsoft Although everybody tells me "you can do that in Excel", I am reasonably certain that you cannot. MUST(FAILS): no mouse based zoom Reasonable sekection of graph types. Can only handle 32K data points. dataplot http://www.itl.nist.gov/div898/software/dataplot cost: freeware? Recommended by net.folk command driven. somewhat powerful language doesn't seem to have mouse interface F77 lots of nice graphs SEMATECH standard Scilab INRIA/France Cost: freeware. Web page: http://www-rocq.inria.fr/scilab/ Seems to be command driven (based on reading the web page). Seems to be comparable to MATlab in abilities. To be evaluated --------------- Minitab Chart Interaction MUST (FAILS?): no mouse based zoom. Charts: MUST (OK): lots of graph types, language easily allows customizing, lots of examples in tutorial MATLAB Data Management: MUST (FAILS?): Only textual import seems to be M-files, which are trivially related to ascii text files, but still not the same. See: Octave Octave: MATLAB free clone? IPL Cost: freeware? Charts: MUST (OK) lots of graph types Chart interaction: MUST (FAILS): batch mode only IDL A Fortran like language advertised in IEEE magazines as producing lots of neat graphs. xplot obsolete??? Mathematica Barely possible. Net.folk say difficult to explore data with. Good internal language - anything possible. Slow for large datasets? Statsoft HiQ Demo in the mail S old stat package (from AT&T, if I remember correctly) haven't seen anyone using it. see: R R S free clone Mineset vendor: SGI Highly recommended by net.folk "could do complicated data exploration in only five clicks" cost: 20,000$ host: SGI JGraph unclear what this is. net.searching yields several shareware Java Graph tool references. Other ----- MySQL http://www.tcx.se cost: freeware? database MiniSQL cost: freeware? database From nobody Wed Feb 25 16:13:11 1998 Path: newsfeed.cv.nrao.edu!newsgate.duke.edu!nntprelay.mathworks.com!peerfeed.ncal.verio.net!news.ncal.verio.com!mlyle From: mlyle@wco.com (Michael Lyle) Newsgroups: comp.arch,comp.soft-sys.matlab,comp.soft-sys.mathematica,comp.soft-sys.sas,comp.soft-sys.stat.systat,comp.soft-sys.stat.spss,comp.graphics.apps.gnuplot,sci.data.formats Subject: Re: ISO Data Management, Graphing, and Visualization Tools Date: Wed, 25 Feb 1998 08:35:13 -0800 Organization: Flex Products, Inc. Lines: 32 Message-ID: References: <34EA32BE.5CBFB5EB@cs.wisc.edu> <34F3510A.547376D7@cs.wisc.edu> NNTP-Posting-Host: venus37.wco.com Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 8bit X-Newsreader: Yet Another NewsWatcher 2.4.0 Xref: newsfeed.cv.nrao.edu comp.arch:5232 comp.soft-sys.matlab:6711 comp.soft-sys.sas:7533 comp.soft-sys.stat.systat:1105 comp.soft-sys.stat.spss:3246 comp.graphics.apps.gnuplot:2584 sci.data.formats:240 In article <34F3510A.547376D7@cs.wisc.edu>, Andy Glew wrote: > A little while back I posted in search of Data Management, Graphing, and Data Visualization tools. > > Off and on in the past week I have been evaluating such tools, reading the glossies, > running the demos, waiting for sales and technical support to tell me that their package > does not do what I hoped it would do... > > I thought that I would share the results of this evaluation of existing software packages > with y'all. If you decide to roll your own, the X/Motif based XRT graph widget (and all his cousins in the PDS suite) is a great starting point. It meets most of your criteria for the graphing part, and is not performance and size limited. You still have to do the statistics, but you could probably interface your own stuff to one of the standard stat packages. Using X-Windows is no easy task, but you get exactly what you want, and in an environment that is reliable and portable. www.klg.com I have no affilitation with klg, just a satisfied customer. However, the one or two times I had to use their tech support, they were very unresponsive. YMMV. -- Michael Michael Lyle (mlyle@wco.com) From nobody Wed Feb 25 16:13:31 1998 Path: newsfeed.cv.nrao.edu!newsgate.duke.edu!nntprelay.mathworks.com!newsfeed.internetmci.com!4.1.16.34!cpk-news-hub1.bbnplanet.com!news.bbnplanet.com!newsfeeds.sol.net!uwm.edu!uwvax!news From: Andy Glew Newsgroups: comp.arch,comp.soft-sys.matlab,comp.soft-sys.mathematica,comp.soft-sys.sas,comp.soft-sys.stat.systat,comp.soft-sys.stat.spss,comp.graphics.apps.gnuplot,sci.data.formats Subject: Re: ISO Data Management, Graphing, and Visualization Tools Date: Wed, 25 Feb 1998 12:15:48 -0600 Organization: U Wisc CS (& Intel) Lines: 70 Message-ID: <34F45FD4.BF4E4161@cs.wisc.edu> References: <34EA32BE.5CBFB5EB@cs.wisc.edu> <34F3510A.547376D7@cs.wisc.edu> NNTP-Posting-Host: helga.cs.wisc.edu Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-Mailer: Mozilla 4.03 [en] (WinNT; I) To: Andy Glew Xref: newsfeed.cv.nrao.edu comp.arch:5242 comp.soft-sys.matlab:6718 comp.soft-sys.sas:7541 comp.soft-sys.stat.systat:1106 comp.soft-sys.stat.spss:3248 comp.graphics.apps.gnuplot:2590 sci.data.formats:241 Life is like this.... I think that I have finished my (vain) search for data management, graphing, and exploration tools, when a new one comes to my attention, first by somebody responding to my "final" post, and then in the form of an ad in Scientific Computing and Automation magazine. I just downloaded the demo for Origin 5.0 from www.microcal.com. It passes all of the tests that I can perform on the demo with flying colours. My "bottom line" section of my notes now reads: Wed Feb 25 1998: NEW BOTTOM LINE: Origin 5.0 seesm to be the winner! It has the chart interaction features I want, e.g. mouse based zoom. It has most of the chart types I want built in. Its extension language seems to be able to build the chart types that are missing - e.g. clustered bar - at least, such chart types appear in their gallery of examples. It integrates well with Excel and Windows NT. Origin 5.0 Professional appears to allow extensions, such as allowing to read new input file formats. Not clear, however, whether it is flexible enough to allow filters to be specified. More evaluation seems to need to be done by usage. So I'm going to buy it... (damn, it's expensive!). Andy Glew wrote: > A little while back I posted in search of Data Management, Graphing, and Data Visualization tools. > > Off and on in the past week I have been evaluating such tools, reading the glossies, > running the demos, waiting for sales and technical support to tell me that their package > does not do what I hoped it would do... > > I thought that I would share the results of this evaluation of existing software packages > with y'all. > > $Header: /u/g/l/glew/work/data-exploration-tool-search/RCS/evaluation-notes,v 1.4 1998/02/24 22:54:10 glew Exp $ > > Evaluation of Data Management, Graphing, and Exploration Tools > ============================================================== > > BOTTOM LINE: there is no single answer, unfortunately. > > SPSS seems to be the nicest Windows enabled data management, stats, > and graphing tool. However, it has distinct limitations when it comes > to data exploration; specifically, it can't do "mouse based zoom", or > any of the other nice things that make data exploration quicker. > > DataDesk is the nicest interactive data exploration tool, has mouse > based zoom and a whole slew of other interactive features, but it has > only an extremely limited repertoire of graphs, plots, and charts. > > JMP does more graphs than DataDesk, and is reasonably interactive, has > mouse based zoom, but not much more - but it cannot do the most common > graph types in my field (e.g. clustered bar graphs, let alone > clustered stacked bar graphs). > > None of these tools seem to do the "advanced" graphs, such as > clustered stacked bar charts. > > In an ideal world I'd buy SPSS and Datadesk, and also a good "batch > mode" package such as MatLAB to draw the really sophisticated stuff.