From nobody Wed Feb 18 09:54:24 1998
Path: newsfeed.cv.nrao.edu!newsgate.duke.edu!nntprelay.mathworks.com!newsfeed.internetmci.com!192.52.106.6!ncar!csn!nntp-xfer-1.csn.net!news-2.csn.net!not-for-mail
From: jbeal@jbeal.org (Jeremy Beal)
Newsgroups: sci.data.formats
Subject: Re: Lots of File Formats for free
Date: Wed, 18 Feb 1998 00:38:37 GMT
Organization: SuperNet Inc. +1.303.285.0194 Denver Colorado
Lines: 17
Message-ID: <34ea2d2f.2854007015@news-2.sni.net>
References: <6c6gio$9tm$1@news.netvision.net.il>
Reply-To: www.nvmedia.com/~`jbeal
NNTP-Posting-Host: 198.233.40.11
X-Newsreader: Forte Free Agent 1.1/32.230
Xref: newsfeed.cv.nrao.edu sci.data.formats:233

On Sun, 15 Feb 1998 12:37:45 +0200, "Tsahi Carmona" <tsahi@iris.co.il> wrote:

>I've got lots of file formats to share with anyone who wishes to get that
>information:
>Multimedia (JPEG, GIF, BMP, AVI, WAV...)
>Compresetions (LZEXE, ARJ, ZIP...)
>And more...
>

There is a site at http://wotsit.simsware.com/ which is dedicated to collecting
file formats. (Not necessarily scientific.) I'm sure they would appreciate any
information...


Jeremy Beal
Get my e-mail address at www.nvmedia.com/jbeal
(Tired of the damn spam)

From nobody Thu Feb 19 09:33:31 1998
Path: newsfeed.cv.nrao.edu!newsgate.duke.edu!nntprelay.mathworks.com!news-peer.gip.net!news.gsl.net!gip.net!newspump.sol.net!sol.net!uwm.edu!uwvax!news
From: Andy Glew <glew@cs.wisc.edu>
Newsgroups: comp.databases,sci.engr.mecha,uwisc.misc,sci.data.formats,comp.data.administration,sci.econ,sci.op-research
Subject: [Fwd: ISO Data Management, Graphing, and Visualization Tools]
Date: Wed, 18 Feb 1998 17:03:25 -0600
Organization: U Wisc CS (& Intel)
Lines: 530
Message-ID: <34EB68BD.6E7E7E2C@cs.wisc.edu>
NNTP-Posting-Host: helga.cs.wisc.edu
Mime-Version: 1.0
Content-Type: multipart/mixed; boundary="------------9D034D89CA46DD3D0AD0D439"
X-Mailer: Mozilla 4.03 [en] (WinNT; I)
Xref: newsfeed.cv.nrao.edu comp.databases:6499 sci.data.formats:235 comp.data.administration:466 sci.econ:11545 sci.op-research:1197

This is a multi-part message in MIME format.
--------------9D034D89CA46DD3D0AD0D439
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit

Brief
====

I seek information or recommendation for data management, graphing,
exploration and visualization tools.

Particularly hoping for

++ Graphical zoom - e.g. on a scatterplot, draw a box around points
    with mouse, and have axes re-offset and rescaled so that the
    box is fullscreen.

++ Ability to specify filter programs to be run on imported data,
    instead of reading it from a file (so that said filters can do the
    file format conversion task).

++ Database for results.

Please respond by email to glew@cs.wisc.edu

Detail
=====

Newsgroups
--------------

Friends suggested that I post the attached request, which I had sent to
my domain newsgroup, comp.arch, as well as some application software
newsgroups, to newsgroups where readers are more likely to have experience
of what I want:

sci.engr.mecha
    other engineering fields, because engineering fields other than CS
    (I'm an EE, so I dis CS) have more experience in doing good
    experimental work than CS.

sci.econ
sci.op-research
    because other fields, such as economics, may be doing this
    (SOMEBODY has to be buying SAS and SPSS - certainly not CS
    people)

sci.med
    because a lot of the examples from these software packages are
    medical

etc. etc.

Please respond by email to glew@cs.wisc.edu, since these newsgroups
are outside my reading list.



Status
-------

The attachment describes what I am looking for in detail.
Don't be scared off by the wishlist - I'll happily settle for
something like gnuplot with graphical zoom, because I can
couple its filters to an external database. If you don't ask,
you won't get...


Since I posted last night, I have received several emails
and located a software showroom where I tried SAS,
SPSS, and Minitab demos and/or tutorials.

Sumary:

SAS:
    It may be the best seller, but SAS is an antiquated dinosaur
    - mainframe, textual command oriented, lousy graphs.
    I just say no.

SPSS:
    Reasonably good GUI,
    as well as a reasonably good command language.

    [Very GOOD]: pivotting table editor

    [BAD]: no graphical mouse zoom - to change axes on a
        scatterplot, you have to type in limits.

    Unclear if database JOINs are possible in its command language.

SAS/JMP:
    SAS Institute's graphical, GUI, interactive beast.
    Unfortunately, their webpage is broken, so I haven't
    been able to try the demo.

SAS/StatVIEW:
    A company recently acquired by SAS.
    Looks good, nicely graphical.
    Unclear if it has graphical zoom.

Minitab:
    Slightly better than SAS, but still antiquated;
    more limited in analysis abilities.

Other comments are in attachments.


--------------9D034D89CA46DD3D0AD0D439
Content-Type: message/rfc822
Content-Transfer-Encoding: 7bit
Content-Disposition: inline

Path: uwvax!news
From: Andy Glew <glew@cs.wisc.edu>
Newsgroups: comp.arch
Subject: ISO Data Management, Graphing, and Visualization Tools
Date: Tue, 17 Feb 1998 19:00:46 -0600
Organization: U Wisc CS (& Intel)
Message-ID: <34EA32BE.5CBFB5EB@cs.wisc.edu>
NNTP-Posting-Host: helga.cs.wisc.edu
Mime-Version: 1.0
Content-Type: multipart/mixed; boundary="------------7EE2F594139DDCB6FBD5113B"
X-Mailer: Mozilla 4.03 [en] (WinNT; I)

This is a multi-part message in MIME format.
--------------7EE2F594139DDCB6FBD5113B
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit

This seems not to have posted from my UNIX system,
so I will fall back to posting from Netscape.

See attachment for questions about data management, graphing, and visualization tools.


--------------7EE2F594139DDCB6FBD5113B
Content-Type: text/plain; charset=us-ascii; name="k.txt"
Content-Transfer-Encoding: 7bit
Content-Disposition: inline; filename="k.txt"

Sender: glew@balder
Newsgroups: comp.arch,comp.soft-sys.matlab,comp.soft-sys.math.mathematica,comp.soft-sys.sas,comp.soft-sys.stat.spss,comp.soft-sys.stat.systat,comp.graphics.visualization,comp.graphics.apps.gnuplot
Subject: ISO data management, graphing, and visualization
From: Andrew Glew <glew@cs.wisc.edu>
Organization: CS Department, University of Wisconsin
Lines: 378
X-Newsreader: Gnus v5.3/Emacs 19.34
Date: 17 Feb 1998 18:59:25 -0600
Message-ID: <pg2d8gleniq.fsf@balder.i-have-a-misconfigured-system-so-shoot-me>


Brief
=====

I seek recommendations about software packages to manage experimental data,
and to prepare graphs and visualizations.

Short list of wished-for features:

a) underlying database - i.e. I would like to do
    joins of differing datasets

b) decent graphs: e.g. log axes, different forms of graph

c) interactive data browsing: e.g. click on a point, jump to related
    variable, drag box to control zooming in, etc.
    (but I also want good batch mode graph control)

Justification for Posting
=========================

I have probably posted to enough newsgroups to set off some spam filters.
Here's why I chose these newsgroups:

comp.arch
    My application domain is the field of computer architecture

comp.soft-sys.matlab
comp.soft-sys.math.mathematica
comp.soft-sys.sas
comp.soft-sys.stat.spss
comp.soft-sys.stat.systat
comp.graphics.apps.gnuplot
    Some applications that I know are in this field
    - I seek advice to help me choose between these, and others

comp.graphics.visualization
    The generic field of visualization


Detail
======

Context - Improving My Personal BKMs
------------------------------------

comp.arch readers may be aware that I returned to school to finish my
Ph.D.  after having worked in industry for many years.

I am finally beginning to collect experimental data for my research,
and I would like to have a system to manage the data.

I would like to improve my personal "Best Known Methods" for this task
of data management and visualization. I.e. I would like to find better
tools to do this than the Perl scripts and GNUplot that I used in my
MS 8 years ago, or the not-much-better technology that I have used in
industry.

In particular, in my last industrial stint "Excell" was considered the
state of the art for preparing graphs. I consider this distinctly
unsatisfactory - especially since Excell was extremely slow in
handling large datasets (upwards of 64000 data points), which I have
always been able to do in GNUplot.  Furthermore, I consider the imprecation
"reduce your data set" not always desirable.

I have been playing with computer performance data for more than 15
years now.  Surely the state of the art has improved somewhat?

Types of Data
-------------

The datasets that I wish to manage range from

Profiles:
    2-tuples of (address,count)
    for many of the locations in a program
    - often giving 100s of thousands of data points,
    which I wish to quickly scroll around in,
    looking at coarse scale (related to the precision of my screen)
    so that I can zoom in and out.

Simulator Results:
    Typical measurements such as IPC (Instructions per Clock)
    and estimated run time for particular benchmarks.
    100 or so parameters or measurements per benchmark.

Traces:
    Ideally, I believe that the approach I describe here (that of a database)
    could be extended to traces of events such as start,execute,retirement
    for instructions in a program => millions of measurements,
    limited mainly by disk space.

I should note that version control of measurements is an issue
- i.e. my measurements are almost never just streams of numbers,
but are, at the minimum, streams of numbers augmented with a version,
a few dates, and a comment description, which I wish to be an intrinsic
part of the data.

The Similator Result measurements are often ad-hoc - in that new metrics
are created on a daily basis, and discarded as readily. Thus tools
that make it painful to create new metric types - e.g. by requiring
a database schema to be created by hand - are undesirable.

Similarly, measurements are sparse.


Wish List - Database
--------------------

It has long seemed to me that many of the problems of data
manipulation could be handled by a Relational Database Management
system.

For example, if you have (in a very schematic notation)

    Experiment = {
	name = "Baseline",
	results = {
	    {	benchmark = spec95,126.gcc,I=test, ipc = 45 }
	    {	benchmark = spec95,126.gcc,I=ref, ipc = 22 }
	    {	benchmark = spec95,129.compress,I=test, ipc = 988 }
	    ...
	}
    }
    Experiment = {
	name = "LatestGreatIdea",
	results = {
	    {	benchmark = spec95,126.gcc,I=test, ipc = 450 }
	    {	benchmark = spec95,126.gcc,I=ref, ipc = 212 }
	    {	benchmark = spec95,129.compress,I=test, ipc = 98 }
	    ...
	}
    }

then something like an SQL join should be able to perform the comparison.

Straightforward attempts to put this into an RDBMS usually
founder on

a) the extreme slowness of RDBMS
b) cost (perhaps the MiniSQL freeware will make this okay)
c) the painful contortions that most RDBMS require in handling
    inconsistent data - i.e. I don't want to define a schema,
    my schema is in my datafiles (i.e. the tuple field names)

Nonetheless, I have hope that an RDBMS, or perhaps a OODBMS, or even
an ORDBMS, may be okay.


Wish List - Graphing
--------------------

Of course, I wish to be able to produce any graph that I can find in
the computer architecture literature, and/or those that I am familiar
with from other fields...

Wish List - GUI *and* Batch
---------------------------

I would like to be able to interactively walk into and out of my data,
drawing boxes to zoom, etc.

Tools like GNUplot seem to fall down in this regard --- it is a real
pain to have to type "set xrange [4:1000]", as opposed to using the mouse.

Wish List - Automation
----------------------

There seem to be a number of interactive tools around.
Each, of course, insists on slightly different input formats.

It is straightforward to write Perl scripts to do the conversion,
but...  it would be *really* *nice* if those scripts could be wired
into the interface, so that, e.g., MATLAB could automatically use my
.SSOUT to .MAT converter when I try read a .SSOUT file.

I.e. I don't want to automatically create every possible datafile
format.  Data management nightmare.  Rather, I want to convert as
needed.

Wish List - Good Built In Handling of Missing Data
--------------------------------------------------

As mentioned above, my data is often sparse - many metrics are not
available in all configurations.  Math systems that permit
e.g. calculations of ratios and which handle missing data nicely are
highly desirable.

I.e. can it give "NA" (Not Available) as an answer?  Giving spurious
values such as 0 (Perl's default lacking programming) is highly
undesirable and misleading.

Note, especially, that you do not want simply to join on records
where all fields are defined - explicitly listing undefined is very
useful.





Candidates
----------

The list of candidate tools for this that I am aware of includes

    DEVise
    Datadesk
    SPSS    Sigmaplot
    SAS
    Statsoft
    SYSTAT
    JMP
    IDL
    IPL
    gnuplot
    xmgr
    MATLAB
    Mathematica

I am not, however, familiar with all of these - I do not feel that I
know enough to make a good decision. Hence this post, inviting
recommendations from others. (Please reply to me personally by mail -
I leave it up to you to decide if you want to reply to the newsgroups
as well.)

I don't know much, but I have gathered some random impressions and
information about these tools that I will sumarize here - hoping that
others may correct me.

DEVise
------

"Database Exploration and Visualization".

Academic work in progress, from the University of Wisconsin.

Contains an underlying database - apparently ad-hoc coded, although
plans to interface to a standard commercial database.  Can use an SQL
subset.

Primitive graphs - doesn't even have logarithmic axes.

Limited GUI querying - click on a graphical object, see the data.
Most data exploration (e.g. range setting) seems to have to be done
via menus.

TCL/TK based => extensible? (in the sense that you can extend any
source code that you can read).

Datadesk
--------

Commercial. 

Seems to provide the most interactive data browsing:
twirling axes in 3D, zooming in and out via boxes, etc.

Linked variable windows.

Datastructures seem primitive - straight vectors.
Q: does it support structures?

MATLAB
------

Commercial.

Matrix based, but supports matrices of matrices,
and matrics with named tuple fields.

Good internal programming language. Many packages.

Theoretically database type JOINs could be written, but
don't seem to exist in standard set of add-ons.
Q: is there a database add-on?

Theoretically good interactive graph browsing could be written,
but standard graphing tools seem to be command/text based.

(Part of me seems to think that what I want is SQL for MATLAB,
with interactive graph browsing.)

SPSS, SAS
---------

Commercial.

I have used SPSS (and SAS) extensively back in the mainframe days, so
I feel confident of their computing abilities. (I am considerably less
confident of the computing abilities of many of the newer, graphics
oriented, tools.)

I am less familiar with the GUI interfaces that have been added in the
last decade. Q: do they provide the interactive features, such as drag
a box to zoom?

Q: in the old days SAS and SPSS were basically sequential file of
records oriented.  Do they have any database features, like JOIN?


IDL
---

Commercial.

Widely advertized, seems to have a general programming language
with good graphics hooks. 

Q: database?

gnuplot
-------

Freeware.

Very textual, does a good job for a very limited repertoire of graphs.

Nice in that it seems to be one of the few programs to allow filters
to be specified when importing data, e.g.

plot '< grep-ssout VariableName*100 datafile', '< grep-ssout Variable2/Variable1 datafile'

IPL
---

Seems to be mainly batch oriented.
Semi-freeware.


xmgr
----

Does the graphical "zoom in by drawing a box" thing.
Doesn't do much else.

Mathematica
-----------

Might be able to handle stuff like this, but seems to have performance
problems.  Not GUI that I can see.


Conclusion
==========

Any help in choosing tools that I can use will be appreciated.

Basically, I want something that can work out of the box, but which I
can also extend.

I am somewhat shy of investing in tools without good, detailed
explanations or demonstrations:
    a) $$$ - actually, I am willing to spend typical commercial
software prices, but I get somewhat fatigued with the hassle of
returning software when it turns out not to do what I want.  Hence, I
will appreciate people sending me in the direction of demoware
packages - ideally time limited demoware, because my experience is
that crippleware which, e.g, is limited in the data set sizes it can
manage, often looks good on the small data sets but then dies on the
large data sets.
    b) personal time - it takes a long time to learn how to use many
of these packages. I am somewhat reluctant to dedicate such time.






---
Andy "Krazy" Glew, glew@cs.wisc.edu
Place URGENT in email subject line for mail filter prioritization.
DISCLAIMER: private posting, not representative of employer.
{{ VLIW: the leading edge of the last generation of computer architecture }}


--------------7EE2F594139DDCB6FBD5113B--


--------------9D034D89CA46DD3D0AD0D439--

From nobody Wed Feb 25 13:43:25 1998
Path: newsfeed.cv.nrao.edu!newsgate.duke.edu!nntprelay.mathworks.com!nntp.abs.net!feed2.news.erols.com!erols!newsfeeds.sol.net!uwm.edu!uwvax!news
From: Andy Glew <glew@cs.wisc.edu>
Newsgroups: comp.arch,comp.soft-sys.matlab,comp.soft-sys.mathematica,comp.soft-sys.sas,comp.soft-sys.stat.systat,comp.soft-sys.stat.spss,comp.graphics.apps.gnuplot,sci.data.formats
Subject: Re: ISO Data Management, Graphing, and Visualization Tools
Date: Tue, 24 Feb 1998 17:00:27 -0600
Organization: U Wisc CS (& Intel)
Lines: 615
Message-ID: <34F3510A.547376D7@cs.wisc.edu>
References: <34EA32BE.5CBFB5EB@cs.wisc.edu>
NNTP-Posting-Host: helga.cs.wisc.edu
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
X-Mailer: Mozilla 4.03 [en] (WinNT; I)
To: Andy Glew <glew@cs.wisc.edu>
Xref: newsfeed.cv.nrao.edu comp.arch:5204 comp.soft-sys.matlab:6672 comp.soft-sys.sas:7500 comp.soft-sys.stat.systat:1100 comp.soft-sys.stat.spss:3236 comp.graphics.apps.gnuplot:2574 sci.data.formats:239

A little while back I posted in search of Data Management, Graphing, and Data Visualization tools.

Off and on in the past week I have been evaluating such tools, reading the glossies,
running the demos, waiting for sales and technical support to tell me that their package
does not do what I hoped it would do...

I thought that I would share the results of this evaluation of existing software packages
with y'all.


$Header: /u/g/l/glew/work/data-exploration-tool-search/RCS/evaluation-notes,v 1.4 1998/02/24 22:54:10 glew Exp $


Evaluation of Data Management, Graphing, and Exploration Tools
==============================================================

BOTTOM LINE: there is no single answer, unfortunately.

SPSS seems to be the nicest Windows enabled data management, stats,
and graphing tool. However, it has distinct limitations when it comes
to data exploration; specifically, it can't do "mouse based zoom", or
any of the other nice things that make data exploration quicker.

DataDesk is the nicest interactive data exploration tool, has mouse
based zoom and a whole slew of other interactive features, but it has
only an extremely limited repertoire of graphs, plots, and charts.

JMP does more graphs than DataDesk, and is reasonably interactive, has
mouse based zoom, but not much more - but it cannot do the most common
graph types in my field (e.g. clustered bar graphs, let alone
clustered stacked bar graphs).

None of these tools seem to do the "advanced" graphs, such as
clustered stacked bar charts.

In an ideal world I'd buy SPSS and Datadesk, and also a good "batch
mode" package such as MatLAB to draw the really sophisticated stuff.




Requirements
============

    As I go through this process, I am learning more about what
    I want in a tool.



    Chart Interaction

 mouse based zoom
     draw a box around some data points on the screen,
     and have axes adjusted so that box fills full screen.
  Many (most) tools do not have a mouse based zoom,
     but instead require axis limits to be typed
     into a dialog box, or worse.
  Occasionally hides as a magnifying glasstool,
     or as lasso or point selection.



 select/query
     select points, and be told



    Chart Types
 MUST
     axis types
  {linear,log} x, {linear,log} y

 MUST

     {scatter, lines {w/wo symbols}} multiple datasets
  editable colors, symbols, linestyles, etc., assumed

     bar {vertical, horizontal}

  clustered bar
  stacked bar
  stacked clustered bar

 WANT
     ability to add comments (arrows, text, etc.) to charts
     to explain features.

 NICE
     overlay arbitrary graphs on top of each other

     3D graphs - 2D arry of bars, etc.

     thumbnails

     small multiples
  (typically of scatterplots)


 Survey: I scanned quickly over the last volume of ISCA and MICRO
     conference journals to see what sorts of graphs are in use.
     Briefly:   Micro97     ISCA97
  XY Scatter  7     6
  XY line graphs  30     103
  Simple Bar Graphs 7     2
  Clustered Bar graphs 73     38
  Cluster Cluster Bar 8     0
  Stacked Bar graphs 22     13
  Clustered Stacked Bars 20     8
  3D surface plots 2     0



    Data Management:
 MUST
     Import arbitrary textfiles in some reasonable
     format (CSV,TSV,WSV, etc.)

 WANT
     Import from a pipe, or through a program filter
     - so that can write a program to extract and reformat
     data from the database of a different program
     (and thus avoid the problems of proliferating data formats,
     temporary files, etc.)
 NICE
     Import Excel.
     OLE automation on import.

 WANT
     Ability to embed comments in a file - so that
     can track data provenance, etc.

 WANT
     No limit on points in dataset - e.g. I have prepared
     graphs with 200,000 points in the past.
     However, it is often hard to verify, from demoware,
     what the capabilities are.


    Output:
 MUST:
     EPS (Emeddable Post Script)
 OR MUST:
     Windows objects that can be embedded in Word
     (OLE automation).

 MUST:
     Finally produce Postscript viewable by Ghostview
     (for acceptance by conferences).







Products
========

Evaluated
---------

DataDesk
    Good magazine reviews

    Evaluation: crippleware demo downloaded from web

    Graph Interaction: pretty good

 MUST(OK): graphical zoom
     Seems to be the most interactive package that I have tried so far.

 Overall, has the nicest forms of interaction with the graph I have seen.
 Not only does it have mouse based zoom (box based),
 but it also has the ability to drag the graph around, changing
 i.e. changing xmin,xmax,ymin,ymax without changing the scale,
 and other interactive features.

    Graph Types: poor

 MUST(FAILS): LIMITED GRAPH TYPES
     Unfortunately, while it seems to be very interactive,
     it does not seem to have a good selection of graph/plot/chart types.
     For example, its concept of lineplots is to plot the
     variable as Y against the case number as X. I.e. it does not
     have XY lineplots.

     Similarly, Datadesk's barcharts are restricted in a similar manner,
     and understand nothing of clustering, let alone stacking, etc.

    While I might purchase DataDesk just for data exploration of scatterplots,
    I definitely cannot consider it as the answer to my graphing needs.
    Particularly not at its cost (600$ or so).

    Klugey User Interface: originally MAC based,
 not at all natural feeling in the Windows environment.


SPSS

    Cost: 175$, as a "student gradpack".

    Comments:

 SPSS 7 and 8 look reasonably good on Windows
 - much better than the old interface, which carried
 over from mainframes.

 However, still have access to the old command language
 - can run from command line, or within a procedure automation tool -
 which is good.

 All of the SPSS products (SPSS, Sigmaplot, etc.)
 seem to have good tutorials and help.

    Database
 still SPSS sequential files
 ???? unclear if can do a JOIN

    Pivoting table editor - VERY GOOD!!!

    Graphs/Charts - a reasonable variety of graphs.
 Q: can it do clustered bars, stacked bars, clustered stacked bars?

 BAD: doesn't seem to be able to do mouse based zooms.

SPSS/Deltagraph
    Cost: 295$
    Recently acquired company.

    http://www.spss.com/software/DeltaGraph/

    Comment:
 Seems to be the "PowerPoint" of PC graphing packages
 - produces sexy looking graphs, fancy backgrounds, etc.
 Seems to be recommended by many social science departments
 (psychology, demographics).

    Chart Types

                     2-D Charts Area, Area %, Bar, Bar-Stacked,
                     Bar-Floating,* Bar-Segmentation,* Bar-Stacked
                     Segmentation,* Bubble, Build-Up,*
                     Build-Up-Stacked,* Column, Column-Stacked,
                     Column-Segmentation,* Column-Stacked
                     Segmentation,* Column-Floating,* Column-XY,*
                     Combination, Contour Fill, Contour Line, Double X
                     axis, Double XY axes, Double Y axis, High-Low,
                     High Low Open Close (Candlestick and Whisker),
                     Line, Line-Filled, Line-XY, Line-Paired XY,
                     Pictograph, Pie, Pie-Donut, Pie-Multiple, Pie-Stacked,
                     Polar, Quality Control X bar r, Quality Control X bar
                     s, Quality Control p, Quality Control n, Quality
                     Control u, Quality Control c, Radar, Range, Scatter,
                     Scatter-Paired XY, Scatter (with optional droplines),
                     Spider, Step, Table, Ternary, Ternary %, Time Line,
                     Vector-Gridded, Vector-Radius/Angle, Vector-XY,
                     XYZ Contour Fill, and XYZ Contour Line.

                     Text Charts Bullet, Organization, Table.

                     3-D Charts Area, Column, Ribbon, Scatter,
                     Scatterline, Surface Fill, Surface Line, Wireframe,
                     True 3-D XYZ Surface Fill, True 3-D XYZ Surface
                     Line.

                     Statistical Charts Histogram, Ogive, Pareto, Box
                     Plot, Survival.*

     Doesn't seem to have clustered stacked bars (as are a fad
     in computer architecture papers recently).

     UNKNOWN: mouse based zoom?

    Data management

 Limit: 32,000 points per data sert

 MUST(OK): text files.
     Unclear if interaction with Excell, etc., is by
     producing files, or by OLE automation.



SAS + SAS/GRAPH
    Cost:
 no student price
 150$ departmental
 >1000$ industrial
 very complicated and particular licencing
     - even though Intel has a licence,
     I can't install it on my office machine
     because I can't access the licence server

    Comments:
 Very clunky mainframe like interface.
 Lots of typing in old command language.
 Poor tutorial and help.

    Charts:
 MUST (FAILS):
     Limited types, it seems.
     They even use line printer examples!!!!!

 MUST (FAILS) (Supposed to be OK, not verified):
     Mouse Based Zoom:
  I tried to find it, could,
  although later SAS/GRAP technical support
  said it is in
  "Edit/Graphics/Magnify Tool".

  => lousy help and tutorials if I couldn't find this!!!

SAS/JMP
    Recently acquired company
    => SAS will probably absorb its function into main SAS.

    Evaluation
 Demo: crippleware

 Poor help, no tutorial to speak of.

    Chart Interaction:
 MUST (OK): has mouse based zoom (magnify toll)
     (BUG: selecting magnify tool from right click menu is broken;
     only works when selected from toolbar)
 Nice GUI for spinning 3D

    Chart:
 MUST (FAILS):
     Limited graph types.
     MISSING clustered bars, stacked bars, etc.
     Has scattergraphs - are in help file
  - but I could not find how to make them.

XMGR
    Cost: freeware
    Chart Interaction:
 MUST (OK): mouse based zooming
    Chart:
 MUST (FAILS):
     Limited graph types
     MISSING clustered bars, etc.
  (so write them yourself...)

DEVise
    University of Wisconsin project
    Cost: freeware?

    Written in TCL/TK, hence supposedly extensible,

    Limited functionality SQL database underneath it.
    (will be extended to Oracle?)

    Chart Interaction:
 MUST (FAILS): no mouse based zoom.

    Chart:
 MUST (FAILS):
     Limited graph types.
     Very poor quality graphs.



Sigmaplot
    From SPSS
    Demo online
    NICE!!!!: Tufte macro package

    Evaluation:
 demo online

 very badly behaved demo: requires admin to install, etc.

 good help and tutorial (true for all SPSS products)

    Data Management
 NICE (OK): OLE automation
 MUST (OK): EPS output (GIFF, etc.)
 Limits: 16K columns by 64K rows

    Chart Interaction
 NICE (FAILS): doesn't seem to have "click on a point
     to query" - i.e. doesn't highlight that
     point in dataset (at least not trivially)
 MUST (FAILS): doesn't have "mose based zoom".
     Actually, has "mouse based magnify",
     but it just magnifies, doesn't redraw axes.

    Chart types:
     lots of types, including (more on web page):
     note that it has grouped bar and stacked bar
  Q: does it have grouped stacked bar?

                    2D
                        Scatter - 14 types
                        Line - 4 types
                        Scatter and Line - 10 types
                        Step - 8 types
                        Vertical Bar - 2 types
                        Horizontal Bar - 2 types
                        Vertical, Grouped Bar - 2
                        types
                        Horizontal, Grouped Bar -
                        2 types
                        Vertical, Stacked Bar
                        Horizontal, Stacked Bar
                        Box - 2 types
                        Polar - 3 types
                        Histograms - 6 types
                        Ternary - 3 types
                        Time-Series
                        Bubble
                        Pie
                        Control Charts
                        Needle
                        Quadrant
                        Population
                    3D
                        Multiple, intersecting plots
                        with hidden line removal,
                        smooth or discrete shading,
                        transparent or opaque fills,
                        and light source shading
                        3D rotation
                        Perspective preview
                        Scatter
                        Bar
                        3D line - trajectory and
                        waterfall
                        Mesh
                        Contour


S-Plus
    From Mathsoft
    Cost: 500$

    Checked out web page. Extensible embedded language.
    Website with lots of S-plus libraries.

    MUST(Fails?): mouse based zoom
 As is usual from the silly demos online, cannot tell if
 it has mouse based zoom.

    MUST(OK): clustered bars, scatter

    Mathsoft also sells:
 Axum - 200$ - graphing
 Mathcad - 130$ - misc stuff


Axum
    From Mathsoft
    Cost: 200$

    MUST(Fails?) Mouse based zoom
 As is usual from the silly demos online, cannot tell if
 it has mouse based zoom. My guess is not.

    MUST(OK): clustered bars

    Basically cannot evaluate it from its demos or literature


Statistica
    Evaluated: Statistica 5.1 demo
    MUST(FAILS?): no mouse based zoom
 Actually has a mouse based magnify,
 but it doesn't rescale or reaxis - instead
 it is just a visual magnify
    Overall feels very cluttered - not a good UI.
    Competent techsupport, though

Systat
    Cost: 185$ gradpak.
    MUST(OK-50%as mouse based zoom
 Specifically, can lasso or otherwise select points,
 and replot so that only those points are visible.

 Unfortunately, the selection tools are available only for scatterplots.

 Wasn't able to find out how to unselect data without
 reloading the whole bloody data set.
 (The demo says "Be careful to unselect, or else you
 will be restricted to the selected points.)

    Looks possible, but looks like it will have some kluges
    that make it awkward to use.


Statview
    From SPSS
    Demo in the mail

    MUST(Fails): does not have mouse based zoom
 (reported by SPSS sales)

Excel
    Microsoft

    Although everybody tells me "you can do that in Excel",
    I am reasonably certain that you cannot.

    MUST(FAILS): no mouse based zoom

    Reasonable sekection of graph types.

    Can only handle 32K data points.


dataplot
    http://www.itl.nist.gov/div898/software/dataplot
    cost: freeware?
    Recommended by net.folk

    command driven.
 somewhat powerful language
 doesn't seem to have mouse interface
    F77

    lots of nice graphs
 SEMATECH standard


Scilab
    INRIA/France
    Cost: freeware.
    Web page:  http://www-rocq.inria.fr/scilab/
    Seems to be command driven (based on reading the web page).
    Seems to be comparable to MATlab in abilities.



To be evaluated
---------------

Minitab
    Chart Interaction
 MUST (FAILS?): no mouse based zoom.

    Charts:
 MUST (OK): lots of graph types,
     language easily allows customizing,
     lots of examples in tutorial

MATLAB
    Data Management:
 MUST (FAILS?):
     Only textual import seems to be M-files,
     which are trivially related to ascii text files,
     but still not the same.
    See: Octave

Octave:
    MATLAB free clone?

IPL
    Cost: freeware?
    Charts: MUST (OK) lots of graph types
    Chart interaction: MUST (FAILS): batch mode only

IDL
    A Fortran like language advertised in IEEE magazines
    as producing lots of neat graphs.


xplot
    obsolete???


Mathematica
    Barely possible.
    Net.folk say difficult to explore data with.
    Good internal language - anything possible.
    Slow for large datasets?

Statsoft

HiQ
    Demo in the mail

S
    old stat package (from AT&T, if I remember correctly)
    haven't seen anyone using it.
    see: R

R
    S free clone

Mineset
    vendor: SGI
    Highly recommended by net.folk "could do complicated data
 exploration in only five clicks"
    cost: 20,000$
    host: SGI

JGraph
    unclear what this is.
    net.searching yields several shareware Java Graph tool references.


Other
-----

MySQL
    http://www.tcx.se
    cost: freeware?
    database

MiniSQL
    cost: freeware?
    database



From nobody Wed Feb 25 16:13:11 1998
Path: newsfeed.cv.nrao.edu!newsgate.duke.edu!nntprelay.mathworks.com!peerfeed.ncal.verio.net!news.ncal.verio.com!mlyle
From: mlyle@wco.com (Michael Lyle)
Newsgroups: comp.arch,comp.soft-sys.matlab,comp.soft-sys.mathematica,comp.soft-sys.sas,comp.soft-sys.stat.systat,comp.soft-sys.stat.spss,comp.graphics.apps.gnuplot,sci.data.formats
Subject: Re: ISO Data Management, Graphing, and Visualization Tools
Date: Wed, 25 Feb 1998 08:35:13 -0800
Organization: Flex Products, Inc.
Lines: 32
Message-ID: <mlyle-ya02408000R2502980835130001@news.wco.com>
References: <34EA32BE.5CBFB5EB@cs.wisc.edu> <34F3510A.547376D7@cs.wisc.edu>
NNTP-Posting-Host: venus37.wco.com
Mime-Version: 1.0
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: 8bit
X-Newsreader: Yet Another NewsWatcher 2.4.0
Xref: newsfeed.cv.nrao.edu comp.arch:5232 comp.soft-sys.matlab:6711 comp.soft-sys.sas:7533 comp.soft-sys.stat.systat:1105 comp.soft-sys.stat.spss:3246 comp.graphics.apps.gnuplot:2584 sci.data.formats:240

In article <34F3510A.547376D7@cs.wisc.edu>, Andy Glew <glew@cs.wisc.edu> wrote:

> A little while back I posted in search of Data Management, Graphing, and
Data Visualization tools.
> 
> Off and on in the past week I have been evaluating such tools, reading
the glossies,
> running the demos, waiting for sales and technical support to tell me
that their package
> does not do what I hoped it would do...
> 
> I thought that I would share the results of this evaluation of existing
software packages
> with y'all.

If you decide to roll your own, the X/Motif based XRT graph widget (and all
his cousins in the PDS suite) is a great starting point.  It meets most of
your criteria for the graphing part, and is not performance and size
limited.  You still have to do the statistics, but you could probably
interface your own stuff to one of the standard stat packages.  Using
X-Windows is no easy task, but you get exactly what you want, and in an
environment that is reliable and portable.

www.klg.com

I have no affilitation with klg, just a satisfied customer.  However, the
one or two times I had to use their tech support, they were very
unresponsive.  YMMV.

-- 
Michael
Michael Lyle (mlyle@wco.com)

From nobody Wed Feb 25 16:13:31 1998
Path: newsfeed.cv.nrao.edu!newsgate.duke.edu!nntprelay.mathworks.com!newsfeed.internetmci.com!4.1.16.34!cpk-news-hub1.bbnplanet.com!news.bbnplanet.com!newsfeeds.sol.net!uwm.edu!uwvax!news
From: Andy Glew <glew@cs.wisc.edu>
Newsgroups: comp.arch,comp.soft-sys.matlab,comp.soft-sys.mathematica,comp.soft-sys.sas,comp.soft-sys.stat.systat,comp.soft-sys.stat.spss,comp.graphics.apps.gnuplot,sci.data.formats
Subject: Re: ISO Data Management, Graphing, and Visualization Tools
Date: Wed, 25 Feb 1998 12:15:48 -0600
Organization: U Wisc CS (& Intel)
Lines: 70
Message-ID: <34F45FD4.BF4E4161@cs.wisc.edu>
References: <34EA32BE.5CBFB5EB@cs.wisc.edu> <34F3510A.547376D7@cs.wisc.edu>
NNTP-Posting-Host: helga.cs.wisc.edu
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
X-Mailer: Mozilla 4.03 [en] (WinNT; I)
To: Andy Glew <glew@cs.wisc.edu>
Xref: newsfeed.cv.nrao.edu comp.arch:5242 comp.soft-sys.matlab:6718 comp.soft-sys.sas:7541 comp.soft-sys.stat.systat:1106 comp.soft-sys.stat.spss:3248 comp.graphics.apps.gnuplot:2590 sci.data.formats:241

Life is like this....

I think that I have finished my (vain) search for data management, graphing,
and exploration tools, when a new one comes to my attention, first by somebody
responding to my "final" post, and then in the form of an ad in Scientific Computing
and Automation magazine.

I just downloaded the demo for Origin 5.0 from www.microcal.com.
It passes all of the tests that I can perform on the demo with flying colours.

My "bottom line" section of my notes now reads:


Wed Feb 25 1998: NEW BOTTOM LINE: Origin 5.0 seesm to be the winner!
It has the chart interaction features I want, e.g. mouse based zoom.
It has most of the chart types I want built in.
Its extension language seems to be able to build the chart
    types that are missing - e.g. clustered bar
    - at least, such chart types appear in their gallery of examples.
It integrates well with Excel and Windows NT.
Origin 5.0 Professional appears to allow extensions,
    such as allowing to read new input file formats.
    Not clear, however, whether it is flexible enough to
    allow filters to be specified.

More evaluation seems to need to be done by usage. So I'm
going to buy it... (damn, it's expensive!).



Andy Glew wrote:

> A little while back I posted in search of Data Management, Graphing, and Data Visualization tools.
>
> Off and on in the past week I have been evaluating such tools, reading the glossies,
> running the demos, waiting for sales and technical support to tell me that their package
> does not do what I hoped it would do...
>
> I thought that I would share the results of this evaluation of existing software packages
> with y'all.
>
> $Header: /u/g/l/glew/work/data-exploration-tool-search/RCS/evaluation-notes,v 1.4 1998/02/24 22:54:10 glew Exp $
>
> Evaluation of Data Management, Graphing, and Exploration Tools
> ==============================================================
>
> BOTTOM LINE: there is no single answer, unfortunately.
>
> SPSS seems to be the nicest Windows enabled data management, stats,
> and graphing tool. However, it has distinct limitations when it comes
> to data exploration; specifically, it can't do "mouse based zoom", or
> any of the other nice things that make data exploration quicker.
>
> DataDesk is the nicest interactive data exploration tool, has mouse
> based zoom and a whole slew of other interactive features, but it has
> only an extremely limited repertoire of graphs, plots, and charts.
>
> JMP does more graphs than DataDesk, and is reasonably interactive, has
> mouse based zoom, but not much more - but it cannot do the most common
> graph types in my field (e.g. clustered bar graphs, let alone
> clustered stacked bar graphs).
>
> None of these tools seem to do the "advanced" graphs, such as
> clustered stacked bar charts.
>
> In an ideal world I'd buy SPSS and Datadesk, and also a good "batch
> mode" package such as MatLAB to draw the really sophisticated stuff.



