From GROTH at pupgg.princeton.edu Thu Oct 10 23:23:04 1991
X-VM-Message-Order:
	(18 19 21 20 10 11 15 22 23 24 25 27 28 29 30
	 9 1 2 5 6 7 8 13 14 16 3 4 12 17 26)
X-VM-Summary-Format: "%n %*%a %-17.17F %-3.3m %2d %4l/%-5c %I\"%s\"\n"
X-VM-Labels: nil
Status: RO
X-VM-v5-Data: ([nil nil nil nil nil nil nil nil nil]
	["866" "" "8" "October" "91" "21:35:47" "GMT" "Edward J. Groth   609-258-4361" "GROTH at pupgg.princeton.edu " "<2539367 at toto.iv>" "23" "Legal line in header???" "^From:" nil nil "10" "1991100821:35:47" "Legal line in header???" (number " " mark "     Edward J. Groth   Oct  8   23/866   " thread-indent "\"Legal line in header???\"\n") nil]
	nil)
X-VM-VHeader: ("Resent-" "From:" "Sender:" "To:" "Apparently-To:" "Cc:" "Subject:" "Date:") nil
X-VM-Bookmark: 9
Newsgroups: alt.sci.astro.fits
Reply-To: groth at pupgg.princeton.edu
Organization: Physics Department, Princeton University
Nntp-Posting-Host: pupggg.princeton.edu
From: GROTH at pupgg.princeton.edu (Edward J. Groth   609-258-4361)
Subject: Legal line in header???
Date: 8 Oct 91 21:35:47 GMT

I've just come across a FITS header with a line of the following
form:

IDENT4  = R.A 00H  -19 DEC +89 /title of map

I thought what could follow an equals sign was T, F, a legal
integer, a legal floating point number with exponent field if
desired, or alphanumeric data enclosed in ' '.

If so, the above line is illegal.  

So, is the line legal or not.  If it's legal, what are the syntax
rules for header lines.

				- Ed


/----------------------------------------------------------------------\
| Edward J. Groth            | Phone: 609-258-4361                     |
| Physics Dept., Jadwin Hall | Fax:   609-258-1124                     |
| Princeton University       | SPAN/HEPNET:  PUPGG::GROTH=44117::GROTH |
| Princeton, NJ 08544        | Internet:     groth at pupgg.princeton.edu |
\----------------------------------------------------------------------/

From bschlesinger at nssdcb.gsfc.nasa.gov Thu Oct 10 23:23:15 1991
Status: RO
X-VM-v5-Data: ([nil nil nil nil nil nil nil nil nil]
	["1966" "" "9" "October" "91" "17:11:00" "GMT" "Barry Schlesinger" "bschlesinger at nssdcb.gsfc.nasa.gov " "<1100956 at toto.iv>" "39" "Re: Legal line in header???" "^From:" nil nil "10" "1991100917:11:00" "Legal line in header???" (number " " mark "     Barry Schlesinger Oct  9   39/1966  " thread-indent "\"Re: Legal line in header???\"\n") nil]
	nil)
Newsgroups: alt.sci.astro.fits
Organization: NASA - Goddard Space Flight Center
News-Software: VAX/VMS VNEWS 1.41
Nntp-Posting-Host: nssdcb.gsfc.nasa.gov
From: bschlesinger at nssdcb.gsfc.nasa.gov (Barry Schlesinger)
Subject: Re: Legal line in header???
Date: 9 Oct 91 17:11:00 GMT

In article <15029 at princeton.Princeton.EDU>, groth at pupgg.princeton.edu writes...
>I've just come across a FITS header with a line of the following
>form:
> 
>IDENT4  = R.A 00H  -19 DEC +89 /title of map
> 
>I thought what could follow an equals sign was T, F, a legal
>integer, a legal floating point number with exponent field if
>desired, or alphanumeric data enclosed in ' '.
> 
>If so, the above line is illegal.  
> 
>So, is the line legal or not.  If it's legal, what are the syntax
>rules for header lines.

	The fixed format for keyword value, defined in the original
FITS paper, is strictly required only for mandatory keywords.  Such 
keywords would include the required keywords for the primary header, 
the keywords required by the agreement on generalized extensions, and 
the keywords required for individual extension types.  For other 
keywords, the fixed format is not required but strongly recommended.
	As a matter of practice, I would urge that the fixed format be 
used except in cases where it is clearly impossible to do so, such as 
a double precision floating point number in exponential form that 
requires more that 20 columns. 
	For the case in question, strictly speaking, the line is
permissible.  It appears to be intended to be a character string.  In
that case, I would urge the writers to reorganize the value in a way
that allows them to surround it with quotes.  (Another, somewhat
obscure, interpretation is that the entire field starting in column 9
is a comment.  If the keyword has a value, than column 9 must contain
an equal sign. However, the presence of an equal sign in column 9 does
not require that the keyword have a value!  I would strongly recommend
against putting an equal sign in column 9 when there is no value,
because it will lead to confusion.) 
	So in summary, I would say that the line is "legal".  But not 
everything that is legal is good practice.
				Barry Schlesinger
				NSSDC/NOST FITS Support Office

From dwells at fits.cx.nrao.edu Thu Oct 10 23:23:19 1991
Status: RO
X-VM-v5-Data: ([nil nil nil nil nil nil nil nil nil]
	["2340" "Wed" "9" "October" "1991" "22:13:40" "GMT" "Don Wells" "dwells at fits.cx.nrao.edu " "<6548567 at toto.iv>" "44" "Re: Legal line in header???" "^From:" nil nil "10" "1991100922:13:40" "Legal line in header???" (number " " mark "     Don Wells         Oct  9   44/2340  " thread-indent "\"Re: Legal line in header???\"\n") nil]
	nil)
Newsgroups: alt.sci.astro.fits
In-Reply-To: bschlesinger at nssdcb.gsfc.nasa.gov's message of 9 Oct 91 17: 11:00 GMT
Organization: National Radio Astronomy Observatory, Charlottesville, VA
From: dwells at fits.cx.nrao.edu (Don Wells)
Subject: Re: Legal line in header???
Date: Wed, 9 Oct 1991 22:13:40 GMT

In article <9OCT199112110371 at nssdcb.gsfc.nasa.gov>
bschlesinger at nssdcb.gsfc.nasa.gov (Barry Schlesinger) writes:

 BS> ... the case in question... appears to be intended to be a
 BS> character string.... the presence of an equal sign in column 9 ...
 BS> [FITS] does not require that the keyword have a value ... in summary, 
 BS> I would say that the line is "legal".  

I myself would say that it should be illegal for a header line to have
an equal sign in column 9 with a value field in an invalid format.
Reasonable people can easily differ about this (as Barry and I are
doing), because the FITS papers are ambiguous on this subtle point, as
on many others.

I just checked NOST-100-0.2f (14May91), the NASA draft standard for
FITS. Its wording is also somewhat ambiguous. In 5.1.1 (Syntax) it
says "if a value is present, column 9 shall contain an equal sign...
column 10 ... [a] blank, and columns 11-80 ... as specified in Section
[5.3]. If no value is present... 9-80 may contain any ASCII text."
(actually 5.1.1 says the "remainder" of 5.2, but I think it really
means 5.3 --- this should be checked). While 5.1.1 does not explicitly
say that an = in 9 *requires* a valid value, it does seem to me that
that is the most plausible reading of the wording. The wording of
5.1.2.2 (Value Indicator (bytes 9-10)) is consistent with this
interpretation.  Section 5.2.3.1 (Additional Keywords, Requirements)
also appears to be consistent, but is somewhat more ambiguous. The
wording of 5.3 (Value), interpreted in the context of the whole of
Section 5 (Headers), is clearly restricting value fields to 5 possible
formats with syntactic rules taken from Fortran-77 list-directed READ:
string ('-quoted), logical, integer, float and complex (two floats),
with /-delimited comments.

I cannot think of any future expansion need that would make it useful
to allow arbitrary strings behind an equal sign in column 9.

I recommend that the NOST Technical Panel review this technical point,
and provide an explicit rule in Sections 5.x of NOST-100-x.x.

--

Donald C. Wells             Associate Scientist        dwells at nrao.edu
National Radio Astronomy Observatory                   +1-804-296-0277
520 Edgemont Road                                 Fax= +1-804-296-0278
Charlottesville, Virginia 22903-2475 USA            78:31.1W, 38:02.2N 

From dwells at fits.cx.nrao.edu Thu Oct 10 23:23:22 1991
Status: RO
X-VM-v5-Data: ([nil nil nil nil nil nil nil nil nil]
	["1107" "Thu" "10" "October" "1991" "19:48:57" "GMT" "Don Wells" "dwells at fits.cx.nrao.edu " "<1169533 at toto.iv>" "25" "Re: Legal line in header???" "^From:" nil nil "10" "1991101019:48:57" "Legal line in header???" (number " " mark "     Don Wells         Oct 10   25/1107  " thread-indent "\"Re: Legal line in header???\"\n") nil]
	nil)
Newsgroups: alt.sci.astro.fits
In-Reply-To: dwells at fits.cx.nrao.edu's message of Wed, 9 Oct 1991 22: 13:40 GMT
Organization: National Radio Astronomy Observatory, Charlottesville, VA
From: dwells at fits.cx.nrao.edu (Don Wells)
Subject: Re: Legal line in header???
Date: Thu, 10 Oct 1991 19:48:57 GMT

In article <DWELLS.91Oct9171340 at fits.cx.nrao.edu>
dwells at fits.cx.nrao.edu (Don Wells) writes:

 DW> I myself would say that it should be illegal for a header line to
 DW> have an equal sign in column 9 with a value field in an invalid
 DW> format.  

Barry Schlesinger has informed me that Preben Grosbol has pointed out
an exception to the interpretaton which I suggested: 

Section 5.2.2.4 ("Commentary Keywords") of the NOST document specifies
the following rule for keywords COMMENT, HISTORY and blank: "this
keyword shall have no associated value; columns 9-80 may contain any
ASCII text". This rule allows an "=" in column 9 with columns 10-80
containing an invalid format. 

I recommend that the NOST Technical Panel revise the text so that
these three cases will be an explicit exception to an explicit rule.

--

Donald C. Wells             Associate Scientist        dwells at nrao.edu
National Radio Astronomy Observatory                   +1-804-296-0277
520 Edgemont Road                                 Fax= +1-804-296-0278
Charlottesville, Virginia 22903-2475 USA            78:31.1W, 38:02.2N 

From pence at heawk1.gsfc.nasa.gov Wed Oct 23 10:25:37 1991
Status: RO
X-VM-v5-Data: ([nil nil nil nil nil nil nil nil nil]
	["7919" "Thu" "3" "October" "1991" "21:06:21" "GMT" " William Pence" "pence at heawk1.gsfc.nasa.gov " "<1619644 at toto.iv>" "133" "Comments on 'Binary Table Extension' draft" "^From:" nil nil "10" "1991100321:06:21" "Comments on 'Binary Table Extension' draft" (number " " mark "      William Pence    Oct  3  133/7919  " thread-indent "\"Comments on 'Binary Table Extension' draft\"\n") nil]
	nil)
Newsgroups: alt.sci.astro.fits
Keywords: fits
Nntp-Posting-Host: heawk1.gsfc.nasa.gov
Organization: Goddard Space Flight Center
From: pence at heawk1.gsfc.nasa.gov ( William Pence)
Subject: Comments on 'Binary Table Extension' draft
Date: Thu, 3 Oct 1991 21:06:21 GMT


Comments on the "Binary Table Extension to FITS" Draft by Cotton and Tody
dated 20 September 1991:

For the most part, I have found that the binary table format proposed
in this document to be unambiguously defined and straight forward to
implement.  It appears general enough to be able to handle virtually
any type or format of data.  I was able to  support all the binary
table features in the FITSIO software package  except for the 'Variable
length array facility' which will be added in the near future.  I have
just a couple suggestion for changes to this draft:

1)  Sections 5.7 and 5.8 states:  'The IEEE NaN (not a number) values are
used to indicate an invalid number; ... All IEEE special values are recognized.'

I have 2 comments on this.  First, why we do we need more than one value
to represent an undefined pixel?  An IEEE NaN value is defined to have
all the exponent bits set to 1 and a non-zero set of fractional bits
which is a very large set of different possible values.  This makes it
more difficult for FITS readers, especially those written in Fortran,
to test if a pixel is undefined.  It would be very much simpler if FITS
were to adopt a single NaN value to represent undefined values; an
obvious choice would be to adopt a value with ALL bits = 1 (which is
equivalent to a value of -1 in 2's complement integer notation).  This
would have the advantage that the value has the same representation on
every type of computer (you don't have to worry about where the
exponent bits are on machines with swapped bytes)  and it would make it
much more efficient for FITS readers to test for undefined values.  (I
realize that the FITS definition for primary array states that any NaN
value may be used to represent an undefined value and it may be too
late to change this now.  But this does not imply that we have to adopt
the same convention for the binary table extension).

Secondly, what does it mean that 'All IEEE special values are recognized'?
If a pixel has the IEEE value representing + or - infinity, for instance, 
what is a FITS reader supposed to do with it?   These special values are 
not supported on all machines, so the FITS reader can't simply pass them
on without possible loss of information.

2)  The first paragraph of section 5 states: 'The last 2880-byte record
should be zero byte filled past the end of valid data.'  I believe that
this is an overly strict requirement, and is in fact unenforceable. 
Are FITS reading programs supposed to read these bytes to be sure they
are zero, and if not, refuse to read the rest of the FITS file??  I
would suggest modifying this sentence to state: 'The value of any bytes
in the last 2880-byte record past the end of valid data are undefined
and FITS reading software should make no assumptions as to their values.'
This is a rather trivial objection for static FITS files, but is more
important in applications where a FITS file is used in a dynamic real
time environment where the contents are frequently modified (e.g.,
in a database type application).  It might impose needless overhead
on the software if it has to explicitly reset these bytes to zero
any time the length of the table is shortened.

3)  The biggest problem I have with the current Binary Table draft has
to do with the way character fields are handled.   Basically this
standard treats these as arrays of single characters rather than as
strings.  I believe that this is a too primitive level for most FITS
applications which would prefer to treat a string of characters as a
single unit.  This becomes very important when one has to deal with
arrays of strings (and not single characters) in a table.  It is
perhaps clearer to illustrate the problem with an example:

Suppose a table contains a column with TFORM = '20A', i.e, each row of
the column contains 20 characters.  Now suppose a user calls a fortran
subroutine (in the FITSIO package for example) to read values from this
column.  The user passes a character string array to the subroutine
(for the sake of argument, suppose the user has declared the character
array as CHARACTER*10 CARRAY(50)) and the user expects the subroutine
to fill up the array with the first 50 strings contained in the FITS
table column.  But the subroutine cannot do this because it only knows
that there are a total of 20 characters in each row, but has no idea if
these should be interpreted as a single 20-character string, 2
10-character strings, 4 5-character strings, etc.  Thus, the subroutine
cannot pack characters from the FITS column into the user's Fortran
string array because it does not know the length of each FITS string.

The draft document proposes a way to solve this problem by defining
another keyword in the header, e.g.,  TDIM = '(5,4)',  to signify that
the 20A column is to be interpreted as 4 substrings of 5 characters
each.  But I object to this on 2 grounds:  first, this proposed
convention for the use of TDIM is only defined in Appendix A and is not
part of the formal binary table definition and thus is not binding on
the creators of FITS files.  FITS readers then cannot be sure that the
users will always adopt this convention.  The second, and to me more
compelling objection is that a subroutine interface like FITSIO does
not need to look for a TDIM keyword for any other column data type
except for ASCII characters.  For all other datatypes the TFORM keyword
itself is sufficient for the FITS reading subroutine to read the data.
For example, if the user wants to read a column of double precision
numbers, the reading program only needs to look at the TFORM keyword
(e.g, TFORM = '10D') to know how many values are contained in each row
of the column.  It is only in the case of ASCII strings that the
subroutine has to go hunting around for more information in order to be
able to interpret the data.
(This is analogous to the situation if you were asked to read  a column
of integer values, which contained 4 bytes per row, but you did not
know whether the the column contained 4 1-byte integers, 2 2-byte
integers, or 1 4-byte integer in each row).  Note in passing that in a subroutine interface like FITSIO, one does not need to worry about the
higher dimensionality of the data in the FITS files;  as far as the
FITSIO interface is concerned, every FITS array or table column can be
interpreted as a one-dimensional array of data;  it is the
responsibility of the calling program to group the data into higher
dimensions, if required.

As a better solution to this problem, I propose that the TFORM value
for ASCII columns be changed to the form:  TFORM = 'rAw'  where 'w' is
the length of a unit string, and 'r' is a repeat count.  For instance,
'3A5' would indicate that the field contains 3 strings each 5
characters in length.  An exception to this general rule would be the
case where no width is specified, as in TFORM = '20A'.  This would be
interpreted as if it were written 'A20', i.e., that this is a single
string 20 characters in width and not 20 1-character strings.

This modified form for TFORM is also the same is is used in FITS ASCII
table extensions, except for the fact that ASCII tables columns cannot
have a repeat count.

This new convention for the TFORM keyword has already been implemented
in the version 3.xx FITSIO software for anyone who wants to try it
out.  This is of course completely unofficial, so users should use this
feature sparingly for any real data that is to be distributed to other
users.

4) I do not care much for the ability to terminate a character string
before its explicit length by an ASCII NULL character.  This feature is
useless in Fortran implementations, and just requires more overhead
to test for NULL characters, and then pad them out with blanks.  Using
a NULL in the first character to represent an undefined string is fine
though.

-Bill Pence, HEASARC    pence at tetra.gsfc.nasa.gov  or  LHEAVX::PENCE


From dwells at fits.cx.nrao.edu Wed Oct 23 10:25:43 1991
Status: RO
X-VM-v5-Data: ([nil nil nil nil nil nil nil nil nil]
	["4366" "Fri" "11" "October" "1991" "05:35:26" "GMT" "Don Wells" "dwells at fits.cx.nrao.edu " "<672734 at toto.iv>" "82" "Re: Comments on 'Binary Table Extension' draft" "^From:" nil nil "10" "1991101105:35:26" "Comments on 'Binary Table Extension' draft" (number " " mark "     Don Wells         Oct 11   82/4366  " thread-indent "\"Re: Comments on 'Binary Table Extension' draft\"\n") nil]
	nil)
Newsgroups: alt.sci.astro.fits
In-Reply-To: pence at heawk1.gsfc.nasa.gov's message of Thu, 3 Oct 1991 21: 06:21 GMT
Organization: National Radio Astronomy Observatory, Charlottesville, VA
From: dwells at fits.cx.nrao.edu (Don Wells)
Subject: Re: Comments on 'Binary Table Extension' draft
Date: Fri, 11 Oct 1991 05:35:26 GMT

In article <pence.686523981 at heawk1> pence at heawk1.gsfc.nasa.gov
(William Pence) writes:
 WP> Date: Thu, 3 Oct 1991 21:06:21 GMT  
 WP> Comments on the "Binary Table Extension to FITS" Draft by Cotton
 WP> and Tody dated 20 September 1991:  

 WP> ... Sections 5.7 and 5.8 states:  'The IEEE NaN (not a
 WP> number) values are used to indicate an invalid number; ... All
 WP> IEEE special values are recognized.' ... An IEEE NaN value is
 WP> defined to have all the exponent bits set to 1 and a non-zero set
 WP> of fractional bits ... difficult for FITS readers ... written in
 WP> Fortran, to test if a pixel is undefined... simpler ... adopt a
 WP> single NaN value ... ALL bits = 1 ... equivalent to -1 in 2's
 WP> complement ...  

This BINTABLE NaN rule is consistent with the NaN rule of the Floating
Point Agreement which became effective 01-January-1990. The motivation
of the NaN rule was to allow IEEE machines to use their hardware
support to propagate NaNs automatically.  The majority of computers
used in astronomical computing today support IEEE FP.

The usual solution to the problem that standard Fortran is unsuitable
for systems programming is to resort to a set of machine-dependent
utility subroutines ("Z" routines in the parlance of IRAF and AIPS).
This problem does not arise for C programmers because the standard
language includes shift and mask operators.

 WP> ... what does it mean that 'All IEEE special values are
 WP> recognized'? If a pixel has the IEEE value representing + or -
 WP> infinity, for instance, what is a FITS reader supposed to do with
 WP> it?   These special values are not supported on all machines, so
 WP> the FITS reader can't simply pass them on without possible loss
 WP> of information.  

IEEE machines support +/-Inf in their hardware and libraries(?). Most
data analysis systems do not exploit this fact, but I contend that
they should. My opinion is not just theoretical, I put it into
practice in the IPPS [Interactive Picture Processing System] which I
and others built at KPNO during ~1974 to ~1980, and which ran in
production until May 1985.  I operated the CDC-6000-series CPU in the
mode in which its hardware and its library routines would calculate
and propagate infinities by the formal mathematical rules.  In the
IPPS when you took the logarithm of an image the zero pixels produced
-Infs, and when you exponentiated that image you got your zeroes back!
The arctangent of +Inf was indeed pi/2, etc.  When the IPPS had to
convert floating to integer for output it would interpret Infs as
NaNs. This is just what a FITS writer should do today when writing
floating numbers in an integer format.  When reading floating FITS
data on a non-IEEE machine I recommend that the Infs and NaNs be
construed as "blanks".  If your datasystem does not support the
concept of "blanks" you will have to make some arbitrary decision.

Incidentally, Crays support NaNs and Infs, even though they are are
not IEEE machines. 

 WP> 2)  The first paragraph of section 5 states: 'The last 2880-byte
 WP> record should be zero byte filled past the end of valid data.'  I
 WP> believe that this is an overly strict requirement, and is in fact
 WP> unenforceable. ...  

This is consistent with the zero-padding rule of the Basic FITS
Agreement of 1979. I view the rule as a matter of neatness, of
craftsmanship. I can imagine some programmer new to FITS examining a
hex dump of a file and getting very confused when the data values in
the last logical record do not end at the expected place, and indeed
appear random -- or worse, display some pattern left over in the
memory of the writing program. The zeroes will reassure hir...

The draft NOST standard document specifies the zero-padding rule in
Sect.4.3.2 (Primary Data Array) and specifies ASCII-blank-padding for
the TABLE extension in 8.1.3 (Data Sequence). 

I myself am reluctant to relax these rules because of my neatness
motivation, but I do concede that you have a point when you say that
they are "overly strict requirement[s], and [are] in fact
unenforceable".
--

Donald C. Wells             Associate Scientist        dwells at nrao.edu
National Radio Astronomy Observatory                   +1-804-296-0277
520 Edgemont Road                                 Fax= +1-804-296-0278
Charlottesville, Virginia 22903-2475 USA            78:31.1W, 38:02.2N 

From olson at anchor.esd.sgi.com Wed Oct 23 10:25:57 1991
Status: RO
X-VM-v5-Data: ([nil nil nil nil nil nil nil nil nil]
	["975" "" "11" "October" "91" "04:45:09" "GMT" "Dave Olson" "olson at anchor.esd.sgi.com " "<6475947 at toto.iv>" "22" "Re: Exabyte tape marks (was Re: Maximum blocking factor)" "^From:" nil nil "10" "1991101104:45:09" "Exabyte tape marks (was Re: Maximum blocking factor)" (number " " mark "     Dave Olson        Oct 11   22/975   " thread-indent "\"Re: Exabyte tape marks (was Re: Maximum blocking factor)\"\n") nil]
	nil)
Newsgroups: alt.sci.astro.fits
Organization: Silicon Graphics, Inc.  Mountain View, CA
From: olson at anchor.esd.sgi.com (Dave Olson)
Subject: Re: Exabyte tape marks (was Re: Maximum blocking factor)
Date: 11 Oct 91 04:45:09 GMT

In <1991Oct5.075855.2333 at noao.edu> tody at noao.edu (Doug Tody) writes:

...
| Whether or not one makes use of short filemarks depends upon the driver
| used.  The Sun SCSI tape driver, at least to date, writes only long
| filemarks.  Most people with Suns use the Sun st driver since it comes with
| SunOS.   Better drivers can be found from third party vendors.  ApUNIX sells
| such a driver, although we have not tried it here (we do use their DAT
| driver).  Features such as short filemarks for Exabyte or fast file
| seeks for DAT may not be available with a poor driver.
| 
| If I recall correctly DAT filemarks are 130 Kb or so...

The ARDAT Python uses 4 bytes (not even Kbytes) for a FM.  Obviously
if no data is written, it can still take up an entire frame, which
is ~122Kbytes if I remember right.  Most other DAT drives conforming
to DDS are the same.
--

     Dave Olson
----============----
Life would be so much easier if we could just look at the source code.

From vicki at quake.Stanford.EDU Wed Oct 23 10:26:09 1991
Status: RO
X-VM-v5-Data: ([nil nil nil nil nil nil nil nil nil]
	["406" "" "15" "October" "91" "22:33:21" "GMT" "Vicki Johnson" "vicki at quake.Stanford.EDU " "<7805135 at toto.iv>" "7" "keyword with multiple values" "^From:" nil nil "10" "1991101522:33:21" "keyword with multiple values" (number " " mark "     Vicki Johnson     Oct 15    7/406   " thread-indent "\"keyword with multiple values\"\n") nil]
	nil)
Newsgroups: alt.sci.astro.fits
Reply-To: vicki at quake.Stanford.EDU (Vicki Johnson)
Organization: Stanford
From: vicki at quake.Stanford.EDU (Vicki Johnson)
Subject: keyword with multiple values
Date: 15 Oct 91 22:33:21 GMT

What is the recommended keyword approach for describing
a parameter with several  (say < 100) values?  Assume a FITS file is 
generated from a program run with PARMS=(1.1, 2.2, 3.3).  One approach would
be FITS keywords of the form PARMS1=1.1, PARMS2=2.2, etc.,
similar to the NAXIS usage.   I've seen mention of hierarchical keywords,
but I gather that is discouraged.   Are there other approaches?
Vicki

From thompson at stars.gsfc.nasa.gov Wed Oct 23 10:26:14 1991
Status: RO
X-VM-v5-Data: ([nil nil nil nil nil nil nil nil nil]
	["436" "" "21" "October" "91" "17:06:00" "GMT" "William Thompson, code 682.1, x2040" "thompson at stars.gsfc.nasa.gov " "<4325296 at toto.iv>" "9" "Extensions" "^From:" nil nil "10" "1991102117:06:00" "Extensions" (number " " mark "     William Thompson, Oct 21    9/436   " thread-indent "\"Extensions\"\n") nil]
	nil)
Newsgroups: alt.sci.astro.fits
Organization: NASA/GSFC-Laboratory for Astronomy and Solar Physics
News-Software: VAX/VMS VNEWS 1.4-b1
Nntp-Posting-Host: stars.gsfc.nasa.gov
From: thompson at stars.gsfc.nasa.gov (William Thompson, code 682.1, x2040)
Subject: Extensions
Date: 21 Oct 91 17:06:00 GMT

I was told that I could get a list of all registered FITS extensions by sending
a mail message to Preben Grosbol at pgrosbol at eso.org.  I've been trying this
for some time now, and it doesn't work.  Does anybody know how to make contact
with this person, or how to get this information in another way?

I'm particularly interested in getting a description of something called the
'IMAGE' (or possibly 'MATRIX') extension.

Bill Thompson

From thompson at stars.gsfc.nasa.gov Wed Oct 23 10:26:31 1991
Status: RO
X-VM-v5-Data: ([nil nil nil nil nil nil nil nil nil]
	["804" "" "22" "October" "91" "14:51:00" "GMT" "William Thompson, code 682.1, x2040" "thompson at stars.gsfc.nasa.gov " "<4168418 at toto.iv>" "16" "Re: Extensions" "^From:" nil nil "10" "1991102214:51:00" "Extensions" (number " " mark "     William Thompson, Oct 22   16/804   " thread-indent "\"Re: Extensions\"\n") nil]
	nil)
Newsgroups: alt.sci.astro.fits
Organization: NASA/GSFC-Laboratory for Astronomy and Solar Physics
News-Software: VAX/VMS VNEWS 1.4-b1
Nntp-Posting-Host: stars.gsfc.nasa.gov
From: thompson at stars.gsfc.nasa.gov (William Thompson, code 682.1, x2040)
Subject: Re: Extensions
Date: 22 Oct 91 14:51:00 GMT

In article <21OCT199113063901 at stars.gsfc.nasa.gov>, I wrote...

>I was told that I could get a list of all registered FITS extensions by
>sending a mail message to Preben Grosbol at pgrosbol at eso.org.  I've been
>trying this for some time now, and it doesn't work.  Does anybody know how to
>make contact with this person, or how to get this information in another way?

>From some of the responses that I've gotten, I now realize that this sounds
like Dr. Grosbol doesn't respond to his mail.  I regret this misapprehension;
this was not the case.  My problem was simply that my mail messages always came
back as undeliverable.  I expect this is because there is something wrong with
the address I have, which is why I asked for help.

I apologize for any offense I may have given anyone.

Bill Thompson

From thompson at stars.gsfc.nasa.gov Wed Oct 23 10:26:49 1991
Status: RO
X-VM-v5-Data: ([nil nil nil nil nil nil nil nil nil]
	["150" "" "21" "October" "91" "19:46:00" "GMT" "William Thompson, code 682.1, x2040" "thompson at stars.gsfc.nasa.gov " "<8042950 at toto.iv>" "6" "IMAGE extension" "^From:" nil nil "10" "1991102119:46:00" "IMAGE extension" (number " " mark "     William Thompson, Oct 21    6/150   " thread-indent "\"IMAGE extension\"\n") nil]
	nil)
Newsgroups: alt.sci.astro.fits
Organization: NASA/GSFC-Laboratory for Astronomy and Solar Physics
News-Software: VAX/VMS VNEWS 1.4-b1
Nntp-Posting-Host: stars.gsfc.nasa.gov
From: thompson at stars.gsfc.nasa.gov (William Thompson, code 682.1, x2040)
Subject: IMAGE extension
Date: 21 Oct 91 19:46:00 GMT

Could somebody please post the proposal for the IMAGE extension?  I understand
there is one available from the IAU office.

Thank you,

Bill Thompson

From dmehring at zia.aoc.nrao.edu Wed Oct 23 10:27:01 1991
Status: RO
X-VM-v5-Data: ([nil nil nil nil nil nil nil nil nil]
	["696" "" "22" "October" "91" "20:40:26" "GMT" "David Mehringer" "dmehring at zia.aoc.nrao.edu " "<6711830 at toto.iv>" "14" "program to copy a FITS image (request)" "^From:" nil nil "10" "1991102220:40:26" "program to copy a FITS image (request)" (number " " mark "     David Mehringer   Oct 22   14/696   " thread-indent "\"program to copy a FITS image (request)\"\n") nil]
	nil)
Newsgroups: alt.sci.astro.fits
Distribution: usa
Organization: National Radio Astronomy Observatory, Socorro NM
From: dmehring at zia.aoc.nrao.edu (David Mehringer)
Subject: program to copy a FITS image (request)
Date: 22 Oct 91 20:40:26 GMT

Sorry if this is a FAQ, but does someone have a program whose sole
purpose is to copy a FITS image at disk level and if so, may I have a
copy?  I am trying to manipulate images at disk level and am in need
of such a program.  Since I will be changing the pixel values, a
program that also checks for the max and min pixel values and
incorporates them into the header would be great.
Thanks in advance.


-- 
Dave Mehringer            | "Every so often someone comes along and tries
dmehring at zia.aoc.nrao.edu | to re-invent the wheel, but usually ends up 
National Radio Astronomy  | with an octogon that has an off-center hole."
Observatory, Socorro, NM  |    -- E.N. Parker (the solar wind guy)

From pence at heawk1.gsfc.nasa.gov Thu Oct 24 15:14:53 1991
Status: RO
X-VM-v5-Data: ([nil nil nil nil nil nil nil nil nil]
	["2067" "" "23" "October" "91" "16:49:56" "GMT" " William Pence" "pence at heawk1.gsfc.nasa.gov " "<2601167 at toto.iv>" "64" "Re: program to copy a FITS image (request)" "^From:" nil nil "10" "1991102316:49:56" "program to copy a FITS image (request)" (number " " mark "      William Pence    Oct 23   64/2067  " thread-indent "\"Re: program to copy a FITS image (request)\"\n") nil]
	nil)
Newsgroups: alt.sci.astro.fits
Distribution: usa
Organization: Goddard Space Flight Center
Nntp-Posting-Host: heawk1.gsfc.nasa.gov
From: pence at heawk1.gsfc.nasa.gov ( William Pence)
Subject: Re: program to copy a FITS image (request)
Date: 23 Oct 91 16:49:56 GMT

dmehring at zia.aoc.nrao.edu (David Mehringer) writes:

>Sorry if this is a FAQ, but does someone have a program whose sole
>purpose is to copy a FITS image at disk level and if so, may I have a
>copy?  I am trying to manipulate images at disk level and am in need
>of such a program.  Since I will be changing the pixel values, a
>program that also checks for the max and min pixel values and
>incorporates them into the header would be great.
>Thanks in advance.

David

I am not exactly clear what you want the program to do, but the following
fortran program is an example of how you could read a FITS image
with 3 calls to the FITSIO subroutine package.

-Bill Pence

C----------------------------------------------------------------------
	program rdimag

C	simple program to open and read a FITS image

C	this assumes that the image is in INTEGER*4 format, and is no
C	larger than 100 X 100 pixels.  Would need to change the following
C	declaration statements if this is not the case...
	parameter (ndim1=100)
	parameter (ndim2=100)
	integer image(ndim1,ndim2),nulval

	character*40 name
	integer iunit,rwmode,block,status
	integer bitpix,naxis,naxes(99),pcount,gcount
	logical simple,extend,anyflg

	print *,'enter name of FITS file:'
	read(*,1000)name
1000	format(a)

C	use fortran unit number 12
	iunit=12
C	open the FITS file with readonly access:
	rwmode=0
	call ftopen(iunit,name,rwmode,block,status)

C	read the important primary array header keywords:
	call ftgprh(iunit,simple,bitpix,naxis,naxes,pcount,gcount,
     &              extend,status)

C	should check that this is really a 2-D array, and that the
C	FITS array size does not exceed the IMAGE array dimensions,
C       but we will just assume that everything is OK...

C	assign the value to be used to represent undefined pixels:
	nulval=-999

C	now read the array
	call ftg2dj(iunit,0,nulval,ndim1,naxes(1),naxes(2),
     &              image,anyflg,status)

C	Now do whatever you want with the IMAGE array....

	end
C---------------------------------------------------------------------------

From rlw at stsci.edu Tue Oct 29 17:44:47 1991
Status: RO
X-VM-v5-Data: ([nil nil nil nil nil nil nil nil nil]
	["8637" "" "28" "October" "91" "18:16:22" "GMT" "Rick L. White" "rlw at stsci.edu " "<4273323 at toto.iv>" "152" "image compression" "^From:" nil nil "10" "1991102818:16:22" "image compression" (number " " mark "     Rick L. White     Oct 28  152/8637  " thread-indent "\"image compression\"\n") nil]
	nil)
Newsgroups: alt.sci.astro.fits
Keywords: image compression
Organization: Space Telescope Science Institute
From: rlw at stsci.edu (Rick L. White)
Subject: image compression
Date: 28 Oct 91 18:16:22 GMT

I got into this discussion late (thanks to Bob Hanisch for directing me to
this group and to Don Wells for maintaining the archive); I apologize that
this posting is so long.  As an introduction, I am the person who developed
the image compression technique we will be using to distribute the digitized
sky survey plates that were used in the construction of the Guide Star
Catalog.  There are about 1500 images, each of with 14000x14000 I*2 pixels,
so the total data volume is 600 GBytes.  For distribution we will compress
the images by about a factor of 10, i.e.  the compressed data will require
about 1.5 bits/pixel.  This is, of course, lossy compression -- but the only
loss (according both to my analysis and to tests we have done) is that we
have thrown away most of the noise in the image.  In fact, at 1.5 bits/pixel
we are still keeping 1 bit of noise everywhere -- we can compress down to
about 0.7 bits/pixel before we start to throw away real information.  We
will soon be distributing a CD-ROM sampler with 8 different plates (north &
south, galactic plane & pole, etc.) compressed at 3 different levels (about
1.5, 0.7, and 0.3 bits/pixel).

Archie Warnock (warnock at nssdca.gsfc.nasa.gov) writes: 
% In general, compression is tricky - binary files just don't compress
% very well.

Greg Scott Hennessy (gsh7w at fermi.clas.Virginia.EDU) writes:
% As with many things, the answer is "that depends". 

Don Wells (dwells at fits.cx.nrao.edu) writes: 
% I agree with both statements. The real issue is what fraction of the
% bits of the file are "random"...

Don has it exactly right.  It is easy to put a lower bound on the data
volume required to store an image (thus an upper bound on the
compression) if it is to be compressed with no loss:  just count the
total number of bits of noise in the image.  For the GS digitized
plates, the noise is typically sigma = 200 DN in each pixel, so
lossless compression will always require at least 8.5 bits/pixel (+/- 1
sigma requires 8.5 bits).

Not only is this a lower bound to the data volume, but it is actually a
pretty good estimate of what can be achieved.  The real structure in
most astronomical images can be stored with very little information,
typically less than 1--2 bits/pixel (even for crowded fields in the
galactic plane).  If your lossless method requires much more than N+2
bits/pixel, where N is the number of noise bits, then you probably need
a better method.

An upper bound to the data volume can be calculated from the entropy of
the image, S = Sum[f*log2(f)], where the sum is over all different
pixel values and f is the fraction of pixels having the given value.
This is an upper bound because if there are any correlations between
pixels (as there always are in real images), then less than S bits/pixel
are required.

Note that one can generate a highly incompressible image simply by
adding a whole bunch of noise bits to each pixel.  One way to do this,
for example, is to convert the image to float and divide by a flat
field.  A flat-fielded R*4 image is very difficult to compress -- if you
insist on lossless compression!  Of course the key is to allow some loss
in the compression.  Lossless compression may make sense for low-noise CCD
images, but no sensible person cares whether a compressed R*4 image agrees
with the original in the last bit.

Markus Buchhorn (markus at mso.anu.edu) writes:
% ... one way of helping the compression algorithm might be to store the data
% as differences from one pixel to the next. If the Universe was well behaved
% and skies were truely smooth, the gain would be astounding ...  As the noise
% increases, the efficiency would decrease.  Another advantage (possibly the
% main one) is that all the dp/dx values would be closer in number-space,
% which means that a trick like compressing high and low bytes separately
% would work much better.

Archie Warnock (warnock at nssdca.gsfc.nasa.gov) writes: 
% ... this is virtually the algorithm IHW used for compressing the 
% digitized large-scale images.  We observed that, for most of our 1600 
% images, although the data itself required 2 bytes for storage, the 
% pixel-to-pixel differences fit in 1 byte.

This method is often suggested for compression, but it does not work very
well for noisy images.  Suppose we have a completely empty image with each
pixel = SKY+NOISE.  If we take the difference of adjacent pixels, then each
pixel has SQRT(2)*NOISE (assuming that the noise is independent and equal in
each pixel).  Taking differences of consecutive pixels _increases_ the
noise, which actually makes the image _harder_ to compress (by about 0.5
bit/pixel).

Even when there is real structure in the image, taking differences often
does not help.  Astronomical images are filled with point sources, so when a
signal appears it tends to change very rapidly from one pixel to the next.
If you take the derivative of an image you spread the structure over more
pixels and make it more complicated, which does not help compression.
Smooth objects are pretty rare in astronomical images.

I think Markus is right that the main reason this method has been found to
be useful is that it shifts the values of pixels so that they are centered
around zero.  One would do better, though, just to subtract the SKY value
directly so that the noise does not get combined.  Of course, if SKY varies
(i.e. there is structure in the image), this doesn't work and one needs a
fancier method.  Nonetheless, just taking differences of adjacent pixels is
not a very good first step in compressing images.

Markus Buchhorn (markus at mso.anu.edu) writes:
% But what about a lossy compression technique ? Before anyone jumps on
% that, look at the way HST will store their digitised sky survey...
% they have achieved quite respectable compression (4:1), and, *they claim*
% that photometry is not compromised by this. (Dunno about crowded fields
% though ???)

Archie Warnock (warnock at nssdca.gsfc.nasa.gov) writes: 
% Excellent idea - especially if we can arrange the compression so what's 
% lost is the noise, not the signal.  Still, astronomers are conservative 
% folks and want to preserve every last bit of noise.

Don Wells (dwells at fits.cx.nrao.edu) writes: 
% The general strategy which I speculate will generally make the maximum
% gain is to split the datastream into streams which have homogeneous
% statistics, and compress each stream separately...  It may be advantageous
% to compress even and odd bytes separately for 16-bit data, or use four
% streams for 32-bit.

Jean-Pierre Veran (veran at cfht.hawaii.edu) writes: 
% The images are generally very noisy and I have found that I can improve
% considerably the compression if I get rid of a certain number of the least
% significant bits, then use COMPACT 1.0 to compress the remaining most
% significant bits and finally just append the least significant bits to
% the file, without trying any compression on them.

Yes, the trick is to separate the noise and the signal.  The signal is
highly compressible, while you are wasting your time if you try to compress
the noise.  I've tried a number of techniques similar to those mentioned
above.  The difficulty with just stripping off the bottom N bits is that
there is still some signal buried in those bits.  If you throw the bits away
(lossy compression) you have lost some potentially valuable information; if
you write them out directly, you have saved a lot of noise to get a little
more information.  If you leave more of the noise bits in the signal
when you do your split, the signal becomes much harder to compress (e.g.
run-length coding fails completely in the presence of even a little
noise.)

The trick in the method I use -- based on the H-transform, see Fritze et
al.  Astron.Nachr (1978) 298, 189 and Capaccioli et al. (1988) Astron.Nachr.
309, 69 -- is that it preserves the mean of the image on all scales.
Effectively it blocks the image in groups of 1x1, 2x2, 4x4, ... pixels and
preserves the average of each such block if it differs signficantly from the
average of neighboring blocks.  That means that low-surface brightness
objects, which are not detected in single pixels but _are_ detected if one
averages over blocks, can be preserved even if you throw away the noise.
This method manages to keep as much as possible of the information which is
buried in the pixel noise without keeping too much of the noise itself.
Of course, you can still keep the noise if you want it -- but with this
method it is much clearer where to draw the line and stop trying to
compress the data.

Rick White,   rlw at stsci.edu
Space Telescope Science Institute

From rlw at stsci.edu Tue Oct 29 17:44:54 1991
Status: RO
X-VM-v5-Data: ([nil nil nil nil nil nil nil nil nil]
	["4211" "" "28" "October" "91" "18:18:32" "GMT" "Rick L. White" "rlw at stsci.edu " "<1242799 at toto.iv>" "74" "FITS compression" "^From:" nil nil "10" "1991102818:18:32" "FITS compression" (number " " mark "     Rick L. White     Oct 28   74/4211  " thread-indent "\"FITS compression\"\n") nil]
	nil)
Newsgroups: alt.sci.astro.fits
Keywords: FITS image compression
Organization: Space Telescope Science Institute
From: rlw at stsci.edu (Rick L. White)
Subject: FITS compression
Date: 28 Oct 91 18:18:32 GMT

A few remarks on the FITS compression proposal:

Greg Scott Hennessy (gsh7w at fermi.clas.Virginia.EDU) writes: 
% It is very common already to have FITS data sets that will not fit
% into RAM, and I think that it would be a mistake to rely on a
% compression scheme that required reading the whole data set into ram. 

Doug Tody NOAO/IRAF CCS (tody at noao.edu) writes: 
% Unless there are good reasons to do otherwise it would be wise to use a
% compression techinique which is local in nature ... This is particularly
% important if the data is to be randomly accessed at runtime, for example,
% when reading a FITS image from a CD-ROM, or from disk.  It would also aid in
% recovery from data losses.  A simple technique would be to compress each
% image line independently, using an index to record the offset of each
% compressed line.

This is an important point.  The GS digitized plates are 14000x14000.  It
would be terrible to have to uncompress the entire plate to get at any of
the data.  In fact, it would be terrible to have to uncompress an entire
14000 pixel line of data just to extract a small section of the plate!
The approach we have taken is to compress the image in blocks (500x500);
then one can extract an image section anywhere on the plate by
uncompressing only the required blocks.  The compression technique we're
using is fast enough that I don't think we would bother breaking small
images up -- e.g. it takes only 4 secs to decompress a 500x500 image on a
Sparc-1.

Archie Warnock (warnock at nssdca.gsfc.nasa.gov) writes: 
% The second "trick" [of the FITS compression scheme] is that, when you
% compress the data, you compress everything, including a FITS header.  The
% reason for this is that it allows you to compress _any_ FITS data stream,
% and the result of the decompression will always be a fully-qualified FITS
% data stream, which you can feed directly back into your FITS reader.

Any scheme which is appropriate for binary data will not be appropriate
for text, so it does not really make sense to think of putting both
the header and data through some generic compression tool.  (E.g. if
you apply your successive differences to text and then feed it to
unix compress, I am sure it will not compress as well as the original
text.)  The beauty of the FITS compression proposal, though, is that
one is free to define a compression method that applies one technique
to the header (or does nothing to the header) and an entirely different
technique to the data.  After all, you only have to look for the
END record to know when to change horses.

Greg Scott Hennessy (gsh7w at fermi.clas.Virginia.EDU) writes: 
% Table 1 in the Warnock et. al. draft clearly shows what we all know now.
% Some data sets compress well (using a generic compress, not referring to any
% particular algorithm) and some data sets do not. My conclusion from
% Table 1 is that the space savings from different algorithms is a small
% fraction of the range of savings possible, hence picking a good but
% admittedly not best for all cases algorithm to standarize on is a win
% in keeping down software complexity over supporting multiple
% compression algorithms to gain 5-10 percent savings. 

Archie Warnock (warnock at nssdca.gsfc.nasa.gov) writes: 
% Bang for the buck - a good concept.  Computing time goes up 
% substantially with increasing compression ratio.  I hope no one claims 
% to be able to decide for everyone else what investment level is 
% sufficient.

Actually, deciding how much noise to keep is far more important than
computing time.  It may be that the best technique for lossy compression
is not the best for lossless compression.  The JPEG group took this
approach -- their lossless method is tacked on after the fact and
has little in common with their main (lossy) method.  (Incidentally,
my method we're using for the GS plates performs well for either
lossless or lossy compression.)  In any case, I think the decisions
to be made can effect the compression achieved at the factor of 2--3
level, not just the 5-10 percent level.  I think the idea of having
a fair number of methods to choose from is a good one.

Rick White,   rlw at stsci.edu
Space Telescope Science Institute

From veran at cfht.hawaii.edu Wed Oct  9 09:37:37 1991
Status: RO
X-VM-v5-Data: ([nil nil nil nil nil nil nil nil nil]
	["6394" "Thu" "3" "October" "1991" "00:09:13" "GMT" "Jean-Pierre Veran" "veran at cfht.hawaii.edu " "<864584 at toto.iv>" "175" "compression of fits" "^From:" nil nil "10" "1991100300:09:13" "compression of fits" nil nil]
	nil)
Newsgroups: alt.sci.astro.fits
Keywords: fits, image compression
Nntp-Posting-Host: nano.cfht.hawaii.edu
Reply-To: veran at cfht.hawaii.edu (Jean-Pierre Veran)
Organization: Canada-France-Hawaii Telescope Corporation
From: veran at cfht.hawaii.edu (Jean-Pierre Veran)
Subject: compression of fits
Date: Thu, 3 Oct 1991 00:09:13 GMT

The Canada-France-Hawaii Telescope has large format CCDs of 2K by 2K pixels
(2 bytes per pixel even if only 15 bits are used at the moment). An average of
100 images are taken each nignt ie almost 1 gigabytes each night.
The 8 megabytes images are stored in FITS format on optical disk, after being
compressed with the Unix COMPRESS program. The performances of COMPRESS are
generally poor so we decided to initiate a research project to find out how it
would be possible to improve the compression ratio of our FITS files. I am the
one in charge for that.
First I tried compression programs available in the public network,
(COMPRESS 4.1, COMPACT 1.0, ARC 5.21, LHARC 1.0, ZOO 2.10, FREEZE 2.2, ZIP 0.9)
on a bunch of objects, flats and bias images.
I have found that the best was generally COMPACT 1.0, using adaptative
Huffman coding.
The images are generally very noisy and I have found that I can improve
considerably the compression if I get rid of a certain number of the least
significant bits, then use COMPACT 1.0 to compress the remaining most
significant bits and finally just append the least significant bits to
the file,
without trying any compression on them. The problem is that the optimal
number
of LSB to get rid of depends on the image and I'm trying to work out a
criterion
based on simple statistic properties. This seems not very difficult,
needs some
more calculation but only during compression, not during decompression
and it is
possible to perform these calculations only on a part of the image, the
central
part for example.
Before compressing the MSB, I tried to use a linear predictor. The
results show
that the compression ratio is generally slightly better but not always.
It is certainly possible to improve the performances of the predictor
with a
larger neighborhood for example but it will not lead to miracles
anyway.
I have also tested the previous pixel compression method suggested in
the FITS
Compression proposal. Well, the algorithm is very simple and the results
are
pretty good. Sometimes, it is possible to even reduce the size of the
obtained
file using a compressing programm (COMPRESS 4.1 seems to be the best).
But still the results show that better performances can simply be
achieved
separating the least significants bits from the most significant ones as
said
above and I do think that some improvements can still be made in that
way.
You will find below the results of some tests. If anybody is interested,
I
can mail some more figures. Any comments or suggestions are very
welcome.

Jean-Pierre Veran
Canada-France-Hawaii Telescope
Email: veran at cfht.hawaii.edu

------------------------------------------------------------------------
---------
This document gives the results of compression tests of 3 FITS object
images,
without header.

First the SHIFT least significant bits were removed from the image ie
Image = Image >> SHIFT has been performed.

Then a linear prediction was computed, using the predictor:
P(x) = 0.351*A + 0.346*C + 0.292*B		B C D
						A X
and each pixel was replaced by its error of prediction X-P(X).

Then this image was compressed with an adaptative Huffman coding (Compact 1.0)

Finally all the bits first removed were compacted and added to the file without
any compression.

The tests have been carried out without and with prediction.


          Size of the original image    2048*2048*2     8388608
RATIO = ----------------------------- = ----------- = -----------
	 Size of the compressed image   KBYTES*1000   KBYTES*1000


<-- indicates a global maximum of the compression ratio.

For comparisons, the performances of the old version of COMPRESS and
COMPACT 1.0
used alone are reminded.

Then the previous pixel compression algorithm has been tried (PREVPIX),
followed by a compression of the file with COMPRESS 4.1 (PREVPIX +
COMPRESS 4.1)
which turned out to be the best compression algorithm for those files.

**********************************************
* Source image: 112069o.fits  Size = 8388608 *
**********************************************
Un. COMPRESS: Ratio = 1.25	KBytes = 6719
COMPACT 1.0 : Ratio = 1.31	KBytes = 6398

PREVPIX:		Ratio = 1.53	Kbytes = 5481
PREVPIX + COMPRESS 4.1:	Ratio = 1.53	Kbytes = 5481

	    ### SHIFT + COMPACT 1.0 ### 

SHIFT	RATIO	KBYTES		SHIFT	RATIO	KBYTES
-----	-----	------		-----	-----	------
  0	1.31	 6398		  4	1.70	 4936
  1	1.48	 5684		  5	1.77	 4750
  2	1.52	 5506		  6	1.81	 4624
  3	1.62	 5178		  7	1.83	 4594	<--

	### SHIFT + PREDICTOR + COMPACT 1.0 ###
SHIFT	RATIO	KBYTES		SHIFT	RATIO	KBYTES
-----	-----	------		-----	-----	------
  0	1.40	 5996		  4	1.77	 4737
  1	1.57	 5326		  5	1.83	 4564
  2	1.60	 5253		  6	1.89	 4444	<--
  3	1.70	 4943		  7	1.85	 4533

**********************************************
* Source image: 111907o.fits  Size = 8388608 *
**********************************************
Un. COMPRESS: Ratio = 1.79	KBytes = 4696
COMPACT 1.0 : Ratio = 1.87	KBytes = 4485

PREVPIX:		Ratio = 1.98	Kbytes = 4250
PREVPIX + COMPRESS 4.1:	Ratio = 1.98	Kbytes = 4250

	    ### SHIFT + COMPACT 1.0 ###
SHIFT	RATIO	KBYTES		SHIFT	RATIO	KBYTES
-----	-----	------		-----	-----	------
  0	1.87	 4485		  4	2.31	 3636
  1	2.01	 4177		  5	2.33	 3597	<--
  2	2.12	 3954		  6	2.30	 3645
  3	2.23	 3766		  7	2.14	 3912

	### SHIFT + PREDICTOR + COMPACT 1.0 ###
SHIFT	RATIO	KBYTES		SHIFT	RATIO	KBYTES
-----	-----	------		-----	-----	------
  0	1.95	 4304		  4	2.34	 3580
  1	2.09 	 4005		  5	2.38	 3528	<--	 
  2	2.21	 3795		  6	2.31	 3631
  3	2.33	 3607		  7	2.21	 3800

**********************************************
* Source image: 121685o.fits  Size = 8388608 *
**********************************************
Un. COMPRESS: Ratio = 2.86	KBytes = 2937
COMPACT 1.0 : Ratio = 3.02	KBytes = 2777

PREVPIX:		Ratio = 1.97	Kbytes = 4251
PREVPIX + COMPRESS 4.1:	Ratio = 2.97	Kbytes = 2825

	    ### SHIFT + COMPACT 1.0 ###
SHIFT	RATIO	KBYTES		SHIFT	RATIO	KBYTES
-----	-----	------		-----	-----	------
  0	3.02	 2777		  4	3.19	 2627
  1	3.19	 2630		  5	3.06	 2737
  2	3.32	 2523		  6	2.58	 3250
  3	3.35	 2501	<--	  7	2.23	 3763

	### SHIFT + PREDICTOR + COMPACT 1.0 ###
SHIFT	RATIO	KBYTES		SHIFT	RATIO	KBYTES
-----	-----	------		-----	-----	------
  0	2.97	 2827		  4	3.15	 2665
  1	3.13	 2673		  5	3.07	 2736
  2	3.23	 2589	<--	  6	2.59	 3243
  3	3.20	 2619		  7	2.23	 3753

Jean-Pierre Veran
Canada-France-Hawaii Telescope
Email: veran at cfht.hawaii.edu

From koffley at nrlvx1.nrl.navy.mil Wed Oct  9 09:36:44 1991
Status: RO
X-VM-v5-Data: ([nil nil nil nil nil nil nil nil nil]
	["1829" "" "1" "October" "91" "10:29:29" "GMT" "koffley at nrlvx1.nrl.navy.mil" "koffley at nrlvx1.nrl.navy.mil" "<3732164 at toto.iv>" "37" "Re: FITS V3.0 available from NRLREAD/NEXT" "^From:" nil nil "10" "1991100110:29:29" "FITS V3.0 available from NRLREAD/NEXT" nil nil]
	nil)
Newsgroups: alt.sci.astro.fits
Organization: NRL SPACE SYSTEMS DIVISION
From: koffley at nrlvx1.nrl.navy.mil
Subject: Re: FITS V3.0 available from NRLREAD/NEXT
Date: 1 Oct 91 10:29:29 GMT

In article <pence.686247157 at heawk1>, pence at heawk1.gsfc.nasa.gov 
( William Pence) writes:

>In general I would discourage users from obtaining 2nd hand copies of the
>FITSIO software because one may not get the latest version.  As a case in
>point, the new version 3.01 of FITSIO, which fixes a relatively minor bug,
>was released just last Friday.  Currently there a 2 ways to receive the 
>most recent version of FITSIO:
>
>1) by anonymous ftp from  tetra.gsfc.nasa.gov  in subdirectory pub/fitsio3
>
>2) over the SPAN network (e.g., with DECNET copy) from
>
>      NDADSA::HEASARC:[EXOSAT.XANADU.FITSIO.VERSION3]
>
>If any users have difficulty accessing the files by either of these
>methods, then they should send me a message.
>
Bill,

   I wasn't trying to subvert GSFC's efforts here. The FITS package was placed
here as a service to our local community who for the most part are not too
saavy about the network at large and even less so about FTP or DECNET. I only
posted to the group because I figured there might be more of the same out 
in the general population. Everyone seems to know how to send e-mail. 

BTW, was there a general announcement that I missed re: 3.01 ?? I didn't see
it in this group. Is there some forum that such an announcement would take
place without having to FTP to GSFC to find out ? I only ask because the only
reason I would have known there was a 3.00 was the announcement in this
group.
---
\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/
< Joe Koffley                        KOFFLEY at NRLVAX.NRL.NAVY.MIL             >
< Naval Research Laboratory          KOFFLEY at CCF.NRL.NAVY.MIL                >
< Space Systems Division             AT&T  :  202-767-0894                   >
\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/

From pence at heawk1.gsfc.nasa.gov Wed Oct  9 09:36:53 1991
Status: RO
X-VM-v5-Data: ([nil nil nil nil nil nil nil nil nil]
	["888" "" "1" "October" "91" "15:54:16" "GMT" " William Pence" "pence at heawk1.gsfc.nasa.gov " "<1318779 at toto.iv>" "17" "Re: FITS V3.0 available from NRLREAD/NEXT" "^From:" nil nil "10" "1991100115:54:16" "FITS V3.0 available from NRLREAD/NEXT" nil nil]
	nil)
Newsgroups: alt.sci.astro.fits
Organization: Goddard Space Flight Center
Nntp-Posting-Host: heawk1.gsfc.nasa.gov
From: pence at heawk1.gsfc.nasa.gov ( William Pence)
Subject: Re: FITS V3.0 available from NRLREAD/NEXT
Date: 1 Oct 91 15:54:16 GMT

koffley at nrlvx1.nrl.navy.mil writes:

>BTW, was there a general announcement that I missed re: 3.01 ?? I didn't see
>it in this group. Is there some forum that such an announcement would take
>place without having to FTP to GSFC to find out ? I only ask because the only
>reason I would have known there was a 3.00 was the announcement in this
>group.

I am unsure of the best way to advertise new releases of the FITSIO software,
especially since minor bug fixes or enhancements will probably be released
nearly every week for the next month or 2.  Would users prefer to see an
announcement here, or should they be expected to check the distribution
directories (by FTP or DECNET) to see if there is a newer version?  In any
case, I would post a message here whenever a significant new release was
available.

< Bill Pence, HEASARC         pence at tetra.gsfc.nasa.gov  or  LHEAVX::PENCE  >

From thompson at stars.gsfc.nasa.gov Wed Oct  9 09:37:06 1991
Status: RO
X-VM-v5-Data: ([nil nil nil nil nil nil nil nil nil]
	["1393" "" "1" "October" "91" "15:36:52" "GMT" "Bill Thompson" "thompson at stars.gsfc.nasa.gov " "<746049 at toto.iv>" "26" "Proposed TAXISnnn convention" "^From:" nil nil "10" "1991100115:36:52" "Proposed TAXISnnn convention" nil nil]
	nil)
Newsgroups: alt.sci.astro.fits
Reply-To: thompson at stars.gsfc.nasa.gov
Organization: NASA Goddard Space Flight Center - Greenbelt, MD, USA
News-Software: VAX/VMS VNEWS 1.3-4
Nntp-Posting-Host: stars.gsfc.nasa.gov
From: thompson at stars.gsfc.nasa.gov (Bill Thompson)
Subject: Proposed TAXISnnn convention
Date: 1 Oct 91 15:36:52 GMT


                      "Multidimensional Table" Convention

One way to look at binary tables is as way of expressing data structures.  In
this view, each row is an element in a array whose data type is a structure,
and each column represents a member of this structure.

The basic binary tables convention expresses this structure array as one
dimensional.  However, one could also organize these structure elements as a
multidimensional array.  For instance, if a binary table had 15 rows, then one
could also organize this as a 5 x 3 array.

The "Multidimensional table" convention would express this organization using
the optional keywords TAXIS, TAXIS1, TAXIS2, etc., which are defined in the
manner of NAXIS, NAXIS1, NAXIS2, etc.  The product of the values of the keyword
variables TAXIS1 x TAXIS2 x ... x TAXISn is required to be equal to the total
number of rows in the binary as given by NAXIS2.  The example discussed above
would have the following values for these keywords.

        NAXIS2  =                   15  /  The table has 15 rows
        TAXIS   =                    2  /  These rows for a 2D array
        TAXIS1  =                    5  /  The first dimension is 5
        TAXIS2  =                    3  /  The second dimension is 3

The adherence to this convention will be indicated by the presence of the
keyword TAXIS, along with the associated TAXISnnn keywords.

From thompson at stars.gsfc.nasa.gov Wed Oct  9 09:37:02 1991
Status: RO
X-VM-v5-Data: ([nil nil nil nil nil nil nil nil nil]
	["2559" "" "1" "October" "91" "15:37:52" "GMT" "Bill Thompson" "thompson at stars.gsfc.nasa.gov " "<1116841 at toto.iv>" "43" "Proposed TDESCnnn convention" "^From:" nil nil "10" "1991100115:37:52" "Proposed TDESCnnn convention" nil nil]
	nil)
Newsgroups: alt.sci.astro.fits
Reply-To: thompson at stars.gsfc.nasa.gov
Organization: NASA Goddard Space Flight Center - Greenbelt, MD, USA
News-Software: VAX/VMS VNEWS 1.3-4
Nntp-Posting-Host: stars.gsfc.nasa.gov
From: thompson at stars.gsfc.nasa.gov (Bill Thompson)
Subject: Proposed TDESCnnn convention
Date: 1 Oct 91 15:37:52 GMT


			 "Dimension Label" Convention

This convention describes a way to assign labels to the dimensions of arrays in
a binary table.  The purpose of this convention is to provide a simple standard
for documenting the interrelationships between arrays in a binary table.  This
convention operates by assigning the keyword TDESCnnn to a column nnn which
contains an array with one or more dimensions.  This keyword would have the
format TDESCnnn='(label1,label2,...)' where label1, etc. are the dimension
labels.  There would be one dimension label for each fundamental dimension.
The labels are character strings of up to eight character strings.  The same
rules that apply to user defined keywords also apply to dimension labels.
Adherence to this convention is indicated by the presence of the keywords
TDESCnnn.

Optional documentation can be assigned to the dimension labels in one of two
ways.  The simplest way is to have a keyword with the same name as the
dimension label.  This keyword would take a character string value which would
describe the meaning of the dimension label.  A second approach would be to
have a separate FITS extension with the keyword EXTNAME='extname.labeln', where
extname is the value of EXTNAME in the current extension, and labeln is the
dimension label.  The structure of this associated extension is left up to the
user.  It is not required that the dimension label be documented with either of
these two schemes.

For instance, suppose that column 1 in a binary table contains a 100x100x50
array, where the first two dimensions represent spatial position in right
ascension and declination, and the third dimension represents time.  Column 2
contains an array which depends only on RA and DEC, and the third column
contains an array depending only on time.  The binary tables extension header
could then contain the following entries:

        TFORM1  = '500000I'             /  16 bit integers
        TDIM1   = '(100,100,50)'        /  Dimensions
        TDESC1  = '(RA,DEC,TIME)'       /  Dimension labels
        TFORM2  = '10000I '             /  16 bit integers
        TDIM2   = '(100,100)'           /  Dimensions
        TDESC2  = '(RA,DEC)'            /  Dimension labels
        TFORM3  = '50I    '             /  16 bit integers
        TDESC3  = '(TIME) '             /  Dimension label
        RA      = 'Right Ascension in 1 arcsec inc.'    /  Meaning of RA
        DEC     = 'Declination in 1 arcsec increments'  /  Meaning of DEC
        TIME    = 'Time in 10 second increments'        /  Meaning of TIME

From thompson at stars.gsfc.nasa.gov Wed Oct  9 09:37:10 1991
Status: RO
X-VM-v5-Data: ([nil nil nil nil nil nil nil nil nil]
	["2960" "" "1" "October" "91" "15:38:43" "GMT" "Bill Thompson" "thompson at stars.gsfc.nasa.gov " "<1557667 at toto.iv>" "60" "Proposed TNDIMnnn convention" "^From:" nil nil "10" "1991100115:38:43" "Proposed TNDIMnnn convention" nil nil]
	nil)
Newsgroups: alt.sci.astro.fits
Reply-To: thompson at stars.gsfc.nasa.gov
Organization: NASA Goddard Space Flight Center - Greenbelt, MD, USA
News-Software: VAX/VMS VNEWS 1.3-4
Nntp-Posting-Host: stars.gsfc.nasa.gov
From: thompson at stars.gsfc.nasa.gov (Bill Thompson)
Subject: Proposed TNDIMnnn convention
Date: 1 Oct 91 15:38:43 GMT


		       "Variable Dimensions" Convention

This convention describes how columns in a binary table not only can contain
multidimensional arrays as its elements, but also how the number and size of
the dimensions can be different for each row of the table.  The convention
works by storing the dimensions of an array stored in one column (nnn) in
another array stored in a second column (mmm).  The format of this second
description array is discussed below.  The interrelationship between the data
arrays (column nnn) and the description arrays (column mmm) is indicated by the
use of the optional keyword "TNDIMnnn=mmm".  Adherence to this convention will
be indicated by the presence of the TNDIMnnn keyword.

The description array used by this convention is a 16 bit integer array with
the following format: the first element contains the number of dimensions in
the data array being described; the second element contains the size of the
most rapidly varying dimension of the data array; the third the size of the
next most rapidly varying dimension; and so forth up to the total number of
dimensions.  Any extra elements in the description array after the final
dimension parameter will be ignored, and should be padded with zeros.  The
product of the dimension sizes must be less than or equal to the size of the
data array as given by the keyword TFORMnnn.  The number of dimensions can be
zero to signal that no data is stored in the data array for that row in the
binary table.

If multiple columns in the binary table represent arrays with the same
dimensional structure, then they can all refer to the same description array.

The "Variable length array" facility could also be used with this convention.
Either the data array or description array, or both, could be defined as
variable length arrays.

As an example, suppose that both columns 3 and 4 contain a series of
multidimensional data arrays of different sizes, and that the associated array
sizes are in column 5.  The header could contain the following keywords and
values:

        TFORM3  = '100I    '            /  16 bit integers
        TNDIM3  =                    5  /  Dimensions contained in column 5
        TFORM4  = '100I'                /  16 bit integers
        TNDIM4  =                    5  /  Dimensions contained in column 5
        TFORM5  = '4I      '            /  Dims. stored as 16 bit integers

The data arrays in columns 3 and 4 could then contain up to three dimensions,
one less than the size of the array in column 5.  Valid values of the array
stored in column 5 can be represented symbolically as

        (1,100)
        (2,10,10)
        (2,20,5)
        (3,5,5,2)
        (3,4,4,4)

even though the last example adds up to only 64 points and not 100.  However,
the description array

        (3,5,5,5)

would not be valid because this requires 5 x 5 x 5 = 125 elements, which is
bigger than the 100 elements allocated to the arrays in columns 3 and 4.

From tody at noao.edu Wed Oct  9 09:37:14 1991
Status: RO
X-VM-v5-Data: ([nil nil nil nil nil nil nil nil nil]
	["1991" "" "1" "October" "91" "16:24:19" "GMT" "Doug Tody" "tody at noao.edu " "<2157090 at toto.iv>" "40" "Re: Proposed TAXISnnn/TDESCnnn/TNDIMnnn conventions" "^From:" nil nil "10" "1991100116:24:19" "Proposed TAXISnnn/TDESCnnn/TNDIMnnn conventions" nil nil]
	nil)
Newsgroups: alt.sci.astro.fits
Organization: National Optical Astronomy Observatories, Tucson, AZ, USA
From: tody at noao.edu (Doug Tody)
Subject: Re: Proposed TAXISnnn/TDESCnnn/TNDIMnnn conventions
Date: 1 Oct 91 16:24:19 GMT

Bill, I suppose any number of conventions can be proposed and tried.
Since conventions are not part of the FITS standard one is free to try
anything in this regard.  For the sake of argument, I will offer some
comments on your proposals.

TAXISnnn
	The one-dimensional array-of-records format upon which binary tables
	is based was not chosen arbitrarily, it is what is used in a
	relational database.  There is a wealth of experience with this
	model, not only in commercial databases, but in the existing "table"
	implementations found in various astronomical software systems.
	Your N-dimensional array of structures, on the other hand, is in my
	opinion a fairly arbitrary generalization of the simple table, and
	does not justify departing from the relational model.  If we want to
	abandon the relational model and propose a database with more complex
	structures then there are no end of ways this can be done - and
	currently, no standards (this is a hot area of research currently).

TDESCnnn
	You are starting to reinvent world coordinate systems.  See the
	paper by Hanisch et.al. describing a proposed convention for
	representing world coordinate systems in FITS.  In the general case
	this is a very difficult problem.

TNDIMnnn
	This is a good idea (allowing the array dimension to be specified
	separately for each record).  A generalization of the TDIMnnn
	convention along these lines is probably needed.

This discussion illustrates further why Appendix A is a convention and not
part of the core proposal.

These proposed conventions would all be layered upon the basic binary tables
format.  Ultimately many such conventions are likely to be developed to
represent different types of data.  Discussion of the basic binary tables
format is also welcome.
-- 
Doug Tody, National Optical Astronomy Observatories, Tucson AZ, 602-325-9217
UUCP: {arizona,decvax,ncar}!noao!tody  or  uunet!noao.edu!tody 
Internet: tody at noao.edu             SPAN/HEPNET: NOAO::TODY (NOAO=5355)

From tody at noao.edu Wed Oct  9 09:38:03 1991
Status: RO
X-VM-v5-Data: ([nil nil nil nil nil nil nil nil nil]
	["2317" "" "4" "October" "91" "21:37:17" "GMT" "Doug Tody" "tody at noao.edu " "<2720600 at toto.iv>" "40" "Re: Maximum blocking factor" "^From:" nil nil "10" "1991100421:37:17" "Maximum blocking factor" nil nil]
	nil)
Newsgroups: alt.sci.astro.fits
Organization: National Optical Astronomy Observatories, Tucson, AZ, USA
From: tody at noao.edu (Doug Tody)
Subject: Re: Maximum blocking factor
Date: 4 Oct 91 21:37:17 GMT

>From article <9110031830.AA05079 at fits.cx.nrao.edu>, by CUR%STARLINK.RUTHERFORD.AC.UK at VTVM2.CC.VT.EDU (Malcolm J. Currie):
> In the IRAF 2.9.1 patch of Sept. 6 there is an `enhancement' that makes
> the maximum FITS blocking factor 22 rather than 10.  In my humble
> opinion this is a mistake.  I think that the importance of
> transportability far outweighs the efficiency gain.  Users should be
> protected from shooting themselves (or the recipients of the
> non-standard tape) in the foot.

The program is capable of the higher blocking factor but will issue a stern
warning message that the tape is non standard and may be unreadable
elsewhere if the user attempts to do this.  Obviously, any user producing a
tape to be read elsewhere should produce a standard FITS tape.  FITS is not
always used to produce data to be read elsewhere.

> On the question of increasing the maximum blocking factor for fixed-
> block sequential media, I'm not convinced that we need blocksizes
> greater than 28800 bytes for Exabyte...

The blocking factor has little effect on tape usage for either Exabyte or
DAT since the drives are intelligent and will reblock the data and store
it on the tape in fixed size blocks.  The only wasted storage is that
required to pad the last physical device record to accomodate the logical
block size specified by the client.  In principle a larger block size
could aid streaming but in tests I have performed the block size has had
little effect on the sustaiied transfer rate.

The larger block size is useful mainly for 6250 bpi reel tape tape and
cartridge tape.  6250 bpi tapes require more storage for a smaller blocking
factor.  In the case of cartridge tapes this doesn't matter but the larger
block can aid streaming (this is best dealt with at a lower level than FITS
however).

Yes, the huge (2 Mb) file mark associated with current Exabyte drives is a
serious problem if you write many small FITS files.  If you are going to
write many small files you are probably better off with DAT, which has a
small file mark and which can position to any file very quickly.
-- 
Doug Tody, National Optical Astronomy Observatories, Tucson AZ, 602-325-9217
UUCP: {arizona,decvax,ncar}!noao!tody  or  uunet!noao.edu!tody 
Internet: tody at noao.edu             SPAN/HEPNET: NOAO::TODY (NOAO=5355)

From sla at fast.ucsc.edu Wed Oct  9 09:38:09 1991
Status: RO
X-VM-v5-Data: ([nil nil nil nil nil nil nil nil nil]
	["1574" "" "4" "October" "91" "22:39:32" "GMT" "Steve Allen" "sla at fast.ucsc.edu " "<7221301 at toto.iv>" "27" "Exabyte tape marks (was Re: Maximum blocking factor)" "^From:" nil nil "10" "1991100422:39:32" "Exabyte tape marks (was Re: Maximum blocking factor)" nil nil]
	nil)
Newsgroups: alt.sci.astro.fits
Summary: many device drivers are immature or incomplete
Keywords: 8-mm Exabyte tape EOF marks
Organization: UCO/Lick Observatory
From: sla at fast.ucsc.edu (Steve Allen)
Subject: Exabyte tape marks (was Re: Maximum blocking factor)
Date: 4 Oct 91 22:39:32 GMT

In article <9110031830.AA05079 at fits.cx.nrao.edu> CUR%STARLINK.RUTHERFORD.AC.UK at VTVM2.CC.VT.EDU (Malcolm J. Currie) writes:
>                             More of a limitation on Exabyte is the
>maximum number of files caused by the huge tape marks (equivalent to
>about 2Mb).  You cannot get more than 320 files on a labelled cartridge
>or 950 on an unlabelled cartridge.
>I was intrigued by Richard Stover's findings.  Are there no inter-block
>gaps on Exabyte?
>Malcolm Currie                               Janet: CUR at UK.AC.RL.STAR
>Starlink                                     SPAN : RLVAD::CUR

Richard Stover has complete control over his Exabyte, including the
ability to write the short file marks (as opposed to the default
long file marks).  The short file marks take up only ~300 kbytes
and in addition to being short, they are faster to write.  This is extremely
important for some of the rapid-readout of large CCDs done at Lick.
All Exabyte hardware should be able to write these short marks;
the deficiency usually lies in the device driver software of the OS.

Richard Stover's data acquisition system actually puts a limit of
1600 FITS files per Exabyte tape to prevent having too much valuable
data on a single easy-to-lose-or-destroy tape.


_______________________________________________________________________________
Steve Allen          |                                |   sla at helios.ucsc.edu
UCO/Lick Observatory |     This space for rent.       | If the UC were opining,
Santa Cruz, CA 95064 |                                | it wouldn't tell me.

From tody at noao.edu Wed Oct  9 09:38:16 1991
Status: RO
X-VM-v5-Data: ([nil nil nil nil nil nil nil nil nil]
	["1558" "" "5" "October" "91" "07:58:55" "GMT" "Doug Tody" "tody at noao.edu " "<7645480 at toto.iv>" "28" "Re: Exabyte tape marks (was Re: Maximum blocking factor)" "^From:" nil nil "10" "1991100507:58:55" "Exabyte tape marks (was Re: Maximum blocking factor)" nil nil]
	nil)
Newsgroups: alt.sci.astro.fits
Organization: National Optical Astronomy Observatories, Tucson, AZ, USA
From: tody at noao.edu (Doug Tody)
Subject: Re: Exabyte tape marks (was Re: Maximum blocking factor)
Date: 5 Oct 91 07:58:55 GMT

>From article <21756 at darkstar.ucsc.edu>, by sla at fast.ucsc.edu (Steve Allen):
> Richard Stover has complete control over his Exabyte, including the
> ability to write the short file marks (as opposed to the default
> long file marks).  The short file marks take up only ~300 kbytes
> and in addition to being short, they are faster to write.

The Exabyte can write two types of filemarks: short filemarks (about 0.5 Mb)
and long filemarks (2.2 Mb).  The difference is that the long filemark
includes a a gap allowing the filemark to be overwritten.  So long as one
only writes to a tape at EOT the difference is not important.

Whether or not one makes use of short filemarks depends upon the driver
used.  The Sun SCSI tape driver, at least to date, writes only long
filemarks.  Most people with Suns use the Sun st driver since it comes with
SunOS.   Better drivers can be found from third party vendors.  ApUNIX sells
such a driver, although we have not tried it here (we do use their DAT
driver).  Features such as short filemarks for Exabyte or fast file
seeks for DAT may not be available with a poor driver.

If I recall correctly DAT filemarks are 130 Kb or so...

These drives employ considerable data buffering internally (246 Kb in the
case of Exabyte) which is why the block size has little effect on the
sustained transfer rate.
-- 
Doug Tody, National Optical Astronomy Observatories, Tucson AZ, 602-325-9217
UUCP: {arizona,decvax,ncar}!noao!tody  or  uunet!noao.edu!tody 
Internet: tody at noao.edu             SPAN/HEPNET: NOAO::TODY (NOAO=5355)

From dwells at fits.cx.nrao.edu Tue Oct  1 10:50:56 1991
Status: RO
X-VM-v5-Data: ([nil nil nil nil nil nil nil nil nil]
	["3891" "Tue" "1" "October" "1991" "14:40:40" "GMT" "Don Wells" "dwells at fits.cx.nrao.edu " nil "82" "Draft Proposal re Fixed-block Sequential Media." "^From:" nil nil "10" nil nil nil nil]
	nil)
X-VM-Summary-Format: "%n %*%a %-17.17F %-3.3m %2d %4l/%-5c %I\"%s\"\n"
X-VM-Labels: nil
X-VM-VHeader: ("Resent-" "From:" "Sender:" "To:" "Apparently-To:" "Cc:" "Subject:" "Date:") nil
X-VM-Bookmark: 2
Newsgroups: alt.sci.astro.fits
Organization: National Radio Astronomy Observatory
From:  dwells at fits.cx.nrao.edu (Don Wells)
Subject:  Draft Proposal re Fixed-block Sequential Media.
Date: Tue, 1 Oct 1991 14:40:40 GMT


This draft proposal has been placed in the anonymous-FTP server on
fits.cx.nrao.edu [192.33.115.8] in directory FITS/doc:
rw-r--r--  1 dwells   vlb          2183 Oct  1 09:51 blocking90.txt
This is a short file, and so I am appending it to this announcement.

FITS is a bitstream data format which is media-independent (contrary
to the impression given by the Wells-Greisen-Harten Basic FITS paper
of 1981). FITS *logical* records have length 2880 bytes. Blocking is
regarded as being a media-dependent matter, not a part of the FITS
standard per se. For 9-tk tapes the FITS community has agreed to use
blocking factors between 1 and 10 (28800 bytes/block) inclusive for
FITS files.  Other types of media need to be separately negotiated.
During the past few years it has become clear that 9tk tapes are being
superceded by several types of cartridge media. This text proposes
rules for several of these media.

This draft has been under consideration by the FITS committees for
about two years, and has not yet been endorsed by them. I encourage
everyone interested in FITS to consider the implications carefully.
For example, what would be the increase in efficiency if DATs and/or
Exabytes were allowed to have blocking factor greater than 10, and do
we want that efficiency enough to change our software which already
supports 28800-byte blocks? 

Donald C. Wells             Associate Scientist        dwells at nrao.edu
National Radio Astronomy Observatory                   +1-804-296-0277
Edgemont Road                                     Fax= +1-804-296-0278
Charlottesville, Virginia 22903-2475 USA            78:31.1W, 38:02.2N 

=-=-=-=-=-=-=-=-=-=-=-=-=-=-= cut here =-=-=-=-=-=-=-=-=-=-=-=-=-=-=

                            Draft Proposal
                                 for
               Blocking of Fixed-block Sequential Media.
                                 and
                           bitstream Devices.

                        P.Grosbol and D.Wells
                            1990-August-10


                             Introduction

A number of high density storage media (e.g. optical disks and helical
scan tapes) is of significant interest of exchange of large volumes of
astronomical data in FITS format. Many controllers and devices for these
media can only access data in blocks of fixed length, typically 2**n bytes.
Thus, a special blocking agreement for FITS files is required for this
type of media.

                 Draft Proposal for fixed-block media.

For fixed-block sequential media, FITS files consisting of an integer
number of 2880 byte logical records should be regarded as a bitstream
written out with the fixed blocking size of the media with the last
block being padded out with zeros to the length of the fixed blocks.
Reading an incomplete FITS logical record should be regarded as an
end-of-file. This proposal applies to optical disks (accessed as a
sequential set of records), QIC format 1/4inch cartridge tapes
and Local Area Networks.

The fixed-block blocking proposal conforms to the general rules
for blocking of FITS files (Grosbol et al. 1988, Astron. Astrophys.
Suppl. 73, p359) using a blocking factor of 2**n/2880 for a media
with a fixed block size of 2**n bytes.

                   Draft Proposal for bitstream devices.

For bitstream devices, FITS files should be written with a blocking
factor of one i.e. with fixed blocks of 2880 bytes corresponding to
the logical record size. This proposal applies to FITS files written
to logical file systems.

                 Draft Proposal for variable-block media.

For variable block length sequencial media, FITS files may be written
with an interger blocking factor between 1 and 10 including as specified
in the blocking agreement for 1/2 inch 9 track tapes. This proposal applies
to FITS files written to DDS/DAT 4mm cartridge tapes and
8mm cartridge tape (Exabyte).


From sla at fast.ucsc.edu Tue Oct  1 20:51:41 1991
Status: RO
X-VM-v5-Data: ([nil nil nil nil nil nil nil nil nil]
	["1984" "" "1" "October" "91" "22:07:37" "GMT" "Steve Allen" "sla at fast.ucsc.edu " nil "35" "Re: Draft Proposal re Fixed-block Sequential Media." "^From:" nil nil "10" nil nil nil nil]
	nil)
Newsgroups: alt.sci.astro.fits
Summary: efficiency gains are not large
Keywords: FITS, exabyte
Organization: UCO/Lick Observatory
From: sla at fast.ucsc.edu (Steve Allen)
Subject: Re: Draft Proposal re Fixed-block Sequential Media.
Date: 1 Oct 91 22:07:37 GMT

In article <9110011440.AA29235 at fits.cx.nrao.edu> dwells at fits.cx.nrao.edu (Don Wells) writes:
>This draft has been under consideration by the FITS committees for
>about two years, and has not yet been endorsed by them. I encourage
>everyone interested in FITS to consider the implications carefully.
>For example, what would be the increase in efficiency if DATs and/or
>Exabytes were allowed to have blocking factor greater than 10, and do
>we want that efficiency enough to change our software which already
>supports 28800-byte blocks?
>

Richard Stover at Lick observatory has investigated the issues of blocksize
on Exabyte tapes.  Richard is the author and maintainer of the Lick
data acquisition system.  His observations are included below.
--------------------
Years ago I looked into Exabyte efficiency with respect to blocking.
One can query the Exabyte for the number of 1024-byte blocks remaining
on the tape.  Therefore, one can write some number of n-byte variable
length blocks and then ask the Exabyte how many 1024-byte blocks are left.
>From this one can easily deduce the tape efficiency.  My results were:
	2048 or 4096 byte block are 100% efficient.
	2880 byte blocks are 93% efficient.
	28880 byte blocks are 96.7% efficient.
Given these small efficiency differences I have chosen to write 2880-byte
records, although the software was designed to handle a blocking factor
up to 10.  One reason to write 28800-byte records instead of 2880-byte
records might be that the number of system calls is reduced by a factor
of 10.  But machines are so fast these days that this didn't seems to
me like an overriding concern.

Richard Stover
--------------
_______________________________________________________________________________
Steve Allen          |                                |   sla at helios.ucsc.edu
UCO/Lick Observatory |     This space for rent.       | If the UC were opining,
Santa Cruz, CA 95064 |                                | it wouldn't tell me.

From CUR%STARLINK.RUTHERFORD.AC.UK at VTVM2.CC.VT.EDU Fri Oct  4 07:18:03 1991
Status: RO
X-VM-v5-Data: ([nil nil nil nil nil nil nil nil nil]
	["285" "Fri" "4" "October" "1991" "10:04:00" "GMT" "\"Malcolm J. Currie\"" "CUR%STARLINK.RUTHERFORD.AC.UK at VTVM2.CC.VT.EDU" nil "7" "Optimum blocking factor for Exabyte" "^From:" nil nil "10" nil nil nil nil]
	nil)
Newsgroups: alt.sci.astro.fits
Organization: National Radio Astronomy Observatory
From:  "Malcolm J. Currie" <CUR%STARLINK.RUTHERFORD.AC.UK at VTVM2.CC.VT.EDU>
Subject:   Optimum blocking factor for Exabyte
Date: Fri, 4 Oct 1991 10:04:00 GMT

Since the efficiency differences on Exabyte are due to the padding of
data blocks to a multiple of 1024-bytes, the optimum FITS blocking
factor is 6.


Malcolm Currie                               Janet: CUR at UK.AC.RL.STAR
Starlink                                     SPAN : RLVAD::CUR

From forveill at gag.observg.grenet.fr Tue Oct  8 09:11:39 1991
Status: RO
X-VM-v5-Data: ([nil nil nil nil nil nil nil nil nil]
	["2916" "Tue" "8" "October" "1991" "08:33:01" "GMT" "Thierry Forveille" "forveill at gag.observg.grenet.fr " nil "54" "Blocking factor" "^From:" nil nil "10" nil nil nil nil]
	nil)
Newsgroups: alt.sci.astro.fits
Path: cv3.cv.nrao.edu!mail-to-news-gateway
Message-ID:  <199110080835.AA17688 at corton.inria.fr>
Organization: National Radio Astronomy Observatory
Lines: 54
From:  forveill at gag.observg.grenet.fr (Thierry Forveille)
Sender: news at nrao.edu
Subject:  Blocking factor
Date: Tue, 8 Oct 1991 08:33:01 GMT

>>From article <9110031830.AA05079 at fits.cx.nrao.edu>, by CUR%STARLINK.RUTHERFORD.AC.UK at VTVM2.CC.VT.EDU (Malcolm J. Currie):
>> In the IRAF 2.9.1 patch of Sept. 6 there is an `enhancement' that makes
>> the maximum FITS blocking factor 22 rather than 10.  In my humble
>> opinion this is a mistake.  I think that the importance of
>> transportability far outweighs the efficiency gain.  Users should be
>> protected from shooting themselves (or the recipients of the
>> non-standard tape) in the foot.
>
>The program is capable of the higher blocking factor but will issue a stern
>warning message that the tape is non standard and may be unreadable
>elsewhere if the user attempts to do this.  Obviously, any user producing a
>tape to be read elsewhere should produce a standard FITS tape.  FITS is not
>always used to produce data to be read elsewhere.
>
>> On the question of increasing the maximum blocking factor for fixed-
>> block sequential media, I'm not convinced that we need blocksizes
>> greater than 28800 bytes for Exabyte...
>
>The blocking factor has little effect on tape usage for either Exabyte or
>DAT since the drives are intelligent and will reblock the data and store
>it on the tape in fixed size blocks.  The only wasted storage is that
>required to pad the last physical device record to accomodate the logical
>block size specified by the client.  In principle a larger block size
>could aid streaming but in tests I have performed the block size has had
>little effect on the sustaiied transfer rate.
>
>The larger block size is useful mainly for 6250 bpi reel tape tape and
>cartridge tape.  6250 bpi tapes require more storage for a smaller blocking
>factor.  In the case of cartridge tapes this doesn't matter but the larger
>block can aid streaming (this is best dealt with at a lower level than FITS
>however).

Even with stern warning messages, this alteration of the maximum blocking 
factor remains a violation of the FITS standard. If the IRAF group wants 
to stick to their "enhancement", they should simultaneously set SIMPLE
to F, and shouldn't call the result a FITS file. 

A standard is a standard! When the IRAF group vetoed hierarchical keywords
a few years ago, the radio community kept complying to the unmodified
FITS agreement, even though most of these tapes will never enter a non-radio
package. Even though many FITS tapes aren't meant to be read elsewhere in 
principle, some of them do turn out to. Even worse, some tapes will not be 
read elsewhere, but they will be read later, and some of them will survive 
IRAF.

The fact that this modification is only justified by a marginal capacity 
enhancement on an already obsolescent medium makes it even harder to 
justify. Who cares about 20% on a 6250 bpi reel when DATs and Exabyte
drives are now cheaper than any tape drive?


	Thierry Forveille
        Observatoire de Grenoble
        forveill at frgag51.bitnet

From dwells at fits.cx.nrao.edu Tue Oct 22 09:44:32 1991
Status: RO
X-VM-v5-Data: ([nil nil nil nil nil nil nil nil nil]
	["1157" "Tue" "22" "October" "1991" "12:28:42" "GMT" "Don Wells" "dwells at fits.cx.nrao.edu " nil "23" "XTENSION='IMAGE' text" "^From:" nil nil "10" nil nil nil nil]
	nil)
Newsgroups: alt.sci.astro.fits
Organization: National Radio Astronomy Observatory
From:  dwells at fits.cx.nrao.edu (Don Wells)
Subject:  XTENSION='IMAGE' text
Date: Tue, 22 Oct 1991 12:28:42 GMT

I asked J. Ramon Munoz to supply a machine-readable version of the IUE
'IMAGE' proposal, and he has done so, and I have put his text into the
anonymous-FTP server on fits.cx.nrao.edu [192.33.115.8] in directory
FITS/doc:

-rw-r--r--  1 dwells   vlb          6043 Oct 22 08:17 image_extension.txt

This is not the published version. It is "a portion of an internal
note (1988) where a short description of the IMAGE Extension is
given".

NOTE: The XTENSION='IMAGE' proposal is still unofficial. I have heard
that the European FITS committee asked the IUE group to pick another
name so that 'IMAGE' could be reserved for a possible universal
agreement in the FITS community. This would be analogous to the
history of binary tables; originally they were to be called '3DTABLE',
but at the request of the committee members the prototype
implementation was called 'A3DTABLE'.

Donald C. Wells             Associate Scientist        dwells at nrao.edu
National Radio Astronomy Observatory                   +1-804-296-0277
520 Edgemont Road                                 Fax= +1-804-296-0278
Charlottesville, Virginia 22903-2475 USA            78:31.1W, 38:02.2N 

