From a108poon@cdf.toronto.edu Fri Dec 2 21:33:51 1994
Newsgroups: comp.compression
Path: solitaire.cv.nrao.edu!hearst.acc.Virginia.EDU!caen!math.ohio-state.edu!howland.reston.ans.net!pipex!uunet!utcsri!cdf.toronto.edu!news
From: Poon Jacob Tin Hang
Subject: Re: silly question:hqx compressed files
In-Reply-To: <3b91tc$m55@ixnews1.ix.netcom.com>
X-Sender: a108poon@eddie
Content-Type: TEXT/PLAIN; charset=US-ASCII
Message-ID:
Sender: news@cdf.toronto.edu (Usenet News)
Nntp-Posting-Host: eddie
Organization: University of Toronto Computing Disciplines Facility
References: <3b8g7g$e3e@potogold.rmii.com> <3b91tc$m55@ixnews1.ix.netcom.com>
Mime-Version: 1.0
Date: Sun, 27 Nov 1994 15:19:16 GMT
Lines: 21
On 27 Nov 1994, Gerald Lee wrote:
> I just downloaded a file w/ the extension .hqx from a friendly ftp site,
> but I am unable to figure what program I need to decomp it. I looked
> through some grps and a couple FAQ's but, there's no mention of .hqx.
> Is it new? What will I need to expand it, and where should I look for
> the appropriate program? Thanks much in advance!
>
>
> --
> ------------------------------------------------------------
> Gerald Lee "sleep and jumbo frets...the best!" > gbones@ix.netcom.com
> ------------------------------------------------------------
>
>
The .hqx file belongs to an application called BinHex 4.0 origially
created in Mac platform years ago. It is NOT a compression, but it is an
encoding program like uuencode. To extract .hqx files, you need a file
named binhex13.zip (which can also create .hqx files) available in
/pub/msdos/mac directory in oak.oaklad.edu ftp site.
From rrajala@tnclus.tele.nokia.fi Fri Dec 2 21:35:07 1994
Path: solitaire.cv.nrao.edu!hearst.acc.Virginia.EDU!caen!newsxfer.itd.umich.edu!gatech!swrinde!pipex!uunet!news.tele.fi!news.csc.fi!nokia.fi!tnclus!rrajala
From: rrajala@tnclus.tele.nokia.fi (Risto Rajala NTC/SWS/CPC Kalkki2 puh 90-511 5843)
Newsgroups: comp.compression
Subject: Re: Reed Solomon code needed
Date: 28 Nov 94 15:44:02 EET
Organization: Nokia Telecommunications.
Lines: 869
Message-ID: <1994Nov28.154402.1@tnclus>
References: <3bapvg$iau$1@mhadg.production.compuserve.com>
NNTP-Posting-Host: tne02.tele.nokia.fi
In article <3bapvg$iau$1@mhadg.production.compuserve.com>,
Andrew Powell <100330.1701@CompuServe.COM> writes:
> I am looking for source code for the Reed Solomon ECC algorithm.
Here is R-S-Code articles collected and posted by Tom Moore two years ago.
Risto Rajala, rrajala@tnclus.tele.nokia.fi
--------------------------------------------------------------------------------
Subject: REED-SOLOMON PACKAGE
From: tmoore@ewu.UUCP (Tom Moore)
The collection of Reed-Solomon goodies.
The following collection should help you with the theory and
actual code behind REED-SOLOMON Codes.
The included files are responses sent back to me by request.
If you use anything, please give credit to the proper
authors. As of yet, I have NOT written anything to do
with RS (But I will shortly........)
Tom Moore {...uunet!isc-br!ewu!tmoore}
Eastern Washington University {...tmoore%ewu@uunet.uu.net}
Head Consultant, Computer Graphics Lab
(509) 623-4235
---------------------------------------------------------------------------
# This is a shell archive. Remove anything before this line,
# then unpack it by saving it in a file and typing "sh file".
#
# Wrapped by cg5!tmoore on Fri Nov 15 14:24:06 PST 1991
# Contents: rs.galois rs.rply.1 rs.rply.2 rs.rply.3
echo x - rs.galois
sed 's/^@//' > "rs.galois" <<'@//E*O*F rs.galois//'
@From: colin@array.UUCP (Colin Plumb)
Subject: REED/SOLOMON -Galois fields
Date: 13 Nov 91 02:31:18 GMT
Organization: Array Systems Computing, Inc., Toronto, Ontario, CANADA
tmoore@ewu.uucp writes:
>There has got to be someone out there who has messed with REED/SOLOMON
>error correcting codes. I know, I know, you all are interested in
>getting avery little bit of redundancy out of text, and here I come
>wanting to put some back in. It may not sit will with you, but hey,
>I need some help here, so how about a little mail to me.
Okay, time for the old tutorial...
First, Galois fields. In number theory, we have a thing called a
"group". This is a bunch of things I'll call numbers, and
an operation, usually written as addition or multiplication.
The axioms of a group are, roughly:
For each a, and b in the group, a+b is in the group
There is an identity element, often written e, such that a+e = e + a = a.
For each a in the group, there is an inverse -a so a + -a = 0.
For all a, b and c, (a+b)+c = a+(b+c)
Some groups are commutative, or Abelian (after the mathematician Abel),
in which case you also have a+b = b+a. Everything I'll work with here
is, in fact, Abelian, although there are structures that aren't.
An example of a group is the integers, or the integers modulo 10,
with normal addition. 5+5 = 0, so -5 = 5, and so on.
Fancier than that is a ring, which has addition and multiplication.
Multiplication also stays within the ring, and it distributes:
a*(b+c) = (a*b)+(a*c). There's also a multiplicative identity,
usually written as 1. a*1 = 1*a = a.
The integers are a ring, as are the integers modulo 10.
Fancier still is a field, which has multiplicative inverses. For every
a other than 0 (the additive inverse), there is a multiplicative
inverse a', such that a*a' = a'*a = 1. The integers are not a field,
although the rationals are. So are the integers modulo any prime
number. Modulo 10, 5 does not have a multiplicative inverse: times
anything, it's either 5 or 0. Modulo 9 (also not a prime), 3 does not
have a multiplicative inverse. But modulo 11, everything has a
multiplicative inverse. (This property comes up in the prime-testing
algorithms that public-key cryptography has produced so much excitement
in, but I'm drifting...)
Anyway, it turns out that given a ring, you can define primes.
And a ring, modulo a prime, produces a field. (There are actually
come complications involved in prime vs. irreducible, but I'll ignore
them - this theory is *not* designed to let you pass your exam
tomorrow; it's the bare minimum motivation for what's to follow.)
So, anyway, we now have the material for producing rings of "nice"
sizes. 2 is prime, so we can make a field of size 2, but it's not
big enough to do much with. And no other prime is a power of 2,
as is suitable to work with on binary computers. Galois to the rescue!
Given a ring, we can define polynomials over that ring. You know, like
3x^2 + 2x + 1. Basically, it's just a list of numbers in the ring
(e.g. 1, 2, 3 for the above - equivalent to 1, 2, 3, 0, 0, 0,...).
There are a few more things you can do if the numbers are from a field,
and it turns out we'll be using a field, so I'll stop making the
distinction.
Now, we can multiply polynomials, and add them, and so on. Furthermore,
we have an additive identity, additive inverses, a multiplicative
identity, and so on. It turns out to be a ring.
So, let's take as our basic field the integers modulo 2. I.e.
0 and 1. 0+0 = 1+1 = 0, 0+1 = 1+0 = 1, 0*0 = 0*1 = 1*0 = 0, 1*1 = 1.
We can now form polynomials over the integers modulo 2, in which
each power of x (like, say, x^12) is either present (coefficient of 1)
or absent (coefficient of 0). These can be expressed quite nicely
as bit strings. E.g. 10001000000100001 is the polynomial x^16+x^12+x^5+1.
This polynomial can be factored as 1111000000011111 * 11,
x^15+x^14+x^13+x^12+x^4+x^3+x^2+x+1 * x+1. Both of these are prime -
they cannot be expressed as the product of smaller polynomials.
Now, observe that adding polynomials involves adding corresponding
coefficients, modulo 2. This is also known as the xor operation.
Also observe that each polynomial is its own additive inverse, so
subtracting is the same as adding. And multiplication is just
shift-and-add, like usual, except it's shift-and-xor.
Now, given primes, we can form the polynomials modulo a prime.
Let's take 1011, x^3+x+1, a polynomial which happens to be prime.
We can take any given polynomial (say, 11111, x^4+x^3+x^2+x+1), and
subtract multiples of 1011 until it has degree less than 3.
First, subtract x times 1011, i.e. 10 times 1011, or 10110:
11111
10110
=====
1001
Then, subtract 1 times 1011:
1001
1011
=====
0010
Thus, x^4+x^3+x^2+x+1 modulo x^3+x+1 is x.
If you think for a bit, you'll realize that exactly the polynomials with
degree less than the modulus are the possible remainders. Other than
that, any coefficients are possible. So if the modulus is of degree n
(highest coefficient is x^n), then there are 2^n possible remainders,
corresponding to all possible combinations of x^0, x^1, ..., x^(n-1).
These are written as all possible n-bit strings.
Now, we're getting somewhere! We have numbers we can work with (it
turns out that multiplicative inverses exist, if the modulus is
irreducible, which is prime plus a little bit that I won't get into),
and we have a nice equivalnece with bit-strings.
So, let's take some degree-8 polynomial and use it as the modulus,
so all possible bytes are our numbers.
Now, consider what would happen if we were to store, after a bunch of
bytes a0, a1, ... a(n-1), their sum s (using this field, which means we
xor them). The total would be a0+...+a(n-1)+s = a0+...+a(n-1)+a0+...+a(n-1)
= 0. This is basically parity.
If an error hit one of the bytes, say an error e in position k, so the
received data is b0 = a0, b1 = a1, ..., bk = ak+e, ... b(n-1) = a(n-1),
bn = s. Then the sum wouldn't add up, and we'd get, instead of 0, e.
Now, this is an error-detection code, but isn't much better than parity.
It is, in fact, 8-way interleaved parity. But we can get better.
Now, let's come up with a second sum character, t.
t = a0*x^0 + a1*x^1 + a2*x^2 + ... + a(n-1)*x^(n-1).
Now, if an error occurs, then summing t on the received side will
give us, not the received t, but t+e*x^k. We cah subtract off the
received t to get e*x^k.
Now, we have e and e*x^k. Since e is not 0 (if it was, there'd be
no error), it has a multiplicative inverse, e'. We can multiply
this by e*x^k to get x^k. As long as n is low enough that each
possible k has a different x^k (there are 255 such possibilities for
an 8-bit code), then from this we can recover the position, k, of
the error. Since we know the error pattern and the location,
we can fix it. Bingo, error correction!
(If the error is in the user data, then x^k is non-zero, so both
e and e*x^k are non-zero. If one or the other is zero, then the
error is in the coresponding check byte and can be ignored. If
both are zero, there was no error.)
This is the simplest single-error-correcting Reed-Solomon code.
It turns out that by adding more check bytes, computed by
u = a0*x^0+a1*x^2+a2*x^4+..., v = a0*x^0+a1*x^3+a2*x^6+...,
etc. and solving more complex equations to find the solutions,
we can determine the patterns and locations of multiple errors.
For example, suppose we have 4 check bytes and 2 errors. e1 at
location k1 and e2 at location k2.
We can derive, from the check bytes, e1+e2, e1*x^k1+e2*x^k2,
e1*x^(2*e1)+e2*x^(2*k2) and e1*x^(3*e1)+e2*x^(3*k2). From
these four equations, we can derive the four unknowns e1, k1, e2, and
k2 we need to fix the error. See various journal articles and
textbooks for efficient ways to solve these systems of equations. With
r check bytes, you can detect r errors, correct r/2 errors, or any
combination. (Say, correct up to r/3 errors and detect up to 2r/3.) In
fact, even with more errors, you'll *probably* detect them, but it's
just not absolutely guaranteed.
What CD's do is a bit more complicated... they have two levels of
Reed-Solomon ECC. First, they scatter the data around on the disc so a
burst error will take a bit out of a bunch of blocks rather than
totally destroying a few blocks. Then, they arrange the block in a
grid and apply error correction codes to the rows. If there are just
one or two errors, they repair them, but if there are a bunch, they
mark the whole row as bad, and don't try to correct it.
Next, they apply error correction down the columns. Because any rows
with errors are already marked bad, we don't have to derive the error
locations k1, k2, k3, etc. and so with the number of check bytes we
have (which determines the number of equations), we can have more
errors e1, e2,... without having too many unknowns. (It's also easier
to solve the equations for the errors than the locations, so this makes
the implementation easier.) This results in really spectacular
error-correction ability, over a large block, without having to deal
with hundreds of equations in hundreds of unknowns, which would make CD
players ridiculously expensive. (Remember, you can't slow down the
music when an error occurs - that's just as bad as no correction at
all.)
That was a *really* quick summary, but the essentials are there.
--
-Colin
@//E*O*F rs.galois//
chmod u=rw,g=,o= rs.galois
echo x - rs.rply.1
sed 's/^@//' > "rs.rply.1" <<'@//E*O*F rs.rply.1//'
@From: kosmach@mot.com
To: tmoore%ewu@uunet.uu.net
Subject: REED/SOLOMON
Cc: kosmach@ukraine.noname
If you have no experience with error correcting codes, then
I think it may be a little difficult to start with RS codes.
But anyway,
RS codes are a general case of BCH codes (i.e., RS codes were
discovered first and then they were generalized to BCH codes).
So, any literature you find about nonbinary BCH codes can be
applied to RS codes. A good starter book on coding theory is:
ERROR CONTROL CODING: FUNDAMENTALS AND APPLICATIONS written by
Shu Lin and Daniel J. Costello, Jr. In particular, Chapter 6,
discusses the encoding and decoding of binary and nonbinary BCH
codes along with RS codes. It should be very simple to write the
encoder. The hard part is determining the type of decoder you want
(i.e., do I want a very fast algorithm, am I worried about memory,
how large are the codewords, how much parity do I need, and do I
need to correct erasures).
If you have any questions, let me know.
Jim Kosmach
Research Engineer
Motorola Corporate Research Center
kosmach@ukraine.corp.mot.com
@//E*O*F rs.rply.1//
chmod u=rw,g=,o= rs.rply.1
echo x - rs.rply.2
sed 's/^@//' > "rs.rply.2" <<'@//E*O*F rs.rply.2//'
@From: rpw3@rigden.wpd.sgi.com (Rob Warnock)
Subject: Re: REED/SOLOMON
Rob Warnock, MS-1L/515 rpw3@sgi.com
Silicon Graphics, Inc. (415)335-1673
2011 N. Shoreline Blvd.
Mountain View, CA 94039-7311
=========================================================================
============= attachment 1 of 2 =========================================
=========================================================================
@From: rpw3@rigden.wpd.sgi.com
Subject: ECC bibliocraphy
Organization: Silicon Graphics Inc., Mountain View, CA
Here's my bibliography for ECC:
- Lin & Costello, "Error Control Coding: Fundamentals and
Applications" (Prentice-Hall)
- Peterson & Weldon, "Error Correcting Codes" (MIT Press)
- Blahut, "Theory and Practice of Error Control Codes" (Addison-Wesley)
- Berlekamp, "Algebraic Coding Theory" (Aegean Park Press)
But for beginners, good books are:
- Hamming, "Coding and Information Theory" (Prentice-Hall)
- Arazi, "A Commonsense Approach to the Theory of Error-Correcting
Codes" (MIT Press)
Hamming's little book is an easy start, but only goes through single-error
correction. Lin & Costello is a pretty good cross-section, and has some
good stuff about Reed-Solomon codes used in disks & tapes. For example,
on pp.525-531 they discuss the particularly simple single-error correcting
shortened (174,171) R-S code over GF(256) used in the IBM 3370 disk. [You
can also use it in the unshortened (255,252) form.] This R-S code can easily
be done in software, with small table lookups. [I posted some rough, untested
pseudo-C for it in this newsgroup back in July...]
=========================================================================
============= attachment 2 of 2 =========================================
=========================================================================
@From: rpw3@rigden.wpd.sgi.com
Subject: Re: Compression is great, but, robustness?
Organization: Silicon Graphics Inc., Mountain View, CA
madler@nntp-server.caltech.edu (Mark Adler) writes:
+---------------
| Bill Davidsen notes:
| >> You can, of course, put error detection and correction in the data,
| >> using something simple like Hamming code, or larger and more complex
| >> like Fire codes.
+---------------
Fire codes are totally obsolete these days. Their miscorrection rates are
quite high compared to the available alternatives.
+---------------
| Better still are Reed-Solomon codes. The algebra is a little complicated,
| but the run-time of a good implementation is pretty fast. They can correct
| rather long error bursts and are naturals for computer data streams--you
| can use eight bit symbols in the algebra which correspond to the bytes in
| the data stream.
+---------------
Particularly simple is the shortened (174,171) R-S code over GF(256) used in
the IBM 3370 disk. [You can also use it in the unshortened (255,252) form.]
Each of the three checksum bytes is computed independently, rather than as
the remainder of one large polynomial, computed (roughly) as:
for (s0 = s1 = sm1 = i = 0; i < BLOCK_SIZE; ++i) {
s0 = s0 ^ input[i];
s1 = GF_mult_by_alpha[ s1 ^ input[i] ];
sm1 = GF_mult_by_alpha_inverse[sm1 ^ input[i] ];
}
where "GF_mult_by_alpha" is a lookup table to multiply an element of GF(256)
by the constant "alpha", the chosen primitive element. The two tables are
256 bytes each. [These checksums can also be computed in hardware at the
speed of one clock per byte, with a simple PAL per checksum byte (3).]
Correction is just as simple, but needs more memory to store the division
table (since both operands are variables). You start by computing the
"syndrome" bytes the same was as the checksum bytes, but this time the
transmitted checksum bytes are folded in too. then, assuming just one
byte in error, the location of the byte in error is given by:
err_i = GF_log_base_alpha[ GF_divide[s1][s0] ];
where "GF_log_base_alpha" is another 256-byte table and "GF_divide" is a
65,536-byte table. Given the location "err_i" of the error, correction is
then simply a matter of:
input[err_i] ^= s0;
(The "sm1" term is actually redundant, but can be used to help protect
against miscorrections that can occur when there are multiple errors.
It's cheap insurance.)
Although this code corrects only one symbol (byte) in error, interleaving
multiple blocks can improve the burst performance almost as well as one
likes. [The IBM 3370 uses 3:1 interleaving to correct any burst confined
to three consecutive bytes.] Still, it can only correct one burst. But the
cost is really low...
+---------------
| The guys from JPL would tell you this for no beers at all!
+---------------
Reference:
Lin & Costello, "Error Control Coding: Fundamentals & Applications",
(Prentice-Hall 1983), pp.525-531.
@//E*O*F rs.rply.2//
chmod u=rw,g=,o= rs.rply.2
echo x - rs.rply.3
sed 's/^@//' > "rs.rply.3" <<'@//E*O*F rs.rply.3//'
@From: "samsung@tigger.jvnc.net -- Advanced Media Laboratory of Samsung Electronics 1009 Lenox Dr. Lawrenceville"
Subject: Reed Solomon Code
Here is some code that came down the net some time ago.
I works really well. I have put this code into a Khoros routine.
Steve Jaffe
samsung@tigger.jvnc.net
Received: from munnari.OZ.AU by tigger.jvnc.net (5.65/1.34)
id AA17750; Wed, 26 Jun 91 02:11:52 -0400
Received: from augean.ua.oz.au by munnari.oz.au with SMTP (5.64+1.3.1+0.50)
id AA22760; Wed, 26 Jun 1991 16:11:30 +1000 (from simon@augean.ua.oz.au)
Date: Wed, 26 Jun 1991 16:11:30 +1000
@From: simon@augean.ua.oz.au
Message-Id: <9106260611.22760@munnari.oz.au>
Received: by augean (5.61+IDA+MU/4.8N)
id AA27938; Wed, 26 Jun 1991 15:32:11 +0930
To: ASTERIAN@BNR.CA, Tom.Lyon@Eng.Sun.COM, brianc@ecn.purdue.edu,
curiger@iis.ethz.ch, danbl@arcturis.cna.tek.com,
dlee@hplabsb.hpl.hp.com, eho@ee.ubc.ca, gd@erg.sri.com,
hptasins@sdcc6.UCSD.EDU, jpc@avdms8.msfc.nasa.gov,
jsun@sunpix.East.Sun.COM, jw7348@mexico.medtronic.com,
kahrs@alice.att.com, lovell@s1.elec.uq.oz.au,
magore@watserv1.uwaterloo.ca, nieto@trantor.harris-atd.com,
plx!plxsun!ming@Sun.COM, probert%cs@hub.ucsb.edu,
rand@zeta.sps.mot.com, rchann@cebnet.CEB.McMaster.CA, rjk@sequent.com,
robert@wiliki.eng.hawaii.edu, ron@chopin.udel.edu,
ron@monu6.cc.monash.edu.au, rtu@bsac.berkeley.edu,
samsung@tigger.jvnc.net, sdw@hpsadsdw.hp.com, srm@matrx.matrix.com,
tmb@ai.mit.edu, tony@comperex.cx.oz.au, vchin@kean.ucs.mun.ca
Subject: Re: Reed-Solomon code
To all those people who requested the Reed-Solomon encoder/decoder program,
here it is. I had to dig it out of my archive, and having posted that I had
it without checking it was all okay, discovered to my chagrin that it dumped
core when run after being compiled using cc -O on our new Sparc 2. (Okay on cc
with no optimize, and also on gcc with or without optimize, and ran heaps of
times on our old Dec and Sun boxes - funny how these bugs can hide for years!).
So after a night of reviewing my coding theory so I could follow the program
(it's been a while...), I found and fixed the bug (you guessed it, reading one
past the end of an array to give a compiler dependent segmentation violation!).
At the same time I put in a couple more comments so it hopefully won't be too
hard to follow with the aid of a reference text such as those mentioned in the
comments (can't recall the titles off-hand, just the authors, but they are
standard texts).
Anyhow, do what you will with it. It may or may not suit your needs, but its
one way of doing the job (I have no idea of it's speed compared to other
techniques or implementations). Hopefully it may save someone some work!
Simon
----------------------- Cut here - code follows ------------------------------
/* rs.c */
/* This program is an encoder/decoder for Reed-Solomon codes. Encoding is in
systematic form, decoding via the Berlekamp iterative algorithm.
In the present form , the constants mm, nn, tt, and kk=nn-2tt must be
specified (the double letters are used simply to avoid clashes with
other n,k,t used in other programs into which this was incorporated!)
Also, the irreducible polynomial used to generate GF(2**mm) must also be
entered -- these can be found in Lin and Costello, and also Clark and Cain.
The representation of the elements of GF(2**m) is either in index form,
where the number is the power of the primitive element alpha, which is
convenient for multiplication (add the powers modulo 2**m-1) or in
polynomial form, where the bits represent the coefficients of the
polynomial representation of the number, which is the most convenient form
for addition. The two forms are swapped between via lookup tables.
This leads to fairly messy looking expressions, but unfortunately, there
is no easy alternative when working with Galois arithmetic.
The code is not written in the most elegant way, but to the best
of my knowledge, (no absolute guarantees!), it works.
However, when including it into a simulation program, you may want to do
some conversion of global variables (used here because I am lazy!) to
local variables where appropriate, and passing parameters (eg array
addresses) to the functions may be a sensible move to reduce the number
of global variables and thus decrease the chance of a bug being introduced.
This program does not handle erasures at present, but should not be hard
to adapt to do this, as it is just an adjustment to the Berlekamp-Massey
algorithm. It also does not attempt to decode past the BCH bound -- see
Blahut "Theory and practice of error control codes" for how to do this.
Simon Rockliff, University of Adelaide 21/9/89
26/6/91 Slight modifications to remove a compiler dependent bug which hadn't
previously surfaced. A few extra comments added for clarity.
Appears to all work fine, ready for posting to net!
Notice
--------
This program may be freely modified and/or given to whoever wants it.
A condition of such distribution is that the author's contribution be
acknowledged by his name being left in the comments heading the program,
however no responsibility is accepted for any financial or other loss which
may result from some unforseen errors or malfunctioning of the program
during use.
Simon Rockliff, 26th June 1991
*/
#include
#include
#define mm 4 /* RS code over GF(2**4) - change to suit */
#define nn 15 /* nn=2**mm -1 length of codeword */
#define tt 3 /* number of errors that can be corrected */
#define kk 9 /* kk = nn-2*tt */
int pp [mm+1] = { 1, 1, 0, 0, 1} ; /* specify irreducible polynomial coeffts */
int alpha_to [nn+1], index_of [nn+1], gg [nn-kk+1] ;
int recd [nn], data [kk], bb [nn-kk] ;
void generate_gf()
/* generate GF(2**mm) from the irreducible polynomial p(X) in pp[0]..pp[mm]
lookup tables: index->polynomial form alpha_to[] contains j=alpha**i;
polynomial form -> index form index_of[j=alpha**i] = i
alpha=2 is the primitive element of GF(2**mm)
*/
{
register int i, mask ;
mask = 1 ;
alpha_to[mm] = 0 ;
for (i=0; i>= 1 ;
for (i=mm+1; i= mask)
alpha_to[i] = alpha_to[mm] ^ ((alpha_to[i-1]^mask)<<1) ;
else alpha_to[i] = alpha_to[i-1]<<1 ;
index_of[alpha_to[i]] = i ;
}
index_of[0] = -1 ;
}
void gen_poly()
/* Obtain the generator polynomial of the tt-error correcting, length
nn=(2**mm -1) Reed Solomon code from the product of (X+alpha**i), i=1..2*tt
*/
{
register int i,j ;
gg[0] = 2 ; /* primitive element alpha = 2 for GF(2**mm) */
gg[1] = 1 ; /* g(x) = (X+alpha) initially */
for (i=2; i<=nn-kk; i++)
{ gg[i] = 1 ;
for (j=i-1; j>0; j--)
if (gg[j] != 0) gg[j] = gg[j-1]^ alpha_to[(index_of[gg[j]]+i)%nn] ;
else gg[j] = gg[j-1] ;
gg[0] = alpha_to[(index_of[gg[0]]+i)%nn] ; /* gg[0] can never be zero */
}
/* convert gg[] to index form for quicker encoding */
for (i=0; i<=nn-kk; i++) gg[i] = index_of[gg[i]] ;
}
void encode_rs()
/* take the string of symbols in data[i], i=0..(k-1) and encode systematically
to produce 2*tt parity symbols in bb[0]..bb[2*tt-1]
data[] is input and bb[] is output in polynomial form.
Encoding is done by using a feedback shift register with appropriate
connections specified by the elements of gg[], which was generated above.
Codeword is c(X) = data(X)*X**(nn-kk)+ b(X) */
{
register int i,j ;
int feedback ;
for (i=0; i=0; i--)
{ feedback = index_of[data[i]^bb[nn-kk-1]] ;
if (feedback != -1)
{ for (j=nn-kk-1; j>0; j--)
if (gg[j] != -1)
bb[j] = bb[j-1]^alpha_to[(gg[j]+feedback)%nn] ;
else
bb[j] = bb[j-1] ;
bb[0] = alpha_to[(gg[0]+feedback)%nn] ;
}
else
{ for (j=nn-kk-1; j>0; j--)
bb[j] = bb[j-1] ;
bb[0] = 0 ;
} ;
} ;
} ;
void decode_rs()
/* assume we have received bits grouped into mm-bit symbols in recd[i],
i=0..(nn-1), and recd[i] is index form (ie as powers of alpha).
We first compute the 2*tt syndromes by substituting alpha**i into rec(X) and
evaluating, storing the syndromes in s[i], i=1..2tt (leave s[0] zero) .
Then we use the Berlekamp iteration to find the error location polynomial
elp[i]. If the degree of the elp is >tt, we cannot correct all the errors
and hence just put out the information symbols uncorrected. If the degree of
elp is <=tt, we substitute alpha**i , i=1..n into the elp to get the roots,
hence the inverse roots, the error location numbers. If the number of errors
located does not equal the degree of the elp, we have more than tt errors
and cannot correct them. Otherwise, we then solve for the error value at
the error location and correct the error. The procedure is that found in
Lin and Costello. For the cases where the number of errors is known to be too
large to correct, the information symbols as received are output (the
advantage of systematic encoding is that hopefully some of the information
symbols will be okay and that if we are in luck, the errors are in the
parity part of the transmitted codeword). Of course, these insoluble cases
can be returned as error flags to the calling routine if desired. */
{
register int i,j,u,q ;
int elp[nn-kk+2][nn-kk], d[nn-kk+2], l[nn-kk+2], u_lu[nn-kk+2], s[nn-kk+1] ;
int count=0, syn_error=0, root[tt], loc[tt], z[tt+1], err[nn], reg[tt+1] ;
/* first form the syndromes */
for (i=1; i<=nn-kk; i++)
{ s[i] = 0 ;
for (j=0; j error */
s[i] = index_of[s[i]] ;
} ;
if (syn_error) /* if errors, try and correct */
{
/* compute the error location polynomial via the Berlekamp iterative algorithm,
following the terminology of Lin and Costello : d[u] is the 'mu'th
discrepancy, where u='mu'+1 and 'mu' (the Greek letter!) is the step number
ranging from -1 to 2*tt (see L&C), l[u] is the
degree of the elp at that step, and u_l[u] is the difference between the
step number and the degree of the elp.
*/
/* initialise table entries */
d[0] = 0 ; /* index form */
d[1] = s[1] ; /* index form */
elp[0][0] = 0 ; /* index form */
elp[1][0] = 1 ; /* polynomial form */
for (i=1; i0)) q-- ;
/* have found first non-zero d[q] */
if (q>0)
{ j=q ;
do
{ j-- ;
if ((d[j]!=-1) && (u_lu[q]0) ;
} ;
/* have now found q such that d[u]!=0 and u_lu[q] is maximum */
/* store degree of new elp polynomial */
if (l[u]>l[q]+u-q) l[u+1] = l[u] ;
else l[u+1] = l[q]+u-q ;
/* form new elp(x) */
for (i=0; i >tt errors and cannot solve */
for (i=0; itt hence cannot solve */
for (i=0; i no errors: output received codeword */
for (i=0; i
References: <3b7usa$sen@pita.cs.huji.ac.il>
NNTP-Posting-Host: callisto.nt.tuwien.ac.at
In article <3b7usa$sen@pita.cs.huji.ac.il> writes:
> Hello all,
>
> I'm writing (or at least trying to write) a new compression
> algorithm/program, and like many others, I'm using hash tables
> to access strings from the input.
> I've experimented a few days, but wasn't able to find anything
> satisfactory, so instead of re-inventing the wheel, I ask for
> your kind suggestions: What hash funcition is well suited for
> this (i.e from strings of varied length to integers).
> If ineterst is shown, I'll post a summary of replies.
>
> Thanx,
>
> --Eitan.
Take a look at the article " Fast Hashing of Variable Length Text Strings
" of Peter K. Pearson in Communications of the ACM , June 1990 volume 33
number 6.
AFAIN, it is the most proper thing for you.
I am writing some compression programs too, so hoping we can exchange some
ideas ?
All the bests,
Khanh Nguyen
Vienna University of Technology
From cwrea@tuzo.erin Fri Dec 2 21:36:38 1994
Newsgroups: comp.compression
Path: solitaire.cv.nrao.edu!hearst.acc.Virginia.EDU!caen!newsxfer.itd.umich.edu!gatech!swrinde!cs.utexas.edu!utnut!erin.utoronto.ca!tuzo.erin!cwrea
From: cwrea@tuzo.erin (Chris W. Rea [UL])
Subject: Re: Hash function for compression algorithms
Message-ID:
Sender: news@credit.erin.utoronto.ca (Usenet News)
Nntp-Posting-Host: tuzo.erin
Reply-To: cwrea@tuzo.erin.utoronto.ca
Organization: Erindale College, University of Toronto, Canada
References: <3b7usa$sen@pita.cs.huji.ac.il>
Date: Tue, 29 Nov 1994 22:54:09 GMT
Lines: 53
In article <3b7usa$sen@pita.cs.huji.ac.il>,
Eitan Frachtenberg wrote:
>
>I'm writing (or at least trying to write) a new compression
>algorithm/program, and like many others, I'm using hash tables
>to access strings from the input.
>I've experimented a few days, but wasn't able to find anything
>satisfactory, so instead of re-inventing the wheel, I ask for
>your kind suggestions: What hash funcition is well suited for
>this (i.e from strings of varied length to integers).
>If ineterst is shown, I'll post a summary of replies.
This is one I use for a project I'm currently working on:
/* HashWord takes W, the ASCII string we want to hash
** S, the length of string W (ie: strlen(W))
** R, choose the next R'th value to resolve a collision
*/
ULONG HashWord(UBYTE *W, UBYTE S, UWORD R)
{
ULONG HashValue=0;
UWORD Seed[3]={0,0,0}, i=0;
Seed[0]=R+119;
Seed[1]=R*727+R*R;
Seed[2]=5033-R*R*R;
seed48(Seed);
for(i=0; i~~
References:
NNTP-Posting-Host: nmerha3f.bnr.ca
Keywords: JPEG, Tiff
Originator: jbotte@nmerha3f
Xref: solitaire.cv.nrao.edu comp.compression:14966 alt.binaries.pictures:21113
In article , swlau@cs.cuhk.hk (Lau Siu Wah) writes:
|> Does someone know if there is a program to convert compressed image
|> in TIFF format to JPEG format?
|>
|> Clarence.
The best PC based system for doing this sort of thing is a program
called Graphics Workshop from a company called Alchemy Mindworks.
It is shareware and is available on many FidoNet type systems. I do
not know if there is an FTP site for it. I get asked this a lot...
is there any FTP site that would be willing to carry this package?
I would be willing to "upload" it regularly. In addition to converting
>from one graphics format to another, the program also allows you to
view, crop, scale, dither, and a bunch of other image processing
operations. It is free to try out, but you would eventually have to
pay for it (there are worldwide "distributors").
If someone volunteers an FTP site, perhaps this could become a FAQ?
Jim
#include // My opinions are my own,
// but you have have them if you like.
From ts@uwasa.fi Sun Dec 4 22:54:21 1994
Path: solitaire.cv.nrao.edu!hearst.acc.Virginia.EDU!caen!newsxfer.itd.umich.edu!gatech!swrinde!pipex!sunic!news.funet.fi!zippo.uwasa.fi!uwasa.fi!ts
From: ts@uwasa.fi (Timo Salmi)
Newsgroups: comp.compression
Subject: Re: LHA Latest version
Date: 2 Dec 1994 03:21:57 GMT
Organization: University of Vaasa
Lines: 29
Message-ID: <3bm3sl$d4p@zippo.uwasa.fi>
References: <3bl422$6hm@gold.interlog.com>
NNTP-Posting-Host: uwasa.fi
In article <3bl422$6hm@gold.interlog.com> richard@panchax.gryphon.com writes:
:What and where is the latest lha. Trying to fine the FAQ on rtfm but it
:was to busy to accept anonymous connections.
44416 Aug 1 1991 garbo.uwasa.fi:/pc/arcers/lha213.exe
lha213.exe LHa compression for .lzh files, tight, in English
70656 Nov 24 1992 garbo.uwasa.fi:/pc/arcers/lha255b.exe
lha255b.exe LHa compression for .lzh files, Kanji docs, development vers
2849 Apr 20 1993 garbo.uwasa.fi:/pc/arcers/lha255b.txt
lha255b.txt LHA 2.55B READ.ME translation by Masaoki Kobayashi
96999 Jan 11 1993 garbo.uwasa.fi:/unix/arcers/lha101u.tar.Z
lha101u.tar.Z Lha for .lzh compression on Unix
90174 Nov 9 11:03 garbo.uwasa.fi:/pc/doc-net/faqc9411.zip
faqc9411.zip Frequently Asked Questions of comp.compression, J.Gailly
98684 Dec 1 10:20 garbo.uwasa.fi:/pc/INDEX.ZIP
INDEX.ZIP The annotated list of garbo.uwasa.fi MsDos files, zipped
All the best, Timo
..................................................................
Prof. Timo Salmi Co-moderator of comp.archives.msdos.announce
Moderating at garbo.uwasa.fi anonymous FTP archives 193.166.120.5
Faculty of Accounting & Industrial Management; University of Vaasa
Internet: ts@uwasa.fi BBS +(358)-61-3170972; FIN-65101, Finland
From ts@uwasa.fi Sun Dec 4 22:54:30 1994
Path: solitaire.cv.nrao.edu!hearst.acc.Virginia.EDU!caen!hookup!swrinde!pipex!sunic!news.funet.fi!zippo.uwasa.fi!uwasa.fi!ts
From: ts@uwasa.fi (Timo Salmi)
Newsgroups: comp.compression
Subject: Re: GZIP
Date: 2 Dec 1994 11:34:00 GMT
Organization: University of Vaasa
Lines: 20
Message-ID: <3bn0n8$mif@zippo.uwasa.fi>
References: <3bm2os$csp@freenet.unbc.edu>
NNTP-Posting-Host: uwasa.fi
In article <3bm2os$csp@freenet.unbc.edu> ad467@freenet.unbc.edu (Claude Pitre) writes:
:I'm looking for GZIP for DOS. Can someone either tell me where I can get
:it by ftp or email it to me uuencoded.
117185 Aug 19 10:25 garbo.uwasa.fi:/pc/unix/gzip124.zip
gzip124.zip GNU zip for .(g)z files, don't confuse with (pk)zip
98684 Dec 1 10:20 garbo.uwasa.fi:/pc/INDEX.ZIP
INDEX.ZIP The annotated list of garbo.uwasa.fi MsDos files, zipped
90174 Nov 9 11:03 garbo.uwasa.fi:/pc/doc-net/faqc9411.zip
faqc9411.zip Frequently Asked Questions of comp.compression, J.Gailly
All the best, Timo
..................................................................
Prof. Timo Salmi Co-moderator of comp.archives.msdos.announce
Moderating at garbo.uwasa.fi anonymous FTP archives 193.166.120.5
Faculty of Accounting & Industrial Management; University of Vaasa
Internet: ts@uwasa.fi BBS +(358)-61-3170972; FIN-65101, Finland
From pecampbe@mtu.edu Sun Dec 4 22:55:35 1994
Path: solitaire.cv.nrao.edu!hearst.acc.Virginia.EDU!caen!sol.ctr.columbia.edu!news.mtu.edu!news.mtu.edu!not-for-mail
From: pecampbe@mtu.edu (Paul E. Campbell)
Newsgroups: comp.compression,comp.programming
Subject: Re: File Structure for Disk Doubler
Followup-To: comp.compression,comp.programming
Date: 2 Dec 1994 17:04:38 -0500
Organization: Michigan Technological University
Lines: 62
Message-ID: <3bo5lm$97r@metlab4.my>
References:
NNTP-Posting-Host: metlab4.my.mtu.edu
X-Newsreader: TIN [version 1.2 PL1]
Xref: solitaire.cv.nrao.edu comp.compression:15080 comp.programming:12910
Art Pollard (pollarda@uhunix3.uhcc.Hawaii.Edu) wrote:
: I am familier with the Log Structured File System (Sprite) which has been
: used in one study (by DEC) to create a compressed file system. However,
: the resources for this method are too great for a mere PC. There are
: several other approaches simular to this with which I am familier
: however, they all have memory requirements which are too great for PCs or
: would fragment too quickly to support the high sequential access
: across blocks needed while reading a file efficiently.
This is what I worked out when I had the time and desire to build a log
structured file system (I no longer have the time)...
First, you need to do a fast compression routine.
Second, on average, most systems use say 8-16K block sizes for good
compression. 4K is really too small.
Third, the way the LFS structures work is pretty good assuming you have that
1 megabyte free to use for those purposes. So this is my proposed alternative:
First, the segment idea is sound.
Instead of keeping it in memory, do this:
Continuously ADD to the current segment as you get stuff to write to it. This
may entail more writing than the other methods (oh well...). Trust your system
at least enough to do some caching. On the first pass, don't worry about
following a block size requirement religiously.
Store each new chunk of data sequentially in the log. Store EVERYTHING in
8-16K chunks. Pack in the data as close to your arbitrary size as possible.
Put a CRC on each chunk. That way, when you have to restart (someone shuts
off the machine), just start reading in the last segment you marked as in
use (in your "superblock" equivalent). Keep reading blocks until you fail to
get a good CRC. Even put file pointers in these "chunks".
Pointers to "sectors" or other arbitrary pieces of files look like this:
,,
That way, you can find the right block to read with:
Offset=*segment size+Block start
When writing, the rule for whether to write to the current block or start a
new one is simple:
Known: The size of the block so far
The desired block size
The size of the data to write out (no compression)
If (Desired block size-Current size)>(size needed), append it
Then when you finish your write, stick on the extra bits for "end of block"
and CRC. When you write again, you start on the current sector where the
"end of block" section is at, but overwrite that part (you end up rewriting
a few bytes but that doesn't matter too much). Caching helps reduce the
number of writes and rewrites you'd normally end up doing.
As far as DOS's usual "FAT's" or the equivalent structures in Unix systems,
you don't use them. You supply both OS's with as much of a legitimate view
of the system (given their myopic view of the world) as possible and handle
everything else yourself.
The advantage of this system is that you get compression, the advantages of
a log structured system, and low memory overhead.
From marknel@utdallas.edu Sun Dec 4 22:55:50 1994
Path: solitaire.cv.nrao.edu!hearst.acc.Virginia.EDU!darwin.sura.net!haven.umd.edu!news.umbc.edu!europa.eng.gtefsd.com!howland.reston.ans.net!torn!nott!bcarh189.bnr.ca!bcarh8ac.bnr.ca!corpgate!news.utdallas.edu!marknel
From: marknel@utdallas.edu (Mark R Nelson)
Newsgroups: comp.compression
Subject: Re: Any Recommended Books?
Date: 4 Dec 1994 03:41:20 GMT
Organization: The University of Texas at Dallas
Lines: 20
Message-ID: <3brdp0$gv4@news.utdallas.edu>
References:
NNTP-Posting-Host: csclass.utdallas.edu
NNTP-Posting-User: marknel
X-Newsreader: TIN [version 1.2 PL2]
Keith Williams (csi083@coventry.ac.uk) wrote:
: Someone mentioned a book called "The Data Compression Book" by Nelson..
: on this usegroup earlier... I'm having problems tracking it down and
: would appreciate a few more details like publisher etc...
_The Data Compression Book_ is published by M&T Books. ISBN 1-55851-216-0.
M&T is now owned by Henry Holt, and they don't seem to do any direct sales,
so you have to get it through a bookstore. Sorry, but I don't know
anything about who carries it in the UK. If you are at the end of your rope
and can't find it I can supply mail order/phone dealer numbers from the US.
You should also look for _Text Compression_ by Bell, Cleary, and Witten
>from Prentice Hall, ISBN 0-13-911991-4. The best reference to references
is the comp.compression FAQ, available in the normal places, such as
rtfm.mit.edu.
Mark Nelson
marknel@utdallas.edu
From ivanh@iconz.co.nz Fri Dec 9 21:14:58 1994
Path: solitaire.cv.nrao.edu!hearst.acc.Virginia.EDU!darwin.sura.net!tulane!ames!waikato!status.gen.nz!iconz.co.nz!ivanh
From: ivanh@iconz.co.nz (Ivan)
Newsgroups: comp.compression
Subject: Re: Comparison of image compressions
Date: Fri, 9 Dec 1994 06:13:35 UNLISTED
Organization: Desktop Publishing Services
Lines: 37
Message-ID:
References: <3c764i$ice@nms.telepost.no>
NNTP-Posting-Host: ivanh.iconz.co.nz
X-Newsreader: Trumpet for Windows [Version 1.0 Rev B final beta #4]
In article <3c764i$ice@nms.telepost.no> hans2@spacetec.no (Hans Sandsdalen) writes:
>I am lokking for an article which compares different image compression
>techniques, still images.
This article may answer your questions. Do not know about Dr. Dobb though
apart from that his journels are available on oour local BBS frequently.
Path: status.gen.nz!waikato!wupost!simtel.coast.net!msdos-ann-request
From: w8sdz@SimTel.Coast.NET (Keith Petersen)
Newsgroups: comp.archives.msdos.announce
Subject: Info on ARC, ARJ, LZH, ZIP, ZOO, LBR, Compressed & Squeezed files
Message-ID: <9410191526.kp18796@SimTel.Coast.NET>
Date: Wed, 19 Oct 1994 15:26:37 GMT
Expires: 2 Nov 94 20:00:00 GMT
Followup-To: comp.archives.msdos.d
Sender: msdos-ann-request@simtel.coast.net
Organization: SimTel, the Coast to Coast Software Repository (tm)
Approved: w8sdz@SimTel.Coast.NET
Lines: 188
[File: /SimTel/msdos/starter/00-files.doc Revised: Sept. 15, 1994]
All about ARC, ARJ, LZH, ZIP, ZOO, LBR, Compressed and Squeezed files
Some of the files available from SimTel, the Coast to Coast Software
Repository (tm), have been transformed by using one of the standard
freely-distributable utilities that either SQueezes, LiBRaries, ARChives,
ARJs, LZHs, ZIPs, or ZOOs files. Some files in other collections have
been compressed.
____________________________
Ivan Hyslop
Desktop Publishing Services.
Howick, Auckland, New Zealand.
E-MAIL: ivanh@iconz.co.nz
From marknel@utdallas.edu Fri Dec 9 22:30:26 1994
Path: solitaire.cv.nrao.edu!hearst.acc.Virginia.EDU!darwin.sura.net!gatech!howland.reston.ans.net!pipex!bnr.co.uk!bcarh8ac.bnr.ca!corpgate!news.utdallas.edu!marknel
From: marknel@utdallas.edu (Mark R Nelson)
Newsgroups: comp.compression
Subject: Re: PKZIP Runtime Library?
Date: 9 Dec 1994 21:54:57 GMT
Organization: The University of Texas at Dallas
Lines: 20
Distribution: world
Message-ID: <3cajnh$83f@news.utdallas.edu>
References: <1994Nov24.092605.527@pat.uwe.ac.uk> <786013380snz@jaburt.demon.co.uk> <3bj67l$ova$1@rosebud.ncd.com>
NNTP-Posting-Host: csclass.utdallas.edu
NNTP-Posting-User: marknel
X-Newsreader: TIN [version 1.2 PL2]
: Does anybody know of an archiving package or routines for un-archiving useable
: from a Windows C program? Whether it's C, Pascal, object, library or .DLL is
: no problem. It's also unimportant that the file format be .ZIP or any other
: particular format. If you do, please drop me a line ASAP. Thanks!!!
Greenleaf Software makes a product called ArchiveLib. It is a C++ class
library, but also has a complete interface for C programmers. It has both
16 bit and 32 bit Windows DLLs, along with VB, C, C++, and TPW example
programs.
If you want more information, you can call Greenleaf Technical Sales at
800-523-9830 or 214-248-2561. Fax 214-248-7830. There is also a small
amount of benchmark and demo information in the PCVENB area of CompuServe,
but nothing available via ftp at this time.
Mark Nelson
Greenleaf Software
marknel@utdallas.edu
From marmisa@tid.tid.es Wed Dec 14 08:54:47 1994
Path: solitaire.cv.nrao.edu!hearst.acc.Virginia.EDU!caen!usenet.cis.ufl.edu!usenet.eel.ufl.edu!news.mathworks.com!europa.eng.gtefsd.com!newsxfer.itd.umich.edu!gatech!howland.reston.ans.net!news.sprintlink.net!EU.net!goya!tid.tid.es!tid.tid.es!not-for-mail
From: marmisa@tid.tid.es (Luis Jose Marmisa Gazo)
Newsgroups: comp.compression
Subject: Re: Improvement of LZW compression.
Date: 13 Dec 1994 20:54:34 +0100
Organization: Telefonica I+D
Lines: 42
Message-ID: <3cku5q$p8f@tid.tid.es>
References: <1994Dec9.143135.351@ign.fr>
NNTP-Posting-Host: tid.tid.es
X-Newsreader: TIN [version 1.2 PL2]
Corine Plazanet (plazanet@sturm.ign.fr) wrote:
: I've coded LZW compression algorithm for images, text, fax, exe. I would you to decrease the time processing.
: Does anybody know about algorithms to improve LZW or where I can find any?
: I heard about H-coding, what about you?
: Thanks,
: Corinne PLAZANET
: IGN.
Hi Corinne:
I am not an expert in the compression field, but I can suggest to you
two fast algorithms:
1) V.42bis from ITU. This is the standard for data compression in modems. An
8 bits microprocessor (Z80) is able to compress / uncompress a bidirectional
14.400 bps data link. I believe that V.42bis is a LZW (LZ78).
2) I had some experience with lzrw3-a from Ross Williams. This is the fastest
algorithm in the comparison table of the compression FAQ. However, it is clear
that this algoritm is not the best compressing.
'C' sources are available by ftp in:
- ftp.adelaide.edu.au:/pub/compression/lzrw3-a.c [129.127.40.3]
lzrw3-a is not a LZW algorithm (I think), it's a LZ77. I tested this
algorithm on a 16 bits MC68000 microprocessor both for increase the data
transfer speed (19200 bps) and for file compression. I got good
results.
I hope that this information could be useful to you.
Best regards,
Luis Marmisa (marmisa@tid.es)
P.S.: There are some problems with your e-mail address. I can not send you
any message.
From nevries@aip.nl Fri Dec 16 01:25:08 1994
Path: solitaire.cv.nrao.edu!hearst.acc.Virginia.EDU!portal.gmu.edu!europa.eng.gtefsd.com!howland.reston.ans.net!EU.net!sun4nl!aipnl!nevries
From: nevries@aip.nl (Nico de Vries [AIP-NL])
Newsgroups: comp.compression
Subject: Re: Improvement of LZW compression.
Distribution: world
Message-ID: <787391473snx@aip.nl>
References: <1994Dec9.143135.351@ign.fr>
Date: Wed, 14 Dec 94 07:51:13 GMT
Organization: AIP-NL (Ad Infinitum Programs)
Lines: 24
plazanet@sturm.ign.fr writes in article <1994Dec9.143135.351@ign.fr>:
>
> I've coded LZW compression algorithm for images, text, fax, exe. I
> would you to decrease the time processing.
> Does anybody know about algorithms to improve LZW or where I can find any?
> I heard about H-coding, what about you?
FTP garbo.uwasa.fi:/pc/programming/lds_11.zip contains 2 sources which
are faster than LZW at compression: Finish and LZRW. It also contains
the source AR002 which is probably faster at decompressing that your
LZW implementation, but slower at compressing. OTOH AR002 has a much
better compression ratio than LZW.
> Thanks,
>
>
> Corinne PLAZANET
> IGN.
Take care,
Nico de Vries.
--
nevries@aip.nl [AIP-NL, UltraCompressor II development]
From hle1@delphi.com Mon Dec 26 07:58:03 1994
Path: solitaire.cv.nrao.edu!hearst.acc.Virginia.EDU!caen!math.ohio-state.edu!howland.reston.ans.net!news2.near.net!news.delphi.com!usenet
From: hle1@delphi.com
Newsgroups: comp.compression
Subject: Re: max. huffman code length?
Date: Sun, 25 DEC 94 21:40:18 -0500
Organization: Delphi (info@delphi.com email, 800-695-4005 voice)
Lines: 17
Message-ID: <5Y7U3Bi.hle1@delphi.com>
References: <3da2u2$1ahr@rs18.hrz.th-darmstadt.de>
NNTP-Posting-Host: bos1e.delphi.com
X-To: Ulrich Graef
Ulrich Graef writes:
>> is there any way to calculate the maximal codelength one will get with
>> dynamic huffman? (2^16 characters of 8 bit)
>
>Which data and which dynamic Huffman method (See my last post!)?
>
>If you use the first bit to decide whether to code or to copy the data,
>you can limit the size of the code to
>
> 2^16 * 8 + 1
The Huffman method is somewhat related to the minimum description length
(MDL) and minimum message length (MML) methods which is why I am joining in.
I have been searching for specific detailed examples of how to code
these methods in a spreadsheet. Thanks in advance for any advice you
may have. Harry Edwards hle1@delphi.com
From graef@iti.informatik.th-darmstadt.de Wed Dec 28 00:35:46 1994
Path: solitaire.cv.nrao.edu!hearst.acc.Virginia.EDU!caen!zip.eecs.umich.edu!newsxfer.itd.umich.edu!gatech!swrinde!pipex!uunet!zib-berlin.de!news.th-darmstadt.de!iti.informatik.th-darmstadt.de!graef
From: graef@iti.informatik.th-darmstadt.de (Ulrich Graef)
Newsgroups: comp.compression
Subject: Re: max. huffman code length?
Date: 27 Dec 1994 17:28:42 GMT
Organization: TH Darmstadt, FG Systemprogrammierung
Lines: 53
Distribution: world
Message-ID: <3dpisa$eqp@rs18.hrz.th-darmstadt.de>
References: <3da2u2$1ahr@rs18.hrz.th-darmstadt.de> <5Y7U3Bi.hle1@delphi.com>
NNTP-Posting-Host: spelunke.iti.informatik.th-darmstadt.de
In article <5Y7U3Bi.hle1@delphi.com>, hle1@delphi.com writes:
> Ulrich Graef writes:
>
> >> is there any way to calculate the maximal codelength one will get with
> >> dynamic huffman? (2^16 characters of 8 bit)
> >
> >Which data and which dynamic Huffman method (See my last post!)?
> >
> >If you use the first bit to decide whether to code or to copy the data,
> >you can limit the size of the code to
> >
> > 2^16 * 8 + 1
>
> The Huffman method is somewhat related to the minimum description length
> (MDL) and minimum message length (MML) methods which is why I am joining in.
> I have been searching for specific detailed examples of how to code
> these methods in a spreadsheet. Thanks in advance for any advice you
> may have. Harry Edwards hle1@delphi.com
I think, the methods to generate Huffmann code with dynamic adaption to the
message source are not easy to program in a spreadsheet.
This is because you have to build a Huffman tree to be able to generate
your binary Huffman prefix-code _and_ you must update the tree in periodic
intervals to adapt (dynamically) to the probabilies in your message source.
This is hard to program with good performance,
even if you have a programming language like C or Pascal.
I think the most spreadsheets do not contain features which you need including:
- tree representation and manipulation.
- iterative processing of data
(to perform the coding process and the dynamic model update)
Does your spreadsheet allow this (without heavy add-ins)?
If yes, which one do you use?
If you want to use the spreadsheet anyway, you should consider arithmetic
coding. Especially when you only want to compute the minimal representation
length it would be very helpful. But the need to use iterative processing
of data does not disappear.
Greetings,
Uli
--
Ulrich Graef | home: 06155 62493 int: +49 6155 62493
Lichtenbergweg 11 | office: 06103 752 364 int: +49 6103 752 359
D-64347 Griesheim +-------------------------------------------------
Germany | e-mail: graef@iti.informatik.th-darmstadt.de
From roe2@ellis.uchicago.edu Wed Dec 28 10:16:09 1994
Newsgroups: comp.compression
Path: solitaire.cv.nrao.edu!hearst.acc.Virginia.EDU!caen!msunews!uchinews!ellis!roe2
From: roe2@ellis.uchicago.edu (Cave Newt)
Subject: Re: GZIP file format
Message-ID: <1994Dec28.062253.4076@midway.uchicago.edu>
Sender: news@uchinews.uchicago.edu (News System)
Reply-To: roe2@midway.uchicago.edu
Organization: University of Chicago
References: <3d9ple$dai@sun4.bham.ac.uk>
Date: Wed, 28 Dec 1994 06:22:53 GMT
Lines: 21
phillipm@eee.bham.ac.uk (Mark Phillips) writes:
>I have an application (running under DOS) which generates large
>files and I wish to compress them to the PKZIP format on-line, that is,
>as they are created, so that they may be transfered and decompressed on
>a UNIX platform with PKUNZIP at a later date.
>
>Is it possible to obtain the PKZIP file format specification (including
>a specification of the variant of the Lempel Zif compression algorithm
>PKZIP uses ) or, better still, the source code for PKZIP ? Obviously,
>I am prepared to pay a reasonable fee for the latter.
Obviously you're a bit confused here (gzip has almost nothing to do with
PKZIP/Zip), but free source for zip/unzip is available from ftp.uu.net in
/pub/archiving/zip. If you build the source code into your app you might
be able to pull this off, but anything running under DOS has a huge strike
against it already. Forget about "pipes," unless you're running in a DOS
box under OS/2 and incorporate the named variety...
Greg Roelofs
Info-ZIP
From scott@santarosa.edu Fri Dec 30 12:18:24 1994
Path: solitaire.cv.nrao.edu!hearst.acc.Virginia.EDU!caen!spool.mu.edu!agate!library.ucla.edu!csulb.edu!nic-nac.CSU.net!floyd!floyd!not-for-mail
From: scott@santarosa.edu (Scott Doty)
Newsgroups: comp.compression,comp.dsp
Subject: Re: Compression algorithms for DSP over SLIP/PPP
Followup-To: comp.compression,comp.dsp
Date: 29 Dec 1994 23:54:35 -0800
Organization: Santa Rosa Junior College, Santa Rosa, CA (US)
Lines: 41
Message-ID: <3e0ebr$mpj@floyd.santarosa.edu>
References: <3dikiu$cpr@News1.mcs.com>
NNTP-Posting-Host: floyd.santarosa.edu
X-Newsreader: TIN [version 1.2 PL2]
Xref: solitaire.cv.nrao.edu comp.compression:15530 comp.dsp:13619
Adam W. Dace (thekind@myhost.subdomain.domain) wrote:
[snip]
> ztalk uses GSM compression, and I've been wondering if there's anything
> better out there that I can get specs on *freely*. I've poked through the
> FAQ here, albeit a bit quickly and haven't really found much of use.
> I'm by no means a good C programmer nor have I played with DSP much until
> recently. If anyone could suggest whether GSM is the way to go, or not
> please let me know.
There are two LPC libraries that I've got to work under Linux:
_[ snippet 1 ]_
Ron Frederick (frederic@parc.xerox.com) of Xerox PARC,
Palo Alto, CA, contributed the LPC codec which is based on
an implementation done by Ron Zuckerman
(ronzu@isu.comm.mot.com) of Motorola which was posted to
the Usenet group comp.dsp on 26 June 1992.
_ _ _ _
ftp from beta.xerox.com, /pub/net-research/lpc.tar.Z
616 bytes/second, appears to run real-time on a 386/33. Speech
quality is...er..."usable." :)
_[ snippet 2]_
U.S. Department of Defense
LPC-10 2400 bps Voice Coder
Release 1.0
October 1993
_ _ _ _
ftp from dspsun.eas.asu.edu, in /pub/speech
Better audio quality, but seems to be more CPU intensive.
I've used the former to implement a general-purpose audio tool,
which will compress and transmit (using TCP) audio from a ulaw
source (including /dev/audio). I use this to listen to audiocasts
over a PPP link. (ftp.santarosa.edu, pub/Linux/vot/zvot-0.81.tar.gz).
Hopefully, this signal, and not noise...
-Scott
From john@pc.xs4all.nl Sat Dec 31 13:26:36 1994
Path: solitaire.cv.nrao.edu!hearst.acc.Virginia.EDU!caen!math.ohio-state.edu!howland.reston.ans.net!news.sprintlink.net!EU.net!sun4nl!hacktic!pc.xs4all.nl!pc.xs4all.nl!john
From: john@pc.xs4all.nl (Jan-Pieter Cornet)
Newsgroups: comp.compression
Subject: Re: compressor for word-aligned data?
Message-ID: <123094173545HNR.1.4@pc.xs4all.nl>
Date: Fri, 30 Dec 94 17:35:45 +0100
References:
Organization: Jan-Pieter Cornet at home, A Jacobsstr 8, 2642 AD Pijnacker, Netherlands
X-Newsreader: HNR 1.4 by Kafka & The Dude
Lines: 24
arndt@marvin.physik.uni-giessen.de (Arndt Brenschede) once said:
> we are dealing with large data-files from containing essentially
> 12-bit adc data. The files are word-aligned, that means, 16 bit are used
> instead of 12. It would be nice
> sometimes to compress these files, but gzip (and probably most
> other compressers) does no good job.
Try UC2 in the "tight/multimedia" mode (UC A -TT arch file ...). When
your file is indeed better compressed in 16-bit mode, UC2 should
auto-detect this and produce a tighter archive.
If you want something on another platform than DOS/Windows/OS2, your
best bet is to write a filter (actually, two filters with opposite
effects) that pre-processes your file into something that compresses
better. Examples: take word-wise delta values, or re-order the bytes in
your file from 0 1 2 3 4 5 ... to: 0 2 4 ... 1 3 5 ...
Good luck,
=====BEGIN FRACTAL-COMPRESSED SIGNATURE===== | Jan-Pieter Cornet
!PGP0XA4E77CCB/KVC=1FCBE41048A009550F68867928EB8DDF |
=====END FRACTAL-COMPRESSED SIGNATURE===== ;-)
My v2.6 decompressor (out soon!) will expand this to a 72 minutes MPEG movie!
~~