FRDB Archives

Freethought & Rationalism Archive

The archives are read only.


Go Back   FRDB Archives > Archives > IIDB ARCHIVE: 200X-2003, PD 2007 > IIDB Philosophical Forums (PRIOR TO JUN-2003)
Welcome, Peter Kirby.
You last visited: Yesterday at 05:55 AM

 
 
Thread Tools Search this Thread
Old 01-16-2002, 08:46 AM   #1
Contributor
 
Join Date: Jul 2001
Location: Deep in the heart of mother-lovin' Texas
Posts: 29,689
Talking Firm touts "perfect compression"

I have some resort property in Afghanistan to sell you if you believe this hype:

<a href="http://www.wired.com/news/technology/0,1282,49599,00.html" target="_blank">Perfect compression?</a>

From the interview:

WN: You have this working right now?

PSG: It's only operating on a limited bitstream of a few hundred bits. It needs a lot of work to make it a commercial technology.


In other words, they've designed a "perfect" compression algorithm that works on one data set, but they haven't (in 12 years) gotten it to work on other data sets.
Mageth is offline  
Old 01-16-2002, 02:37 PM   #2
Veteran Member
 
Join Date: Jun 2000
Location: Greensboro, NC, U.S.A.
Posts: 2,597
Cool

Quote:
Originally posted by Mageth:
<strong>I have some resort property in Afghanistan to sell you if you believe this hype:</strong>
I also really enjoyed his assertion that propeller-driven planes and jet-engine-driven planes operated on "completely different methodologies."

LOL. I bet Bernoulli would be pretty surprised to hear that!

Regards,

Bill Snedden
Bill Snedden is offline  
Old 01-16-2002, 02:56 PM   #3
Contributor
 
Join Date: Jul 2000
Location: Lebanon, OR, USA
Posts: 16,829
Exclamation

I've had an abundance of experience with data-compression algorithms, and it is certainly possible to devise algorithms that do high levels of compression for certain datasets. However, in absolute generality, there is a simple counting argument that shows that an absolutely general lossless compressor is impossible. By "lossless" is meant a compressor that will produce a compressed data object from which one can recover the original object in full exactitude, and not simply some approximation of it.

Imagine that one wants to compress a data object with N bits, each of which can be either possible value without any connection to the others -- in effect, a totally general data object. There are 2^N possible such objects.

Now compress this object to M bits. Doing a count of possible compressed objects yields 2^M bits. This means that there will be 2^N - 2^M of the original objects that cannot be mapped onto the compressed objects, no matter what algorithm one chooses. Since M less than N is clearly desired here (a less voluminous object), there will always be some uncompressible objects.

However, real-world experience seems to violate this contention, at least judging from the success of various compression algorithms. There are two reasons for that.

The first reason is that much real-world data, such as text files, violate the proof's conditions in being only a small subset of possible data objects. For example, in a word, consonants and vowels usually alternate, with clusters being usually 2 or rarely 3 vowels or consonants.

If one has several data objects one wants to send, such as ASCII characters, one can do compression by using fewer bits for the more common letters; there is even an algorithm, the Huffman algorithm, that finds the best possible choice along with a simple way of reconstruction.

Also, redundancy can be handled by having one's compressed data include commands like "repeat previous one".

The second reason is that in many applications, such as images and sound, one does not need absolute precision; one can often get away with reconstructing some approximation of the original. This allows one to use a much smaller quantity of possible compressed data objects, and thus much shorter ones.
lpetrich is offline  
Old 01-16-2002, 07:20 PM   #4
Veteran Member
 
Join Date: Dec 2001
Location: southeast
Posts: 2,526
Post

The more you know about a dataset, the more compression you can achieve. For example, I could compress a 10,000 word book into a single phrase, if both sides have access to the same library and the title is sufficient to locate the book.

But the mathematical truth is still there: there must always be some dataset that cannot be compressed, or every possible dataset will collapse to a single bit.
Asha'man is offline  
Old 01-18-2002, 07:45 AM   #5
Veteran Member
 
Join Date: Jun 2000
Location: Denver, Colorado, USA
Posts: 4,834
Post

I agree that knowledge of the dataset is crucial. You can have lossless compression of a dataset, but only if the data set contains less data than it appears to at first glance and there is clear understandings between the sender and receiver.

For example, if you have a screen that consists of colored squares, you really only need to know two corner coordinates, and about three numbers to identify the color (one number if you have a pre-arranged color coding system). Using this system compresses data dramatically compared to a bitmap of the same thing without data loss.

The transmission of a book with one line of reference data is another good example. To give some other real world examples, police commonly communicate crimes in process with numbers referring to criminal code sections, and doctors commonly bill insurance companies based on standard descriptions of medical disorders with pre-agreed code numbers.

Shorthand such as "the series of integers from 1 to 10^23" similarly is a very efficient data compression compared to writing it all out.

Even such obvious steps as writing a multi-volume book on a computer and saving it on a CD-ROM, while a lossless form of compression (in the sense that it saves everything you care about into a smaller space) works only because you don't really care about a lot of the data (e.g. paper color and quality, defects in printing, etc.).
ohwilleke is offline  
 

Thread Tools Search this Thread
Search this Thread:

Advanced Search

Forum Jump


All times are GMT -8. The time now is 07:59 AM.

Top

This custom BB emulates vBulletin® Version 3.8.2
Copyright ©2000 - 2015, Jelsoft Enterprises Ltd.