Re: Weener and Ma - thanks guys!! (alt.binaries.pictures.vladmodels)

extremeusenet.nl

2015/07/22 20:58

On 2015-07-23 01:45:36 +0000, Weener said:

> Ma <Doff@Modo.de> wrote in news:201507222043245273-Doff@Modode:
>
>> most people in
>> this group will use compression when posting and you have to decompress
>> the file that they post so it's not like they'd have an exact copy once
>> extracted, I thought either they are ignorant or just bitching to be
>> bitching.
>
> Winrar and 7zip do indeed produce exact copies once extracted. Csv's and
> crc values with PiCheck are what Vlad collectors use to verify them. Just
> so you know.
>
> Weener

ZIP is a very popular compression algorithm supported by many popular
programs such as WinZip, 7-Zip, and recent versions of Microsoft
Windows. ZIPping a file or set of files can often reduce their size
significantly at the cost of needing to be unzipped before they can be
used.
Note though that I said, “…often reduce their size.”
Unfortunately, “often” doesn’t mean “always.”

The short answer
ZIPping photographs, music, and videos will typically not make them
significantly smaller and can even make them slightly larger.
To understand why that might be, we need to look into how compression
works at a high level.
About compression
While the specifics of many different compression algorithms is often
the stuff of research, theses, and even patents, the concepts of
compression are actually fairly simple.
The idea is that information stored on disk is often stored in a way
that is less than optimal for storage. It may be optimal for other
purposes, but as a side effect, there may be redundant information in
the data that could be represented differently.
A simple compression algorithm is “run length encoding.”
Consider the following text:
This is a row of 10 asterisks: ********** followed by text.
That’s 59 characters long. If we define the character “+” to not be a
plus character, but rather an indicator that the next two characters
are a count, and the third character the character that should be
repeated that many times, we get this:
This is a row of 10 asterisks: +10* followed by text.
We’ve shortened or “compressed” the text to only 53 characters, but it
still means exactly the same thing. When decompressed, the “+” is
encountered causing the “10*” that follows it to be read and replaced
with 10 asterisks. The original uncompressed text is restored.
This is a row of 10 asterisks: ********** followed by text.
Compression doesn’t always compress
In the example above, we took a line of 59 characters and “compressed”
it to 53 characters. It’s not a great compression algorithm, but it
worked.
Now, let’s compress this text using the same algorithm:
Here's a single plus sign: + followed by text.
That’s 46 characters long.
The problem is that because it actually contains the plus sign, the
character we said was special in our compression algorithm, we can’t
just let it be. If we do, the decompression algorithm will look at it
and say, “Oh, the next two characters are a count of the number of
times I should repeat the third character following,” which is simply
wrong.
Unless we specially encode the plus character:
Here's a single plus sign: +01+ followed by text.
That allowed the decompressor to follow its algorithm: “+” means the
next three characters are a count (one, in this case) of the number of
times to repeat the third character (“+”). The compression and
decompression algorithm works.
The only problem is that the “compressed” data at 49 characters is now
larger than the original 46.
Every compression algorithm faces this problem. My little example above
was crafted to make it easy to show, but even the most advanced
compression algorithm will have situations where compressing particular
forms of data may cause the “compressed” data to be larger than the
original.
Compressing already compressed data
One of the most common ways that compressed data can end up larger than
the original is if the original is itself already compressed.
Let’s look at the compressed version of my silly little example again:
This is a row of 10 asterisks: +10* followed by text.
What happens if we try to compress that data again? Well, as we saw,
that single “+” sign is a problem and needs to be treated specially:
This is a row of 10 asterisks: +01+10* followed by text.
The result is that the “compressed” data got bigger than the original.
Or rather compressing the already-compressed data made it larger.
This happens most reliably when you compress twice using the same
algorithm, but if the compression techniques you’re using are
relatively efficient, then the algorithms don’t matter as much. ZIPping
something twice makes the second zip larger than the first. But ZIPping
a “RAR” file, also a compressed file, will typically result in
something bigger than the original.
With that as background, we can finally explain our answer to the question.
Pictures, music, and videos are already compressed

“
One of the most common ways that compressed data can end up larger than
the original is if the original is itself already compressed.
Pictures in popular formats such as .jpg, .png or .gif are already compressed.

Music files in formats like .mp3, .ogg, .aac, and so on are already compressed.
Video files in formats like .wmv, .m4v, .mov, and more are already compressed.
And as we’ve seen by now, depending on the type of compression you’re
using, compressing an already-compressed file at best does very little
and at worst makes the file bigger.
So there’s typically no space-saving advantage to ZIPping a photo, a
movie, or an MP3.

I use KeKa for folders to zip with 7z for cross-platform use and with
NO COMPRESSION!
For movie files like mpg or mp4 I simply split the file. Linux and Unix
users can use the split command in bash, windows users can use
HJSplitter, OSX users can use Split&Concat.

Follow-ups:

Prev.

Article List

Favorite