How does recursive packing of a file changes its size?

How the size of a packed file will change if you will pack it again? How will it change if you do that again and again? Will it be the same, bigger of smaller? I checked this with popular kinds of files and below I will show the results. I used Bash script and R language to check this. Charts shows how the file size changes in each iteration, first bar is the size of original, unpacked file.

How the script is used:

bash-3.2$ ./packer.sh cleaning_silver_jewelery.JPG 20 > cleaning_silver_jewelery.JPG.log
bash-3.2$ Rscript chart.r "cleaning_silver_jewelery.JPG.log" "JPG file"
Read 22 items
bash-3.2$

packer.sh

#!/bin/bash
## $1 stores filename of compressed file
## $2 stores number of iterations

if [ ! $# -eq 2 ] ; then
    echo "usage: $0 file_name number_of_iterations"
    exit 1
elif [ ! -f $1 ]; then
    echo "$1 - file does not exists"
    exit 1
else
    ls -l $1 | awk '{ print $5 }' 
    zip -q test_$1_0.zip $1 && ls -l test_$1_0.zip | awk '{ print $5 }' 
    for (( i=1; i<=$2; i++ )); do
        zip -q test_$1_$i.zip test_$1_$((i-1)).zip && ls -l test_$1_$i.zip | awk '{ print $5 }'
    done
    rm test*zip
fi

chart.r

args <- commandArgs(TRUE)
sizes <- scan(args[1])
image_name <- paste( "./chart", args[1], ".png")

png(filename=image_name, height=700, width=500, bg="white")

barplot(sizes, main=args[2], xlab="iterations",
    ylab="size of zip archive [bytes]", xpd=F, col="black",
    ylim=c(min(sizes), max(sizes)))

axis(2, at=c(min(sizes), max(sizes)))

They say that if you type "Google" into Google, then you will blow up the Internet, fortunately this similar experiment went safe :)

You can download source code (directory FileSizeAfterMultipleCompression) by cloning git@github.com:RobertGawron/snippets.git

1 comment: