Archive - Originally posted on "The Horse's Mouth" - 2009-05-26 19:36:56 - Graham Ellis
Why can you run applications in Java straight from the jar, but you'll never find mainstream applications extracting their code from a tar file? It all comes down to the different indexing structure.
A tar file was originally designed as a "tape archive". So it's a sequential file that contains file name, file length, file data for the first file, then the same thing for the second file, for the third file, and so on. It means that when a tape is written, you have a high data integrity even on a device that doesn't have random access facilities, as there's no need to store statistics about a file once it's been added, nor to look them up right at the start or end of writing the file which may - on a multiuser system - have changed. But it does mean that if you want to read back something from a "tar" file you have to look at each header block in turn until you find what you need.
By contrast, a jar file [same format as zip] contains all the data at the start of the jar, then an index of where each object starts and ends at the end of the file. Which means you can very quickly read the list of contents ("Manifest") and jump to what you need even in a huge file - but it does require random access.
You may ask why the list of contents for a jar is at the end. That's so that a jar file can be updated / have things added to it without having to rewrite the whole file - adding a new file is as simple as reading the old manifest, adding the new file's data starting where the manifest used to start, and then writing an updated manifest.