[8037] in bugtraq
Re: tar "features"
daemon@ATHENA.MIT.EDU (der Mouse)
Sat Sep 26 22:28:07 1998
Date: Sat, 26 Sep 1998 11:18:51 -0400
Reply-To: der Mouse <mouse@RODENTS.MONTREAL.QC.CA>
From: der Mouse <mouse@RODENTS.MONTREAL.QC.CA>
X-To: Kragen <kragen@dnaco.net>
To: BUGTRAQ@NETSPACE.ORG
>> But this sort of thing is why, quite some time ago, I added a key (I
>> picked "j") to my tar to watch for exactly this kind of thing: add j
>> to an x operation and tar will refuse to extract such things.
> Is this a patch you can release?
No and yes. I can release it, but it is not a patch. "[M]y tar" is a
complete rewrite, and as a result I can release the whole thing.
I'm not about to send it to bugtraq, since it's over 260K. Even after
gzip --best | btoa, it's still 85858 bytes. But I will happily mail a
copy to anyone who asks for it. (That size figure does not include the
manpage, which is another 45668 bytes uncompressed - I need to check it
to make sure it's up-to-date, though.)
> Why do you provide the option of not doing this checking?
Mmm. I am not about to drop the option of not doing the checking,
though one could certainly argue it should be turned on by default
despite the efficiency penalty (lots more syscalls per thing
extracted).
[excerpt from the comment I quoted, and reply to it]
>> * This code is full of potential races,
> That's an interesting thing to point out. I can't count the number
> of tar files I've extracted that had world-writable directories in
> them. The races you mention exist in just ordinary tar, as well as
> your modified version, I assume.
Yes and no. The races the comment is referring to are specifically
races in the checking code. Since "ordinary tar" doesn't have that
checking, those races can't exist there.
There undoubtedly are plenty of other races in tar, though I do try to
minimize them (for example, extracted directories are mode 700 until
it's done with them, whereupon it chmod()s them to their in-archive
mode. (This breaks certain unusual tarfiles, but such tarfiles are
usually no better with other tars and often much worse.)) Of course,
this does nothing about a directory that's world-writeable in the
tarfile; it'll come out world-writeable when tar is done with it. I
don't intend to change that, though it does occur to me that it might
be good to have a pseudo-umask argument that is used to mask all the
in-archive modes to get the final modes.
>> Of course, on systems with symlink modes [the paranoia code] will
>> break for an archive that looks like
>> --x--x--x ./foo -> /etc
>> rwxrwxrwx ./foo/profile
>> because it won't be able to readlink() the extracted symlink.
> I assume this means that you're using readlink() to tell if it's a
> symlink or not.
Good guess, but that isn't why. It's because to tell whether it's safe
to extract ./foo/profile I not only have to tell whether ./foo is a
symlink, I have to tell where it points. I don't want to kvetch about
./foo -> ./bar/foo-dir
./foo/datafile
because that's all within the to-be-extracted-into subtree. But the
only ways to tell the difference between that and what I quoted above
are (a) to remember the ./foo link when it's extracted, for later, and
(b) to readlink() it. The former can get arbitrarily hard, as in
./blee/bloo -> ..
./blee/bloo/foo -> /somewhere/evil
./foo/file
, requires a bunch of storage, and doesn't help if the symlink is
already present (perhaps through extracting a companion archive
earlier; either archive can be entirely innocent on its own). Thus, I
went for (b), though as I say it has problems with un-readlink()-able
symlinks (on systems where such things exist).
der Mouse
mouse@rodents.montreal.qc.ca
7D C8 61 52 5D E7 2D 39 4E F1 31 3E E8 B3 27 4B