[325] in arla-drinkers

home help back first fref pref prev next nref lref last post

probable malloc()/free() bug in arlad: hashtabadd()

daemon@ATHENA.MIT.EDU (Ray Jones)
Wed Oct 7 14:28:51 1998

From owner-arla-drinkers@stacken.kth.se Wed Oct 07 18:28:50 1998
Return-Path: <owner-arla-drinkers@stacken.kth.se>
Delivered-To: arla-drinkers-mtg@bloom-picayune.mit.edu
Received: (qmail 25407 invoked from network); 7 Oct 1998 18:28:49 -0000
Received: from unknown (HELO sundance.stacken.kth.se) (130.237.234.41)
  by bloom-picayune.mit.edu with SMTP; 7 Oct 1998 18:28:49 -0000
Received: (from majordom@localhost)
	by sundance.stacken.kth.se (8.8.8/8.8.8) id UAA15497
	for arla-drinkers-list; Wed, 7 Oct 1998 20:20:14 +0200 (MET DST)
Received: from pixie.mit.edu (thouis@PIXIE.MIT.EDU [18.238.0.85])
	by sundance.stacken.kth.se (8.8.8/8.8.8) with ESMTP id UAA15493
	for <arla-drinkers@stacken.kth.se>; Wed, 7 Oct 1998 20:20:09 +0200 (MET DST)
Received: (from thouis@localhost)
	by pixie.mit.edu (8.8.7/8.8.5) id OAA04259;
	Wed, 7 Oct 1998 14:20:08 -0400
Date: Wed, 7 Oct 1998 14:20:08 -0400
Message-Id: <199810071820.OAA04259@pixie.mit.edu>
From: Ray Jones <rjones@pobox.com>
To: arla-drinkers@stacken.kth.se
Subject: probable malloc()/free() bug in arlad: hashtabadd()
Reply-to: rjones@pobox.com
Sender: owner-arla-drinkers@stacken.kth.se
Precedence: bulk

(this is with arla-0.13)

arlad seems to be calling free() on memory not allocated via malloc.
here is the backtrace of the crash (modulo uninteresting
top-of-stack):

#3  0x8078198 in free (p=0x80f20b4) at wrapper.c:391
#4  0x806c8c4 in hashtabadd (htab=0x80e4720, ptr=0x80f1718) at hash.c:124
#5  0x8056a39 in get_info (e=0x80f1550, volname=0xbffff1e8 "536873038", ce=0x81325a8)
    at volcache.c:397
#6  0x8056bb0 in add_entry (volname=0xbffff1e8 "536873038", cell=0, ce=0x81325a8) at volcache.c:448
#7  0x8056ca2 in volcache_getbyid (id=536873038, cell=0, ce=0x81325a8) at volcache.c:510
#8  0x804e406 in fcache_recover_state () at fcache.c:780
#9  0x804e855 in fcache_init (alowvnodes=3000, ahighvnodes=4000, alowbytes=94371840, 
    ahighbytes=104857600, recover=1) at fcache.c:944
#10 0x804c0d4 in main (argc=2, argv=0xbffffdd8) at arla.c:816

the pointer passed to free() is not a pointer from malloc().

the relevant lines from get_info are:

393         if (e->entry.flags & VLF_RWEXISTS) {
394             e->num_ptr[RWVOL].cell = e->cell;
395             e->num_ptr[RWVOL].vol  = e->entry.volumeId[RWVOL];
396             e->num_ptr[RWVOL].ptr  = e;
397             hashtabadd (volidhashtab, (void *)&e->num_ptr[RWVOL]);


the relevant lines from hashtabadd are:

116     hashtabadd(Hashtab * htab, void *ptr)
117     {
118         Hashentry *h = _search(htab, ptr);
119         Hashentry **tabptr;
120
121         assert(htab && ptr);
122
123         if (h)
124             free((void *) h->ptr);
125         else {

my best guess is that:
1- get_info adds some volid to the hashtable.  h->ptr in the new hash
   entry points to the interior of 'e' (see line 397 above).  i don't
   know, but i suspect that 'e' was created by malloc().
2- get_info adds the same volid to the hashtable, again.  line 124
   calls free() on the interior pointer from the previous step.

here's some possibly relevant information about the volid being added
to the hashtable (from within hastabadd()), and the one that was
already there:

(gdb) p *((struct num_ptr *) ptr)->ptr
$15 = {entry = {name = "home.smith", '\000' <repeats 54 times>,
volumeType = 0, nServers = 1, serverNumber = {-2098337235, 0, 0, 0, 0,
0, 0, 0}, serverPartition = {5, 0, 0, 0, 0, 0, 0, 0}, serverFlags =
{4, 0, 0, 0, 0, 0, 0, 0}, volumeId = {536873037, 536873038,
536873039}, cloneId = 0, flags = 20480}, volsync = {spare1 = 0, spare2
= 0, spare3 = 0, spare4 = 0, spare5 = 0, spare6 = 0}, cell = 0,
refcount = 0, flags = {validp = 0}, name_ptr = {{cell = 0, name =
'\000' <repeats 64 times>, ptr = 0x0}, {cell = 0, name = '\000'
<repeats 64 times>, ptr = 0x0}, {cell = 0, name = '\000' <repeats 64
times>, ptr = 0x0}}, num_ptr = {{cell = 0, vol = 536873037, ptr =
0x80f1550}, {cell = 0, vol = 0, ptr = 0x0}, {cell = 0, vol = 0, ptr =
0x0}}}

(gdb) p *((struct num_ptr *) h->ptr)->ptr
$16 = {entry = {name = "home.smith", '\000' <repeats 54 times>,
volumeType = 0, nServers = 1, serverNumber = {-2098337235, 0, 0, 0, 0,
0, 0, 0}, serverPartition = {5, 0, 0, 0, 0, 0, 0, 0}, serverFlags =
{4, 0, 0, 0, 0, 0, 0, 0}, volumeId = {536873037, 536873038,
536873039}, cloneId = 0, flags = 20480}, volsync = {spare1 = 0, spare2
= 0, spare3 = 0, spare4 = 0, spare5 = 0, spare6 = 0}, cell = 0,
refcount = 1, flags = {validp = 1}, name_ptr = {{cell = 0, name =
"home.smith", '\000' <repeats 54 times>, ptr = 0x80f1eec}, {cell = 0,
name = '\000' <repeats 64 times>, ptr = 0x0}, {cell = 0, name =
"home.smith.backup", '\000' <repeats 47 times>, ptr = 0x80f1eec}},
num_ptr = {{ cell = 0, vol = 536873037, ptr = 0x80f1eec}, {cell = 0,
vol = 0, ptr = 0x0}, {cell = 0, vol = 536873039, ptr = 0x80f1eec}}}

i don't have a suggested fix at this time.  perhaps hashtabadd should
take an argument 'freep' that indicates in the hashentry whether the
value should be freed if the hashentry is replaced or deleted.  (this
would supersede the freep argument to hastabdel.)

btw, this is a linux 2.0.35, glibc system.  i was running "arlad -t"
with the default configs.  if there is something i can do to provide
more info, let me know.

ray jones

home help back first fref pref prev next nref lref last post