[prev in list] [next in list] [prev in thread] [next in thread] 

List:       linux-kernel
Subject:    Re: I found a synchronization problem in mm/vmalloc.c
From:       Nick Piggin <npiggin () suse ! de>
Date:       2010-01-13 4:55:38
Message-ID: 20100113045538.GA3901 () nick
[Download RAW message or body]

On Tue, Jan 12, 2010 at 06:32:09PM +0900, Yongseok Koh wrote:
> Sorry, Mr. Morton.
> 
> Even though it is somewhat late, I am doing cc the mailing list.
> 
> Thanks.
> 
> -----Original Message-----
> 
> On Thu, 7 Jan 2010 20:22:30 +0900
> "Yongseok Koh" <yongseok.koh@samsung.com> wrote:
> 
> > Dear all,
> > 
> > I___m Yongseok Koh in Korea.
> > 
> 
> Thanks for the report.
> 
> Please do cc a mailing list when reporting bugs so that everyone else knows
> what's going on.
> 
> > 
> > I just got a new message in linux-2.6.28.10 (plz refer to the below)
> > 
> > And, one of my colleagues found that there is a synchronization 
> > problem in mm/vmalloc.c
> > 
> >  
> > 
> > In free_unmap_area_noflush(), va->flags is marked as VM_LAZY_FREE 
> > first, and then vmap_lazy_nr is increased atomically.
> > 
> > But, in __purge_vmap_area_lazy(), while traversing of vmap_are_list, 
> > nr is counted by checking VM_LAZY_FREE is set to va->flags.
> > 
> > After counting the variable nr, kernel reads vmap_lazy_nr atomically 
> > and checks a BUG_ON condition whether nr is greater than vmap_lazy_nr.
> > 
> >  
> > 
> > The problem is that, if interrupted right after marking VM_LAZY_FREE, 
> > increment of vmap_lazy_nr can be delayed.
> > 
> > Consequently, BUG_ON condition can be met because nr is counted more 
> > than vmap_lazy_nr.
> > 
> >  
> > 
> > What I mentioned is highly probable when vmalloc/vfree are called 
> > frequently.
> > 
> > And my colleagues have verified this scenario by adding delay between 
> > marking VM_LAZY_FREE and increasing vmap_lazy_nr in 
> > free_unmap_area_noflush().
> > 
> >  
> > 
> > Am I right ?
> > 
> 
> Looks plausible to me and as far as I can tell, current code has the same
> issue.

Yes, I think it's a good catch.

 
> Wakey wakey, Nick!  What makes that BUG_ON() safe?  Not purge_lock afacit?

No I think it is a bug. I would say that we can just get rid of the BUG_ON
now. atomic_t is signed, so it should be OK if it momentarily goes negative
(and anyway it's only used in a heuristic).

So, thanks for the report. Would you care to send a patch, or propose
another way to fix the problem?

Thanks,
Nick

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/
[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic