Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Thread synchronisation bug in BufferManager implementation #2

Closed
GoogleCodeExporter opened this issue Mar 13, 2015 · 5 comments
Closed

Comments

@GoogleCodeExporter
Copy link
Collaborator

Encountered a failure in testCase4() - assertion failure. 

Original issue reported on code.google.com by d.majum...@gmail.com on 17 Dec 2006 at 2:21

@GoogleCodeExporter
Copy link
Collaborator Author

Bug is in locatePage(). In the loop where we check if the page is already in the
hash bucket, we need to check both whether the page is being read (frameIndex 
== -1) 
or is being written (writeInProgress == true). However, for writeInProgress, we 
can 
relax the rule a bit if the page is being accessed for read only (SHARED latch).

This bug cause some page updates to be lost, and therefore the test failed. By 
increasing the number of iterations and updates, I was able to consistently 
reproduce this bug.

Original comment by d.majum...@gmail.com on 18 Dec 2006 at 12:12

  • Changed state: Fixed

@GoogleCodeExporter
Copy link
Collaborator Author

Reopening this issue as there are still concurrency issues with the buffer 
manager 
implementation. The current implementation is too complex, and therefore hard 
to 
keep bug free. The code needs to be refactored to make it simpler.

Original comment by d.majum...@gmail.com on 31 Dec 2006 at 11:49

  • Changed state: Started

@GoogleCodeExporter
Copy link
Collaborator Author

Made the thread synchronisation more conservative. Removed synchronisation of 
the 
BCBs.

Original comment by d.majum...@gmail.com on 6 Jan 2007 at 12:51

  • Changed state: Fixed

@GoogleCodeExporter
Copy link
Collaborator Author

Reopening issue as new concurrency bug found.
Basically, there is a gap between reading a page and the fixcount being set to 
1, when the page is
unprotected and therefore available to the BufferWriter for eviction. This 
means, right in the midst of a call to 
fix(), the page can be replaced.

Original comment by d.majum...@gmail.com on 2 Nov 2007 at 12:52

  • Changed state: Accepted
  • Added labels: Priority-Critical
  • Removed labels: Priority-Medium

@GoogleCodeExporter
Copy link
Collaborator Author

Defect fixed by ensuring that when a new BCB is allocated, its fixcount is 
always set to 1 instead of 0.
The code that increments the fixcount has been made conditional - it will not 
increment the fixcount of a 
newly allocated BCB.
This fix has also enabled the page latch to be released early as originally 
intended.

Historical detail:
The bug initially manifested itself as a missed update. This led me to think 
that the problem was with locking, 
and somehow two threads were obtaining access to the same page. Only recently, 
after running the tests on 
Solaris 10 on a Dual-Core processor, the failures started occuring elsewhere, 
indicating that somehow the 
page itself was getting corrupt. This led to the discovery that because of a 
period of unprotected status, the 
page was getting replaced even as one thread thought that it had it pinned.

Original comment by d.majum...@gmail.com on 3 Nov 2007 at 11:17

  • Changed state: Fixed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant