Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Possible bug: Missing a fsync() or msync() call after creating MANIFEST-000001 #189

Open
cmumford opened this issue Sep 9, 2014 · 3 comments
Labels

Comments

@cmumford
Copy link
Contributor

cmumford commented Sep 9, 2014

Original issue 183 created by madthanu on 2013-06-30T21:26:50.000Z:

This is about the scenario where a power crash happens while a database is being created. The bug is triggered only if the crash happens within a narrow time interval, and only when certain filesystems (eg: ext4) are used. Furthermore, the bug does not actually corrupt any data, instead only reporting an IO-error on a being-created (i.e., empty) database.

So I'm not sure this behavior is "wrong". Please ignore if you already knew about this behavior.

What steps will reproduce the problem?

  1. Use a Fedora/Ubuntu machine. Create a new leveldb database in a ext4 partition that no other process is writing to.
  2. During the creation, use some trick to crash the machine soon after the mysnc() corresponding to "MANIFEST-000002" happens. Specifically, [A] the crash should happen before any sync-like call after rename("000002.dbtmp"), and [B] the crash should happen within around few seconds. A probable trick would be to add a sleep() after msync("MANIFEST-000002"), and manually pull the plug as soon as the sleep() is triggered.
  3. Reboot the machine. The database would now have a CURRENT file pointing to MANIFEST-000001, but MANIFEST-0000001 would be empty (behavior would be slightly different depending on the filesystem). Try opening the database again with leveldb.

What is the expected output? What do you see instead?
Leveldb would report an IO Error. It is expected to just open the (empty) database and continue working.

What version of the product are you using? On what operating system?
Leveldb-1.12.0. I used Ubuntu 12.04, although most Linux OSes should behave the same way. Also, in addition to ext4, I suspect other filesystems also behave the same way.

Please provide any additional information below.
I'm more involved in filesystem research than in using leveldb, so I might be totally wrong. Do let me know if any additional tests would be useful from my side, I will be happy to help.

@cmumford
Copy link
Contributor Author

cmumford commented Sep 9, 2014

Comment #1 originally posted by madthanu on 2013-07-15T00:57:42.000Z:

Ding? Is the leveldb community interested in bugs like these at all?

I think I have discovered a couple of of similar bugs, but would it be useful to you guys to have new issues created?

@cmumford cmumford self-assigned this Sep 9, 2014
@cmumford
Copy link
Contributor Author

cmumford commented Sep 9, 2014

Comment #2 originally posted by sanjay@google.com on 2013-07-15T23:29:42.000Z:

Sorry for not responding. There aren't too many people spending a significant amount of time on leveldb, so bugs are on the back burner except for urgent things like corruption or crashes. Furthermore, this particular bug might be affected by some directory syncing work that is in progress.

I suspect that bugs like this one will probably also get looked at some point when somebody has some time. So it would be helpful if you have other similar things you can point out.

Thanks.

@cmumford
Copy link
Contributor Author

cmumford commented Sep 9, 2014

Comment #3 originally posted by madthanu on 2013-07-15T23:37:18.000Z:

Sure! I'll report all the potential bugs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant