Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Why query same data faster when open db than After compaction #98

Closed
cmumford opened this issue Sep 9, 2014 · 2 comments
Closed

Why query same data faster when open db than After compaction #98

cmumford opened this issue Sep 9, 2014 · 2 comments
Assignees
Labels

Comments

@cmumford
Copy link
Contributor

cmumford commented Sep 9, 2014

Original issue 92 created by david-zf@163.com on 2012-05-26T02:32:37.000Z:

What steps will reproduce the problem?
1.Put 100000000 record in db, db size is 2.8GB
2.After 2-3min when file num is not changed(Compaction should be completed),query data(90 records dispersedly in db) will cost 12s.When close db an reopen, query same data costs 47ms

What is the expected output? What do you see instead?
two query time is same or difference with two query time is small.

What version of the product are you using? On what operating system?
LevelDB 1.4 Windows 7 64bit

Please provide any additional information below.

@cmumford
Copy link
Contributor Author

cmumford commented Sep 9, 2014

Comment #1 originally posted by dhruba on 2012-05-26T04:53:28.000Z:

what i the value of numopenfiles in your test case? By default, leveldb keeps 1000 files open at most. also, did u change the size of the blockcache?

There are two possibilities:

  1. when the db is not restarted, the data is cached in the leveldb block cache. Once u restart the server, its cache is cold and it takes longer to query the same data.
  2. leveldb does buffered IO, so data could also be cached in the OS block cache. It is possible that when u restart the db-process, it close the files that were mmaped, thus indicating to the kernel that the corresponding OS pages can be purged. Thus, it might take longer to read the same data when you restart the db process.

@cmumford
Copy link
Contributor Author

Old question - assuming answered.

maochongxin pushed a commit to maochongxin/leveldb that referenced this issue Jul 21, 2022
move reporter internals in both headers and source
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant