My favorites | Sign in
Project Logo
                
New issue | Search
for
| Advanced search | Search tips
Issue 104: Develop a patch to improve the performance of restoring courses
5 people starred this issue and may be notified of changes. Back to list
Status:  Completed
Owner:  talktodan
Closed:  Jan 2008
Code
ClaimedBy-battlef...rkingdom


Sign in to add a comment
 
Reported by talktodan, Jan 17, 2008
In some moodle testing, I discovered that some parts of the backup and restore process can 
cause a huge amount of database queries in certain scenarios. ( 
http://tracker.moodle.org/browse/MDL-12594 ).

Vast amounts of database queries are one of the major bottlenecks of current moodle 
performance.

Your task is to examine the restore_create_users() function and develop a patch to reduce the 
number of database queries which are performed. This patch can then be used by Moodle 
developers to discover the problem areas and  improve the performance of this function for 
future versions of moodle. This patch should try and optimise the number of database queries 
without creating bottleknecks in other areas (e.g. memory)

For further reference, see this forum thread from Martin Lanhoff which details some ideas on 
optimising moodle database queries based on some recent work hes done to improve moodle 
performance by reducing database queries:  http://moodle.org/mod/forum/discuss.php?
d=79455

Comment 1 by fosterh, Jan 17, 2008
Estimated time: 1 week
Status: Open
Labels: -Type-Defect -Priority-Medium Code
Comment 2 by battleforkingdom, Jan 21, 2008
I claim this Task.
Comment 3 by fosterh, Jan 21, 2008
Hi Pat,

You are welcome to claim this task, if you wish :-) However, if you would prefer to
work on another selenium task, please claim  issue116  (newly added!) and leave a
comment here saying so.

Helen
Comment 4 by battleforkingdom, Jan 21, 2008
Hi Helen,

Thank you very much for notifying me, Yes I want to claim  issue 116  instead. But this
task is also challenging. I think that I'm experienced with selenium so It'd better
to claim that task instead. and I may have time left to claim this task too. (If I
post the text "I claim this task." before January 22, 2008, 12:00 AM Pacific Time /
08:00 UTC but you label it as Claimed after that, will I get the task? Thanks)

Pat
Comment 5 by fosterh, Jan 21, 2008
Hi Pat,

To claim this task you need to have  issue116  marked as completed first, then add the
comment "I claim this task" here before the deadline.

You can still claim this task after the deadline, though if you do so, your work
won't qualify for any prizes.

Helen
Comment 6 by battleforkingdom, Jan 21, 2008
I claim this Task.
Comment 7 by talktodan, Jan 21, 2008
Wow Cool! You have this task!
Labels: ClaimedBy-battleforkingdom DueDate-20080127
Comment 8 by talktodan, Jan 21, 2008
(No comment was entered for this change.)
Labels: -DueDate-20080127 DueDate-20080128
Comment 9 by battleforkingdom, Jan 21, 2008
Hi Dan, Nice to meet you again

I think I need some clarification. restore_create_users() function in which files ?
Thanks

Comment 10 by talktodan, Jan 21, 2008
Hi Pat,

The function is in backup/restorelib.php. It is reproduced mostly easily using a
large backup file with a large amount of users.

I will generate an example backup file which triggers the problem for you shortly.
Comment 11 by talktodan, Jan 21, 2008
Hi Pat,
I've attached the expensive backup file to the bug at 
http://tracker.moodle.org/browse/MDL-12594

One of the cases which we really should try and optimise is:
- Upload the backup file once (and create all the users from the backup file).
- Upload the backup file again, then the existing users wont be re-created.

We should really try and optimise this process so that if a user already exists and
we aren't doing anything, we shouldn't be generating lots of database queries for
users we are not adding.

My biggest tip will be to try and get database queries outside of loops. I am pretty
sure we can do much better by doing checks outside of loops rather than inside loops.

Good Luck, and hope that helps

Dan
Comment 12 by fosterh, Jan 21, 2008
(No comment was entered for this change.)
Status: Claimed
Comment 13 by battleforkingdom, Jan 22, 2008
I've restored your backup and, Wow It took really long though I upload it the second
time and choose "Add to existing course" (esp. while restoring user and gradebook).
I've taken a look at restore_create_user. It might take some time.

By the way. What is the 'patch' you want me to create? an optimized version of
restore_create_user function ?

Thanks in advance
Comment 14 by talktodan, Jan 22, 2008
Hi Pat,

If you switch on performance debugging by adding these two lines in the config.php, you will get a footer 
printed displaying the number of database queries used. 
define('MDL_PERF' , true);
define('MDL_PERFDB' , true);

Yes, i'd like you to try and create an optimised version of restore_create_user. You may find it useful to add 
additional debugging to see where database queries are being used. Or turning on sql logging on your 
database server.

Here are some guidelines on how to create a patch:
http://docs.moodle.org/en/Development:How_to_create_a_patch

If you are not comfortable creating a patch I am happy to do that for you, if you just send me the before/after 
version of your function.

thanks,
Dan 
Comment 15 by battleforkingdom, Jan 23, 2008
Hi, just tried and the result is, a whopping 76000 queries!! (took 300 secs). I'm now
looking through the function and forum.
Comment 16 by battleforkingdom, Jan 24, 2008
I've put some special query in the starting and ending of restore_create_users() to
separate the queries in log created by restore_create_user() from the whole 70000
queries. The function processes these 2000 times (amount of user)

		      1 Query       SELECT * FROM mdl_backup_ids WHERE backup_code = '1201177218'
AND table_name = 'user' AND old_id = '1014' LIMIT 1
		      1 Query       DELETE FROM mdl_backup_ids 
                               WHERE backup_code = 1201177218 AND
                                     table_name = 'user' AND
                                     old_id = '1014'
		      1 Query       INSERT INTO mdl_backup_ids ( BACKUP_CODE, TABLE_NAME, OLD_ID,
NEW_ID, INFO ) VALUES ( 1201177218, 'user', 1014, 0, 'infile' )


And these for another 2000 times. 

		      1 Query       INSERT INTO mdl_backup_ids ( BACKUP_CODE, TABLE_NAME, OLD_ID,
NEW_ID, INFO ) VALUES ( 1201177218, 'user', 12, 10, 's:14:"exists,needed,";' )
		      1 Query       SELECT * FROM mdl_backup_ids WHERE backup_code = '1201177218'
AND table_name = 'user' AND old_id = '12' LIMIT 1
		      1 Query       SELECT * FROM mdl_user_preferences WHERE userid = '10' AND name
= 'auth_forcepasswordchange' LIMIT 1
		      1 Query       SELECT * FROM mdl_user_preferences WHERE userid = '10' AND name
= 'create_password' LIMIT 1
		      1 Query       SELECT * FROM mdl_backup_ids WHERE backup_code = '1201177218'
AND table_name = 'user' AND old_id = '102' LIMIT 1
		      1 Query       SELECT * FROM mdl_user WHERE username = 'firstname100' AND
mnethostid = '1' LIMIT 1
		      1 Query       DELETE FROM mdl_backup_ids
                               WHERE backup_code = 1201177218 AND
                                     table_name = 'user' AND
                                     old_id = '102'

It cost about 80% of all restore_create_users query. I'm now looking at function
structure to see if there's anyway to reduce it.
Comment 17 by battleforkingdom, Jan 24, 2008
The log above is executed under setting "Restore to existing course, adding new
data". Can I check that If there's already user id, then ignore it instead of delete
the record and insert again? In this case will save around 4000 queries.
Comment 18 by talktodan, Jan 24, 2008
(I'm afraid i'm not 100% familiar with all this all works currently). 

I think it should try and determine 'early' if the user already exists then know that we do not have to do any 
additional work to create the user and can move on to the next.

However you have to be careful to ensure the checks if a user is the same person should be the same as they 
are in the code now (its not just backupuserid = userid I shouldnt think). 

Great work for finding those excessively used queries. 

The user preference queries are intriguing - do you know what they are being used for? (I would imagine 
these would be generated from lines of code which call get_user_preference function ).
Comment 19 by battleforkingdom, Jan 26, 2008
I've made a mistake in the post above, the first 3 queries are right. But the second
path should be

1  SELECT * FROM mdl_backup_ids WHERE backup_code = '1201177218' AND table_name =
'user' AND old_id = '1015' LIMIT 1
2  SELECT * FROM mdl_user WHERE username = 'firstname1013' AND mnethostid = '1' LIMIT 1
3  DELETE FROM mdl_backup_ids
                        WHERE backup_code = 1201177218 AND
                              table_name = 'user' AND
                              old_id = '1015'
4  INSERT INTO mdl_backup_ids ( BACKUP_CODE, TABLE_NAME, OLD_ID, NEW_ID, INFO )
VALUES ( 1201177218, 'user', 1015, 26, 's:6:"exists";' )
5  SELECT * FROM mdl_backup_ids WHERE backup_code = '1201177218' AND table_name =
'user' AND old_id = '1015' LIMIT 1
6  DELETE FROM mdl_backup_ids
                        WHERE backup_code = 1201177218 AND
                              table_name = 'user' AND
                              old_id = '1015'
7  INSERT INTO mdl_backup_ids ( BACKUP_CODE, TABLE_NAME, OLD_ID, NEW_ID, INFO )
VALUES ( 1201177218, 'user', 1015, 26, 's:14:"exists,needed,";' )
8  SELECT * FROM mdl_backup_ids WHERE backup_code = '1201177218' AND table_name =
'user' AND old_id = '1015' LIMIT 1
9  SELECT * FROM mdl_user_preferences WHERE userid = '26' AND name =
'auth_forcepasswordchange' LIMIT 1
10 SELECT * FROM mdl_user_preferences WHERE userid = '26' AND name =
'create_password' LIMIT 1

Total cost is 13 queries per student user if there's already existing user.

It seems to be unable reduce the queries if that user wasn't exist. The total cost 
for new restore is 19 queries per student user, number may vary from it's role.

Why restoring course to an existing database (which already stored all the data I'm
going to insert) still cost 13 queries/user while restoring to a new,
never-contain-user-before Moodle cost 19 queries/user.

The last 3 queries in this post is in create_preference section, They're executed if
$create_preferences was set to true. But the only way to set it to false is, that
user must not be a teacher or student.

Currently my idea is, if the user is already exist (by checking the username), then
set $create_preferences to false. (This will save 3 queries/user)

But Why developer didn't code that first? (may be by intention, or by some reason I
don't know that I can't just ignore the create_preference step if the user is exist)

And to your question, the whole function doesn't call any get_user_preference (but
may be run by some function I don't know)

Thanks very much
Pat
Comment 20 by talktodan, Jan 27, 2008
You've done some great digging there Pat, thanks!

I think that the user preferences part might be an obscure bug. From the sql you've discovered it looks like 
we might force a users password to be changed by uploading a backup file which contains the preference to 
force them to change password. This isn't really what we want to do. The reason it was probably originally 
designed to do this was because there were some course related user preferences which we want to keep.

In any case, this is a really useful discovery and I think there needs to be some discussion between developers 
about how to handle that scenario as its not clear what the best solution is here.

In order to complete this task, it would be great if you could create a modified version of the function which 
doesn't create user preferences (and any other optimisations you may have found), attach it to the bug in the 
moodle tracker, with a short discussion of what you found and a link to this discussion here. We'll have a 
starting point and might be able to get rid of another bug!

thanks
Dan
Comment 21 by battleforkingdom, Jan 28, 2008
Hi, Thanks for your comment Dan!
I've more information here. From my previous post, 

5  SELECT * FROM mdl_backup_ids WHERE backup_code = '1201177218' AND table_name =
'user' AND old_id = '1015' LIMIT 1
6  DELETE FROM mdl_backup_ids
                        WHERE backup_code = 1201177218 AND
                              table_name = 'user' AND
                              old_id = '1015'
7  INSERT INTO mdl_backup_ids ( BACKUP_CODE, TABLE_NAME, OLD_ID, NEW_ID, INFO )
VALUES ( 1201177218, 'user', 1015, 26, 's:14:"exists,needed,";' )

Queries 5-7 is in create_role section (executed if $create_role is true), which I
think they're useless in this case as well as queries 8-10.

I think the reason why developers don't set $create_role and $create_preference to
false if there's already a user exist is, the user might has some special preferences
or roles assigned after the current database (and need to be inserted though it's
already exist).

What we need now is to figure out early that if the preferences/roles of the user in
database and in backup file are the same, then set $create_roles/$create_preferences
to false.

My modified function is just to set $create_role and $create_preference to false if
$create_user is false (line 2271 to 2275). But as I mentioned above, This is more
complicated than I thought. And this file shouldn't be used as the reason I mentioned.

I think I should do this task best, So I'm going to stay on this until the issue is
resolved. My modified file was posted here http://tracker.moodle.org/browse/MDL-12594.

I'd like to see the result of this, so It'd be really appreciated if you notify me
about the developer's discussion about this. My address is battleforkingdom [at]
gmail [dot] com

Thanks very much for an opportunity to do these tasks (including previous XMLDB
Documentation and this task is really challenging).
Pat
0 bytes Download
Comment 22 by talktodan, Jan 28, 2008
Thanks heaps Pat!

If you click on the 'Watch it' button on MDL-12594 you will be notified when developers have discussions 
about the issue so you don't need to be told about it from me ;) I've added you as a watcher there for you so 
you'll get notified as changes. Feel free to continue posting updates to that tracker item. I am a bit busy this 
week, but I hope to have a further look into this at the weekend and hopefully we could get some other 
developers to have a look.

Thanks for your contributions to the moodle community and the GHOP contest! And more  congratulations on successfully completing this task :-)
Status: Completed
Labels: -DueDate-20080128
Sign in to add a comment

Hosted by Google Code