Issue 78: Question 5 : Relocation may waste time
Reported by postri...@gmail.com, Mar 14, 2009
When you are relocating a file there is a case in which you will waste time .

Explanation:

Suppose inode 20 is to be relocated from tier 1 to tier 3.

If tier 3 is full then also you are taking new blocks from file system
which may give blocks from tier 1 itself.

So ultimately you will be landing in not relocating the file and just
WASTING time over here.

You should check for free space on tiers, isn't ?

Is it correct??

Can you avoid this. If yes ... then do it.
Mar 14, 2009
Project Member #1 imreckless@gmail.com
Current implementation has this limitation but further versions will check 
free space before relocationg, moreover the admin should create his policies smartly,
i mean,i wont write a policy to land my files in a tier which i know is already full.
We are planning to show tier usage statistics and puke notations if a tier is full.

Mar 16, 2009
Project Member #2 sandeepksinha
Open an issue and mark it for 2.0
Apr 24, 2009
#3 rishi.b....@gmail.com
This will be dealt when the coding resumes, putting the issue on hold
Status: On-Hold
Jul 13, 2009
Project Member #4 sandeepksinha
This can be better help if we start maintaining the stats for tiers in the SAM table.
I think this will be a good work as this information can be used by GUI as well.
Also, user can get this information on the fly through /sys or some ioctl as well.

Rishi, what do you think can be the challenge?
One of them could be, when do you intend to update this information? Every block
allocation request or after every X seconds? 
I think being 100% correct is not always desirable and doing it on a periodic basis
should absolutely be fine. But at the same time, a huge I/O may eat up the whole tier
in a couple of seconds. 

But thats acceptable, till the time we are consistent. 
Whats your take on this?
Jul 14, 2009
#5 rishi.b....@gmail.com
Dont touch the SAM table. Its very simple and very effective.

Maintain a table somewhere in the kernel which will be persistent across reboots.

Reserve a block for it or something similar.

Whenever the ext2_get_block() function allocates a block it is aware about the tier
from which it is allocating the block.

The ext2_get_block can be modified to update the table.

The table will maintain the tier specific data like:
blocks present
blocks free
blocks allocated
device number of tier
and other such data

Using OHSM we can create tiers on the fly. Change the entries in the device.xml and
the tier configuration changes in the system.

This feature has to be taken care of while implementing the above method of
maintaining tier specific data. 

Every time a new tier device configuration enters the system (which is a RARE case)
we need to make suitable changes to the table inside the kernel which is a trivial
task. We onlt need to do some arithmatic calculations.


Another way of keeping the table inside is to have device specific information rather
than tier specific information

I mean maintain information like device /dev/sda1 is on tier 3 and has 10 blocks in
total out of which 3 blocks are allocated

rather than maintaining 
tier 1 has 3 devices namely (or numbered) /dev/sda1, /dev/sda2 /dev/sda3 and has
total 30 blocks out of which 10 blocks are allocated.

Both the way you can have the same data but the previous one gives additional knowledge.



Jul 14, 2009
#6 rishi.b....@gmail.com
Another way of providing the tier specific information is to write a file system
scanner which will read all the group descriptors of the file system and make a table
on the fly.

This scanning will take a lot of time due to the disk operations involved in it.


Jul 14, 2009
#7 rishi.b....@gmail.com
 Issue 80  has been merged into this issue.
Jul 14, 2009
Project Member #8 sandeepksinha
Hi Rishi,

After the three suggestions that you had,
1. Changing ext2_get_block( )
2. Maintaing info at device level
3. On the fly scanner.

I would go for the third one. Though there can be a change of this data being stale
by a couple of milliseconds or seconds. I would still go for the third one.

No matter whichever way we implement it for, if the file system is being written with
data rigorously, the data would be stale to some extent. 
So, we should keep things as simple as possible.

>Both the way you can have the same data but the previous one gives additional knowledge.

I think when we are saying the information is per tier, we should focus at the
granularity of the tier and not the devices. Right?


Tier No:
Devices in the tier: Can have One or Many
Total blocks present:
Total blocks allocated:
Total blocks free:
>>device number of tier
?? What do you mean by this
>>and other such data
What else ???

  
Jul 14, 2009
Project Member #9 imreckless@gmail.com
When we are allocating files, there are chances of it being relocated
to the same tier, or it may expand over two tiers.
I would also suggest to go for a scanner to check space availability.
And scanner wont take much time, it does not read entire disk, only group descriptors
which are at fixed offsets as the size of block groups are static.
This can be done at the time of enabling, instead of scanning each time just
maintain this data incore once, no. of free block etc, and whenever a block is 
allocated decrement this count and vice versa for freeing blocks, this will reduce 
the need of scanning the file system each time. Just check this value to get the free 
space available. Maintain this info per tier. I think checking and a decrement or 
increment operation wont hamper the performance.

Jul 14, 2009
#10 rishi.b....@gmail.com
One more way to stop "relocating to the same tier"

I think when we get the block number from the ext2_get_block() function we can easily
have the block group number also.

When we have the block group number we can have the tier from it.

If the tier number is not the desired tier then we will stop relocating the file.
Jul 15, 2009
#11 rishi.b....@gmail.com
I wrote on the fly scanner as an option for sending the tier information to user space.

When we are relocating the question comes.

When will you run the on the fly scanner.

An how many times will you run it during the relocation process.

A large file of 20 GB can change the whole scenario of the underlying tiers so on the
fly scanner is not suited for this situation.

The above mentioned method (comment 10) will be more effective and will ensure 100 %
accuracy.


Maintaining persistant tables will also help in providing the information like 80%
full 90% full etc.

Moreover Do we have to add the feature of Priority based allocation.

If priority based allocation and relocation is enabled then we dont need to check the
tier fullness and emptiness.

Jul 15, 2009
Project Member #12 imreckless@gmail.com
May be you did not read my earlier post,

When will you run on the fly scanner?
I said run it once during enabling,
get no. of free block or size for each tier,

How many times you will run?
Only once during enabling, when we have this per tier information,
we can manipulate it on the fly,
when we allocate a block decrement free blocks or size of a tier,
and vice versa for freeing blocks,
And when you want to check if space is available check this table.

The above mentioned method, comment 11.
Will land up in a file spanning across two tiers.
And the situation is already handled, ranged block allocation
does that only it wont allocate in other ties, only for the
stake of data consistency we provided whole BG rang of fs at the end
of each entry in sam table.

Prioritizing the tier also requires you to scan for free space,
if space available then only relocate.




Jul 15, 2009
#13 rishi.b....@gmail.com
So the algo for it is like:

Whenever OHSM is enabled a fs_bg_scanner() will be run and it will make a table or
update one of the tables inside the OHSM Module.

Whenever ohsm_ext2_get_blocks() allocates block it will refer this table and also
update this table with the blocks it has allocated.

The table will be cleared once ohsm is disabled.
Jul 15, 2009
Project Member #14 imreckless@gmail.com
Yes, somewhat like this only.

whenever ohsm_ext2_new_block() is called we update the table
and whenever ext2_free_blocks() is called we update the table.

And before relocating a file just check this table for free space
in a particular tier.
Jul 17, 2009
#15 rishi.b....@gmail.com
so do we plan to fix this in target release 2.0.0

can SKS comment on this
Status: Review-Req
Owner: sandeepksinha
Jul 17, 2009
Project Member #16 sandeepksinha
We would like to do this independent of the file system. Thats always a better approach.
 
No, lets target this for v1.2. We have enough bandwidth as of now to get this in.
Labels: Target-Release1.2