Export to GitHub

weed-fs - issue #54

Bug - empty collections


Posted on Nov 23, 2013 by Massive Cat

When you assign with /dir/assign it will create an empty collection on 1 volume server only.

Step to reproduce: - start 2 volume servers - assign a key with /dir/assign (replication=000) - upload something - shut down the volume server with assigned volumes. - /dir/assign again and the master will keep choosing the previously assigned volumes on the dead node no matter what.

I wish to have the old behavior if collections are empty or not specified.

Comment #1

Posted on Nov 23, 2013 by Grumpy Cat

The bug description is different from what I am experiencing locally.

1) weed master 2) weed volume -port=8081 -dir=/tmp/1 3) weed volume -port=8082 -dir=/tmp/2

The /tmp/1 and /tmp/2 are empty folders. After the first few /dir/assign

chris@clu-dt2:~$ ls -al /tmp/1 total 44 drwxrwxr-x 2 chris chris 4096 Nov 23 13:12 . drwxrwxrwt 30 root root 20480 Nov 23 13:12 .. -rw-r--r-- 1 chris chris 8 Nov 23 13:12 2.dat -rw-r--r-- 1 chris chris 0 Nov 23 13:12 2.idx -rw-r--r-- 1 chris chris 8 Nov 23 13:12 3.dat -rw-r--r-- 1 chris chris 0 Nov 23 13:12 3.idx -rw-r--r-- 1 chris chris 8 Nov 23 13:12 4.dat -rw-r--r-- 1 chris chris 0 Nov 23 13:12 4.idx -rw-r--r-- 1 chris chris 8 Nov 23 13:12 6.dat -rw-r--r-- 1 chris chris 0 Nov 23 13:12 6.idx chris@clu-dt2:~$ ls -al /tmp/2 total 40 drwxrwxr-x 2 chris chris 4096 Nov 23 13:12 . drwxrwxrwt 30 root root 20480 Nov 23 13:15 .. -rw-r--r-- 1 chris chris 8 Nov 23 13:12 1.dat -rw-r--r-- 1 chris chris 0 Nov 23 13:12 1.idx -rw-r--r-- 1 chris chris 8 Nov 23 13:12 5.dat -rw-r--r-- 1 chris chris 0 Nov 23 13:12 5.idx -rw-r--r-- 1 chris chris 8 Nov 23 13:12 7.dat -rw-r--r-- 1 chris chris 0 Nov 23 13:12 7.idx

And if I shutdown any volume server now, the /dir/assign is assigned to the other volume server.

Maybe you did not wait until the master conclude the shutdown volume server is off and mark it offline?

Comment #2

Posted on Nov 25, 2013 by Massive Cat

{ "Topology": { "DataCenters": [ { "Free": 11, "Max": 16, "Racks": [ { "DataNodes": [ { "Free": 8, "Max": 8, "PublicUrl": "10.16.200.14:9341", "Url": "10.16.200.14:9341", "Volumes": 0 }, { "Free": 3, "Max": 8, "PublicUrl": "10.16.200.13:9341", "Url": "10.16.200.13:9341", "Volumes": 5 } ], "Free": 11, "Max": 16 } ] } ], "Free": 11, "Max": 16, "layouts": [ { "collection": "", "replication": "000", "writables": [ 3, 6 ] }, { "collection": "", "replication": "001", "writables": null } ] }, "Version": "0.45" }

There are no errors on logs (debug=true)

Comment #3

Posted on Nov 25, 2013 by Grumpy Cat

So it means the server 10.16.200.13:9341 is shutdown but the master did not notice it?

Comment #4

Posted on Nov 25, 2013 by Massive Cat

yes, even after couples of days.

Comment #5

Posted on Nov 25, 2013 by Grumpy Cat

hmm, out of clue now... Does restarting the master help? Is it consistently repeatable?

Comment #6

Posted on Nov 25, 2013 by Massive Cat

Restarting master did not help. However, after wiping all data on all volumes and restart everything it seams to work properly. No idea what causes this behavior. I will consider this issue closed for now.

Comment #7

Posted on Nov 25, 2013 by Grumpy Cat

If this happens again, hopefully there are some setup I can reproduce locally. Please keep an eye on this.

Status: Fixed

Labels:
Type-Defect Priority-Medium