|
PerformanceTestAOLQueryLog
Performance Test - AOL Query Logs
The Data (AOL Query log)Here's an example of what the AOL query log data looks like: AnonID Query ItemRank ClickURL 1636218 www.airtime500.com 2 http://www.airtime500.com 2272416 theunorthodoxjew.blogspot.com 1 http://theunorthodoxjew.blogspot.com 172627 www.yahoolagins.com 2569723 www.homesforsale 1196769 zip codes 1 http://www.usps.com 724416 propertytaxsales.com 30011 schwab learning 1 http://www.schwablearning.org [...] The row key used was the AnonID. The Query, ItemRank, and ClickURL, columns were inserted into a table that was created with the following HQL command: CREATE TABLE "query-log" ( Query, ItemRank, ClickURL ); The query log was sorted by timestamp. Each line of the query log could generate from 1 to 3 inserts, depending on how many column values were present. After the entire log was inserted, the table contained 75,274,825 cells. On average, the size of each row key was about 7 bytes and the size of each value inserted was 15 bytes. Machine Profile
Insert TestNOTE: The Hypertable Range Servers write their commit log into HDFS, however HDFS currently does not support a sync (or flush) operation. The insert rate may drop some once sync is implemented and called after each commit log write. Eight machines (motherlode001-motherlode008) were used to run both HDFS (version 0.14.4) and Hypertable. The Hypertable master and Hyperspace (Chubby) were run on motherlode001. HDFS was configured to use only three of the four available drives on each machine and was configured with 3-way replication. The log was sorted by timestamp and split into 5 pieces (a.tsv, b.tsv, c.tsv, d.tsv, e.tsv. Two separate machines (motherlode000 and motherlode009) were used to run insert clients. The following HQL commands were started more-or-less simultaneously: motherlode000: LOAD DATA INFILE ROW_KEY_COLUMN=AnonID 'a.tsv' INTO TABLE "query-log"; LOAD DATA INFILE ROW_KEY_COLUMN=AnonID 'b.tsv' INTO TABLE "query-log"; motherlode009: LOAD DATA INFILE ROW_KEY_COLUMN=AnonID 'c.tsv' INTO TABLE "query-log"; LOAD DATA INFILE ROW_KEY_COLUMN=AnonID 'd.tsv' INTO TABLE "query-log"; These four insert jobs completed with the following stats: Elapsed time: 143.88 s
Avg value size: 15.25 bytes
Avg key size: 7.10 bytes
Throughput: 2125750.02 bytes/s
Total inserts: 14825279
Throughput: 103039.68 inserts/s
Elapsed time: 144.80 s
Avg value size: 15.26 bytes
Avg key size: 7.11 bytes
Throughput: 2163621.98 bytes/s
Total inserts: 15185349
Throughput: 104871.25 inserts/s
Elapsed time: 150.32 s
Avg value size: 15.20 bytes
Avg key size: 7.03 bytes
Throughput: 2080001.83 bytes/s
Total inserts: 15208310
Throughput: 101173.45 inserts/s
Elapsed time: 148.21 s
Avg value size: 15.22 bytes
Avg key size: 7.11 bytes
Throughput: 2095660.00 bytes/s
Total inserts: 15080926
Throughput: 101754.55 inserts/sAggregating the numbers yields 410,838.93 random inserts/s. NOTE: The file 'e.tsv' was inserted afterwards, separately. Query TestThe file 'dump-query-log.hql' contained the following lines: select * from "query-log"; quit; And here is the result of the timed execution ... $ time /data1/doug/hypertable/bin/hypertable --batch --timestamp-format=usecs < dump-query-log.hql > /dev/null real 1m52.149s user 1m30.203s sys 0m22.025s 75,274,825 / 112.149 s == 671,094 cells/s SummaryThe AOL query logs were inserted into an 8-node Hypertable cluster. The average size of each row key was ~7 bytes and each value was ~15 bytes. The insert rate (with 4 simultaneous insert processes) was approximately 410K inserts/s. The table was scanned at a rate of approximately 671K cells/s. Update: as of 5 Jan09', the performance is about 60% better (same hardware of course) than those reported here. |
Would it be possible to repeat these benchmarks using KFS as the file system?
What are the specs on the 8 data nodes? Are they also 4x7200RPM JBOD?
@cra... we've tested the same benchmark on KFS. The number is not as good as HDFS yet. But KFS people are working on it.
@tlipcon, Yes, they're 4x7.2k jbod
How about queries with some conditions?
This is very interesting. I just wanted to give several pointers.
1. The random insertion test is a bulk load. I tested MySQL on its Load Data from file function, and it loaded the 36Millin rows in under 60 seconds on a dual core 4GB box with 1 hard drive. And the select from query log was under 20 seconds. So it looks like one instance of mySQL is about an order of magnitude faster than your current implementation.
2. I am not trying to critique your software so early in the stage, but most of the performance deficiencies are I/O related to HDFS I suspect. Your I/O throughput of 8MB/sec over the cluster is just abysmal. I have a feeling if you used Lustrefs over EXT3, you will probably instantly get 10x performance gain. The trick is that the Java implementation of DFS is hiding behind Yahoo's massive IO chunks. Lustrefs on the other hand, had to deal with writing millions of files in the KB range back in 2004, so they can do both large chunks and tiny chunks. And drop KFS, it's going nowhere.
3. The AOL data is so small, it fits in ram. So all benchmarks are pretty much invalidated because you don't know if the Linux cache is caching everything. Do a test where the dataset is 10x the size of total ram in the cluster to test the file system. Again, I think lustre should be 10x the speed of HDFS if 7MB/sec is the throughput you are getting now.
4. Look into the Mars project, which is a map reduce framework on GPUs. http://www.cse.ust.hk/gpuqp/Mars.html Looks like you can get 4x speed up if each cluster included a GPU to do basic stuff like sorting on large datasets. I think this is the reason why Google acquired PeakStream?: to do mapreduce on GPUs.
Thanks for the note, taoshen!
A few things to consider:
1. The test machines are old 1.6GHz opterons with 3 7200rpm sata drives allocated for the HDFS. 2. Our load data infile command is actually exercising the api for every record, while mysql load data infile treat the entire load as one transaction and the entire thing happens in RAM. 3. We use the default range size of 200MB and cellcache size 50MB. So there are splitting and compacting going on in the background. We believe the numbers are closer to the sustained rate for data much larger than RAM, where mysql will fall over completely due to seeks in btrees. 4. The data is being replicated 3-ways.
Can you tell us the spec of your test machine (CPU, disk etc.) and the config for your mysql (table type, myisam or innodb, key_buffer_size or innodb_buffer_pool_size and make sure you have index in the schema), so we can replicate your test on comparable hardware.
We're aware of Lustre and other alternatives for underlying DFS and will publish new test results when we release the beta next month.
BTW, why do you think KFS is going nowhere? There're indications that it will go somewhere in the near future :)
Machine spec Core 2 Duo 2.13Ghz, 4 GB DDR2-800 ram, 1 200GB SATA. MySQL Standard install, no index, innodb. The thing is that you see that 2GB of data gets loaded into the mysql database using the batch load in 60 seconds, which is about 35MB/sec, or about the speed of the hard drive. Of course the dataset was smaller than 4GB ram. So it is possible that some of it came from ram instead.
I didn't know your batch load did exercise the insert for each record. That's pretty good if it is done 1 by 1. However, it goes to show that you are maxed out by HDFS at 7-8MB/sec over a cluster of 8 machines(replicated 3 way) so about 2 machines each with 4 hard drives JBOD. In a perfect system, since you insert the records from 2 machines, each with GigE links I presume, you should insert at the rate of close to 200MB/sec(MAX of two GigE links).
I really want to see a Hypertable implementation over Lustre. Keep it up. I need a good distributed file+DB system to have a distributed tick storage server to store about 10TB of tick data. Right now, I am stuck on mySQL(and KDB+ 32bit personal edition) for a while and it is slow. Right now, I can do 800MB-1GB/sec over 28 disks(dual MD1000 attached to Perc5) on a single node. Still it would take about 3-4hours to scan through the entire tick database.
Next system would be a 4 node Dell 1950 III(each with 2 MD1000 + 2 PERC5) with 10Gb Infiniband cluster running lustre and hopefully Hypertable. If you guys are up to it. That system should push 3-4GB/sec.
http://www.youtube.com/watch?v=SPzt_aBLk0I
I forgot to add that Lustre is unique in a way it is POSIX FS. So you don't have to go through FUSE to use it. It is mature too. I am amazed that Hadoop didn't use Lustre from the start since Lustre is open source.
Thanks for getting back to us, taoshen. No wonder your mysql loads so fast as it only needs to do append without index. You need to add a primary key on anon id in the schema for it to be comparable. There are only 3 disks allocated for HDFS replicating 3 ways, which means the max throughput for a single node is the single disk speed.
The problem with Lustre is that an OST node cannot be shared with a Lustre client (reliably ) as well, last time I checked (http://osdir.com/ml/file-systems.lustre.user/2007-12/msg00059.html). Such limitation goes against our (as well as Google's) "push computation to data" philosophy (vs the traditional "pull data to computation" approach.)
For further discussion, please use http://groups.google.com/group/hypertable-user
@taoshen: FYI...KFS is being used in production at Quantcast. We recently setup a big cluster for our jobs.
@cra...: For KFS performance #'s, look at: http://code.google.com/p/hypertable/wiki/KFSvsHDFS
Lustre clients running on OSS nodes are quite stable now (major customers have this in production). Lustre doesn't have a mirroring mechanism yet, but that is getting much closer (ask on Lustre devel).
I have a question.
When the tablet/range spliting/compacting/moving, would the inserts be blocked for a while? I found that this issue is will continually block my inserting job.
Sorry, I mean I found that this issue in Hadoop HBase is will continually block my inserting job. How about Hypertable. In fact, I will test Hypertable days later.
Hi. I am looking for a dataset like Sample Query Log with time spent on each URL. Can you help me to finding this dataset? Thanks