My favorites | Sign in
Project Home Downloads Issues Source
Search
for
log_files  
Understanding log files?
Updated May 21, 2011 by patrick....@gmail.com

Log files

What are log files?

Log files can be created using the CLI command get_job_log_file. (See CLI commands for more details.)

Log files are useful for understanding exactly what is going 'under the hood'. When running jobs, reviewing the log file is essential in order to be confident that the job is progressing as expected.

The log file is a simple text file, formatted as a csv file (comma separated values). This means that the log file can be opened in any spreadsheet program.

In order to understand what all the values mean in the log file, it is important to review the description provided in the introduction.

Actions and transactions

For each job, there is a lot of activity - in particular, sending individuals back and forth between the master and various slaves. In order to keep track of all this activity, each action is given specific IDs. There are two types of IDs that are assigned: an action ID and a transaction ID.

An action is defined as anything that changes the contents of any of the processing or pending sets for the tasks. The action ID is simply a sequential integer ID, starting at 0, that the master assigns to every action it takes. So for example, is some individuals are sent to a slave, then that action is assigned an action ID such as 345, and when those individuals are received back from the slave, then another action ID is assigned, maybe 543.

A transaction is define as a pair of actions: a send action send individuals to a slave, and a receive action action receives the individuals back from teh slave. For any pair of send-receive actions, the pair will be given the same ID.

The log file columns

In the log file, each action is recorded as one row. The first 7 columns are as follows:

  • ActId: The action ID.
  • TransId: The transaction ID.
  • Task: The task type and task name. The task type could be MasterTask, SEND, or RECV. The name is the name of the slave task. So, for example, SEND:Dev means that individuals were sent to a slave for processing by a slave task called Dev.
  • Node: The node on which the task was executed - which could either be MASTER or SLAVE#.
  • Outcome: The outcome of the task. - if all is well, for SEND transactions, it should show Processing, and for RECV transactions, it should show TaskSuccess. If you see TaskFail the something has gone wrong.
  • ExecTime: The total time taken for processing, in seconds. If the action is a SEND action, then None will be recorded. If the action is a RECV action, then the time taken to process the individuals on the slave will be recorded.
  • StartTime: The time when the action started executing.

These first 7 columns are always the same. The next set of columns however differ depending on the tasks that you have defined. For each task, there are 5 columns, separated by one empty column. The five columns are as follows:

  • ForTask : The name of the task
  • Pend : The size of the pending set
  • Proc : The size of the processing set
  • Succ : The total number of individuals successfully processed
  • AvgTime : The average execution time for the task

Checking what is going on

For example, lets assume that you have a JobDef that defines a number of slave tasks, including a development task (called Dev) that generates a model of a design, and a evaluation task (called Eval) that evaluates the model. Lets also assume that the input size of the development task is 3.

In your log file, if you see a RECV:Dev action, then looking at the data in that row, you should typically expect to see the following:

  • The pending set for Dev task should remain the same as the row above.
  • The processing set for Dev should be reduced by 3 from the row above.
  • The pending set for the Eval task should go up by 3 from the row above.

Sign in to add a comment
Powered by Google Project Hosting