At my work I often use confluence to update our wiki. Just last week I wanted to change the way bar charts look on our wiki. Confluence has simple mechanism of obtaining bar and pie charts. These charts indicate how our clusters are being used by the user community in our university. To generate the numbers, first need to run some script which creates some files with all the required numbers from some database. For example, to create a bar chart shown below I need to put in this code in my confluence. If you go to this page, you can see how these charts look on wiki:
https://wikis.nyu.edu/display/NYUHPC/Historical+Usage+Reports
Confluence code for bar chart
{chart:title=CPU Time in Hours|type=bar|width=900}
|| CLUSTER || January || February || March || April || May || June ||
|| USQ | 380033 | 291235 | 356897 | 343902 | 375666 | 445860 ||
|| BOWERY | 572622 | 272514 | 331905 | 355627 | 301544 | 309076 ||
|| CARDIAC | 716065 | 359153 | 329973 | 723314 | 279251 | 617893 ||
|| CUDA | 1 | 0 | 1 | 251 | 54 | 0 ||
{chart}
Confluence code for pie chart
{chart:title=Number of Jobs on BOWERY|width=450}
|| CLUSTER || January || February || March || April || May || June ||
|| BOWERY | 5,571 | 2,722 | 3,445 | 3,670 | 3,800 | 5,909 ||
{chart}
Simple. Right? Not really. Because, I need to do this on the first of every month. Because of whatever reasons I end up doing this some where in the middle of the month. Which is not cool. So, I thought of setting a cron job. I wanted this cron job to deliver me an email with the stuff I need to put on confluence to generate the charts just like the one shown here.
Generally, I need to log on to a specific machine and then run a command to generate the numbers and a file with the same numbers. Let me show you what I do here:
[manchu@hpc-metrics ~]$ /usr/local/bin/metrics-analysis.py monthly
To retrieve a monthly report for the previous month.
Getting HPC Metrics Statistics.
For the jobs ended on and after 2011-06-01 before 2011-07-01.
Getting the total counts from usq...
Getting the total counts from bowery...
Getting the total counts from cardiac...
Getting the total counts from cuda...
Getting the total counts from usq...
Getting the total counts from bowery...
Getting the total counts from cardiac...
Getting the total counts from cuda...
**************** Starting the sum task 1: grouped by ALL *************
Getting the total sum from usq...
Getting the total sum from bowery...
Getting the total sum from cardiac...
Getting the total sum from cuda...
----------------------
Summary of this period, sorted by ALL
For jobs ended on and after 2011-06-01 and before 2011-07-01
Name usq bowery cardiac cuda All
Jobs number 72,671 5,909 32,444 107 111,131
User number 76 36 9 1 94
CPU time(h) 445,860 309,076 617,893 0 1,372,829
Wall time(h) 4,279,233 115,571 793,340 83 5,188,227
Used time(h) 381,261 39,053 177,488 3 597,805
Requested CPU cores 108,591 115,402 183,247 830 408,070
Avg. used CPU cores 1.17 7.91 3.48 0.00
Where,
Avg. used CPU cores: the average CPU resource consumed by a job, which is "CPU time/Used time"
The results were also stored in the file of hpc_usage_2011-06-01_2011-06-30.txt
[manchu@hpc-metrics ~]$ vi hpc_usage_2011-06-01_2011-06-30.txt
1 HPC Usage Summary
2
3 **************** Starting the sum task 1: grouped by ALL *************
4 Summary of this period, sorted by ALL
5 For jobs ended on and after 2011-06-01 and before 2011-07-01
6 Name usq bowery cardiac cuda All
7 Jobs number 72,671 5,909 32,444 107 111,131
8 User number 76 36 9 1 94
9 CPU time(h) 445,860 309,076 617,893 0 1,372,829
10 Wall time(h) 4,279,233 115,571 793,340 83 5,188,227
11 Used time(h) 381,261 39,053 177,488 3 597,805
12 Requested CPU cores 108,591 115,402 183,247 830 408,070
13 Avg. used CPU cores 1.17 7.91 3.48 0.00
14 Where,
15 Avg. used CPU cores: the average CPU resource consumed by a job, which is "CPU time/Used time"
16
17
I need to get the numbers from line 7 to 13 and put them together in specific format required to generate bar and pie charts on wiki using confluence wiki tool. Like I said, I could do it by cutting and pasting the numbers on to some excel file and then copy and paste them from there to confluence. Believe me it's painful to do that. So I've decided to write a script that would do all this work and then cron job would deliver all the results to me in the email. I just copy the entire things from the email and put them on confluence. That's it. It's just not even a minute job this way. More over cron job would remind me that I need to do this on the first of every month. Cool. Ha!
Crontab
My cron job runs the script at 00:05 on 1st of each month. The line you need to put in your cron tab is:
[manchu@hpc-metrics ~]$ crontab -e
5 0 1 * * /home/manchu/hpc-metrics-cronjob.sh > /dev/null 2>&1
crontab: installing new crontab
[manchu@hpc-metrics ~]$
Cron Job Script
Here is my cron job script:
[manchu@hpc-metrics ~]$ more hpc-metrics-cronjob.sh
#!/bin/bash
# This is a cron job script for delivering the hpc metrics at the beginning of each month. Written by Sreedhar Manchu.
/home/manchu/metrics.sh | mail -s "HPC Metrics Details for `date -d\"1 month ago\" +%B`" my_email@domain.com
[manchu@hpc-metrics ~]$
Shell Script to generate the required code for confluence
Here is the script I wrote to generate the confluence code:
[manchu@hpc-metrics ~]$ more metrics.sh
#!/bin/bash
/usr/local/bin/metrics-analysis.py monthly
echo "----------------------------------------------------------------------------------------"
echo
echo "###################################### EXCEL VALUES ###################################"
echo
echo "----------------------------------------------------------------------------------------"
echo
filename=hpc_usage_`date -d "last month" "+%F"`_`date -d "yesterday" "+%F"`.txt
for ((i=-5; i<=-2; i++))
do
echo -n "| *`date -d "last month" "+%b %Y"`* |"
for j in {7..12}
do
echo -n "`head -$j $filename | tail -1 | perl -lane 'print " $F['$i'] |"'`"
done
echo -n "`head -13 $filename | tail -1 | perl -lane 'print " $F['$(($i+1))'] |"'`"
echo
done
months=`date -d "last month" "+%-m"`
clusters=4
categories=7
for ((i=0; i<$months; i++))
do
filename=hpc_usage_`date -d "$(($months-$i)) months ago" "+%F"`_`date -d "yesterday $(($months-$i-1)) months ago" "+%F"`.txt
for ((j=0; j<$clusters; j++))
do
for ((k=0; k<$categories; k++))
do
if [ $k -ne 6 ]
then
value[$(($i+$j*$months+$clusters*$months*$k))]="`head -$(($k+7)) $filename | tail -1 | perl -lane 'print "$F['$((
-5+$j))']"' | sed 's/,//g'`"
else
value[$(($i+$j*$months+$clusters*$months*$k))]="`head -13 $filename | tail -1 | perl -lane 'print "$F['$((-4+$j))
']"' | sed 's/,//g'`"
fi
done
done
done
title_string=("Number of Jobs" "Number of Users" "CPU Time in Hours" "Walltime in Hours" "Used Time in Hours" "Total Requested CPU Cores" "Avg. C
PU Cores Used Per Job")
cluster=("USQ" "BOWERY" "CARDIAC" "CUDA")
echo
echo "----------------------------------------------------------------------------------------"
echo
echo "####################################### BAR CHARTS ####################################"
echo
echo "----------------------------------------------------------------------------------------"
echo
echo " "
for ((k=0; k<$categories; k++))
do
echo "{chart:title=${title_string[$k]}|type=bar|width=900}"
echo -n "|| CLUSTER ||"
for ((n=0; n<$months; n++))
do
echo -n " `date -d "$(($months-$n)) month ago" "+%B"` ||"
done
echo
for ((j=0; j<$clusters; j++))
do
echo -n "|| ${cluster[$j]} |"
for ((i=0; i<$months; i++))
do
echo -n " ${value[$(($k*$clusters*$months+$j*$months+$i))]} |"
done
echo "|"
done
echo "{chart}"
echo " "
echo
done
echo "----------------------------------------------------------------------------------------"
echo
input_start_date=`date -d "$months months ago" "+%F"`
input_end_date="`date "+%F"`"
/usr/local/bin/metrics-analysis.py << EOF
$input_start_date
$input_end_date
EOF
filename="hpc_usage_`date -d "$months months ago" "+%F"`_`date -d "yesterday" "+%F"`.txt"
for ((j=0; j<$clusters; j++))
do
for ((k=0; k<$categories; k++))
do
if [ $k -ne 6 ]
then
value[$(($k*$clusters+$j))]="`head -$(($k+7)) $filename | tail -1 | perl -lane 'print "$F['$((-5+$j))']"' | sed 's/,//g'`
"
else
value[$(($k*$clusters+$j))]="`head -13 $filename | tail -1 | perl -lane 'print "$F['$((-4+$j))']"' | sed 's/,//g'`"
fi
done
done
echo "----------------------------------------------------------------------------------------"
echo
echo "####################################### PIE CHARTS ####################################"
echo
echo "----------------------------------------------------------------------------------------"
echo
echo " "
for ((k=0; k<$categories; k++))
do
echo "{chart:title=${title_string[$k]} over the last `date -d"last month" "+%-m"` months|width=450}"
echo -n "|| CLUSTER ||"
for ((j=0; j<=3; j++))
do
echo -n " ${cluster[$j]} ||"
done
echo
echo -n "|| category |"
for ((j=0; j<$clusters; j++))
do
echo -n " ${value[$(($k*$clusters+$j))]} |"
done
echo -n "|"
echo
echo "{chart}"
echo " "
echo
done
echo "----------------------------------------------------------------------------------------"
echo
[manchu@hpc-metrics ~]$
Output I get when I run the script
Here is the output I get when I run this script:
[manchu@hpc-metrics ~]$ ./metrics.sh
To retrieve a monthly report for the previous month.
Getting HPC Metrics Statistics.
For the jobs ended on and after 2011-06-01 before 2011-07-01.
Getting the total counts from usq...
Getting the total counts from bowery...
Getting the total counts from cardiac...
Getting the total counts from cuda...
Getting the total counts from usq...
Getting the total counts from bowery...
Getting the total counts from cardiac...
Getting the total counts from cuda...
**************** Starting the sum task 1: grouped by ALL *************
Getting the total sum from usq...
Getting the total sum from bowery...
Getting the total sum from cardiac...
Getting the total sum from cuda...
----------------------
Summary of this period, sorted by ALL
For jobs ended on and after 2011-06-01 and before 2011-07-01
Name usq bowery cardiac cuda All
Jobs number 72,671 5,909 32,444 107 111,131
User number 76 36 9 1 94
CPU time(h) 445,860 309,076 617,893 0 1,372,829
Wall time(h) 4,279,233 115,571 793,340 83 5,188,227
Used time(h) 381,261 39,053 177,488 3 597,805
Requested CPU cores 108,591 115,402 183,247 830 408,070
Avg. used CPU cores 1.17 7.91 3.48 0.00
Where,
Avg. used CPU cores: the average CPU resource consumed by a job, which is "CPU time/Used time"
The results were also stored in the file of hpc_usage_2011-06-01_2011-06-30.txt
----------------------------------------------------------------------------------------
###################################### EXCEL VALUES ###################################
----------------------------------------------------------------------------------------
| *Jun 2011* | 72,671 | 76 | 445,860 | 4,279,233 | 381,261 | 108,591 | 1.17 |
| *Jun 2011* | 5,909 | 36 | 309,076 | 115,571 | 39,053 | 115,402 | 7.91 |
| *Jun 2011* | 32,444 | 9 | 617,893 | 793,340 | 177,488 | 183,247 | 3.48 |
| *Jun 2011* | 107 | 1 | 0 | 83 | 3 | 830 | 0.00 |
----------------------------------------------------------------------------------------
####################################### BAR CHARTS ####################################
----------------------------------------------------------------------------------------
{chart:title=Number of Jobs|type=bar|width=900}
|| CLUSTER || January || February || March || April || May || June ||
|| USQ | 205227 | 341853 | 433496 | 494724 | 323675 | 72671 ||
|| BOWERY | 5571 | 2722 | 3445 | 3670 | 3800 | 5909 ||
|| CARDIAC | 42578 | 98160 | 27774 | 32225 | 53133 | 32444 ||
|| CUDA | 29 | 1 | 114 | 218 | 131 | 107 ||
{chart}
{chart:title=Number of Users|type=bar|width=900}
|| CLUSTER || January || February || March || April || May || June ||
|| USQ | 70 | 87 | 96 | 89 | 85 | 76 ||
|| BOWERY | 27 | 32 | 33 | 35 | 38 | 36 ||
|| CARDIAC | 10 | 9 | 10 | 7 | 12 | 9 ||
|| CUDA | 4 | 1 | 4 | 6 | 3 | 1 ||
{chart}
{chart:title=CPU Time in Hours|type=bar|width=900}
|| CLUSTER || January || February || March || April || May || June ||
|| USQ | 380033 | 291235 | 356897 | 343902 | 375666 | 445860 ||
|| BOWERY | 572622 | 272514 | 331905 | 355627 | 301544 | 309076 ||
|| CARDIAC | 716065 | 359153 | 329973 | 723314 | 279251 | 617893 ||
|| CUDA | 1 | 0 | 1 | 251 | 54 | 0 ||
{chart}
{chart:title=Walltime in Hours|type=bar|width=900}
|| CLUSTER || January || February || March || April || May || June ||
|| USQ | 5245803 | 13446775 | 20903317 | 20559816 | 40612407 | 4279233 ||
|| BOWERY | 80722 | 43649 | 60817 | 63988 | 67950 | 115571 ||
|| CARDIAC | 284755 | 494757 | 320250 | 78273 | 1067320 | 793340 ||
|| CUDA | 116 | 4 | 224 | 996 | 158 | 83 ||
{chart}
{chart:title=Used Time in Hours|type=bar|width=900}
|| CLUSTER || January || February || March || April || May || June ||
|| USQ | 291717 | 214256 | 282014 | 295398 | 312894 | 381261 ||
|| BOWERY | 30980 | 21816 | 29132 | 38124 | 36421 | 39053 ||
|| CARDIAC | 153014 | 150655 | 117527 | 23588 | 256442 | 177488 ||
|| CUDA | 71 | 0 | 21 | 300 | 91 | 3 ||
{chart}
{chart:title=Total Requested CPU Cores|type=bar|width=900}
|| CLUSTER || January || February || March || April || May || June ||
|| USQ | 428382 | 780907 | 838015 | 1357449 | 359100 | 108591 ||
|| BOWERY | 175895 | 103060 | 97843 | 107121 | 96763 | 115402 ||
|| CARDIAC | 151891 | 277135 | 66307 | 87260 | 91421 | 183247 ||
|| CUDA | 92 | 4 | 307 | 1623 | 1019 | 830 ||
{chart}
{chart:title=Avg. CPU Cores Used Per Job|type=bar|width=900}
|| CLUSTER || January || February || March || April || May || June ||
|| USQ | 1.30 | 1.36 | 1.27 | 1.16 | 1.20 | 1.17 ||
|| BOWERY | 18.48 | 12.49 | 11.39 | 9.33 | 8.28 | 7.91 ||
|| CARDIAC | 4.68 | 2.38 | 2.81 | 30.66 | 1.09 | 3.48 ||
|| CUDA | 0.01 | 0.02 | 0.04 | 0.84 | 0.59 | 0.00 ||
{chart}
----------------------------------------------------------------------------------------
To analyze the hpc-metrics database
Please input the start date, such as 2010-11-01: Please input the day after the end date, such as 2010-12-01: Getting HPC Metrics Statistics.
For the jobs ended on and after 2011-01-01 before 2011-07-01.
Getting the total counts from usq...
Getting the total counts from bowery...
Getting the total counts from cardiac...
Getting the total counts from cuda...
Getting the total counts from usq...
Getting the total counts from bowery...
Getting the total counts from cardiac...
Getting the total counts from cuda...
**************** Starting the sum task 1: grouped by ALL *************
Getting the total sum from usq...
Getting the total sum from bowery...
Getting the total sum from cardiac...
Getting the total sum from cuda...
----------------------
Summary of this period, sorted by ALL
For jobs ended on and after 2011-01-01 and before 2011-07-01
Name usq bowery cardiac cuda All
Jobs number 1,871,646 25,117 286,314 600 2,183,677
User number 153 66 22 11 172
CPU time(h) 2,193,594 2,143,289 3,025,648 307 7,362,837
Wall time(h) 105,047,351 432,696 3,038,695 1,580 108,520,322
Used time(h) 1,777,541 195,527 878,715 486 2,852,269
Requested CPU cores 3,872,444 696,084 857,261 3,875 5,429,664
Avg. used CPU cores 1.23 10.96 3.44 0.63
Where,
Avg. used CPU cores: the average CPU resource consumed by a job, which is "CPU time/Used time"
The results were also stored in the file of hpc_usage_2011-01-01_2011-06-30.txt
----------------------------------------------------------------------------------------
####################################### PIE CHARTS ####################################
----------------------------------------------------------------------------------------
{chart:title=Number of Jobs over the last 6 months|width=450}
|| CLUSTER || USQ || BOWERY || CARDIAC || CUDA ||
|| category | 1871646 | 25117 | 286314 | 600 ||
{chart}
{chart:title=Number of Users over the last 6 months|width=450}
|| CLUSTER || USQ || BOWERY || CARDIAC || CUDA ||
|| category | 153 | 66 | 22 | 11 ||
{chart}
{chart:title=CPU Time in Hours over the last 6 months|width=450}
|| CLUSTER || USQ || BOWERY || CARDIAC || CUDA ||
|| category | 2193594 | 2143289 | 3025648 | 307 ||
{chart}
{chart:title=Walltime in Hours over the last 6 months|width=450}
|| CLUSTER || USQ || BOWERY || CARDIAC || CUDA ||
|| category | 105047351 | 432696 | 3038695 | 1580 ||
{chart}
{chart:title=Used Time in Hours over the last 6 months|width=450}
|| CLUSTER || USQ || BOWERY || CARDIAC || CUDA ||
|| category | 1777541 | 195527 | 878715 | 486 ||
{chart}
{chart:title=Total Requested CPU Cores over the last 6 months|width=450}
|| CLUSTER || USQ || BOWERY || CARDIAC || CUDA ||
|| category | 3872444 | 696084 | 857261 | 3875 ||
{chart}
{chart:title=Avg. CPU Cores Used Per Job over the last 6 months|width=450}
|| CLUSTER || USQ || BOWERY || CARDIAC || CUDA ||
|| category | 1.23 | 10.96 | 3.44 | 0.63 ||
{chart}
----------------------------------------------------------------------------------------
[manchu@hpc-metrics ~]$