Search
Close this search box.

Heartbeat: a progress monitor for long-running processes

I’ve had a need on a couple of projects for a monitoring system to record progress of a long-running transaction. Enter Heartbeat – a library which does just that. It’s a project of mine on github: Heartbeat progress monitor.

The typical use-case is for monitoring a lengthy batch load, where Heartbeat will record progress at set intervals. The intervals can be time-based, or progress-based or both – so you can have an overnight process which writes a status update every 15 minutes, and for every 10,000th item it processes. Alternatively, for a Windows Service have a Heartbat to pulse every hour so you know it’s still in a working state and how much work it’s done since starting.

Heartbeat is threadsafe, so it’s ideal for asynchronous processing which is actioned by spawning worker threads or using Task parallelism in .NET 4.0.

Usage

Usage is straightforward:

1. Initialise Heartbeat

Create a Heartbeat instance, passing in the object to be monitored, and the interval values for each type of pulse, and then start the timer:

long countInterval = 7000;
double timerInterval = 300;  // 0.3 seconds
var heartbeat = new Heartbeat(this, countInterval, timerInterval);
heartbeat.Start();

2. Integrate Heartbeat with the working process

If you only want timed pulses, you don’t need to do anything else – the Heartbeat instance will pulse at the set interval and write a log to the database. If you want additional pulses when your count interval is reached, then you need to increment the count when each item is processed:

//do work...
    heartbeat.IncrementCount();

You can also subscribe to the OnPulse event, and on each pulse you can stop the log being written, or add your own custom message to the log:

heartbeat.OnPulse += new Heartbeat.OnPulseEventHanlder(RunTimer_OnPulse);
    heartbeat.Start("RunTimer started, timerInterval: {0}, runTime: {1}".FormatWith(timerInterval, runTime));
…
    void RunTimer_OnPulse(PulseEventSource source, ref bool writeLog, ref string logText)
    {
        writeLog = true;
        logText = "RunTimer_OnPulse, source: {0}, text: {1}"
                    .FormatWith(source, RandomValueGenerator.GetRandomString());
    }

3. Tell Heartbeat when you’re finished

Call SetComplete or SetFailed to log that the work is finished:

var heartbeat = new hb.Heartbeat(this, countInterval, 0);
    heartbeat.Start("RunCount_NoHandler, countInterval: {0}, countTo: {1}".FormatWith(countInterval, countTo));
    try
    {
        for (int i = 0; i < countTo; i++)
        {
            heartbeat.IncrementCount();
        }
        var zero = 0;
        var dbz = 1 / zero;
        heartbeat.SetComplete("RunCount_NoHandler finished");
    }
    catch (Exception ex)
    {
        heartbeat.SetFailed("RunCount_NoHandler failed, message: {0}".FormatWith(ex.FullMessage()));
    }

If you don’t make either call, then Heartbeat will write an UNKNOWN status log when the Heartbeat object is disposed.

Configuration

Rather than specify the pulse intervals for each Heartbeat instance you use, you can set default values at application level in your config file:

<!-- set heartbeat defaults to pulse every 10 minutes & every 3,000 increments -->
 <sixeyed.heartbeatenabled="true"
                     defaultPulseTimerInterval="600000"
                     defaultPulseCountInterval="3000" />

The config section is optional, but if you don’t have any config and don’t set at least one of the pulse intervlas, your Heartbeat will never fire.

The Heartbeat Database

Heartbeat pulses are written to a simple database, consisting of a log table and two reference data tables:

To create the database (with C:\ as the default file location), run CREATE-Heartbeat.sql in the Database folder. In a working system I’d expect the Heartbeat tables to be added to an existing system database, so the script is just to get you started.

Each instance of a job has a unique ID, and the log records the full CLR assembly name of the object being monitored, the pulse intervals being used, the log time and status. The log also records the number of counts for the type of pulse, and the pulse statistics (elapsed milliseconds for timed pulses, number of items for count pulses). In raw form, the results of a time- and count- heartbeat session look like this:

HeartbeatLogId    HeartbeatInstanceId    ComponentTypeName    StatusCode    PulseTimerInterval    PulseCountInterval    LogDate    LogText    TimerPulseNumber    CountPulseNumber    TimerMilliseconds    CountNumber

376    BF341091-D0EB-498F-803A-B623EAD5BF16    Sixeyed.Heartbeat.Tests.HeartbeatTest, Sixeyed.Heartbeat.Tests, Version=1.0.0.0, Culture=neutral, PublicKeyToken=null    START     800    1000    2010-09-30 20:03:37.990    RunCountAndTimer_NoHandler, countInterval: 1000, countTo: 5692, timerInterval: 800, runTime: 4532    0    0    0    0

377    BF341091-D0EB-498F-803A-B623EAD5BF16    Sixeyed.Heartbeat.Tests.HeartbeatTest, Sixeyed.Heartbeat.Tests, Version=1.0.0.0, Culture=neutral, PublicKeyToken=null    WORKING     800    NULL    2010-09-30 20:03:38.803        1    NULL    811.2014    NULL

378    BF341091-D0EB-498F-803A-B623EAD5BF16    Sixeyed.Heartbeat.Tests.HeartbeatTest, Sixeyed.Heartbeat.Tests, Version=1.0.0.0, Culture=neutral, PublicKeyToken=null    WORKING     800    NULL    2010-09-30 20:03:39.613        2    NULL    1622.4029    NULL

411    BF341091-D0EB-498F-803A-B623EAD5BF16    Sixeyed.Heartbeat.Tests.HeartbeatTest, Sixeyed.Heartbeat.Tests, Version=1.0.0.0, Culture=neutral, PublicKeyToken=null    SUCCEED     800    1000    2010-09-30 20:04:01.610    RunCountAndTimer_NoHandler finished    29    5    23618.4415    5692

Storing the results in SQL Server provides the opportunity for simple SSRS dashboards, facilitated by the reference tables which give friendly names for statuses and allow you to store friendly names for applications (in HeartbeatApplications – xxx can be aliased to “User Load” for querying).

Implementation Notes

Heartbeat and Task.Factory.Start go very nicely together, but you need to be careful with how state is passed to the tasks. As Heartbeat implements IDisposable, if your worker class is also IDisposable then the Heartbeat instance could go out of scope and be disposed before your tasks are run. The safest pattern is to pass the task method a copy of the Heartbeat instance and any other variables it needs: 

 for (long i = 0; i < countTo; i++)
    {
        var heartbeat = _heartbeat; //don't pass the instance directly
        var taskIndex = i;
        tasks[i] = Task.Factory.StartNew(() => DoWork(heartbeat, taskIndex, finalTaskIndex));
    }
…
    private void DoWork(Heartbeat heartbeat, long taskIndex, long finalTaskIndex)
    {
        heartbeat.IncrementCount();
        //do work...
        if (taskIndex == finalTaskIndex)
        {
            heartbeat.SetComplete("StubComponent.Process finished");
        }

Also note the check to see if this is the final task – if so, the Heartbeat is set to complete. The task which is started last may not be the final task to complete, so the end time is not guaranteed to be accurate, but for processes running for several hours, the discrepancy is likely to be minimal.

Still to come…

  • Fancy SSRS reports showing a breakdown of jobs by completion status (failed, succeeded), average duration, historical run stats etc.
  • Options for using Heartbeat in distributed systems like NServiceBus and BizTalk.
This article is part of the GWB Archives. Original Author: Elton Stoneman

Related Posts