A Discussion on Heart Rate Monitors - Optimum Performance Training

Why do people use Heart rate monitors when exercising?

I asked ChatGPT this exact question.

The top 2 reasons Mr. GPT gave were 1) Monitoring Intensity and 2) Optimizing Workouts.

Let’s focus on the 1st answer given by ChatGPT, Monitoring Intensity. Since the 2nd answer is entirely reliant on the 1st answer here, we can get two birds stoned at once!

Meaning, if heart rate monitors aren’t “great” for monitoring exercise intensity how could they possibly be great for optimizing workouts?

What we are really trying to understand here is why heart rate monitors may not be great at monitoring intensity. Note, I didn’t say useless or terrible, I said not great and should not be your only line of reasoning.

Cardiac output is what you are really trying to approximate when discussing heart rate. Cardiac output is made up of stroke volume and heart rate (Cardiac Output = Stroke Volume x Heart Rate) and is typically measured in liters of blood pumped by the heart per minute.

Cardiac output is tightly linked with VO2. VO2 Max being the maximum amount of oxygen that can be delivered by the circulating blood to be utilized by the muscles during activity. In highly trained individuals the VO2 Max is generally limited by the amount of utilization that occurs at the muscular level.

So, if heart rate is tightly linked with cardiac output, we might not have anything to really worry about here. Meaning, that heart rate is giving us a great approximation of exercise intensity then what am going on about?

The problem is that heart rate and cardiac output are not tightly linked.

“At exercise intensities with a substantial amount of anaerobic metabolism, HR overestimates the VO2 and energy expenditure as well. HR increases in hot and decreases in cold environmental conditions for a given workload.’ The exercise induced dehydration elevates the HR by 7 beats-min’ with each loss of body weight of 1%. Mental stress substantially increases HR at rest as well as during exercise in addition to a day-to-day variation of about 6 beats min!” (1)

In a study of high-level cross-country runners, they measured oxygen consumption (VO2) and heart rate (they also measured local oxygen utilization via NIRS) during a hilly cross-country course. What they observed was that during the uphill sections, oxygen intake increased, and during the downhill sections oxygen intake decreased. This makes intuitive sense. However, as the authors state “HR remained stable throughout the trail run in contrast to VO2 that increased within the uphill and decreased within the downhill sections” (1). Meaning, heart rate was not a good representation of actual work intensity. It was not responsive to changes in elevation (ie workload). But, VO2 and NIRS were.

[Image Taken from Running in Hilly Terrain: NIRS is More Accurate to Monitor Intensity than Heart Rate(1)]

Burr et al. performed a unique study where they measured oxygen consumption and heart rate for competitive downhill mountain bikers. What they found was that riders had a heart rate on average approximately 80% of their max heart rate during the downhill rides, while on average they were working at about 52% of their VO2 Max (2). In many cases, VO2 and heart rate go together quite well. Think, lower intensity (i.e. very sustainable) and constant work loads. However, their are many examples when they diverge drastically.

I hope, by this point in the article, I have convinced you that heart rate monitors, although useful, are not a great representation of cardiac output/VO2.

But, for argument sake, let’s say there are circumstances and scenarios where heart rate is a good indicator of cardiac output and VO2 Max. Because, as mentioned, these circumstances do exist.

In these scenarios, wouldn’t a heart rate monitor be a great way of monitoring intensity?

Let’s use an example. Let’s say individual number 1 is working at a power output corresponding to 80% of their VO2 max while cycling on the bike. Let’s say individual number 2 is also working at a power output corresponding to 80% of the VO2 Max cycling on the bike. Seems comparable, right? Maybe, but maybe not.

If individual 1 has a critical power (limit of sustainability) on the bike that occurs at 75% of the VO2 max, then at 80% VO2 Max they are performing unsustainable work, and in due course will fail to maintain the current pace (in a matter of 5-15 minutes likely). Not only will task failure occur, but over time their 80% VO2 max will move closer to 100% VO2 Max without any increase in the actual workload…as is the nature of unsustainable work when taken to its logical end.

Conversely, if individual number 2 has a critical power on the bike that occurs at 85% of the VO2 max, then they will be able to maintain the current pace they are at for an extended period of time if working at 80% VO2 Max. If this individual is fuelled and trained for this, we are talking about holding this pace for hours potentially WITHOUT a significant rise in VO2.

Critical power can be considered a threshold (although this is not technically correct we can forget about that for now). Lactate threshold or gas exchange threshold are also thresholds, as their names indicate. These 2 thresholds are pretty much synonymous (occurring at approximately the same work intensity, although are determined using different methods). Every individual will have their own critical power and lactate threshold for each modality (the work intensity at which these “thresholds” may be found).

“Threshold-based exercise bouts are, by design, individually modulated to reach the physiological thresholds of each individual rather than a homogenous homeostatic perturbation, inducing heterogeneous responses. Physiological thresholds have a considerable range across individuals where, for example, critical power can be as low as ~50% in the untrained, to ~95% VO2 max in elite athletes.” (3).

My example is just one example of many that exist in the research literature. But let’s go through one more just to finish this off.

Meyler et al (4) took a group of 10 recreationally trained individuals and put them through two scenarios of testing (cycling). In each scenario, the authors categorized the 3 intensity prescriptions for the individuals to perform. Namely, 1 – “moderate”, 2 – “heavy” and 3 – “severe”.

In one scenario, they anchored these intensity prescriptions at percentages of VO2 Max as recommended by the American College of Sports Medicine. They referred to this group as the Traditional group. Notably, these recommendations for percent VO2 max or percent heart rate would not be too dissimilar from what you may read from your Garmin watch (5 zone model) if you have not adjusted the numbers and have left it to base assumption. Meaning, they picked generic %’s of VO2 Max that “should” correspond with these intensities.

Conversely, the other scenario anchored intensity based on a threshold model of performance, anchoring intensity based on gas exchange threshold and critical power. They referred to this group as the threshold group.

So what happened? Let’s just take one of the intensity prescriptions and elaborate. Let’s use “severe” intensity. To the layman, severe intensity means unsustainable work (greater than Critical Power), or in the study it is referred to as high intensity interval training.

This aspect of the study required the individuals to perform a set work:rest format for 5 sets. When working above critical power, intensity prescription on the modality is of major concern. But, if you are using a percentage VO2 max model (or heart rate) to prescribe intensity, you will be completely oblivious to this fact.

“Completion rates for HIIT Threshold and HIIT Traditional were 100% and 20%, respectively. In HIIT Traditional, two subjects completed all five intervals, four completed four intervals, three completed three intervals, and one individual completed one interval” (4).

To reiterate, in the threshold group, all 10 individuals were able to complete the prescribed work:rest scenario. Which, if you are a coach, this constitutes a good training design. However, in the traditional group, only two subjects completed all five intervals. Again, if you are a coach, this is a bad training design if only 20% of your clients are able to complete your training design.

They continue, “this demonstrates the large variability in the exercise stimulus elicited when exercising at a work rate corresponding with 85% VO2 Max compared to that of 110% CP. Compared to all individuals exercising at 110% CP in HIlT Threshold, work rates ranged between 115% and 156% CP in HIIT Traditional, explaining the variability in time to task failure” (4). Meaning, percent VO2 max was not a great way to prescribe intensity.

What we have here is the realization that optimal intensity prescription needs to be based on a threshold concept, as opposed to a VO2 max concept (or heart rate). More clearly stated, exercise prescription needs to be based on the individual.

For the every day exerciser, these recommendations may be impractical. The population at large will not be able to access a metabolic cart in which to establish their gas exchange threshold, or a lactate analyzer to establish lactate threshold and they are not going to know how to calculate their critical power or even know what the hell that even means.

Which takes us right back to why people probably use Heart rate monitors most often. They are easily accessible and generally low-cost. Yet, that still does not mean they are a good way to measure intensity.

Lehtonen et al. (3) recommend 2 very practical and very accessible options for monitoring intensity and optimizing prescription. They of course would also include using known performance metrics as well (ie pace per km for running or watts for cycling), but their article recommends (plus the additional number 3 from me):

1) RPE

2) Talk Test

3) Power Output

(CR10 RPE Scale – Exact Wording May Vary)

The reason I like using these first two methods (RPE and Talk Test) to prescribe training intensity is because every individual has unlimited access to it. There is no subscription fee, only your awareness and your attention are required.

Plenty of research has shown that RPE of 2-4 out of 10 would correspond with the upper end of the modern intensity domain, meaning below the gas exchange threshold or lactate threshold (zone 2 in todays social media landscape). The wording for RPE 2 to 4 would likely be described as “easy”, “moderate” or from some Coaches “comfortably uncomfortable”.

RPE 2 to 4 would also correspond with the “last positive” stage of a talk test. Meaning this is the highest intensity an individual can exercise and still be able to complete the talk test.

The Talk Test “requires exercising individuals to recite a 10-15 second text at the end of an exercise stage of an incremental protocol, following which they are asked the following question: ‘can you talk comfortably?’. They can answer yes, not sure and no” (3).

For example, while you are running, can you recite the alphabet out loud without it feeling too too difficult? I mean, it’s not gonna be as easy as if you’re sitting down and doing nothing, but is it doable? If it is, answering YES, you’re probably still below the gas exchange threshold. If your answer is NOT SURE or NO, then you probably are not. I use this method as a way of trying to modulate my intensity during easy runs.

“The reasoning of this test is grounded in the competitive requirements of ventilation for the gas exchange and metabolism of exercise versus speech, as the first inflection point of ventilation that begins to rise exponentially, VT (basically the same intensity as the gas exchange threshold), represents the point where speech becomes uncomfortable and difficult due to the increased ventilatory demand from exercise.(3)”.

A study (5) using elite road cyclists confirmed that the talk test was a valid surrogate for the gas threshold and the respiratory compensation point, the latter being in somewhat close proximity to the critical power. What they found was that measured gas exchange threshold and the point at which the individuals talk test would indicate this threshold occurring were nearly identical (3.7 Watts per kg vs 3.6 Watts per kg). The same was found for the second stage of the talk test and the respiratory compensation point (both occurring at 4.3 Watts per kg).

Intensities above critical power would be signified by a negative ability to perform the talk test and RPE ratings of 7 and up (rising). This would not happen right away, but after only a short time this would indeed be the case. Meaning the activity was hard moving towards maximal.

To quickly summarize, I am not suggesting heart rate monitors are completely useless. They definitely have their uses. I have used them for many years myself. What I am recommending is that you use RPE and breathing rate/comfort as a better gauge of intensity than your heart rate monitor. A heart rate monitor can easily be used in conjunction with RPE and Talk Test, but should not be used alone.

Another topic I have not made a point of addressing in this article, but will briefly touch now, is using chest strap heart rate monitors versus a watch heart rate monitor. Chest strap heart rate monitors that are connect to a watch or an external device are considered the most valid and representative measures of heart rate. Wrist watch heart rate monitors are notoriously less reliable.

How much less reliable you ask? “For heart rate, measurement error also varied by device brand. Apple Watch was within ±3% 71% (35/49) of the time, while Fitbit wearables were within ±3% 51% (36/71) of the time and Garmin wearables were within ±3% 49% (23/47) of the time. Despite similar ±3% measurement error rates, Fitbit appeared to underestimate heart rate more than Apple Watch and Garmin.” (6). In this study, the best watch heart rate monitor was within 3% of the actual heart rate 71% of the time, and the other watches were less than this.

“The only thing worse than no data, is bad data“.

Imagine making your way through the wilderness using a broken compass but not understanding that it is broken, or that it was not actually in complete alignment with reality. Not providing you with good guidance. Would it be better to have the broken compass, knowing it is not entirely accurate or to not have access to the compass at all and be forced to rely on other information to navigate?

Now, what we’re talking about here is not nearly as dire as what I have portrayed above. Yet, having bad data means you have to knowingly overlook it, to forget about it, to not allow yourself to be influenced by it. You must not allow the bad data to impact your decision-making. This is much easier said than done.

This ends our discussion on heart rate monitors.

Michael

Join our mailing list for first access to Mentorship registration, Coaches Reading List mail outs, periodic Webinars and more.

Reference:

(1) Running in Hilly Terrain: NIRS is More Accurate to Monitor Intensity than Heart Rate

(2) Physiological Demands of Downhill Mountain Biking

(3) Hierarchical Framework to Improve Individualized Exercise Prescription in Adults: A Critical Review

(4) Variability in Exercise Tolerance and Physiological Responses to Exercise Prescribed Relative to Physiological Thresholds and to Maximum Oxygen Uptake

(5) Relationship Between The Talk Test and Ventilatory Thresholds In Well-Trained Cyclists

(6) Reliability and Validity of Commercially Available Wearable Devices for Measuring Steps, Energy Expenditure, and Heart Rate: Systematic Review