Capacity Equals Performance... Doesn’t It?

Home > Articles > Capacity Equals Performance... Doesn’t It?
By Phil Stevens, MEd, CPO, FAAOP

Most readers will be familiar with the parable of the turkeys that wanted to fly. Watching the sky and seeing the other birds soaring on the winds, they became increasingly envious. "We have wings and feathers, why shouldn't we fly?" So they talked to the eagle, one of the greatest fliers in the skies, and asked to be taught the secrets of flight. The eagle agreed and invited them to his home. Following the skilled instruction of their teacher, the eager turkeys learned quickly, lifting off from the ground and landing gently upon their return. By the end of the day, they were able to fly, soaring on the winds as they had seen the other birds do. The grateful turkeys eagerly thanked their instructor…and then walked home.


The story is striking because it plays upon a fundamental assumption most of us inherently believe, that there is a strong relationship between capacity and performance, and that improving one's abilities ultimately augments one's work or eventual accomplishments. For example, estimates suggest that the United States spends roughly 7 percent of its gross domestic product on youth education, buoyed on the inherent assumption of a meaningful correlation between capacity and performance; that teaching and training our youth will ultimately result in sustained positive contributions to society in adulthood. While there are occasional exceptions, the investment in our youth generally pays off, as their increased capacity allows them to lead productive, successful lives.

However, in the realm of rehabilitation, recent clinical trials on occupational therapy to improve upper-limb function following stroke and prosthetic usage patterns observed in individuals with unilateral upper-limb amputations have begun to challenge the assumption that capacity is immediately related to performance.1,2 This article provides an examination of these clinical trials and their disruptive findings and considers their implications in the broader space of rehabilitative medicine. 

The Trial

A cohort of 78 individuals with unilateral hemiplegia secondary to ischemic or hemorrhagic stroke was recruited into the randomized trial. These individuals were all at least six months removed from their stroke event and presented with unilateral upper-limb weakness, characterized by mild to moderate functional motor capacity. Specifically, all subjects had some ability to open their affected hands and perform some basic grasping and lifting tasks on the standardized Action Research Arm Test (ARAT). (See sidebar below.)

The individuals were enrolled in a clinical trial examining various dosages of upper-limb task-specific practice in an occupational therapy setting, and their ultimate impact on the functional capacity of the paretic upper limb. The lowest tested dosage was 3,200 iterations as the tasks were performed 100 times per session over four sessions per week for eight weeks, with a median time of 13.6 hours of active practice (n = 19). Elevated dosages included 6,400 and 9,600 repetitions as iterations were doubled and tripled in separate cohorts (n = 21 and 21 respectively). A final cohort performed 300 repetitions per session during four sessions per week but extended their training beyond the eight-week course of the trial, reaching an average of just under 33 hours of practice (n = 17). 

Action Research Arm Test (ARAT)

The Action Research Arm Test (ARAT) is an evaluation measure specifically designed to assess upper-limb function among individuals with neurologic hemiplegia. It is well established, first described in 1981.1 The test assesses an individual's ability to manipulate objects of differing sizes, weights, and shapes. More specifically, the ARAT consists of 19 performance items grouped into the four subscales of grasp, grip, pinch, and gross movement.

Within each subscale, the items are ordered according to difficulty. Under the assumption that successful completion of the most difficult tasks indicates a subject would perform the easier tasks with similar efficiency, participants are asked to perform the most difficult task in each subscale first. If they obtain the maximal score for this item, the maximal score for the entire subscale is awarded and the evaluator moves on to the next task. Should participants fail to obtain the maximal score, they must perform the easiest task. If they fail at this task, the intermediate items are not tested. Rather, the entire subscale is scored as a zero. Partial completion of the easiest task allows participants to move onto the remaining items in the subscale.

Scoring is a product of completion and timing as follows:
0 =  cannot perform any part of the test
1 =  performs the test partially
2 =  completes the test, but takes an abnormally long time
3 =  performs the test normally

The total score on the ARAT ranges from 0 to 57, with higher scores indicating better performance. The time required will depend upon proficiency with many subjects tested in 7-10 minutes, but some subjects requiring a long as 20 minutes to
be tested.

1. Lyle, R. C. 1981. A performance test for assessment of upper limb function in physical rehabilitation treatment and research. International Journal of Rehabilitation Research 4:483-92.

The Outcomes

Outcomes in this trial can be presented in the two separate constructs of capacity and ultimate performance. To evaluate capacity, the ARAT was administered. Performance was measured with bilateral, wrist-worn accelerometers, recording the accelerations of both upper limbs along three axes and recording discrete activity counts. These accelerometers were worn once per week for 26 hours throughout the clinical trial, at the conclusion of the trial, and at a two-month follow-up.

The first two hours of the 26-hour period were discarded, removing the monitored activity recorded during the treatment sessions and the return trip home. Of interest was the 24-hour period once the subjects got home. The waterproof design of the accelerometers allowed them to be worn through the entire 24-hour period.

A range of variables were collected. For example, the total hours of use with the paretic arm divided by the total hours of use of the sound arm was represented by the use ratio, which quantified the contribution of the affected arm relative to the sound limb. Among neurologically intact adults, this use ratio was found to be 0.95, suggesting nearly equal amounts of use for both upper limbs throughout the 24-hour period. Numbers less than one would indicate increased use of the sound arm relative to the paretic arm. 

The Observations

With regard to capacity, ARAT scores improved with training, from an average of 32.4 at baseline to 36.9 at the trial's conclusion. Notably, these improvements were largely retained, with average ARAT scores of 35.9 recorded at the post-intervention follow-up. Unfortunately, measured activity values did not reflect these improvements in capacity. The average use ratio of 0.66 observed at baseline didn't change appreciably, nor did the measured hours of use for the paretic limb, initially reported at 4.73 hours. More complex accelerometry variables accounting for movement intensity, accelerations, and magnitudes all told similar stories. While capacity increased with training, measured performance was unchanged.

To better illustrate this observation, the authors of the trial created density plots for each individual in which the y-axis represented movement intensity, and the x-axis indicated the relative contributions of the two extremities. Equivalent contributions were plotted at x = 0, with negative values reflecting dominant activity of the paretic hand and positive values reflecting dominant values for the sound extremity.

For neurologically sound controls, these plots would appear as symmetrical Christmas trees centered at the x = 0 line. For study subjects, the trees were decidedly skewed, with fuller, disproportionately offset plots to the left and comparatively sparse plots on the right. 

The authors shared these plots for three individuals representing measured extremes. One subject demonstrated an atypical increase of 18 points in his ARAT score over the course of treatment. Another demonstrated a more representative improvement of ten points while the ARAT scores of the final subject increased by a mere three points. Despite these disparate changes in capacity, the density plots of measured activity at baseline, conclusion, and follow-up were largely unchanged, a collection of lopsided Christmas trees, all skewed to the left.1 

The Implications

With this data in hand, the authors concluded simply that "upper-limb task-specific training, designed to improve upper-limb capacity in the clinic, may be unable to improve upper-limb performance in daily life."1 This is in stark contrast with the reasonable assumption that improvements in capacity would translate to improvements in performance. In other words, despite extensive efforts in flight training, the students continued to walk home. As the authors would later clarify, "it is striking that not one person changed upper-limb performance after this carefully delivered intervention."1

With the increased proliferation of wearable activity monitors, the findings of this trial suggest that those involved in rehabilitation may soon find their reasonable assumptions disrupted by data measured in day-to-day existence, activity, and participation. 

Similar Findings in Upper-limb Prosthetics

Somewhat similar observations were recently published in a cross-sectional analysis within the field of upper-limb prosthetics.2 In this trial, 20 individuals with unilateral upper-limb absence at the transradial level, all of whom used single degree of freedom myoelectric hands, were recruited. This was a reasonably experienced cohort, with an average of 25 years since their amputations and an average of 20 years since their initial prescriptions of myoelectric prostheses.

As with the stroke trial, these individuals were asked to wear tri-axial accelerometers on both wrists, but only over a single seven-day period, with the resultant data used to characterize their relative reliance on their sound limbs over their prostheses. Unique to this trial was the additional measurement of prosthetic wear time.

In this case, capacity was only assessed once, using a very rigorous analysis of a single task. The cylinder task required the user to grasp a cylinder, lift and rotate it 90 degrees to the horizontal, and place it inside a horizontal tube. Performance on this task was assessed using a range of sensors including an electronic goniometer attached to the proximal knuckle of the index finger of the prosthetic hand to measure hand aperture, an Inertial Measurement Unit (IMU) to measure forearm motion, and a head-mounted eye tracker to capture gaze behavior during the activity.

Capacity was reported in terms of task success, or the number of times prosthesis users successfully completed the task out of ten attempts; task duration, or the mean duration of successful attempts (measured in seconds); "delay plateau" and "reach plateau," characterizing the timing and extent of hand movements; and "gaze patterns," or assessments of how much of the users' gaze was directed at their prosthetic hands compared to looking ahead to plan the next portion of the task.

With respect to measured performance, as with the stroke subjects, prosthesis users demonstrated high reliance on their sound side limbs. While the color-based plotting construct used to present this data was entirely distinct from that in the stroke study, the results were similar, with plots skewed towards reliance on the sound side limb. As with the stroke subjects, no correlations were observed between users' proficiency in their performance of the cylinder task and the measures of everyday activity. Echoing the authors of the stroke study, the authors of this second trial concluded, "A simple assumption might be that the better a prosthesis user performs with the lab, the more likely they are to use the prosthesis to perform everyday tasks. However, we found no evidence linking our measures of user performance with usage." Restated, they observed no correlation between measured capacity and measured performance.

Of some interest, there was no correlation observed between prosthetic wear time and measured usage, a finding that suggests that perceived value of a prosthesis is not confined to its measured use, perhaps reflecting constructs of aesthetic value along with body symmetry and wholeness. Also unique to this analysis was a consideration of prosthetic experience and reliance. Notably, a negative correlation was observed, meaning those individuals who had worn their prostheses longer were less reliant on their sound side limbs. 

What's Next?

The rehabilitation community will need to consider this emerging relationship between what individuals are able to do and what they choose to do. Performance, defined in this case as the utilization of a paretic limb or limb prosthesis in one's home environment, is unlikely to occur in the absence of capacity, or the basic competence to engage the target extremity in meaningful activity. But it appears that capacity may not be enough. Additional constructs of behavioral interventions or environmental modifications may need to be explored to ultimately influence performance in a more meaningful way. 

Phil Stevens, MEd, CPO, FAAOP, is a director with Hanger Clinic's Department of Clinical and Scientific Affairs. He can be contacted at 


1.        Waddell, K. J., M. J. Strube, and R. R. Bailey, et al. 2017. Does task-specific training improve upper limb performance in daily life post-stroke? Neurorehabilitation and Neural Repair 31(3):290-300.

2.       Chadwell, A., L. Kenney, and M. H. Granat, et al. 2018. Upper limb activity in myoelectric prosthesis users is biased towards the intact limb and appears unrelated to goal-directed task performance. Scientific Reports 23;8(1):11084.