Journal of Rehabilitation Research & Development (JRRD)

Quick Links

  • Health Programs
  • Protect your health
  • Learn more: A-Z Health
Veterans Crisis Line Badge

Volume 51 Number 3, 2014
   Pages 391 — 400

Anomia treatment platform as behavioral engine for use in research on physiological adjuvants to neurorehabilitation

Diane Kendall, PhD;1–2* Anastasia Raymer, PhD;2–3 Miranda Rose, PhD;4–5 JoEllen Gilbert, MS;2,6 Leslie J. Gonzalez Rothi, PhD2,7

1Department of Speech and Hearing Sciences, University of Washington, Seattle, WA; 2Brain Rehabilitation and Research Center, Malcom Randall Department of Veterans Affairs Medical Center, Gainesville, FL; 3Department of Communication Disorders and Special Education, Old Dominion University, Norfolk, VA; 4School of Human Communication Sciences, La Trobe University, Melbourne, Australia; 5Clinical Centre for Research Excellence in Aphasia Rehabilitation, University of Queensland, Queensland, Australia; 6Department of Neurology Research, University of Florida/Shands Neuroscience Center, Jacksonville, FL; 7The Bob Paul Family Professor of Neurology, University of Florida, Gainesville, FL

Abstract — The purpose of this study was to create a "behavioral treatment engine" for future use in research on physiological adjuvants in aphasia rehabilitation. We chose the behavioral target anomia, which is a feature displayed by many persons who have aphasia. Further, we wished to saturate the treatment approach with many strategies and cues that have been empirically reported to have a positive influence on aphasia outcome, with the goal being to optimize the potential for positive response in most participants. A single-subject multiple baseline design with replication across eight participants was employed. Four men and four women, with an average age of 62 yr and an average of 63.13 mo poststroke onset, served as participants. Word-retrieval treatment was administered 3 d/wk, 1 h/d for a total of 20 treatment hours (6–7 wk). Positive acquisition effects were evident in all eight participants (d effect size [ES] = 5.40). Treatment effects were maintained 3 mo after treatment termination for five participants (d ES = 2.94). Within and across semantic category, generalization was minimal (d ES = 0.43 within and 1.09 across). This study demonstrates that this behavioral treatment engine provides a solid platform on which to base future studies whereby various treatment conditions are manipulated and pharmacologic support is added.

Key words: adjuvant, anomia, aphasia, behavioral treatment engine, language, neurorehabilitation, pharmacology, rehabilitation, speech-language pathology, stroke.

Abbreviations: AQ = Aphasia Quotient, BNT = Boston Naming Test, CIU = correct information unit, df = degrees of freedom, ES = effect size, ICC = intraclass correlation, RR&D = Rehabilitation Research and Development, SAQOL = Stroke and Aphasia Quality of Life (scale), SD = standard deviation, VA = Department of Veterans Affairs, WAB = Western Aphasia Battery.
*Address all correspondence to Diane Kendall, PhD; University of Washington, Speech and Hearing Sciences, 1417 42nd St NE, Seattle, WA 98105; 206-685-7482; fax: 206-543-1093. Email:

Nadeau and Wu suggest that when considering the combination of drug and behavioral treatment in neurorehabilitation, the rationale for use of a drug is to promote reactive plasticity in the mature central nervous system, thereby promoting normal learning mechanisms [1]. In contrast, they suggest that the behavioral therapy might be referred to in these combinations as a "behavioral engine" designed to provide the substantive experience with the knowledge to be learned. An optimal behavioral engine must produce an effect that is notable, reaps an effect in a broad spectrum of potential participants, and can be replicated across sites and therapists with reasonable fidelity. With a behavioral engine as a base, effect of an additional adjunctive agent such as a drug might be more explicitly identified.

Language therapy has been noted to effect substantial gains in those living with chronic aphasia [2], especially when therapy is delivered as frequently as three times per week [3] and for a total of at least 8 h [4]. Because of the prevalence of word-retrieval impairments (anomia) among individuals with aphasia, much treatment research has centered on identifying effective treatments for anomia, usually in the context of picture-naming paradigms. Effects are usually strong for improved naming of trained words [5–6], with some maintenance of training effects lasting several months after treatment completion. In naming treatments for aphasic word-retrieval deficits, however, there is conflicting evidence concerning the degree of generalization to untreated items and contexts [7–10]. Some recent studies have shown that generalization may be seen to untreated items within trained semantic categories [11–12].

Several studies of aphasia resulting from stroke have endorsed that clinical outcomes might also be enhanced by various pharmacotherapies [13–17]. Nadeau and Wu (2006) state that "pairing a physiological agent with a behavioral therapy will be a key paradigm in neurorehabilitation . . .," noting that pairing the two ". . . might accelerate the acquisition of knowledge during therapy" [1, p. 108]. Thus, the purpose of this study was to create an anomia "behavioral engine" for future use in research on physiological adjuvants in aphasia rehabilitation.

Our strategy was to choose a behavioral target that was a feature displayed by most persons with aphasia of any type or severity, in this case, anomia. Further, we wished to saturate the treatment approach with many strategies and cues that have been empirically reported to positively influence aphasia outcome, with the goal being to optimize the potential for positive response in most participants. These evidence-based strategies include semantic and phonologic cues [18–19], orthographic labels [20], repetition [21], and delayed recall/spaced retrieval training [22]. In this study, we were concerned with response to treatment not only in highly constrained experimental naming tasks, but also to more ecologically valid measures, such as conversation and the participant’s evaluation of possible effects of the treatment on their quality of life. We describe the specifics of this approach and the initial safety and indications of effects in a phase one study of eight participants. The following research questions were asked: (1) Is this treatment able to improve word retrieval in individuals with aphasia? (2) Is the treatment effect maintained after treatment termination? and (3) Does treatment generalize to untreated stimuli and untreated contexts?


Participants were recruited through the Department of Veterans Affairs (VA) Rehabilitation Research and Development (RR&D) Brain Rehabilitation and Research Center, Gainesville, Florida. Four men and four women, with an average age of 62 yr (standard deviation [SD] 9.65) and an average of 63.13 (SD 44.31) mo post stroke onset, served as participants. All participants had experienced a single left hemisphere stroke (documented with either computed tomography or magnetic resonance imaging) and were 6 mo or more poststroke, right-handed, and monolingual English speaking. Exclusion criteria included significant apraxia of speech; self-reported history of depression or other psychiatric illness (unless successfully treated); or history of degenerative neurological illnesses, chronic medical illness, or substantial impairment in vision or hearing. Table 1 lists the relevant participant demographic information.

To determine appropriateness for this study, participants demonstrated aphasia (Western Aphasia Battery [WAB] quotient <93.8) [23], word-retrieval deficits as determined by a score of <45 on the Boston Naming Test (BNT) [24], and no more than mild-moderate apraxia of speech as documented by the Apraxia Battery for Adults [25]. The reading subtest on the WAB was administered to quantify the nature and presence of alexia. The Psycholinguistic Assessments of Language Processing in Aphasia-53 [26] was administered to determine the presence of a possible predominant semantic versus phonologic level impairment underlying the word-retrieval deficit. Working memory was assessed using digits forward and backward. Nonverbal problem solving was assessed using Raven's Progressive Matrices [27]. Participants also completed the Stroke and Aphasia Quality of Life (SAQOL) scale [28].

Treatment Procedures

Treatment was administered 3 d/wk, 1 h/d for a total of 20 treatment hours (6–7 wk). A treatment incorporating semantic, phonologic, repetition, and orthographic cues was constructed, with the addition of a delayed-recall step. Treatment procedures were as follows. The picture was shown to the participant, who was prompted to name it (e.g., blouse "What is this called?"). Whether or not the picture was correctly named, each subsequent step was completed. The picture was then shown with the written name and the participant was asked to name the picture while keeping the written word in view (e.g., "Now can you tell me what this is called?"). The therapist then said the name of the picture and the participant was asked to repeat the name (e.g., "Right, it’s a blouse. Say blouse."). Following a 3 s delay, the participant was asked to say the name again (e.g., "Keep it in mind for a few seconds. What is it called?"). Semantic features of the picture were then provided by the therapist and the participant was prompted once again to name the picture (e.g., "It has a collar and lace. You can button it. A woman wears it. What is it?"). The therapist then said the number of syllables in the word and the initial phonemes and the participant named it (e.g., "It has one syllable and starts with /bl/. What is it?"). The therapist then said the name and the participant repeated (e.g., "Right. It’s a blouse. Say blouse."). Following a 3 s delay, the participant was asked to say the name again (without repetition from the therapist) (e.g., "One more time, what is this called?"). The therapist then moved on to the next item, following the exact procedure.

Treatment Stimuli and Probe Task

The daily probe task included picture naming of 80 words that participants were unable to name in preliminary testing. Stimuli were black and white line drawings selected from 150 nouns distributed across six semantic categories (clothing, body parts, household items, animals, transportation, and school). The MRC Psycholinguistic Database ( was used to determine Kucera-Frances written frequencies, Thorndike-Lorge written frequencies, imageability, concreteness, and age of acquisition ratings of each noun. Semantic relationships were selected from the University of South Florida Word Association Norms ( and the Edinburgh Associative Thesaurus (

In order to determine the treatment stimuli for each participant, individuals were initially asked to name all 150 pictures. Responses were scored for correct/incorrect and 80 pictures from four categories were chosen (20 items in each of three categories for training and 20 items in one untrained control category, with psycholinguistic characteristics balanced across categories). Within each trained category, 15 items were administered in training, and the other five words served as untrained within-category ­generalization probes. During the treatment phases, probe data were collected during each of the daily treatment sessions on two of the four lists, rotating lists 1 and 3 and lists 2 and 4. During training, stimuli in list 1 were treated first to criterion (90% accuracy over three treatment sessions), followed by lists 2 and 3 in succession. List 4 included the untreated control stimuli.

Experimental Design

A single-subject multiple baseline design with replication across eight participants was employed to allow for careful individual analysis of treatment response in the daily probe task as well as group response to other outcome measures. During the baseline phase, all 80 items from lists 1, 2, and 3 (treatment lists) and 4 (control) were probed 10 times to establish a stable baseline level of performance. During the course of the 20 treatment sessions, 10 probes of each list were taken (40 of the 80 total probe words per day). Probing of untreated items allowed analysis of possible generalization effects during the treatment phase. Where generalization did not occur and probe items remained stable, experimental control was demonstrated; that is, treatment specific effects could be demonstrated rather than effects from general stimulation or extraneous factors. The treatment phase was followed by four sessions of posttesting in which the repeated probes were administered. Follow-up testing occurred at 3 mo after treatment termination, and probes were administered four times.

Outcome Measures

The primary outcome measure was the daily picture-naming probe task. Several standardized aphasia tests and communication measures were also included. All picture-naming probes and standardized assessments were audiotaped using a digital recorder. The examiner conducted the scoring online during the session, and this scoring was also later judged by a trained rater blind to the time of testing. The picture-naming probe data were scored incorrect if productions included semantic or phonologic substitutions. Speech distortion errors were scored as correct. Intra- and interrater reliability was assessed using intraclass correlations (ICCs) computed for 20 percent of the repeated probe data. The percentage of the participants’ correct responses was graphed for analysis. The data were then analyzed visually and statistically.

Visual Analysis

Visual analysis of picture-naming probe data was completed by three judges, all speech-language pathologists with at least 3 yr of experience judging data via visual inspection, who had no knowledge of the purpose of the study or the nature of the treatment. Each independently judged the stability of the baseline phases for each participant and then considered the relative slope and height of the data displays during the treatment phase.

Statistical Analysis

Repeated probe data were analyzed in terms of effect sizes (ESs) [29], comparing mean scores in the four posttreatment probes to mean scores at baseline relative to baseline SDs as follows: ES = (Meanposttreatment – Meanbaseline)/SDbaseline. In the event where baselines had 0 SD, a pooled ES was calculated using the following formula: d2 = (Meanposttreatment – Meanbaseline)/SDpooled. ESs >2.6 were considered positive, and those >5.8 were considered large [5]. A group-weighted ES was calculated using the procedures described by Beeson and Robey [5].

Standardized Aphasia Tests and Communication Measures

Standardized tests (WAB, BNT, SAQOL) were readministered at treatment completion and again at 3 mo after treatment completion (Table 2). Changes in performance on the tests from pretreatment to posttreatment and maintenance were examined relative to the standard error of measurement of each test.

Table 2. 
* Pre1 = before treatment initiation, Post1 = immediately after treatment termination, Post 3 = 3 mo after treatment termination.

To determine effects of treatment generalization to untrained linguistic contexts, discourse production was collected through a standard set of interview questions, picture description [30], and Cinderella story retell. The discourse samples were transcribed and randomized for coding of parameters related to word retrieval. Two examiners who were blind to treatment conditions (baseline, posttreatment, maintenance) analyzed each sample, first removing extraneous words and repairs using the rules of the Quantitative Production Analysis [31]. Transcriptions were then coded for the presence of several parameters, including (1) correct information units (CIUs) [30], which refer to words in the sample that are appropriate to the topic and informative to the context; (2) nouns, pronouns, and vague nouns [32]; that is, nominals that convey little concrete information (e.g., thing, kinds), and specific nouns, referring to substantive, concrete nouns. Discrepancies in coding were resolved by consensus through consultation with a third examiner. For each parameter, we calculated the proportion of instances relative to the total number of words in the sample. We used paired samples t-tests to statistically analyze changes in the group from pretreatment to posttreatment in the standardized measures and the discourse samples. In addition, we evaluated changes made by each participant individually to determine which changes were greater than the standard error of measurement for that instrument. Finally, we analyzed relationships among variables.


Reliability of scoring for the daily picture-naming probe measures was acceptable. ICC assessing intrarater reliability was 0.988 and assessing interrater reliability was 0.931.

Primary Outcome Results

Research question 1 (treatment effects) was addressed by analysis of confrontation naming performance on 15 trained stimuli per list immediately following treatment termination (acquisition). Results are shown in Appendixes 13 (available online only). Table 3 summarizes the treatment outcomes. Results show that positive acquisition effects were evident in all eight participants, including three with large ESs. Weighted d ES average for the group was 5.40 (SD 2.20), representing a medium-large effect for acquisition. Visual inspection analysis results showed evidence of acquisition in 16/18 total lists trained across the eight participants.

Research question 2 (maintenance) was addressed by analysis of responses to the maintenance phase probes. All eight participants had a positive maintenance effect for picture naming (Table 3). Weighted d ES average for the group was 2.94 (SD 1.49), representing a small-moderate maintenance of training effects. Visual inspection showed evidence of maintenance in 11/18 total lists trained across all eight individuals.

Secondary Outcome Results

The secondary aim of this study investigated effects of treatment generalization on untreated stimuli within and across semantic categories (Table 3). Regarding within semantic category generalization, the weighted d ES average for the group was 0.43 (SD 1.44) at acquisition and 4.66 (SD 8.10) at maintenance. Two individuals who did not show an immediate generalization effect eventually showed that effect at the 3-month maintenance probe. Visual inspection showed evidence of within semantic category generalization in only 1/18 total lists trained across all eight individuals.

Regarding across semantic category generalization, the weighted d ES average for the group was 1.09 (SD 0.84) (maintenance). Only one individual demonstrated a small across category generalization effect. Likewise, visual inspection showed evidence of across semantic category in only 1/8 total lists trained across the eight individuals.

Standardized Pre- and Posttest Results

The results of standardized testing are displayed in Tables 2 and 4. We performed paired samples t-tests to evaluate changes in performance from pretreatment to immediate posttreatment and from pretreatment to 3 mo posttreatment to determine whether changes were significant and whether those changes were lasting. The only significant improvements identified were for scores on the WAB. The mean Aphasia Quotient (AQ) pretreatment was 74.45 and significantly improved to a mean of 79.35 immediately following treatment, t = 3.09, degrees of freedom (df) = 7, p = 0.02. Examining scores relative to the standard error of measurement of the WAB, four participants (S004, S005, S006, S010) demonstrated improvement from pretreatment to immediate posttreatment. The WAB improvement from the pretreatment score (74.45) to the mean at 3 mo after treatment completion (77.89) also represented a significant difference, t = 3.67, df = 7, p = 0.008, indicating treatment changes were maintained. Only two individuals (S002, S009) had scores that demonstrated improvement at 3 mo posttreatment relative to baseline. There were no significant differences on the WAB from treatment completion to 3 mo after treatment completion, t = 0.52, df = 7, p = 0.52. No significant changes were evident for the group on the other standardized measures (BNT, SAQOL), however.

The results of discourse production total words for the group across the total interview, picture description, and Cinderella story are shown in Table 5. A significant difference from pretreatment to immediate posttreatment was evident only for proportion of vague nouns, t = 4.854, df = 7, p = 0.002, d = 0.74, because the participants used fewer vague nouns following treatment. This effect did not last to the 3 mo posttreatment observation, however, t = 0.282, df = 7, p = 0.79. No other significant changes in discourse measures were noted for CIUs, pronouns, or specific nouns.


The purpose of this study was to create an anomia therapy behavioral engine for future use in research on physiological adjuvants or behavioral treatment combinations in aphasia rehabilitation. This treatment engine could also serve as a basis for future work focused on systematic variation of a treatment package that is typical of phase II treatment research platforms. Regarding safety and feasibility, all eight participants who entered into this protocol finished without incident. We effectively ran this study to completion. There was evidence of treatment effect (i.e., proof of concept) because behavioral adaptation occurred as a direct result of the treatment in all seven individuals with mild-moderate aphasia and five participants were able to maintain these effects 3 mo after treatment termination. As one of the reviewers of this article noted:

. . . The limited therapeutic effectiveness. . . is actually an advantage for use of the protocol as a "behavioral engine." The intervention produces a result that rises above the minimum threshold of a treatment effect. However, it is small enough to avoid problems with ceiling effects; there is a great deal of room left for improvement through the use of an effective additional treatment component. Oppositely, if a treatment component were to have an adverse effect it could be detected by a decrement in treatment improvement. In short, in order for a "behavioral engine" to be of value it would have to have a "Goldilocks" effect—not too large and not too small. The treatment protocol presented here has that characteristic.

Since there were only eight participants in this study, who were distributed over a range of deficits, the results need to be interpreted with recognition of its limited power. The participant with severe aphasia (S002: AQ 36.7) was an outlier in the group and showed only minimal benefit of the treatment. An important benefit of the single-participant research design incorporated in this experiment, especially at this stage of research development of the word-retrieval treatment behavioral engine, is that data allow researchers to determine not only who benefits from a treatment, but also who does not benefit, as would be the case with S002. As the evolution of treatment research moves forward toward rigorous group designs examining the effects of this word-retrieval behavioral engine, individuals with severe aphasia may be excluded from participation, because they may warrant a different treatment.

Little generalized naming improvements were evident to untrained semantic categories. Only three individuals generalized within (S007 and S008) or across (S004) semantic categories. If generalization was to occur, we predicted it would occur within a category. Further, the greatest generalization evident in the aphasia treatment literature has been shown when training atypical category exemplars [11–12,33], as compared with more typical category examples. Because we did not systematically control the type of relationship training items had to untrained items within the category, it is not surprising that within category generalization was limited, because our category items may have represented words that were the most typical of the various categories incorporated in the experiment.

Regarding the single subject who showed across semantic category generalization (S004), he appeared to be the most cognitively intact participant, with a Raven’s Progressive Matrices score of 35–36/36. The importance of executive functions to treatment response has been underscored in recent work. Hinckley et al. (2001) found that the lower the score on the Raven’s and Wisconsin Card Sorting Test, the longer it took patients to achieve performance criterion for therapy [34]. Also, Fillingham et al. (2006) noted that their participants with more intact executive functions had better anomia treatment outcomes [35]. That is, cognitive status seems to be an important variable influencing aphasia treatment effects and this may have been a factor underlying the positive across category generalization shown by S004 and the minimal response to treatment by S002, the participant with severe aphasia.

In addition to the strong treatment effects for retrieval of trained words across most participants, modest improvements, such as a reduction in use of vague nouns, were noted in standardized aphasia test measures, in particular the WAB, and in some aspects of discourse. These effects were not well maintained out to 3 mo after treatment completion, suggesting that the generalized effects of this word-retrieval training paradigm are mostly item specific to the vocabulary incorporated in training.


Positive treatment effects were evident for all participants with mild-moderate aphasia and a small treatment effect was noted for the most severely impaired individual (P002) in this study. Treatment effects may be ­mitigated in individuals with more severe naming impairments [36]. Nevertheless, the results of this experiment suggest that the treatment that we implemented appears to be a suitable behavioral engine to be used in future aphasia treatment studies to evaluate conditions that would amplify behavioral treatment effects in individuals with mild-moderate aphasia, whether modifications of the conditions/experiences associated with the treatment or in conjunction with pharmacologic intervention. Furthermore, the individuals in this study were chronic in nature and the importance of screening cognition is imperative. For individuals with severe aphasia, further work is needed to replicate this treatment to determine whether it is an appropriate treatment option or whether other treatments prove more effective for improving word-retrieval skills.

Author Contributions:
Study concept/design: D. Kendall, A. Raymer, M. Rose, L. J. Gonzalez Rothi.
Data analysis: D. Kendall.
Writing: D. Kendall, A. Raymer.
Editing: M. Rose, L. J. Gonzalez Rothi.
Treatment delivery: J. Gilbert.
Financial Disclosures: The authors have declared that no competing interests exist.
Funding/Support: This material was based on work supported by the VA RR&D Brain Rehabilitation and Research Center Award (grant B6793C).
Additional Contributions: The authors would like to acknowledge Sarah DeChristoforo, Ashley Miller, Katelyn Linski, and Amanda Eanes for their contribution to the discourse analysis and to the participants and their caregivers who devoted their time to this study. Dr. Kendall is also affiliated with the VA Puget Sound Health Care System, Seattle, Washington.
Institutional Review: This project was approved by the University of Florida Institutional Review Board and all participants gave written informed consent prior to participation.
Participant Follow-Up: The authors do not plan to inform participants of the publication of this study because contact information is unavailable.
This article and any supplementary material should be cited as follows:
Kendall D, Raymer A, Rose M, Gilbert J, Gonzalez Rothi LJ. Anomia treatment platform as behavioral engine for use in research on physiological adjuvants to neurorehabilitation. J Rehabil Res Dev. 2014;51(3):391–400.

Go to TOP

Last Reviewed or Updated  Thursday, June 5, 2014 12:01 PM

Valid HTML 4.01 Transitional