Productivity. Interviews with managers and employees
were conducted to determine the productivity measures
collected, the degree to which they were contaminated or
deficient as criteria, and the extent to which they were
used to evaluate effectiveness. Indications were that the
measures most carefully collected and closely monitored
were indicators of the amount of work not finished on a
weekly basis which was received by the group from the
subterritory it supported. That is, the groups’ goals were
not to reach the highest productivity per se, but to
complete all the work that came in each week. Most
territories did not even record the amount of work
completed, but they did record most of these six measures
related to unfinished work per week: (1) New Work
Unfinished--number of new pieces of work not finished, (2)
Percentage of New Work Unfinished--amount of new work
unfinished as a percentage of new work received, (3)
Revisions Unfinished--number of revisions to existing
pieces of work not finished, (4) Percentage of Revisions
Unfinished-number of revisions unfinished as a percentage
of revisions received, (5) Calls Not Answered-number of
phone calls to members of the group not answered, and
(6) Percentage of Calls Not Answered--number of calls not
answered as a percentage of calls received.
Each piece of work required the same set of tasks (e.g.,
coding, computer keying, quality checking, etc.). Although
pieces of work varied somewhat in difficulty, distribution of
difficulty was considered equivalent across groups in a
given territory. Group size was used to adjust for
differences in workload generated by the subterritories or
skills among employees. Groups with higher workloads or
fewer trained employees were assigned more employees.
Group size did not change frequently because workload
was fairly stable. Thus, groups were comparable within a
territory, even though they differed in number of
employees, and there was no need to standardize
productivity data based on group size. There were
differences across territories, however, such as complexity
of the work and average group size. Therefore,
productivity measures were standardized across territories
using z-scores.
Although productivity is often stable (e.g., Deadrick &
Madigan, 1990), the range of jobs studied has been
limited. Thus, productivity data was collected and
aggregated for each group over a long period (M = 27.89
weeks per group, SD = 3.88). To avoid temporal
influences, the time period was the same for each group,
from 3 months before to 3 months after the collection of
the characteristics data. Intraclass correlations were used
to assess reliability, or the degree of variance in
productivity across weeks within a group compared to
between groups. They can be interpreted as the
correlations between the mean of this 30 weeks of
productivity and the mean of another (hypothetical) 30
weeks. Average intraclass correlations ranged from .77 to
.95 (p < .05), thus suggesting substantial reliability.
The six measures were intercorrelated, so they were
averaged into a composite (M = .00, SD = .42, internal
consistency = .74). All measures were not available for all
groups (range from 46-79), so the composite was based
on the available data for each group. Analyses with
measures having the least missing data were similar, so
only data for the composite are presented.2 The signs on
the correlations were reversed so that positive numbers
indicate relationships with higher productivity (i.e., less
work not finished).
Employee satisfaction. To avoid common method
variance, the organization’s opinion survey was used as
the measure of satisfaction rather than adding a scale to
the questionnaire. That is, it was administered at a
different point in time (3 months earlier) and for an
unrelated purpose, thus mitigating any consistency or
priming effects. Data were obtained from all employees
(total n = 1,175), not just the 5 who provided other
measures. This gave the maximum data for each group (M
= 14.87 employees per group, SD = 5.52), enhanced
interrater reliability, and further reduced common method
variance because satisfaction data were included from
many additional employees who did not provide
characteristics data.
The aggregate data from all employees in each group was
used as the satisfaction measure. Such aggregation of
satisfaction data is common, and may be somewhat
justified by the definition of morale as referring to either the
individual or group (Webster’s, 1965), even though the
practice is not without criticism (Roberts et al., 1978).
The survey consisted of 71 items on a range of topics.
Five-point response formats were used, usually ranging
from 5 = "very satisfied" or "strongly agree" to 1 = "very
dissatisfied" or "strongly disagree." A principal components
analysis revealed 12 factors explaining 61% of total
variance: supervision, job, quality of servi