The Emergence of Modern Program Evaluation: 1964–1972

The Emergence of Modern Program Evaluation: 1964–1972

Although the developments discussed so far were not sufficient in themselves to create a strong and enduring evaluation movement, each helped create a context that would give birth to such a movement. Conditions were right for accelerated conceptual and methodological development in evaluation, and the catalyst was found in the War on Poverty and the Great Society, the legislative centerpieces of the administration of U.S. President Lyndon Johnson. The underlying social agenda of his administration was an effort to equalize and enhance opportunities for all citizens in virtually every sector of society. Millions of dollars were poured into programs in education, health, housing, criminal justice, unemployment, urban renewal, and many other areas.

Unlike the private sector, where accountants, management consultants, and R & D departments had long existed to provide feedback on corporate programs’ productivity and profitability, these huge, new social investments had no similar mechanism in place to examine their progress. There were government employees with some relevant competence—social scientists and technical specialists in the var- ious federal departments, particularly in the General Accounting Office (GAO)2—but they were too few and not sufficiently well organized to deal even marginally with determining the effectiveness of these vast government innovations. To complicate matters, many inquiry methodologies and management techniques that worked on smaller programs proved inadequate or unwieldy with programs of the size and scope of these sweeping social reforms.

For a time it appeared that another concept developed and practiced success- fully in business and industry might be successfully adapted for evaluating these federal programs, the Planning, Programming, and Budgeting System (PPBS). PPBS was part of the systems approach used in the Ford Motor Company—and

2This was the original name of the GAO. In 2004, its name was changed to the Government Accountability Office.

44 Part I • Introduction to Evaluation

later brought to the U.S. Department of Defense (DOD) by Robert McNamara when he became Kennedy’s secretary of defense. The PPBS was a variant of the systems approaches that were being used by many large aerospace, communica- tions, and automotive industries. It was aimed at improving system efficiency, ef- fectiveness, and budget allocation decisions by defining organizational objectives and linking them to system outputs and budgets. Many thought the PPBS would be ideally suited for the federal agencies charged with administering the War on Poverty programs, but few of the bureaucrats heading those agencies were eager to embrace it., However, PPBS was a precursor to the evaluation systems the federal government has mandated in recent years with the Government Performance Results Act (GPRA) and the Program Assessment Rating Tool (PART).

PPBS, with its focus on monitoring, outputs, and outcomes, did not succeed. Instead, the beginning of modern evaluation in the United States, Canada, and Germany was inspired by a desire to improve programs through learning from ex- perimentation on social interventions. Ray Rist, in his research with the Working Group on Policy and Program Evaluation, which was created by the International Institute on Administrative Sciences (IIAS) to study differences in evaluation across countries, placed the United States, Canada, Germany, and Sweden among what they called “first wave” countries (Rist, 1999). These were countries that began modern evaluation in the 1960s and 1970s with the goal of improving social programs and interventions. Evaluations were often part of program planning, and evaluators were located close to the programs they were evaluating. As we will discuss later in the chapter, evaluation in the early part of the twenty-first century is more akin to the earlier PPBS systems than to its first-wave origins.

The stage for serious evaluation in the United States was set by several factors. Administrators and managers in the federal government were new to managing such large programs and felt they needed help to make them work. Managers and policy- makers in government and social scientists were interested in learning more about what was working. They wanted to use the energy and funds appropriated for eval- uation to begin to learn how to solve social problems. Congress was concerned with holding state and local recipients of program grants accountable for expending funds as prescribed. The first efforts to add an evaluative element to any of these programs were small, consisting of congressionally-mandated evaluations of a federal juvenile delinquency program in 1962 (Weiss, 1987) and a federal manpower development and training program enacted that same year (Wholey, 1986). It matters little which was first, however, because neither had any lasting impact on the development of evaluation. Three more years would pass before Robert F. Kennedy would trigger the event that would send a shock wave through the U.S. education system, awakening both policymakers and practitioners to the importance of systematic evaluation.

The Elementary and Secondary Education Act. The one event that is most responsible for the emergence of contemporary program evaluation is the passage of the Elementary and Secondary Education Act (ESEA) of 1965. This bill pro- posed a huge increase in federal funding for education, with tens of thousands of federal grants to local schools, state and regional agencies, and universities. The

Chapter 2 • Origins and Current Trends in Modern Program Evaluation 45

largest single component of the bill was Title I (later renamed Chapter 1), destined to be the most costly federal education program in U.S. history. Wholey and White (1973) called Title I the “grand-daddy of them all” among the array of legislation that influenced evaluation at the time.

When Congress began its deliberations on the proposed ESEA, concerns began to be expressed, especially on the Senate floor, that no convincing evidence existed that any federal funding for education had ever resulted in any real edu- cational improvements. Indeed, there were some in Congress who believed fed- eral funds allocated to education prior to ESEA had sunk like stones into the morass of educational programs with scarcely an observable ripple to mark their passage. Robert F. Kennedy was the most persuasive voice insisting that the ESEA require each grant recipient to file an evaluation report showing what had resulted from the expenditure of the federal funds. This congressional evaluation mandate was ultimately approved for Title I (compensatory education) and Title III (inno- vative educational projects). The requirements, while dated today, “reflected the state-of-the-art in program evaluation at that time” (Stufflebeam, Madaus, & Kellaghan, 2000, p. 13). These requirements, which reflected an astonishing amount of micromanagement at the congressional level but also the serious con- gressional concerns regarding accountability, included using standardized tests to demonstrate student learning and linking outcomes to learning objectives.

Growth of Evaluation in Other Areas. Similar trends can be observed in other areas as the Great Society developed programs in job training, urban development, housing, and other anti-poverty programs. Federal government spending on anti- poverty and other social programs increased by 600% after inflation from 1950 to 1979 (Bell, 1983). As in education, people wanted to know more about how these programs were working. Managers and policymakers wanted to know how to im- prove the programs and which strategies worked best to achieve their ambitious goals. Congress wanted information on the types of programs to continue funding. Increasingly, evaluations were mandated. In 1969, federal spending on grants and contracts for evaluation was $17 million. By 1972, it had expanded to $100 million (Shadish, Cook, & Leviton, 1991). The federal government expanded greatly to oversee the new social programs but, just as in education, the managers, political scientists, economists, and sociologists working with them were new to managing and evaluating such programs. Clearly, new evaluation approaches, methods, and strategies were needed, as well as professionals with a somewhat different train- ing and orientation to apply them. (See interviews with Lois-Ellin Datta and Carol Weiss cited in the “Suggested Readings” at the end of this chapter to learn more about their early involvement in evaluation studies with the federal government at that time. They convey the excitement, the expectations, and the rapid learning curve required to begin this new endeavor of studying government programs to improve the programs themselves.)

Theoretical and methodological work related directly to evaluation did not exist. Evaluators were left to draw what they could from theories in cognate disciplines and to glean what they could from better-developed methodologies, such as experimental

46 Part I • Introduction to Evaluation

design, psychometrics, survey research, and ethnography. In response to the need for more specific writing on evaluation, important books and articles emerged. Suchman (1967) published a text reviewing different evaluation methods and Campbell (1969b) argued for more social experimentation to examine program effectiveness. Campbell and Stanley’s book (1966) on experimental and quasi-experimental designs was quite influential. Scriven (1967), Stake (1967), and Stufflebeam (1968) began to write articles about evaluation practice and theories. At the Urban Institute, Wholey and White (1973) recognized the political aspects of evaluation being conducted within or- ganizations. Carol Weiss’s influential text (1972) was published and books of eval- uation readings emerged (Caro, 1971; Worthen & Sanders, 1973). Articles about evaluation began to appear with increasing frequency in professional journals. To- gether, these publications resulted in a number of new evaluation models to respond to the needs of specific types of evaluation (e.g., ESEA Title III evaluations or evalua- tions of mental health programs).

Some milestone evaluation studies that have received significant attention occurred at this time. These included not only the evaluations of Title I, but eval- uations of Head Start and the television series Sesame Street. The evaluations of Sesame Street demonstrated some of the first uses of formative evaluation, as por- tions of the program were examined to provide feedback to program developers for improvement. The evaluations of Great Society programs and other programs in the late 1960s and early 1970s were inspired by the sense of social experimen- tation and the large goals of the Great Society programs. Donald Campbell, the in- fluential research methodologist who trained quite a few leaders in evaluation, wrote of the “experimenting society” in his article “Reforms as Experiments” urg- ing managers to use data collection and “experiments” to learn how to develop good programs (Campbell, 1969b). He argued that managers should advocate not for their program, but for a solution to the problem their program was designed to address. By advocating for solutions and the testing of them, managers could make policymakers, citizens, and other stakeholders more patient with the difficult process of developing programs to effectively reduce tough social problems such as crime, unemployment, and illiteracy. In an interview describing his post- graduate fellowship learning experiences with Don Campbell and Tom Cook, William Shadish discusses the excitement that fueled the beginning of modern evaluation at that time, noting, “There was this incredible enthusiasm and energy for social problem solving. [We wanted to know] How does social change occur and how does evaluation contribute to that?” (Shadish & Miller, 2003, p. 266).

Graduate Programs in Evaluation Emerge. The need for specialists to conduct useful evaluations was sudden and acute, and the market responded. Congress provided funding for universities to launch new graduate training programs in educational research and evaluation, including fellowship stipends for graduate study in those specializations. Several universities began graduate programs aimed at training educational or social science evaluators. In related fields, schools of public administration grew from political science to train administrators to man- age and oversee government programs, and policy analysis emerged as a growing

Chapter 2 • Origins and Current Trends in Modern Program Evaluation 47

new area. Graduate education in the social sciences ballooned. The number of people completing doctoral degrees in economics, education, political science, psychology, and sociology grew from 2,845 to 9,463, an increase of 333%, from 1960 to 1970 (Shadish et al., 1991). Many of these graduates pursued careers eval- uating programs in the public and nonprofit sectors. The stage for modern pro- gram evaluation was set by the three factors we have described: a burgeoning economy in the United States after World War II, dramatic growth in the role of the federal government in education and other policy areas during the 1960s, and, finally, an increase in the number of social science graduates with interests in evaluation and policy analysis (Shadish et al., 1991).

Place Your Order Here!

Leave a Comment

Your email address will not be published. Required fields are marked *