Faculty are about to be faced with dealing
with claims made based on and the consequences of the use of a
device purported to measure the amount of learning produced by CUNY
undergraduate programs. However, the device and methodology for its
application and conclusions drawn from the data produced are quite
controversial. Perhaps the two most significant concerns or
criticisms of the use of the device are that:
1.
The device does not provide for what the public and CUNY may
want to know or report that it does know, namely, "We need to know
what students know when they enter CUNY and what they know when they
graduate". The device is only measure of rudimentary cognitive
skills.
2.
Use of the device will not provide CUNY with what would be
needed specific to curricula to improve teaching and learning.
The CUNY has made the decision to end the use
of the CPE. The CUNY acknowledges the need to have some evidence of
the value added by an education at CUNY and so it formed a task
force to make recommendations for measures and processes for
assessment of learning. The CUNY Task Force (TF) on System-Wide
Assessment was charged to do something more than, and other than, to
provide a replacement for the CPE.
The TF moved rather directly to recommend the
use of the Collegiate Learning Assessment (CLA) by the Council for
Aid to Education(CAE). (FULL DISCLOSURE, the chairman of the Board
of Trustees of the CAE is chairman of the CUNY BOT, Benno Schmidt,
and the CUNY Vice Chancellor of Community Colleges, Eduardo J. Marti
is also on the board of trustees of the CAE.) See
http://www.cuny.edu/about/administration/offices/ue/cue/AssessmentTaskForceReport2011.pdf
for the Task Force Report (TFR) itself.
Here are just some of the concerns:
·
The CUNY Assessment Task Force (TF) recommends the use
Collegiate Learning Assessment (CLA) for CUNY wide assessment of
learning but the TF reports that the CLA presents problems with
providing what CUNY charged the Task Force to recommend. The Task
Force Report (TFR) indicates that the CLA does not provide for an
instrument that will as CUNY charged to provide: an instrument that
will
-
measure other learning outcomes associated with general
education at CUNY.
-
benchmark learning gains against those of comparable
institutions outside CUNY
-
use
the results to improve teaching and learning throughout
CUNY.
·
Such a discrepancy supports the contention that the
purpose of the test recommended is not to measure learning outcomes
but to provide data that can support a public relations presentation
showcasing CUNY's effectiveness and does so provide through dubious
methodological means.
·
The use of the Collegiate Learning Assessment (CLA)
provides for the examination of skills of two disparate groups that
are to be selected using a cross-sectional rather than a
longitudinal design. By design or accident the random selection
method chosen that examines students entering and another group of
those leaving programs of study permits the selection of people with
quite different levels of entering skills and background knowledge.
It is likely that comparisons of results will likely show higher
level of skills in those graduating. Those entering, particularly
at the community colleges, are likely to include ESL students and
those in need of remediation in more than one area. Those exiting
are likely not to include similar numbers of students. The
uncritical reliance on the vendor supplied rhetoric concerning
psychometrics to displace the critiques of methodology does not
allay concerns over the merits of using the CLA.
·
There is no recommendation nor provision for analyzing
the results of CUNY’s use of the CLA versus the results of examining
people of similar ages to the two groups of CUNY students but who
have spent time outside of formal instruction and who through
maturation of brain structures and operations and increase in
experiences in handling information acquired through interaction
with their social and information environments have developed the
cognitive skills that the CLA assesses.
·
Whatever the CLA provides is, on the admission of the
TF and the developers of the CLA, not able to provide CUNY with what
would be needed specific to curricula to improve teaching and
learning.
·
There is concern that the TFR indicates more cautions
against using the CLA than recommendations for its use.
·
There are indications of weaknesses in basic design
and in the use of the results of the CLA and cautions with how to
interpret the results.
·
The inferences drawn from the results of the CLA are
dubious as the subject groups are so heterogeneous as to their ages,
prior academic background preparation, interests, motivations,
programs of study and transfer between programs. In the TFR There is
no indication of consideration of alternative explanations for the
results of the administration of the CLA.
·
The CLA is not an assessment of knowledge at all and
its vendor acknowledges this. It does not provide for what CUNY may
want to know or report that it does know, namely, "We need to know
what students know when they enter CUNY and what they know when they
graduate"
·
There were no considerations given by the TF of the
latest findings of Neuroscience and Developmental Psychology.
·
There is no indication of a literature review of
criticisms of the devices examined
·
The CLA is not normed against the general population
in the age range of the typical test subjects.
·
The use of the CLA -testing (sampling) model- is
severely flawed.
·
The test subjects can be severely compromised as to
their performance due to range of motivations.
·
The test subject selection process can easily be
severely compromised or manipulated to produce desired results
(gaming the system).
The Charge to the Task Force
The CUNY Assessment Task Force recommendations do not meet the
charges of the Task Force and how use of the Collegiate Learning
Assessment (CLA) instrument cannot and thus will not meet the
desired goals expressed by the Chancellery.
The background for the Task Force offers this significant background
report.
After
extensive deliberations, the CPE Task Force recommended that CUNY
discontinue the use of the CPE (CUNY Proficiency Examination Task
Force, 2010). As a certification exam, the CPE had become redundant.
Nearly every student who was eligible to take the exam— by
completing 45 credits with a 2.0 GPA or better— passed the exam.
Further, given that the CPE was designed by CUNY and administered
only within CUNY, it could not be used to benchmark achievements of
CUNY students against those of students at comparable institutions.
Because it was administered only at a single point in time, the CPE
also did not measure learning gains over time. Finally, the
development and administration of the test had become prohibitively
expensive, projected at $5 million per year going forward. The Board
of Trustees took action to discontinue the CPE in November 2010.
----Task Force Report (TFR), 4-5
In January 2011, the CUNY Task Force on System-Wide Assessment of
Undergraduate Learning Gains (Assessment Task Force) was convened by
Executive Vice Chancellor Alexandra Logue and charged as follows:
The
Chancellery wishes to identify and adopt a standardized assessment
instrument to
measure
learning gains at all of CUNY’s undergraduate institutions. The
instrument
should be
designed to assess the ability to read and think critically,
communicate
effectively
in writing, and measure other learning outcomes associated with
general
education at
CUNY. It must be possible for each college and the University to
benchmark learning gains against those of comparable institutions
outside CUNY. It is the responsibility of the Task Force to identify
the most appropriate instrument and to advise the Chancellery on how
best to administer the assessment and make use of the results.
The Task Force is charged with the following specific
responsibilities:
1. Taking
into account psychometric quality, the alignment of the domain of
the
instrument
with broad learning objectives at CUNY colleges, cost, facility of
obtaining
and using
results, and the ability to benchmark results externally, select an
assessment
instrument from among those commercially available at this time.
2. Develop
recommendations for the chancellery on how the assessment should
best be administered so as to
a. represent
each college’s undergraduate student body;
b. generate
a valid assessment of learning;
c.
facilitate comparisons across CUNY colleges and between CUNY and
other
postsecondary institutions.
3. Develop
recommendations on how the colleges and the chancellery can best use
the
results to
improve teaching and learning throughout CUNY.
----Task Force Report (TFR) Executive Summary and Introduction
As CUNY seeks to address the real and
appropriate concern for public accountability and wants to provide
some assurances to the public that there is real value in a CUNY
education there is need for some measure to provide that assurance.
It is often stated by university and college officials that there is
a need to know what a student knows when entering CUNY and what the
student knows when graduating. The TFR indicates that with the
Collegiate Learning Assessment (CLA) does not provide for an
instrument that will:
·
measure other
learning outcomes associated with general education at CUNY.
·
benchmark
learning gains against those of comparable institutions outside CUNY
·
use the results
to improve teaching and learning throughout CUNY.
Instead the TFR recommends use of a device that will not at all
measure what a student knows but only, and perhaps in a dubious
manner, provide some indication of what a student can do as far as
basic cognitive skills.
Consider the following limitations acknowledged in the TFR:
The Task Force emphasizes that the CLA assesses a limited domain
and should not be regarded as a comprehensive measure of general
education outcomes defined by CUNY colleges. The test is not
intended to evaluate all aspects of institutional effectiveness and
is not designed to assess individual student or faculty performance.
.—(TFR,3)
The Task Force does not, however, endorse the CLA for all
purposes. CLA results are intended for use in evaluating learning
outcomes only at the institutional level and primarily as a
“signaling tool to highlight differences in programs that can lead
to improvements in teaching and learning” (from the introduction to
the sample 2009-2010 CLA Institutional Report). .—(TFR,16)
As indicated earlier, the CLA assesses learning in a limited
domain and cannot be regarded as a comprehensive measure of general
education outcomes as currently defined by CUNY colleges or as may
be defined by the Pathways initiative. .—(TFR,16)
and again here:
Given the impossibility of capturing all outcomes with a single
instrument, the Task Force identified the core learning outcomes
common across CUNY: reading, critical thinking, written
communication, quantitative reasoning and information literacy. The
Task Force acknowledges that these competencies do not represent the
full range of learning outcomes deemed essential
by CUNY colleges and institutions across the country (see Liberal
Education and America’s Promise, 2007). Nor do they adequately
represent discipline-specific knowledge and competencies. The
assessment instrument best aligned with this restricted domain must
therefore be seen as one component of a more comprehensive
assessment system comprised of the many formative and summative
measures tailored to assess general education learning outcomes. .—(TFR,6-7)
The concern is that CUNY as a whole and its units may use the CLA
results to make claims about the effectiveness of curricula and of
the General Education Core in particular when the instrument cannot
support such claims. Nor can the results be used to advance teaching
and learning as they are non-specific to programs of instruction.
The TFR notes that:
TheTask Force discussed the methodological issues associated with
assessing learning gains, and this report contains some initial recommendations for
administering the test. However, these questions merited additional deliberation, and more detailed
recommendations will be presented in a supplementary report. .—(TFR,6)
That supplementary report has not been produced and it is a major
issue as the CLA has numerous challenges in how it would be
administered. Indeed it may prove most formidable to develop
process and protocols that would enable the selection of student
groups to be considered as acceptable according to standard criteria
for the conduct of such assessments.
The importance of proper test administration is noted again here:
Administering the test to properly oriented students under
standard and secure conditions is essential for gathering quality
data..—(TFR,12).
Yet there is as yet no description of processes or protocols needed
to secure quality data.
Cautions are included as well:
noting the need to conduct research on the validity of the
electronic scoring methodology to be fully implemented soon by the
Council for Aid to Education (CAE), the organization that develops
and scores the CLA.—(TFR,2)
CUNY may be able to learn from other institutions how best to
motivate randomly selected students to demonstrate their true
ability on the assessment. .—(TFR,2)
Finally, the Task Force calls attention to the fact that the
national sample of colleges that have administered the CLA differs
in important respects from the CUNY student body, and that only a
handful of community colleges have administered the community
college version of the CLA to date. This lack of comparability may
initially hamper CUNY’s ability to interpret its learning gains with
reference to national averages. All of the other candidate tests are
characterized by this important constraint.—(TFR,3)
The Council for Aid to Education (CAE) produces the CLA and the TFR
notes that :
However, because the CAE has recently implemented machine scoring
for all of its unstructured response tests, the Task Force
recommends that the University obtain more information about the
validity of the scoring process and consider the possible
implications for the interpretation of test scores. .—(TFR,14)
There is no indication of any effort by the CAE to provide the
information on the validity of the machine scoring process and there
are questions concerning such an important matter.
The Task Force also urges caution with
respect to interpreting the available benchmarking data. In its
standard report to participating colleges, the CAE provides data
comparing the learning gains at each college to gains measured in
the national sample. The validity of these comparisons may be
affected by the extent to which the colleges comprising the
benchmark sample resemble CUNY and the degree to which the sample of
tested students in the benchmark colleges reflects the total
population of undergraduates in those colleges. .—(TFR,16)
The test is not intended to evaluate all
aspects of institutional effectiveness and is not designed to assess
individual student or faculty performance. .—(TFR,16)
Questionable claims are made related to basic concepts of testing
and measurements:
The CLA will be administered to samples of students who are just
beginning their undergraduate studies and to students who are
nearing the end of their undergraduate career. The sampling must be
done randomly to produce representative results; yet random sampling
will pose logistical challenges. .—(TFR,2)
The TFR indicates that :
This report contains the Task Force’s
recommendations for a test instrument. A supplementary report will
provide guidance on test administration and use of test results by
faculty and academic administrators.—(TFR,16)
Criticism of the cross sectional method for sampling:
My understanding is that the CLA assessment is based on a
cross-sectional design in which a group of freshmen at an
institution
are compared to a group of seniors at the same institution. The
groups are equated through adjustments based on the observed
influences of covariates (e.g., SAT). This approach seems
reasonably
if you are trying to measure a relatively uniform treatment (all
students receive very similar training during the 4 years) but I
have
not seen any discussion of the much more complicated environment at
CUNY.
A plurality of CUNY students are transfer students. Their
educational experience is influenced by two or more institutions and
they frequently take a great deal of time to graduate. The
challenges
to a cross-sectional design are considerable.
First, there is the obvious problem of identifying the effects that
each institution had on a student even when one is reasonably
certain
that the seniors began their education with similar skill levels to
the freshmen to which they are being compared.
Second, there are a number of threats to the validity of the
cross-sectional design that make it very difficult to be certain
that
confounds have not created differences that can be attributed to the
proper source.
One problem is that students self-select and the student who
transfers
from BCC to Lehman may be different in many ways from a BCC student
who transfers to Baruch. It seems quite possible that these
students
could differ in area of interest, motivation, or other factors that
are not captured by the standard covariates that are used to equate
the groups.
Another problem is that admission criteria differ significantly at
the
senior colleges and students who may be admissible to Lehman or
Queens
may not be admissible to Baruch. This difference constitutes a kind
of selection bias because students will begin their education at a
senior college with systematically different skills.
Moroever, the transfer admission criteria are not typically included
as covariates to the CLA process and the students who have had the
most significant gains at a 2yr school are exactly those students
who
will be disproportionately represented at a senior college with
tougher transfer admission criteria. --Professor Kevin Sailor
(Psychology, Lehman)
“A major problem has to do with implementation. If a college were
to choose a weak group of freshmen and an outstanding group of
seniors to take the test, a college would appear to be doing very
well. Would anyone anywhere try to game the system in this way?
Hmmm.”-- Dean Savage (Sociology, Queens College)
Lisa A. Ellis (Library, Baruch College), who
served on the Task Force, has replied:
This is not correct for a number of reasons. First it is not
“implementation” but “sampling” which is a problem. The report
notes the differences between cross-sectional and longitudinal
studies and gives the testing of freshmen and seniors as an example
of how these particular study methodologies differ, This is not to
be read as a statement of how sampling will be done if such a test
were to be administered on CUNY campuses. On page 17, it reads,
“However, both designs present challenges associated with the
treatment of drop-outs and transfer students, and solutions to these
issues must be standardized if the measurement of gains is to be
benchmarked across institutions.” The Task Force not only
recommends a cross-sectional design but also, “recommends testing
students beginning of their academic career [freshmen], at roughly
the 60th credit [upper sophomores or those nearing completion of
Associate’s degrees], and for students pursuing the bachelor’s
degree, when approaching the 120th credit [presumably seniors, if
they have not taken an excess of credits].”
Absent the instructions for implementation as
indicated as needed by the TFR the reply of Dr. Ellis does not
suffice to dismiss the concerns raised by others concerning the use
of the cross sectional approach.
The absence of that supplementary report, that is claimed will be
available in December of 2011, leaves open the significant questions
concerning how it would be possible to administer the CLA in CUNY in
a manner that would be valid and stand up to scrutiny using strict
standards employed by the Social Sciences.
The TFR acknowledges some of the challenges in meeting those
standards:
To measure learning gains, CUNY must choose either a
cross-sectional or a longitudinal design. In a cross-sectional
study, random samples of freshmen and seniors are drawn during the
school year— freshmen in the fall and seniors in the spring. In a
longitudinal study, a group of freshmen is tested in their first
year, and then again as seniors. In theory, the two designs should
yield equivalent results. However, both designs present challenges
associated with the treatment of drop-outs and transfer students,
and solutions to these issues must be standardized if the
measurement of gains is to be benchmarked across institutions.
Because of the multi-year period required to execute a longitudinal
design, the Task Force endorses a cross-sectional design. Moreover,
because CUNY wishes to use the same instrument to test learning
outcomes at all of its colleges—community and senior—the Task Force
recommends testing students at the beginning of their academic
career, at roughly the 60th credit, and for students pursuing the
bachelors degree, when approaching the 120th credit. Finally, in
developing a sampling scheme, analysts must take into account the
numbers of ESL and remedial students, and the appropriateness of
including them in the college’s representative sample. Both groups
may face special challenges in a timed testing situation.
The methodological issues of sampling will have a direct effect
not only on assessments of learning at the institutional level, but also on calculations of
learning gains and subsequent derivations of the learning gains to be ascribed to the college
rather than to natural maturation.
A further complication to measuring learning gains is determining
the nature and significance of any gain. The assessment of learning gains must take into account
changes in performance from one point in time to the next, as well as gain relative to
specific standards. With both methodological and substantive complexities in play, the Task
Force recommends caution in the
initial administrations of the test and the use of multiple
alternative measures to help in the interpretation of results. .—(TFR,16)
Issue with other factors contributing to the results:
Again the
TFR notes that:
The methodological issues of sampling will have a direct effect
not only on assessments of learning at the institutional level, but also on calculations of
learning gains and subsequent derivations of the learning gains to be ascribed to the college
rather than to natural maturation. .—(TFR,16)
Neither the CAE nor the TFR appear to take as fundamentally
significant the failure to account for natural maturation in
comparing groups that range from 17 to 20 years of age with groups
that will be four or more years older.
Recent studies indicate that the higher cognitive process in human
brains develop through that period of time regardless of formal
education.
Reyna, Valerie F. and Farley, Frank. Risk and Rationality in
Adolescent Decision Making: Implications for Theory, Practice and
Public Policy. Psychological Science in the Public Interest,
Volume 7, No. 1, September 2006
Sowell, Elizabeth R., Thompson, Paul M. , Holmes, Colin J.,
Jernigan, Terry I., and Toga, Arthur W. In vivo evidence for
post-adolescent brain maturation in frontal and striatal regions.
Nature neuroscience, Volume 2, No. 10, Ocotober, 1999.
Blakemore, Sarah-Jayne and Choudhury, Suprana. Development of the
adolescent brain: implications for executive function and social
cognition. Journal of Child Psychology and Psychiatry. Vol.
47, No.3/4, 2006, pp. 296-312.
Dahl, Ronald E., Adolescent Brain Development: A Period of
Vulnerabilities and Opportunities, Annals of NY Academy of
Science 1021:, 2004 ,1-22
Criticism
of the failure to account for other factors contributing to the
results of the CLA:
…the outside influence on a student’s
intellectual skills is
quite likely to be much more variable for transfer students than it
is
for native matriculants. For example, many students work and it
seems
quite likely that they pick up some skills through their work
experience. Consider a student who works in a white collar
environment (an accounting department at a large firm) versus one
who
works in a blue collar environment (night security). It seems
likely
that the student in a white collar environment will be exposed to
information and tasks that would have a greater impact on skills
assessed by the CLA. If this exposure lasts for four or five years
then the contributions might be significant.
Thus, some gains may be due to the outside environment rather than
the
school environment. Moreover, this possible influence would
undermine comparisons across schools or programs unless these kinds
of
experiences are distributed equally across the populations who
transfer from each 2 year school to every 4 year school.
--Professor Kevin Sailor (Psychology, Lehman)
Lisa A. Ellis (Library, Baruch College), who
served on the Assessment Task Force, has written:
“…In truth, there are a number of factors that may impact learning
gains which may or may not include teaching.”
Use of CLA in CUNY
While some like Dean Savage (Sociology, Queens
College) caution:
“It's very likely that Central will use the CLA results to evaluate
college effectiveness, so everyone on the campuses should be aware
of the test's limitations, and be prepared to evaluate the results
accordingly.”
Lisa A. Ellis (Library, Baruch College), who
served on the Assessment Task Force, has written the following:
“We had numerous
discussions on the committee about how the test results will be used
(i.e PPM, cross campus comparisons, eliminate programs, etc.) and as
a group were firmly opposed to such use. Minutes were taken and
this appears numerous times during the minutes as a caution to what
information the test results can provide each campus in terms of
what actions can be reasonably taken to improve or change learning
gains depending on the results received. In truth, there are a
number of factors that may impact learning gains which may or may
not include teaching.”
Need for Other measures and devices to be employed and
implementation guidelines
Indeed the TFR includes:
The Task Force identified sampling design, motivation of
students, and involvement of faculty as keys to the successful
implementation of the CLA. Sampling must be conducted carefully so
that the test results accurately reflect the level of learning and
unique demographics at each CUNY institution. Because the test is
not high stakes, CUNY must devise a strategy for encouraging test
takers to demonstrate their true abilities on the test. Finally,
unless faculty believe that the test is a valuable tool for
assessing the learning goals they are attempting to advance in their
own classrooms, the information generated by the assessment will not
become a resource for improving learning outcomes of undergraduate
students.—(TFR,16).
It needs to
be emphasized that the TFR specifically cautions:
…. With both methodological and substantive complexities in play,
the Task Force recommends caution in the initial administrations of the test and the use of multiple
alternative measures to help in the interpretation of results. .—(TFR,16)
Yet there
are no multiple measures being described or provided thus far and
the concern is that they will not be and the results of the CLA,
however administered, may be interpreted in a manner to suit the
interests of those who would require its use for purposes other than
improving pedagogy.
Although the TFR reports that The assessment instrument best
aligned with this restricted domain must therefore be seen as one
component of a more comprehensive assessment system comprised of the
many formative and summative measures tailored to assess general
education learning outcomes. .—(TFR,6-7) and “recommends
caution in the initial administrations of the test and the use of
multiple alternative measures to help in the interpretation of
results. .—(TFR,16)
It is now reported by the Chancellery that
“the Task Force and the Chancellery
share the strong view that responsibility for assessment of all
kinds rests with the colleges and especially with the faculty. It
is the purview of the faculty to define learning goals and outcomes,
identify appropriate measures and evidence to assess progress toward
those goals, and use the results of many strands of evidence for
improvement. “
-David Crook
So the CLA is to be just one component of a more comprehensive
assessment system comprised of the many formative and summative
measures tailored to assess general education learning outcomes that
the colleges will devise or acquire and use. The use of multiple
alternative measures to help in the interpretation of results will
depend on the colleges to develop or acquire the alternative
measures. The Chancellery will supply no such measures. This will
leave the CLA results to be the only assessment of “learning” across
CUNY that will be used to report on the efficacy of the curricula of
the University: an assessment of rudimentary skills using a
problematic instrument.
###################################
Works Cited by CUNY Task Force
American
Educational Research Association, American Psychological
Association, and National Council on Measurement in Education.
(1999). Standards for Educational and Psychological Testing.
Washington, D.C.: American Educational Research Association.
Arum, R. and J. Roksa (2011). Academcially Adrift: Limited Learning
on College Campuses. University of Chicago Press.
CUNY
Proficiency Examination Task Force. (2010). Report of the CUNY
Proficiency Examination Task Force. New York: City University of New
York.
Ewell, P. (2009). Assessment, Accountability and Improvement:
Revisiting the Tension. University of Illinois and University of
Indiana: National Institute for Learning Outcomes Assessment.
Hutchins, P. (2010). Opening Doors to Faculty Involvement in
Assessment. University of Illinois at Urbana-Champaign: National
Institute for Learning Outcomes Assessment.
Liberal Education and America's Promise. (2007). Liberal Education
and America's Promise (LEAP) - Essential Learning Outcomes.
Retrieved June 21, 2011, from AAC&U - Association of American
Colleges and Universities: http://www.aacu.org/leap/vision.cfm
National Institute for Learning Outcomes Assessment. (2011). Tool
Kit: Tests. Retrieved July 13, 2011, from http://www.learningoutcomesassessment.org/tests.htm
Rhodes, T. (. (2010). Assessing Outcomes and Improving Achievement:
Tips and Tools for Using Rubrics. Washington, D.C.: Association of
American Colleges and Universities.
VALUE: Valid Assessment of Learning in Undergraduate Education
Project. (2007). VALUE: Valid Assessment of Learning in
Undergraduate Education Overview. Retrieved June 21, 2011, from AAC&U
Association of American Colleges and Universities: http://www.aacu.org/value/index.cfm
Voluntary System of Accountability (2007). About VSA. Retrieved June
21, 2011, from
Voluntary System of Accountability: http://www.voluntarysystem.org
###################################
OTHER RELEVANT RESOURCES
AAC&U. (2005).
Liberal education outcomes. Washington, DC: Association of
American Colleges and Universities.
AASCU. (Spring
2006). Value-added Assessment. Perspectives. Washington, DC:
American Association of State Colleges and Universities.
Arum, R., Roksa, J., & Velez, M. (2008).
Learning to reason and communicate in college: Initial report of
findings from the CLA longitudinal study.
Brooklyn, NY: The Social Science Research Council.
Banta, T. W.,
and G. R. Pike. 2007. Revisiting the blind alley of value-added.
Assessment Update 19 (1), 1,2,14,15.
Benjamin, R.
and M. Chun (2003). A new field of dreams: The Collegiate Learning
Assessment project.” Peer Review,. 5(4), 26-29.(http://www.cae.org/content/pro_collegiate_reports_publications.htm;
5/23/08).
Benjamin, R., Chun, M., & Shavelson, R. (2007).
Holistic Tests in a sub-score world: The diagnostic logic of the
Collegiate Learning Assessment.
New
York, NY: Council for Aid to Education.
Found at (11/24/07)
http://www.cae.org/content/pdf/WhitePaperHolisticTests.pdf
Benjamin, R., & Chun, M. (2009).
Returning to learning in an age of assessment: A synopsis of the
argument.
New York, NY: Council for Aid to Education monograph.
Benjamin. R., Chun, M. (2003). A new field of dreams: The CLA
Project.
Peer Review,
2003, 5 (14), 26-29.
Blakemore,
Sarah-Jayne and Choudhury, Suprana. Development of the adolescent
brain:
implications for executive function and social cognition. Journal of
Child Psychology and Psychiatry. Vol. 47, No.3/4, 2006, pp. 296-312.
Braun, H. J.
2005 Case, R. (1996) Changing views of knowledge and their impact on
educational research and practice. In D.R. Olson & N. Torrance
(Eds.), Handbook of human development in education: New models of
learning, teaching, and schooling. Oxford: Blackwell.
Braun, H.J.
(2005). Using student progress to evaluate teachers: A primer on
valueadded
models. New
Jersey: Educational Testing Service.
Brennan, R.L.
(1995). The conventional wisdom about group mean scores. Journal of
Educational Measurement, 32(4), 385-396.
CLA (2006)
Sample Institutional Report.
www.cae.org/cla
Carpenter,
Andrew N and Bach, Craig
Learning
Assessment: Hyperbolic Doubts Versus Deflated Critiques
http://ellis.academia.edu/AndrewCarpenter/Papers/172152/Learning_Assessment_Hyperbolic_Doubts_Versus_Deflated_Critiques
Carroll, J.B.
(1993). Human cognitive abilities: A survey of factor-analytic
studies. NewYork: Cambridge University Press.Cronbach, L.J. (1990).
Essential of psychological and educational testing. 5th edition.New
York: Harper Collins.
Council for
Aid to Education. (Fall, 2006). CLA Interim Institutional Report.
New York, NY:Council for Aid to Education.
Council for Aid to Education (2006)
Collegiate Learning Assessment.
New York, NY: Council for Aid to Education.
Council for Aid to Education. (2008).
CLA
Interim Institutional Report.
New York, NY: Council for Aid to Education.
Dahl, Ronald
E.,
Adolescent Brain Development: A Period of Vulnerabilities and
Opportunities, Annals of NY Academy of Science 1021:, 2004 ,1-22
Dwyer, C. A., Millett, C. M., & Payne, D. G. (2006). A Culture of
Evidence:Postsecondary assessment and learning outcomes. Princeton,
N.J.: Educational Testing Service.
Ekman, R., & Pelletier, S. (2008). Assessing student learning: A
work in progress.
Change: The Magazine of Higher Learning, 40(4),
14-19.
Erwin, D., &
Sebrell, K.W. (2003). Assessment of critical thinking: ETS’s tasks
in critical thinking. The Journal of General Education, 52(1),
50-70.
Ewell, P. T.
(1994). A policy guide for assessment: Making good use of the
Tasks in
Critical Thinking.
Princeton, NJ: Educational Testing Service.
Garrett, J. (2009)
English composition report.
Los
Angeles, CA; California State University.
Glenn, David. Scholar Raises Doubts about the Value of a Test of
Student Learning. The Chronicle of Higher Education, June 2, 2010.
Graff, G., &
Birkenstein, C. (May/June 2008). A Progressive Case for Educational
Standardization: How not to respond to the Spellings report. Academe
Online,
http://www.aaup.org/AAUP/pubsres/academe/2008/MJ/Feat/graf.htm
(May 20, 2008).
Hafner, A. (2010).
NSSE 2009 findings: Comparisons between CSU students and far West
peers and trends over time.
Los
Angeles, CA: California State University.
Hardison, C. M., & Vilamovska, A. (2009).
The
Collegiate Learning Assessment: Setting standards for performance at
a college or university.
Santa Monica, CA: RAND Education.
Hardison, C.M.,
& Valamovska, A-M. (2008). Critical thinking performance tasks:
Setting and applying standards for college-level performance.
PM-2487-CAE. Santa Monica, CA: Rand.
Hosch, Braden
J. Time on Test, Student Motivation, and Performance on the
Collegiate Learning Assessment: Implications for Institutional
Accountability, Association for Institutional Research Annual Forum,
Chicago, IL, June, 2010.
Klein, S., Kuh,
G., Chun, M., Hamilton, L., & Shavelson, R. (2005). An approach to
measuring cognitive outcomes across higher-education institutions.
Research in
Higher Education, 46,
3, 251-276.
Klein, S. &
Bolus, R. (1982). An analysis of the relationship between clinical
skills and bar examination results. Report prepared for the
Committee of Bar Examiners of the State Bar of California and the
National Conference of Bar Examiners.Klein, S. (1983). Relationship
of bar examinations to performance tests of lawyering skills. Paper
presented to the American Educational Research Association,
Montreal, April. (reprinted in Professional Education Researcher
Notes, 1982, 4,
10-11).
Klein, S. ).
Characteristics of hand and machine-assigned scores to
collegestudents’ answers to open-ended tasks. In Festschrift for
David Freedman, D.Nolan and T. Speed, editors: Beachwood, OH.
Institute for Mathematical Statistics, 2008.
Klein, S., Benjamin, R., Shavelson, R., & Bolus, R. (2007). The
Collegiate Learning Assessment: Facts and fantasies.
Evaluation Review, 31(5),
415-439.
Klein, S., Freedman, D., Shavelson, R., & Bolus, R. (2008).
Assessing school effectiveness.
Evaluation Review, 32(6),
511-525.
Klein, S. P., Kuh, G. D., Chun, M., Hamilton, L., & Shavelson, R.
(2003, April).
The
search for value-added: Assessing and validating selected higher
education outcomes.
Paper presented at the 88th
American Educational Research Association (AERA).
Klein, S. P., Kuh, G. D., Chun, M., Hamilton, L., & Shavelson, R.
(2005). An approach to measuring cognitive outcomes across higher
education institutions.
Research in Higher Education, 46(3),
251-276.
Klein, S., Liu, O. L., Sconing, J., Bolus, R., Bridgeman, B.,
Kugelmass, H., Nemeth, A., Robbins, S., & Steedle, J. (2009).
Test validity study report.
Retrieved March 31, 2010,
from
the Web:http://www.voluntarysystem.org/docs/reports/TVSReport_Final.pdf
Kuh, G.
(2006). Director’s Message in: Engaged Learning: Fostering Success
for All Students. Bloomington, Indiana: National Survey of Student
Engagement.
Landgraf, K.
(2005). Cover letter accompanying the distribution of Braun (2005)
report.
McClelland,
D.C. (1973). Testing for competence rather than for “intelligence.”American
Psychologist, 28(1),
1-14.
Powers, D.,
Burstein, J., Chodorow, M., Fowles, M., & Kukich, K. (2000).
Comparing
the validity of automated and human essay scoring
(GRE No. 98-08a, ETS
RR-00-10).
Princeton, NJ: Educational Testing Service.
Powers, D.,
Burstein, J., Chodorow, M., Fowles, M., & Kukich, K. (2001).
Stumping erater: Challenging the validity of automated scoring. (GRE
No. 98-08Pb, ETS RR-01-03). Princeton, NJ: Educational Testing
Service.
Reyna, Valerie
F. and Farley, Frank. Risk and Rationality in Adolescent Decision
Making: Implications for Theory, Practice and Public Policy.
Psychological Science in the Public Interest, Volume 7, No. 1,
September 2006
Sackett, P.R.,
Borneman, M.J., & Connelly, B.S. (2008). High-stakes testing in
higher education and employment. American Psychologist, 63(4),
215-227.
Shavelson,
R.J. (2007a). Assessing student learning responsibly: From history
to an audacious proposal. Change. January/February, 2007.
Shavelson,
R.J. (2007b). Student learning assessment: From history to an
audacious proposal. AAC&U.
Shavelson, R., & Huang, L. (2005).
CLA
conceptual framework.
New York: Council for Aid to Education.
Shavelson,
R.J. (2007). A brief history of student learning: How we got where
we are and a proposal for where to go next. Washington, DC:
Association of American Colleges and Universities’ The Academy in
Transition.
Shavelson, R.
(2007 January/February). Assessing student learning responsibly:
From history to an audacious proposal. Change, 26-33.
Shavelson,
R.J. (2008a,b): Aspen Paper and Wingspread Paper
Shermis, M. D.
(2008). The Collegiate Learning Assessment: A critical perspective.
Assessment Update, 20(2), 10-12.
Sowell,
Elizabeth R., Thompson, Paul M. , Holmes, Colin J., Jernigan, Terry
I., and Toga, Arthur W. In vivo evidence for post-adolescent
brain maturation in frontal and striatal regions. Nature
neuroscience, Volume 2, No. 10, Ocotober, 1999.
Steedle, J.
(2009). Advancing institutional value-added score estimation. New
York: Council for Aid to Education
Taylor, K.L.,
& Dionne, J-P. (2000). Accessing Problem-Solving Strategy Knowledge:
The Complementary Use of Concurrent Verbal Protocols and
Retrospective Debriefing. Journal of Educational Psychology, 2000,
92(3), 413-425.
U.S.
Department of Education (2006). A test of leadership: Charting the
Future of U.S. Higher Education. Washington, D.C.