"PISA" redirects here. For other uses, see Pisa (disambiguation).
|Purpose||Comparison of education attainment across the world|
|59 government education departments|
Head of the Early Childhood and Schools Division
|PISA Governing Body (Chair – Lorna Bertrand, England)|
The Programme for International Student Assessment (PISA) is a worldwide study by the Organisation for Economic Co-operation and Development (OECD) in member and non-member nations intended to evaluate educational systems by measuring 15-year-old school pupils' scholastic performance on mathematics, science, and reading. It was first performed in 2000 and then repeated every three years. Its aim is to provide comparable data with a view to enabling countries to improve their education policies and outcomes. It measures problem solving and cognition in daily life.
The 2015 version of the test was published on 6 December 2016.
Influence and impact
PISA, and similar international standardised assessments of educational attainment are increasingly used in the process of education policymaking at both national and international levels.
PISA was conceived to set in a wider context the information provided by national monitoring of education system performance through regular assessments within a common, internationally agreed framework; by investigating relationships between student learning and other factors they can "offer insights into sources of variation in performances within and between countries".
Until the 1990s, few European countries used national tests. In the 1990s, ten countries / regions introduced standardised assessment, and since the early 2000s, ten more followed suit. By 2009, only five European education systems had no national student assessments.
The impact of these international standardised assessments in the field of educational policy has been significant, in terms of the creation of new knowledge, changes in assessment policy, and external influence over national educational policy more broadly.
Creation of new knowledge
Data from international standardised assessments can be useful in research on causal factors within or across education systems. Mons notes that the databases generated by large-scale international assessments have made possible the carrying out, on an unprecedented scale, of inventories and comparisons of education systems in more than 40 countries and on themes ranging from the conditions for learning in mathematics and reading, to institutional autonomy and admissions policies. They allow typologies to be developed that can be used for comparative statistical analyses of education performance indicators, thereby identifying the consequences of different policy choices. They have generated new knowledge about education: PISA findings have challenged deeply embedded educational practices, such as the early tracking of students into vocational or academic pathways.
Barroso and de Carvalho find that PISA provides a common reference connecting academic research in education and the political realm of public policy, operating as a mediator between different strands of knowledge from the realm of education and public policy. However, although the key findings from comparative assessments are widely shared in the research community the knowledge they create does not necessarily fit with government reform agendas; this leads to some inappropriate uses of assessment data.
Changes in national assessment policy
Emerging research suggests that international standardised assessments are impacting upon national assessment policy and practice. PISA is being integrated in national policies and practices on assessment, evaluation, curriculum standards and performance targets; its assessment frameworks and instruments are being used as best-practice models for improving national assessments; many countries have explicitly incorporated and emphasise PISA-like competencies in revised national standards and curricula; others use PISA data to complement national data and validate national results against an international benchmark.
External influence over national educational policy
More important than its influence on countries' policy of student assessment, is the range of ways in which PISA is influencing countries education policy choices.
Policy-makers in most participating countries see PISA as an important indicator of system performance; PISA reports can define policy problems and set the agenda for national policy debate; policymakers seem to accept PISA as a valid and reliable instrument for internationally benchmarking system performance and changes over time; most countries - irrespective of whether they performed above, at, or below the average PISA score - have begun policy reforms in response to PISA reports.
Against this, it should be noted that impact on national education systems varies markedly. For example, in Germany, the results of the first PISA assessment caused the so-called 'PISA shock': a questioning of previously accepted educational policies; in a state marked by jealously guarded regional policy differences, it led ultimately to an agreement by all Länder to introduce common national standards and even an institutionalised structure to ensure that they were observed. In Hungary, by comparison, which shared similar conditions to Germany, PISA results have not led to significant changes in educational policy.
Because many countries have set national performance targets based on their relative rank or absolute PISA score, PISA assessments have increased the influence of their (non-elected) commissioning body, the OECD, as an international education monitor and policy actor, which implies an important degree of 'policy transfer' from the international to the national level; PISA in particular is having "an influential normative effect on the direction of national education policies". Thus, it is argued that the use of international standardised assessments has led to a shift towards international, external accountability for national system performance; Rey contends that PISA surveys, portrayed as objective, third-party diagnoses of education systems, actually serve to promote specific orientations on educational issues.
National policy actors refer to high-performing PISA countries to "help legitimise and justify their intended reform agenda within contested national policy debates". PISA data can be are "used to fuel long-standing debates around pre-existing conflicts or rivalries between different policy options, such as in the French Community of Belgium". In such instances, PISA assessment data are used selectively: in public discourse governments often only use superficial features of PISA surveys such as country rankings and not the more detailed analyses. Rey (2010:145, citing Greger, 2008) notes that often the real results of PISA assessments are ignored as policymakers selectively refer to data in order to legitimise policies introduced for other reasons.
In addition, PISA's international comparisons can be used to justify reforms with which the data themselves have no connection; in Portugal, for example, PISA data were used to justify new arrangements for teacher assessment (based on inferences that were not justified by the assessments and data themselves); they also fed the government's discourse about the issue of pupils repeating a year, (which, according to research, fails to improve student results). In Finland, the country's PISA results (that are in other countries deemed to be excellent) were used by Ministers to promote new policies for 'gifted' students. Such uses and interpretations often assume causal relationships that cannot legitimately be based upon PISA data which would normally require fuller investigation through qualitative in-depth studies and longitudinal surveys based on mixed quantitative and qualitative methods, which politicians are often reluctant to fund.
Recent decades have witnessed an expansion in the uses to which PISA and similar assessments are put, from assessing students' learning, to connecting "the educational realm (their traditional remit) with the political realm". This raises the question whether PISA data are sufficiently robust to bear the weight of the major policy decisions that are being based upon them, for, according to Breakspear, PISA data have "come to increasingly shape, define and evaluate the key goals of the national / federal education system". This implies that those who set the PISA tests – e.g. in choosing the content to be assessed and not assessed – are in a position of considerable power to set the terms of the education debate, and to orient educational reform in many countries around the globe.
PISA stands in a tradition of international school studies, undertaken since the late 1950s by the International Association for the Evaluation of Educational Achievement (IEA). Much of PISA's methodology follows the example of the Trends in International Mathematics and Science Study (TIMSS, started in 1995), which in turn was much influenced by the U.S. National Assessment of Educational Progress (NAEP). The reading component of PISA is inspired by the IEA's Progress in International Reading Literacy Study (PIRLS).
PISA aims at testing literacy in three competence fields: reading, mathematics, science on a 1000-point scale.
The PISA mathematics literacy test asks students to apply their mathematical knowledge to solve problems set in real-world contexts. To solve the problems students must activate a number of mathematical competencies as well as a broad range of mathematical content knowledge. TIMSS, on the other hand, measures more traditional classroom content such as an understanding of fractions and decimals and the relationship between them (curriculum attainment). PISA claims to measure education's application to real-life problems and lifelong learning (workforce knowledge).
In the reading test, "OECD/PISA does not measure the extent to which 15-year-old students are fluent readers or how competent they are at word recognition tasks or spelling." Instead, they should be able to "construct, extend and reflect on the meaning of what they have read across a wide range of continuous and non-continuous texts."
PISA is sponsored, governed, and coordinated by the OECD, but paid for by participating countries.
Method of testing
The students tested by PISA are aged between 15 years and 3 months and 16 years and 2 months at the beginning of the assessment period. The school year pupils are in is not taken into consideration. Only students at school are tested, not home-schoolers. In PISA 2006, however, several countries also used a grade-based sample of students. This made it possible to study how age and school year interact.
To fulfill OECD requirements, each country must draw a sample of at least 5,000 students. In small countries like Iceland and Luxembourg, where there are fewer than 5,000 students per year, an entire age cohort is tested. Some countries used much larger samples than required to allow comparisons between regions.
Each student takes a two-hour handwritten test. Part of the test is multiple-choice and part involves fuller answers. There are six and a half hours of assessment material, but each student is not tested on all the parts. Following the cognitive test, participating students spend nearly one more hour answering a questionnaire on their background including learning habits, motivation, and family. School directors fill in a questionnaire describing school demographics, funding, etc. In 2012 the participants were, for the first time in the history of large-scale testing and assessments, offered a new type of problem, i.e. interactive (complex) problems requiring exploration of a novel virtual device.
In selected countries, PISA started experimentation with computer adaptive testing.
Countries are allowed to combine PISA with complementary national tests.
Germany does this in a very extensive way: On the day following the international test, students take a national test called PISA-E (E=Ergänzung=complement). Test items of PISA-E are closer to TIMSS than to PISA. While only about 5,000 German students participate in the international and the national test, another 45,000 take only the latter. This large sample is needed to allow an analysis by federal states. Following a clash about the interpretation of 2006 results, the OECD warned Germany that it might withdraw the right to use the "PISA" label for national tests.
From the beginning, PISA has been designed with one particular method of data analysis in mind. Since students work on different test booklets, raw scores must be 'scaled' to allow meaningful comparisons. Scores are thus scaled so that the OECD average in each domain (mathematics, reading and science) is 500 and the standard deviation is 100. This is true only for the initial PISA cycle when the scale was first introduced, though, subsequent cycles are linked to the previous cycles through IRT scale linking methods.
This generation of proficiency estimates is done using a latent regression extension of the Rasch model, a model of item response theory (IRT), also known as conditioning model or population model. The proficiency estimates are provided in the form of so-called plausible values, which allow unbiased estimates of differences between groups. The latent regression, together with the use of a Gaussian prior probability distribution of student competencies allows estimation of the proficiency distributions of groups of participating students. The scaling and conditioning procedures are described in nearly identical terms in the Technical Reports of PISA 2000, 2003, 2006. NAEP and TIMSS use similar scaling methods.
All PISA results are tabulated by country; recent PISA cycles have separate provincial or regional results for some countries. Most public attention concentrates on just one outcome: the mean scores of countries and their rankings of countries against one another. In the official reports, however, country-by-country rankings are given not as simple league tables but as cross tables indicating for each pair of countries whether or not mean score differences are statistically significant (unlikely to be due to random fluctuations in student sampling or in item functioning). In favorable cases, a difference of 9 points is sufficient to be considered significant.
PISA never combines mathematics, science and reading domain scores into an overall score. However, commentators have sometimes combined test results from all three domains into an overall country ranking. Such meta-analysis is not endorsed by the OECD, although official summaries sometimes use scores from a testing cycle's principal domain as a proxy for overall student ability.
PISA 2015 was presented on 6 December 2016, with results for around 540,000 participating students in 72 countries, with Singapore emerging as the top performer in all categories.
Welcome back to Wiki Wednesday! As part of the Digital Rhetoric Collaborative’s current focus on activism in online spaces, we’re dedicating a series of Wiki Wednesday posts to interrogating Wikipedia as a site for making, sharing, and circulating meaning. We’ve already shared a few posts that work toward this focus. Heather Lang’s reflection on some of the obstacles she faced as a female graduate student trying to adapt to the culture of Wikipedia kicked off our series. I followed up a few weeks ago with a post about how Wikipedia’s adherence to print culture limits the types of knowledge it can represent. More recently, we invited Eryk Salvaggio, Communications Associate for the Wiki Education Foundation, to share some information about all of the support the Foundation offers for instructors interested in getting their students involved in Wikipedia writing projects. His post also included some rich discussion of the potential of the Wikipedia Education Program to help make the encyclopedia a more diverse and inclusive environment, focusing especially on the gender gap. Because a large majority of Wikipedia editors are male, Wikipedia’s coverage often omits the representation of articles on topics about or of interest to women. In the most recent post in this series, I discussed a Wikipedia writing project designed to remediate the encyclopedia’s gender gap, to expand its coverage and representation of articles of interest to and about women and lgbtq identities. A few weeks ago, that project – which I collaborated on with the amazing Sarah Einstein – was just getting underway. This week, I’m coming back to it to:
- highlight the work students were able to accomplish in Wikipedia,
- discuss the major challenges of the project, especially concerning how some students and their edits were met with distrust and immediate censure by other Wikipedia editors, and finally,
- acknowledge the risks associated with this type of critical digital praxis.
What We Accomplished
Because this project engaged students across three sections of the same class – a junior composition course focused on gender and writing – we were able to engage multiple articles across topics dealing with women’s and lgbtq issues and representation. Accordingly, students worked on and created articles dealing with a variety of subjects, from yoga pants, biographical articles on under-represented women, women’s sports teams, to Feminism in Norway, Women in Nursing, Kate Bornstein, Belle Knox and multiple others.
In total, 55 students edited 57 existing articles and created 14 new ones.
At this point, it’s difficult to assess their lasting impact. Students in one section are still finishing up. Some students’ edits, as discussed below, were edited down or reverted by other Wikipedians. However, I do believe it’s safe to say we made a sizable impact on the representation of women and lgbtq issues and identities, despite how far we still have to go with this type of work.
For full access to student-edited articles, see course page dashboards for each section, which also details other statistics, such as total edits, characters added and article views:
What’s a Deletionist? Anti-LGBTQ Sentiments in Wikipedia
Although many students experienced very little or no resistance to their edits, those who did (either through another editor reverting their revisions or tagging their newly created articles as candidates for “speedy deletion”) were usually adding LGBTQ content. One of the most disturbing instances of student/LGBTQ censure was the immediate tagging of the Black AIDS Institute article with a “speedy deletion” recommendation. This recommendation, which can be accessed and viewed in an older version of the article, was applied 12 seconds after a student had created the article and a lead section (intro) as a placeholder from which to make more edits. 12 seconds.
Rather than providing a rationale for deletion, the result of such tagging places the impetus on the creator of the article to prove the value and/or notability of the article through a “contest this speedy deletion” process. For a novice editor, someone participating in the Wiki Education Program for the first time, such a process would be, of course, complete foreign and intimidating. The fact that this student was creating the article with the intent to provide more content through additional edits is additionally troubling. The editor responsible for the tag, user:Finngall, apparently could not wait longer than 12 seconds to nominate it for deletion. Look more deeply into his contributions, and we find out that this particular editor specializes in this kind of work; he spends a lot of time specifically looking for new articles that might qualify for removal. Finngall is a deletionist.
But if a project like the Wikipedia Education Program is to work, and if it is to enable the remediation of the encyclopedia’s gender gap and create a more inclusive encyclopedia, we must re-think and challenge such deletionism . We cannot afford to invite novice student-editors into the community and reward them for their efforts with what amounts to hasty and inconsiderate erasure of both their identities as contributors and the already marginalized content of their contributions. With help from instructors, this student was able to contest the deletion, and to finish the article contribution, but we must wonder how likely it is that this student will want to make additional contributions after this negative first experience.
Critical Digital Praxis: A Risky Endeavor
In part 1 of this feature, I wrote about how this kind of pedagogical model, which engages students in meaningful public writing, might be better understood as a type of critical digital praxis,
“a model for making writing interventions in public digital cultures in order to both better understand the writing activities of those cultures and make meaningful impressions with/in them. I invoke praxis in the tradition of Paulo Freire and Hannah Arendt, to indicate a socially meaningful and rhetorically conscious method of active response to and within actual social cultures, one that bases such action on careful reflection of the ways in which writing mediates social realities and hierarchies. Praxis is the understanding and enacted practice of writing to effect social action, to establish relationships, to construct our selves and others in the world. It is what makes us human, and what makes us capable of making responsible, critical and reflective meaning in our daily lives. In coming to a new understanding of the cultural politics of representation in Wikipedia, students engage in critical digital praxis to become more capable citizens of our digital world.”
Now, on the other side of the project, I continue to believe that assignments like this one can embody such an optimistic articulation of the potential for writing pedagogy. Furthermore, I believe projects like these can change students and can help them change public culture – that they can make students acknowledge their own capability for cultural participation. Yet I am also worried about the risks associated with students working in (digital) public spaces. How can we create collaborative projects that mitigate students’ risk of censure or marginalization but which still accomplish public work? Do the risks outweigh the more positive outcomes? How do we balance the motivation to allow student to accomplish public praxis while still protecting them as novice writers?