Computerized Adaptive Testing Environment for Clients

Emīls Kālis, State Probation Service of Latvia

September 3, 2018 Helsinki

Evaluation of programs

Why is so hard to answer - does a program reduce recidivism?

Six steps for evaluation of programs

Step What Why How
1 Content To be sure that the program is well grounded theoretically. Analysis of content/ expertise
2 Process of implementation To check if oranizational aspects in practice match program requirements. Analysis of practice
3 Inclusion of participants To see in what level principle of responsivity is considered. Analysis of selection principles and instruments applied
4 Effectiveness of program To be sure that the program does intended changes to participiants? Analysis of changes

From evaluation to evolution of programs

step 5

Calibration of program (interese of program developer)

  • are there some variables which ensure that some participants will experience more changes than others?
  • are theses changes stable? what kind of means are necessary to make these changes sustainable?

Results of such analysis can facilitate changes in content of program or criteria for inclusion of participants

Does the program help reduce recidivism?

Interese of society

step 6 - Do the changes initiated by program are related to reduction of recidivism?

This question can be answered only after we have positive answers to previous five steps!

Step 4 (Analysis of changes) is the most important in reaching step 6. Though in studies we often see that effectiveness of programs are being measured by recidivism rates, ignoring changes related to the specific program. In such situation is almost impossible to replicate results because due to game of too many variables involved including variable not related to programs and questionable research design problems.

Obstacles for measuring changes

  • Neglecting importance of analysis of change, instead preferring analysis of recidivism rate
  • Lack of instruments measuring changes
  • Use of inappropriate instruments, e.g., self-report measures

Problems with self-report measures

  • long list of tiresome questions- even people with high intellect abilities sometimes will find hard to answer to some questions.
  • questionnaires better work for studies where participants are not so interested to deceive. For example, how confident will you feel about your client's (sentenced for violation) negative aswer to the question: How likely that you are going to hit the guy who insulted you?
  • hard to develop multicultural measure

Problems with self-report measures

  • very hard to achieve sufficient reliability of measure. One can achieve high level of reliability, asking many times one question with different words. But when there are many different questions which intended to capture whole construct under interest, very explicit noise of question-type items appear.
  • almost impossible to develop parallel forms of questionnaire which has empirically proved stable measure.

In other words, changes appearing in the first application of questionnaire and second application of questionnaire could be barely related to real changes of interest.

Different approach to the problem

Principles for developing new measures

  • peace by peace - measuring separately each specific target of program
  • for clients with wide range of intellectual abilities
  • applicable in multicultural context
  • insensitive to deceiving
  • picture oriented items (questions) with few type of standard questions. Please select the picture where …..

Different approach to the problem


  • Item response theory
    • approach that gives opportunity to test measurement equality between different testing forms and time points
  • Computerized adaptive testing
    • a mechanism how to reduce number of items(questions). A test taker is not exposed to all test items (100) but only appropriate items (e.g.,30) which matches his/her knowledge level. The number of items is exposed to a test person is depending of testing process where algorithm strives to gain reliable measure for this certain test person.

If one can solve 3+4, but can not solve 38+23, algorytm will find one's border of ability, giving tasks between 3+4 and 38+23

What is is CATEC?

Computerized Adaptive Testing Environment for Clients (CATEC) is

  • independent user interface application providing dynamic view of test items (questions);
  • CATEC is integrated with case management system (PLUS) - in order to verify client and to relate test results to particular client's case;
  • Content and forms of questions are administrated from open source software: R: A language and environment for statistical computing;
  • Computerized adaptive testing process is ensured by R packages catR and catIrt

Process of development of test

  • defining a target - ability/knowledge/attitude (the most important expected outcome from a program);
  • creation of ideas - trying to find out how the target could be measured by visual stimuli (gathering together the most creative and the most experienced workers);
  • creation of items (questions) - artistic work, embodying ideas;

Process of development of test

  • setting up the test - technical work;
  • collection of data - run the test in practice;
  • analysis of data and development of measurement models;
  • applying measurement models in practice;
  • continuous evaluation of program, providing objective data for calibration of program and good basis for measuring recidivism.

where we are and where are we going?

  • 2016-2017 - development of Computerized Adaptive Testing Environment for Clients (CATEC)
  • 2018-2019 - development of pilot tests
  • 2020…. - integrating tests in probation business
  • …….. - evaluation of programs

Pilot tests

This year we plan to launch CATEC with pilot tests:

  • intellectual ability: abstract reasoning - non-verbal estimate of fluid intelligence;
  • ability to recognize emotions.

Example of CATEC: officer login

Example of CATEC: officer login

Example of CATEC: relating client with the main system

Example of CATEC: relating client with the main system

Example of CATEC: verification of client

Example of CATEC: verification of client