Seda+Evaluation+and+analysis

**Practicality**

 * 1) Did the test take as long as expected to design? **Yes.But in some parts,some modifications are done which required a little more time.**
 * 2) Was the test easy to administer? (arrangement of seating, distribution of the test among the learners, supervision, necessary equipment, timing, etc.)**Supervision,distribution and timing was easy as three students took the test.**
 * 3) Were the instructions clear and unambiguous, and examples for each task type provided, so that the test-takers knew exactly what to do?**Yes.All the necessary things was provided to them so they had no difficulty.**
 * 4) Was it easy to mark and score?**Not that easy actually because in the first part there were different answers as I expected.But even so,there were no distinct answers since they all need to use the text to answer them.There were only some different words.**

**Reliability**

 * 1) Does the test provide consistent and credibile measurement of language ability?**Since it only focuses on reading,it cannot measure general ability of language.But as for reading, it gave some credible results and also helped me to see the progress of the students concerning grammar.**
 * 2) Was the variety of tasks and items enough to provide a representative sample of language ability?There may be a vocabulary task but it requires a richer text.**As I said before, it does not include all the aspects of language;rather, it focuses on a specific aim which is reading comprehension.**
 * 3) Did the answer key provide an objective scoring of test items?**Yes,it was helpful but to some extent as some answers were not expected.**
 * 4) Did the format of the test reflect the format of the activities in the classroom, in order to ensure that the learners are familiar with the tasks and rubrics.**The 3 students told me that they are familiar with the format but even so I gave a short explanation to be sure.**
 * 5) Were the scores affected by intra-marker variables? For example, test-marker's personal knowledge of student, sequence of marking (good test following a poor test gets graded higher), order of marking (fatigue--the test marked last may be graded differently than the first test marked).**No,I tried to be as objective as I can.**

NOTE: Practicality and reliability are more significant in norm-referenced tests (e.g. placement and proficiency). In criterion-referenced testing validity is more important. The multiple-choice format is the best for both norm- and criterion-referenced testing, but it is restricted to assessing receptive skills only. Tests of speaking and writing have high validity, but do pose reliability and practicality problems.

**Validity**

 * 1) What was the face validity, i.e. did the test appear to test what it is supposed to test?I**n my opinion,there was.**
 * 2) What was the content validity, i.e. did the test really measures what it is supposed to measure, and nothing else (e.g. summarizing a text heard from tape not only checks writing, but also listening comprehension and the ability to select, extract, and condense the most essential information; general knowledge, intelligence-testing and culturally-loaded questions do not test linguistic competence but extralinguistic knowledge or the analytical skills of the testee).**It measures what it is supposed to measure.**
 * 3) Wha was the construct validity, i.e. did the test reflect the relative importance of the elements specified in the test construct?**Yes.**

**Authenticity**

 * 1) Was the test as realistic as possible and closely related to the situations in which the examinees will perform in real life?**Not really.**

**Washback**

 * 1) Did the test have any positive or negative washback effect on the teaching programme?
 * 2) Did the test reinforce the relative importance of skills and language focus in the classroom?