Jindrich KLUFA
Prague University of Economics Czech Republic [email protected]
Abstract
Entrance exams at Prague University of Economics are analysed in present paper. The admission procedure at the university consists of English and mathematical tests. Two different tests in mathematics are used at Prague University of Economics in present time. We shall compare these tests (the distributions of the number of points in test in mathematics) from probability point of view. Entrance tests in mathematics at the university are multiple choice question tests (the test 1 has fifteen questions, the test 2 has twelve questions, each question in test 1 and each question in the test 2 has five answers). The questions in test 1 and test 2 in mathematics are independent.
Therefore probability model of binomial distribution was used for comparison of the distributions of the number of points in test in mathematics. The results of the paper can be used to improve the admission process at Prague University of Economics.
Keywords: Entrance examination, math tests, probability distribution.
Introduction
The math tests used in entrance examination at Prague University of Economics are multiple choice question tests.
These tests are prepared by the Department of Mathematics of the Faculty of Informatics and Statistics. Statistical analysis of the math tests we can find in Klůfa (2016). The multiple choice question tests are suitable for entrance examinations at the university. These tests are objective, results can be evaluated easily for large number of students. On the other hand, a student can obtain certain number of points in the test purely by guessing the right answers. This problem is addressed e.g. in Zhao (2005), Premadasa (1993), Zhao (2006) - this article studies the probability of obtaining a certain score by pure guesswork and introduces a conversion scheme which converts raw test scores into standard percentage marks (the probabilistic analysis shows that the optimum number of choices of answers is four). In Kaspříková and Klůfa (2012) it was shown that risk of success of students with lower performance levels in entrance examination at Prague University of Economics is negligible. The admission process is analysed e.g. in following education research. Relation between results of the entrance exam test and university study results at Charles University (Faculty of Mathematics and Physics) is studied in Zvára and Anděl (2001). The same problem at Czech University of Life Sciences is studied in Poláčková and Svatošová (2013).
The comparison of the ways of acceptance students at Prague University of Economics from statistical point of view is in Klůfa (2015b). Analysis of the entrance exams at Czech University of Life Sciences we can find in Kučera, Svatošová and Pelikán (2015) – relation between results of the entrance exam test and university study results. Analysis of the entrance tests at Comenius University in Bratislava we can find in Kohanová (2012).
Similar problems are solved in Kubanová and Linda (2012), Hrubý (2016), Bartoška, Brožová, Šubrt and Rydval (2013). Dependence of the results of entrance examinations on test variants is analysed in Klůfa (2015a). In this paper we shall study the entrance examination in mathematics from probability point of view.
The Study
Two different tests in mathematics at Prague University of Economics are applied in admission process in present time:
Test 1. The test has 10 questions for 5 points and 5 questions for 10 points (100 points total). Questions are independent. Each question has 5 answers (one answer is correct), wrong answer is not penalized.
Test 2. The test has 8 questions for 6 points and 4 questions for 13 points (100 points total). Questions are independent. Each question has 5 answers (one answer is correct), wrong answer is not penalized.
The Test 1 is applied in admission process at the Faculty of Informatics and Statistics, at the Faculty of Business Administration and at the Faculty of Finance and Accounting. The Test 2 is applied in admission process at the Faculty of International Relations. We shall compare these tests (the distributions of the number of points in test in mathematics) from probability point of view.
The tests in mathematics correspond to the following general model: Let us consider n independent random trials having two possible outcomes, say “success” (right answer) and “failure” (wrong answer) with probabilities p and (1-p) respectively. The probability of correctly answered question p (under assumption that each of m answers in particular question has the same probability and just one answer is correct) is p=1/m.
Let us denote X as number of successes (right answers) that occur in n independent random trials. X is discrete random variable distributed according to the binomial law with parameters n and p. Probability that number of successes is k (k=0, 1, 2, …,n) is (see e.g. Rao (1973))
(1)
𝑃𝑃(𝑋𝑋=𝑘𝑘) =�𝑛𝑛
𝑘𝑘� 𝑝𝑝𝑘𝑘 (1− 𝑝𝑝)𝑛𝑛−𝑘𝑘
The expected value and the standard deviation of random variable X distributed according the binomial law with parameters n and p is
𝐸𝐸(𝑋𝑋) =𝑛𝑛𝑝𝑝, 𝜎𝜎(𝑋𝑋) =�𝐷𝐷(𝑋𝑋) =�𝑛𝑛𝑝𝑝(1− 𝑝𝑝) (2)
where D(X) is dispersion of random variable X. The mode of random variable X is the most probable value x̂ of random variable X. The distribution function F of random variable X is a real function of one real variable defined for x in interval (-∞, ∞) by formula
F(x) = P(X≤ x)
i.e. F(x) is a probability that number of correct answers is less or equal to x. In our case, the distribution function F of random variable X distributed according to the binomial law with parameters n and p is
𝐹𝐹(𝑥𝑥) = 0,𝑥𝑥< 0, 𝐹𝐹(𝑥𝑥) =� �𝑛𝑛
𝑘𝑘� 𝑝𝑝𝑘𝑘(1− 𝑝𝑝)𝑛𝑛−𝑘𝑘
[𝑥𝑥]
𝑘𝑘=0
,𝑥𝑥 ≥0 (3)
where [x] is integer part of x.
Findings
Distribution of the number of points in the test 1 in mathematics
Now we shall study the distribution of the number of points in the test 1 (the test has 10 questions for 5 points and 5 questions for 10 points) in mathematics. Discrete random variable
Y1 = number of points in the test 1 in mathematics can take values
0, 5, 10, 15, 20, 25,…, 95, 100
For determination of distribution random variable Y1 we must find the probability P(Y1 =k) for k=0, 5, 10, 15, 20, 25,…, 95, 100. For example we shall find probability that number of points in test 1 in mathematics is 25. Let us denote
X1 = number of right answers in the first 10 issues X2 = number of right answers in following 5 issues It holds
P(Y1=25) = P[ (X1 =1∩ X2=2) U (X1 =3∩ X2=1) U (X1 =5∩ X2=0) ] =
= P[ (X1 =1∩ X2=2) ] + P[ (X1 =3∩ X2=1) ] + P[ (X1 =5∩ X2=0) ] Random variables X1, X2 are independent, therefore we have
P(Y1=25) = P (X1=1) P(X2=2) + P (X1=3) P(X2=1) + P (X1=5) P(X2=0)
Random variable X1 has binomial distribution with parameters n=10 and p=0,2. Random variable X2 has binomial distribution with parameters n=5 and p=0,2. According to (1) we obtain
𝑃𝑃(𝑌𝑌1= 25) =�10
1�0,210,89 �5
2�0,22 0,83+�10
3�0,230,87 �5
1�0,210,84+�10
5�0,25 0,85 �5 0� 0,85
= 0,146098
Analogously, we can calculate the probability P(Y1=k) for other k=0, 5, 10, 15, ... , 95, 100 (see Table 1 and Figure 1). For this calculation we used software Mathematica (Statistics ‘DiscreteDistributions’) – see Wolfram (1991).
Table 1: Distribution of number of points in the test 1 in mathematics
Points in test Probability Points in test Probability
0 0.035184 55 0.002890
5 0.087961 60 0.000957
10 0.142937 65 0.000275
15 0.175922 70 0.000067
20 0.174547 75 0.000014
25 0.146098 80 0.000002
30 0.105227 85 3 x 10-7
35 0.066057 90 2 x 10-8
40 0.036467 95 1 x 10-9
45 0.017761 100 3 x 10-11
50 0.007634 Sum 1
Figure 1: Distribution of number of points in the test 1 in mathematics (polygon)
Now we shall find the distribution function F1 of random variable Y1 (number of points in the test 1 in mathematics).
For example we shall find the probability that number of points in test 1 in mathematics is less or equal 30, i.e. the function value F1(30). We have
F1(30)= P(Y1 ≤ 30) = P [ (Y1 =0) U (Y1 =5) U (Y1 =10) U (Y1 =15) U (Y1 =20) U (Y1 =25) U (Y1 =30) ] Random events (Y1 =0), (Y1 =5), (Y1 =10), (Y1 =15), (Y1 =20), (Y1 =25), (Y1 =30) are disjoint (i.e. these random events cannot occur simultaneously), therefore
P(Y1 ≤ 30) = P(Y1 =0)+P(Y1 =5)+P(Y1 =10)+P(Y1 =15)+P(Y1 =20)+P(Y1 =25)+P(Y1 =30) Finally from Table1 we obtain
F1(30)= P(Y1 ≤ 30) = 0.867876,
i.e. under assumption of random choice of answers approximately 86.8% of students get the test score less or equal 30. Similarly we can find other values of distribution function F1 – see Table 2.
Table 2: Distribution function of number of points in test 1 in mathematics Interval of values y F1 (y) Interval of values y F1 (y)
(-∞, 0) 0 [50, 55) 0.995795
[0, 5) 0.035184 [55, 60) 0.998685
[5, 10) 0.123145 [60, 65) 0.999642
[10, 15) 0.266082 [65, 70) 0.999917
[15 ,20) 0.442004 [70, 75) 0.999984
[20, 25) 0.616551 [75, 80) 0.999998
[25, 30) 0.762649 [80, 85) 1.000000
[30, 35) 0.867876 [85, 90) 1.000000
[35, 40) 0.933933 [90, 95) 1.000000
[40, 45) 0.970400 [95, 100) 1.000000
[45, 50) 0.988161 [100, ∞) 1
Figure 2: Distribution function of number of points in the test 1 in mathematics
Finally we shall find a basic descriptive statistics of the distribution of the number of points in the test 1 in mathematics. According to (2) we obtain the expected number of points in test 1 in mathematics E(Y1). Since Y1 = 5 X1 + 10 X2 we have
E(Y1) = E(5X1 + 10X2) = 5 E(X1) +10 E(X2) According to (2) we obtain (E(X1) = 10 . 0.2 = 2, E(X2) = 5 . 0.2 = 1)
E(Y1) = 5 . 2 + 10 . 1 = 20.
Expected number of points in test 1 in mathematics is 20. The mode is the most probable number of points in test 1 in mathematics. From Table 1 is
𝑦𝑦�1 = 15.
Random variables X1, X2 are independent, therefore dispersion of number of points in test 1 in mathematics D(Y1) is (see e.g. Rao (1973))
D(Y1) = D(5X1 + 10X2) = 52 D(X1) +102 D(X2) Since (see formula (2)),
D(X1) = 10 . 0.2 . 0.8 = 1.6, D(X2) = 5 . 0.2 . 0.8 = 0.8, dispersion of number of points in test 1 in mathematics is
D(Y1) = 25 . 1.6 +100 . 0.8 = 120 and the standard deviation of number of points in test 1 in mathematics is
σ1 = 10.954.
Distribution of the number of points in the test 2 in mathematics
Now we shall study the distribution of the number of points in the test 2 (the test has 8 questions for 6 points and 4 questions for 13 points) in mathematics. Discrete random variable
Y2 = number of points in the test 2 in mathematics can take values (see Table 2)
0, 6, 12, 13, 18, 19, 24, 25, 26, 30, 31, 32, 36, 37, 38, 39, 42, 43, 44, 45, ... , 88, 94, 100
For determination of distribution random variable Y2 we must find the probability P(Y2 =k) for k=0, 6, 12, 13, 18, 19,…, 94, 100. For example we shall find probability that number of points in test 2 in mathematics is 25. Let us denote
T1 = number of right answers in the first 8 issues T2 = number of right answers in following 4 issues Random variables T1, T2 are independent, therefore
P(Y2=25) = P[ (T1=2)∩( T2=1) ] = P(T1=2) P(T2=1)
Random variable T1 has binomial distribution with parameters n=8 and p=0,2. Random variable T2 has binomial distribution with parameters n=4 and p=0,2. According to (1) we obtain
𝑃𝑃(𝑌𝑌2= 25) =�8
2�0,220,86 �4
1�0,21 0,83= 0,120259.
Analogously, we can calculate the probability P(Y2 =k) for other k=0, 6, 12, 13, 18, ... , 94, 100 (see Table 2 and Figure 2). For this calculation we used software Mathematica (Statistics ‘DiscreteDistributions’) – see Wolfram (1991).
Table 3: Distribution of number of points in the test 2 in mathematics
Points in test Probability Points in test Probability
5 10 15 20 25 30 35 40 45 50 55 60 65 70 75 80 85 90 95 100 y 0.5
1 F y
0 0.068719 51 0.007516
6 0.137439 52 0.000268
12 0.120259 55 0.000034
13 0.068719 56 0.001409
18 0.060130 57 0.003758
19 0.137439 58 0.000537
24 0.018790 61 0.000001
25 0.120259 62 0.000176
26 0.025770 63 0.001174
30 0.003758 64 0.000470
31 0.060130 68 0.000013
32 0.051540 69 0.000235
36 0.000470 70 0.000235
37 0.018790 74 4 10-7
38 0.045097 75 0.000029
39 0.004295 76 0.000073
42 0.000034 81 0.000002
43 0.003758 82 0.000015
44 0.022549 87 7 10-8
45 0.008590 88 0.000002
48 0.000001 94 1 10-7
49 0.000470 100 4 10-9
50 0.007046 Sum 1
Figure 3: Distribution of number of points in the test 2 in mathematics (polygon)
Now we shall find the distribution function F2 of random variable Y2 (number of points in the test 2 in mathematics).
For example we shall find the probability that number of points in test 2 in mathematics is less or equal 30, i.e. the function value F2(30). We have
F2(30)= P(Y2 ≤ 30) =
=P [(Y2 =0) U (Y2 =6) U (Y2 =12) U (Y2=13) U (Y2 =18) U (Y2 =19) U (Y2 =24) U (Y2 =25) U (Y2 =26) U (Y2 =30)]
Random events (Y2 =0), (Y2 =6), (Y2 =12), (Y2=13), (Y2 =18), (Y2 =19), (Y2 =24), (Y2 =25), (Y2 =26), (Y2 =30) are disjoint (i.e. these random events cannot occur simultaneously), therefore F2(30) is
P(Y2 =0)+ P(Y2 =6)+ P(Y2 =12)+ P(Y2=13)+ P(Y2 =18)+ P(Y2 =19)+ P(Y2 =24)+ P(Y2 =25)+P (Y2 =26)+ P (Y2=30) Finally from Table 3 we obtain
F2(30)= P(Y2 ≤ 30) = 0.761282,
i.e. under assumption of random choice of answers approximately 76.1% of students get the test score less or equal 30. Similarly we can find other values of distribution function F2 – see Table 4.
Table 4: Distribution function of number of points in the test 2 in mathematics
Interval of values y F2(y) Interval of values y F2(y)
(-∞, 0) 0 [50, 51) 0.984052
[0, 6) 0.068719 [51, 52) 0.991568
[6, 12) 0.206158 [52, 55) 0.991836
[12, 13) 0.326417 [55, 56) 0.991870
[13, 18) 0.395136 [56, 57) 0.993279
[18, 19) 0.455266 [57, 58) 0.997037
[19, 24) 0.592705 [58, 61) 0.997574
[24, 25) 0.611495 [61, 62) 0.997575
[25, 26) 0.731754 [62, 63) 0.997751
[26, 30) 0.757524 [63, 64) 0.998925
[30, 31) 0.761282 [64, 68) 0.999395
[31, 32) 0.821412 [68, 69) 0.999408
[32, 36) 0.872952 [69, 70) 0.999643
[36, 37) 0.873422 [70, 74) 0.999878
[37, 38) 0.892212 [74, 75) 0.999878
[38, 39) 0.937309 [75, 76) 0.999907
[39, 42) 0.941604 [76, 81) 0.999980
[42, 43) 0.941638 [81, 82) 0.999982
[43, 44) 0.945369 [82, 87) 0.999997
[44, 45) 0.967945 [87, 88) 0.999997
[45, 48) 0.976535 [88, 94) 0.999999
[48, 49) 0.976536 [94, 100) 1.000000
[49, 50) 0.977006 [100, ∞ ) 1
Finally we shall find a basic descriptive statistics of the distribution of the number of points in the test 2 in mathematics. According to (2) we obtain the expected number of points in test 2 in mathematics E(Y2). Since Y2 = 6 T1 + 13 T2 we have
E(Y2) = E(6 T1 + 13 T2) = 6 E(T1) +13 E(T2) According to (2) we obtain (E(T1) = 8 . 0.2 = 1.6, E(T2) = 4 . 0.2 = 0.8)
E(Y2) = 6 . 1.6 + 13 . 0.8 = 20.
Random variables T1 and T2 are independent, therefore dispersion of number of points in test 2 in mathematics D(Y2) is (see e.g. Feller (1970))
D(Y2) = D(6 T1 + 13 T2) = 62 D(T1) +132 D(T2) Since (see formula (2)),
D(T1) = 8 . 0.2 . 0.8 = 1.28, D(X2) = 4 . 0.2 . 0.8 = 0.64, dispersion of number of points in test 2 in mathematics is
D(Y2) = 36 . 1.28 +169 . 0.64 = 154.24 and the standard deviation of number of points in test 2 in mathematics is
σ2 = 12.419.
Comparison of the Test 1 and the Test 2 in mathematics is in Table 5. For example, test score more than 30 points
P(Y1 >30) = 1 – F1(30) = 1 - 0.867876 = 0. 132124
Table 5: Comparison of the tests in mathematics (random choice of answers)
Entrance examination in mathematics Test 1 Test 2
Expected number of points in test 20 20
Standard deviation 10.954 12.419
Test score more than 10 points 73.4% of students 79.4% of students Test score more than 20 points 38.3% of students 40.7% of students Test score more than 30 points 13.2% of students 23.9% of students Test score more than 40 points 2.96% of students 5.84% of students Test score more than 50 points 0.42% of students 1.59% of students Conclusion
The number of examples in the test in mathematics was reduced from 15 to 12 to shorten the test run time. The Distribution 1 (the distribution of number of points in the Test 1 in mathematics) and the Distribution 2 (the distribution of number of points in the Test 2 in mathematics) have the same expected value (see Table 5). Standard deviation of the Distribution 2 is greater than standard deviation of the Distribution 1. Due to greater variability of the Distribution 2 e.g. the probability that number of points in Test 2 in mathematics exceeds 40 is approximately two times greater (see Table 5) than the probability that number of points in Test 1 in mathematics exceeds 40 (both probabilities are near to 0).
From the results of this paper it seems that the Test 1 in mathematics is better than the Test 2 in mathematics from probability point of view. However the differences between these tests are not significant. A shorter math test can also be used for admission process at Prague University of Economics.
Acknowledgment
The paper was processed with contribution of long term support of scientific work on Faculty of Informatics and Statistics, University of Economics, Prague (IP 400040).
References
Bartoška, J., Brožová, H., Šubrt, T., Rydval, J. (2013) ‘Incorporating practitioners’ expectations to project management teaching’, Efficiency and Responsibility in Education, Proceedings of the 10th International Conference, Prague, pp. 16–23.
Feller, W. (1970) An Introduction to Probability Theory and its Application. John Wiley
Hrubý, M. (2016) ‘Feedback improvement of question objects’, International Journal of Continuing Engineering Education and Lifelong Learning, vol. 26, no 2, pp. 183–195. DOI: 10.1504/IJCEELL.2016.076010 Kaspříková, N., Klůfa, J. (2012) ‘Multiple Choice Question Tests for Entrance Examinations – A Probabilistic
Approach’, Journal on Efficiency and Responsibility in Education and Science, vol. 5, no. 4, pp. 195–202.
http://dx.doi.org/10.7160/eriesj.2012.050402.
Klůfa, J. (2015a) ‘Dependence of the Results of Entrance Examinations on Test Variants‘, Procedia - Social and Behavioural Sciences, vol. 174, pp. 3565–3571. http://dx.doi.org/10.1016/j.sbspro.2015.01.1073
Klůfa, J. (2015b) ‘Analysis of entrance examinations’, Efficiency and Responsibility in Education, Proceedings of the 12th International Conference, Prague, pp. 250–256.
Klůfa, J. (2016) ‘Statistical analysis of the test variants in admission process‘, The 10th International Days of Statistics and Economics, Prague, pp. 852–860.
Kubanová, J., Linda, B. (2012) ‘Relation between results of the learning potential tests and study results’, Journal on Efficiency and Responsibility in Education and Science, vol. 5, no 3, pp.125-134.
http://dx.doi.org/10.7160/eriesj.2012.050302
Kučera, P., Svatošová, L., Pelikán, M. (2015) ‘University study results as related to the admission exam results‘
Efficiency and Responsibility in Education, Proceedings of the 12th International Conference, Prague, pp.
318–324.
Kohanová, I. (2012) ‘Analysis of university entrance test from mathematics’, Acta Didactica Universitatis Comenianae Mathematics, vol. 12, pp.31-46.
Poláčková, J., Svatošová, L. (2013) ‘Analysis of success in university study as connected to admission exam results’, Efficiency and Responsibility in Education, Proceedings of the 10th International Conference, Prague, pp. 503–509.
Premadasa, I. (1993) ‘A reappraisal of the use of multiple-choice questions’, Medical Teacher, vol. 15, no. 2-3, pp. 237-242.
Rao, C.R. (1973) Linear Statistical Inference and Its Applications, New York: John Wiley Wolfram, S. (1991) Mathematica. Addison-Wesley
Zhao, Y. (2005) ‘Algorithms for converting raw scores of multiple-choice question tests to conventional
percentage marks’, International Journal of Engineering Education, vol. 21, , no. 6, pp. 1189-1194.
Zhao, Y. (2006) ‘How to design and interpret a multiple-choice-question test: A probabilistic approach’, International Journal of Engineering Education, vol. 22, no. 6, pp. 1281-1286.
Zvára, K., Anděl, J. (2001) ‘Souvislost výsledku přijímacího řízení s úspěšností studia na MFF’, Pokroky matematiky, fyziky a astronomie, vol. 46, no. 6, pp. 304-312.