• Nenhum resultado encontrado

Adaptive clustering of codes for assessment in introductory programming courses

N/A
N/A
Protected

Academic year: 2019

Share "Adaptive clustering of codes for assessment in introductory programming courses"

Copied!
25
0
0

Texto

(1)

Adaptive clustering of codes for assessment in

introductory programming courses

Alexandre de A. Barbosa – UFAL / UFCG

Evandro de B. Costa – UFAL / UFCG

(2)

Adaptive clustering of codes for assessment in

introductory programming courses

Topics

Context and problem

Related work

Adaptive Clustering of codes

Background

The clustering approach

Evaluation of the clustering approach

Results

(3)

3

Context and problem

Programming is one of the basic competences in computer

science

disciplines of algorithms and the introductory programming

easy to find unmotivated students with some doubts and that

do not understand basic programming concepts

approved students do not have the necessary competencies

for the course and professional life [1]

Many factors are described in the scientific literature

individualized help for each student can minimize some of the

factors

(4)

Context and problem

Practical coding activities are typically adopted in

programming courses

Assessment of the proposed solutions is quite difficult

n. students x n. exercices x n.solutions (code/submissions)

Large number of parameters can be observed

inputs/outputs

code structure

eficiency

The evaluation of code solutions is time-consuming, it is

subject to the bias and errors of each evaluator.

(5)

5

Related work

Online judges [10]

set of tests determine success or failure of the solution

Analysis of similarities [11,12, 13]

explore the code similarities with different purposes

Clustering or classification of codes [14,15,16]

a different set of techniques is used in each research

same set of criterias adopted by each evaluator

(6)

Adaptive Clustering of codes

Background

Clustering algorithm Kmeans [7]

K centroids

Set of data from each element

Distance of elements

Software metrics [18]

Properties extracted from codes

(7)

7

Adaptive Clustering of codes

Background

Euclidean distance

distance of two points in a n-dimensional space

Cohen’s Kappa

degree of agreement between two lists of classification

beyond what would be expected at random

(8)

Adaptive Clustering of codes

The clustering approach

The main ideia...

1. Select one element

to represent the cluster

2. Evaluate the element

(grade + text)

3. Generalize the

(9)

9

Adaptive Clustering of codes

The clustering approach

The steps

(1) code metrics extraction

(2) identification of the criteria adopted by the specialist

(3) Clustering generation

(4) Evaluation

(10)

Adaptive Clustering of codes

The clustering approach

The steps:

(1) code metrics extraction

Each code have a vector of properties

Some metrics are restricted to the code (eg. number of

operators)

Similarity metrics consider the relation of a code and a

(11)

11

Adaptive Clustering of codes

The clustering approach

The steps:

(2) identification of the criteria adopted by the

specialist

Generate all possible combinations of metrics (will be used

to create all possible set of clusters)

Specialist grades 10 codes (used to select on set of clusters)

Brute force method (not efficient)

(12)

Adaptive Clustering of codes

The clustering approach

The steps: (3) Clustering generation

Using Kmeans, with K = 10, based on a set of software

metrics*

* all possible combinations

All possible sets of clusters are

(13)

13

Adaptive Clustering of codes

The clustering approach

The steps: (4) Evaluation

a specialist assign 10 grades

one set clusters is selected

for each cluster in the set the grades are generalized to all

the other cluster elements, using the already given grades

(14)

Adaptive Clustering of codes

Evaluation of the clustering approach

The dataset

set of programming problems

set of codes

(15)

15

Adaptive Clustering of codes

Evaluation of the clustering approach

The dataset: set of programming problems (exercises)

‘salary bonus’ and ‘points distance’ (basic problems)

‘student situation’ and ‘elections’ (decision problems)

‘odd loop’ and ‘divisible by 3’ (loop problems)

(16)

Adaptive Clustering of codes

Evaluation of the clustering approach

The dataset: set of codes submitted by students as

solutions to the exercises

‘salary bonus’ (32 submissions)

‘points distance’ (23 submissions)

‘student situation’ (43 submissions)

‘elections’ (40 submissions)

‘odd loop’ (41 submissions)

(17)

17

Adaptive Clustering of codes

Evaluation of the clustering approach

The dataset: set of evaluations (grades varying from 0 to

10)provided by specialist (teachers and teacher assistants)

(18)

Adaptive Clustering of codes

Results

Compute Cohen`s Kappa

Specialist list vs. specialist list

Specialist list vs. cluster generated list

Compute Euclidean distance

Specialist list vs. specialist list

(19)

19

Adaptive Clustering of codes

Results

Cohen’s Kappa

Mean of 0.76 - Strong Agreement

“The specialists have a string agreement with the cluster

generated list of grades”

(20)

Adaptive Clustering of codes

Results

Euclidean distance*

Mean of 5.95

“Interpreting the list of grades as points coordinates in a

(21)

21

Adaptive Clustering of codes

Adaptive clustering of codes for assessment in introductory programming courses – ITS 2018 Alexandre A. Barbosa – [email protected]

(22)

Conclusions

We have proposed the use of a clustering algorithm to

minimize the effort expended in the evaluation of codes in

introductory courses

The results suggest that it is possible to minimize the

evaluation effort expended (Strong agreement between

specialist – cluster approach)

This research is an ongoing work, much investigation is

still necessary

comparison of different clustering techniques

(23)

23

References

[1] McCracken et. al., “

A multi-national, multi-institutional study of

assessment of programming skills of first-year cs students

.” ItiCSE 2001

[2] Stegeman, M., Barendsen, E., Smetsers, S.: “

Towards an empirically

validated model for assessment of code quality.

”. International Conference

on Computing Education Research 2014

[10] Yulianto, S.V., Liem, I.: “

Automatic grader for programming assignment

using source code analyzer.

” ICODSE 2014

[11] Rego, M.G., Dantas, A., Dalton Serey Guerrero “

Can Computers

Compare Student Code Solutions As Well As Teachers?

” Symposium on

Computer Science Education 2014

[12] Biggers, L.R., Kraft, N.A.: “

Quantifying the similiarities between source

code lexicons.

” ACM-SE 2011

(24)

References

[13] Li, S., Xiao, X., Bassett, B., Xie, T., Tillmann, N.: “

Measuring code

behavioral similarity for programming and software engineering education.

ICSE 2016

[14] Srikant, S., Aggarwal, V.: “

A system to grade computer programming

skills using machine learning

” ACM SIGKDD 2014

[15] Choudhury, R.R., Yin, H., Moghadam, J., Chen, A., Fox, A.: “

Autostyle:

Scale-driven hint generation for coding style.

” ITS 2016

[16] Yin, H., Moghadam, J., Fox, A.: “

Clustering student programming

(25)

Adaptive clustering of codes for assessment in

introductory programming courses

Alexandre de A. Barbosa – UFAL / UFCG

Evandro de B. Costa – UFAL / UFCG

Referências

Documentos relacionados

Considerando-se que a SGI vem reescrevendo sua história após o cisma com a seita Nichiren Shoshu em 1991, privilegia-se duas obras de Daisaku Ikeda “Revolução Humana” e

Assim, os objetivos do presente estudo são: caracterizar a população de grávidas do Centro Hospitalar Cova da Beira em que foram detetadas ecograficamente

The participants in this study showed to have participated more in training activities in classroom (68 times), when compared to e-learning (12 times), b-learning (8-times),

The significant differences in the female digested urine nitrogen from that of the male and the composite from the 2nd to 5th months of storage can be ascribed to the

[r]

The iterative methods: Jacobi, Gauss-Seidel and SOR methods were incorporated into the acceleration scheme (Chebyshev extrapolation, Residual smoothing, Accelerated

5) As partes devem cooperar para promover um sistema econômico in- ternacional favorável e aberto conducente ao crescimento e ao desen- volvimento econômico sustentáveis de todas

Na farmácia Sant'Ana a distribuição de tarefas é feita pela DT, estando estabelecido que os utentes são sempre a principal prioridade. Para permitir o bom desempenho, cada