Considerações Finais - CONCLUSÕES E TRABALHOS FUTUROS

6. CONCLUSÕES E TRABALHOS FUTUROS

6.2 Considerações Finais

Nesta dissertação realizamos a extensão do mapeamento realizado por Silva et al. (2012), abordando o número de replicações publicadas em 2011 e 2012, os pesquisadores e organizações que realizaram estas replicações, o capítulo do SWEBOK que cada replicação estava associada, os métodos de pesquisa das replicações, os conjuntos de replicações e seus estudos originais, o tempo decorrido para realizar a replicação depois de publicado o estudo original e a confirmação ou não destes estudos originais.

Os resultados das questões levantadas mostram que algumas questões de pesquisa mantiveram os mesmos resultados do primeiro mapeamento e que as principais mudanças encontradas estão entre as replicações externas, sendo necessário realizar novas extensões para saber se esta tendência será concretizada.

Nas discussões realizadas ao longo do texto fica evidente a necessidade de se ter definições mais claras de replicação e as suas diretrizes para execução e escrita de relatórios. Nos resultados encontrados existe uma necessidade de executar mais replicações, aumentando consequentemente o número e tamanho dos conjuntos de replicações, o número de replicações executadas por cada estudo original, a execução de conjuntos de replicações mistas, maior cobertura das áreas de Engenharia de Software que tiveram replicações executadas e principalmente a concretização do conhecimento na área.

REFERÊNCIAS

ALMQVIST, J. P. F. Replication of Controlled Experiments in Empirical Software Engineering — A Survey. [s.l.] Lund University, 2006.

ARKSEY, H.; O’MALLEY, L. Scoping studies: towards a methodological framework. International Journal of Social Research Methodology: Theory & Practice, v. 8, n. 1, p. 19–32, fev. 2005.

BAHR, H. M.; CAPLOW, T.; CHADWICK, B. A. Middletown III: Problems of Replication, Longitudinal Measurement, and Triangulation. Annual Review of Sociology, v. 9, p. 243– 264, 1983.

BASILI, V. R.; SHULL, F.; LANUBILE, F. Building Knowledge through Families of Experiments. IEEE Transactions on Software Engineering, v. 25, n. 4, p. 456–473, 1999.

BROOKS, A. et al. Replication of Experimental Results in Software Engineering. IEEE Transactions on Software Engineering, 1995.

BROOKS, A. et al. Replication ’s role in software engineering. In: Guide to Advanced Empirical Software Engineering. [s.l: s.n.]. p. 365–379.

CARVER, J. C. Towards Reporting Guidelines for Experimental Replications : A ProposalInternational Workshop on Replication in Empirical Software Engineering Research. Anais...Cape Town: 2010

DALY, J. et al. Verification of Results in Software Maintenance Through External ReplicationIEEE International Conference on Software Maintenance. Anais...Glasgow: 1994

EASTERBROOK, S. et al. Selecting Empirical Methods for Software Engineering Research. In: SHULL, F.; SINGER, J.; SJØBERG, D. I. K. (Eds.). Guide to Advanced Empirical Software Engineering. London: [s.n.]. p. 285–311.

GÓMEZ, O. S.; JURISTO, N.; VEGAS, S. Replications Types in Experimental DisciplinesInternational Symposium on Empirical Software Engineering and Measurement. Anais...Bolzano-Bozen: 2010a

GÓMEZ, O. S.; JURISTO, N.; VEGAS, S. Replication , Reproduction and Re-analysis : Three ways for verifying experimental findings. 2010b.

GOULÃO, M.; BRITO E ABREU, F. Modeling the Experimental Software Engineering ProcessInternational Conference on the Quality of Information and Communications Technology. Anais...set. 2007

GOULD, J.; KOLB, W. L. A dictionary of the social sciencesLondonTavistock Publications, , 1964.

JURISTO, N.; GÓMEZ, O. S. Replication of Software Engineering ExperimentsEmpirical Software Engineering and Verification, 2012.

JURISTO, N.; VEGAS, S. Using Differences among Replications of Software Engineering Experiments to Gain Knowledge. International Symposium on Empirical Software Engineering and Measurement, p. 356–366, 2009.

KITCHENHAM, B. Procedures for Performing Systematic Reviews. [s.l: s.n.].

KITCHENHAM, B. The role of replications in empirical software engineering—a word of warning. Empirical Software Engineering, v. 13, n. 2, p. 219–221, 29 jan. 2008.

KITCHENHAM, B. A.; DYBA, T.; JORGENSEN, M. Evidence-Based Software EngineeringInternational Conference on Software Engineering. Anais...Washington, DC, USA: 2004

KITCHENHAM, B.; CHARTERS, S. Guidelines for performing Systematic Literature Reviews in Software Engineering. Keele University and University of Durham: [s.n.].

KREIN, J. L.; KNUTSON, C. D. A Case for Replication: Synthesizing Research Methodologies in Software EngineeringInternational Workshop on Replication in Empirical Software Engineering Research. Anais...Cape Town: 2010

LA SORTE, M. A. Replication as a Verification Technique in Survey Research: A Paradigm. Sociological Quarterly, v. 13, n. 2, p. 218–227, 1972.

LINDSAY, R. M.; EHRENBERG, A. S. C. The Design of Replicated Studies. The American Statistician, v. 47, n. 3, p. 217–228, 1993.

LUNG, J. et al. On the difficulty of replicating human subjects studies in software engineering. Proceedings of the 13th international conference on Software engineering - ICSE ’08, p. 191, 2008.

MÄNTYLÄ, M. V; LASSENIUS, C.; VANHANEN, J. Rethinking Replication in Software Engineering: Can We See the Forest for the Trees ?International Workshop on Replication in Empirical Software Engineering Research. Anais...Cape Town: 2010

MENDONÇA, M. G. et al. A Framework for Software Engineering Experimental ReplicationsInternational Conference on Engineering of Complex Computer Systems. Anais...Dublin: mar. 2008

MILLER, J. Replicating software engineering experiments: a poisoned chalice or the Holy Grail. Information and Software Technology, v. 47, n. 4, p. 233–244, mar. 2005.

MOCKUS, A.; ANDA, B.; SJØBERG, D. Experiences from Replicating a Case Study to Investigate Reproducibility of Software DevelopmentInternational Workshop on Replication in Empirical Software Engineering Research. Anais...Cape Town: 2010

PETTICREW, M.; ROBERTS, H. Systematic Reviews in the Social Sciences. 1. ed. [s.l.] Blackwell Publishing, 2006. p. 1–354

POPPER, K. The Logic of Scientific Discovery. [s.l.] Taylor & Francis e-Library, 1959. p. 1–545

SCATALON, L. P.; GARCIA, R. E.; CORREIA, R. C. M. Packaging Controlled Experiments Using an Evolutionary Approach Based on OntologyInternational Conference on Software Engineering and Knowledge Engineering. Anais...Miami: 2011

SCHMIDT, S. Shall we really do it again? The powerful concept of replication is neglected in the social sciences. Review of General Psychology, v. 13, n. 2, p. 90–100, 2009.

SHULL, F. et al. Knowledge-Sharing Issues in Experimental Software Engineering. Empirical Software Engineering, v. 9, n. 1/2, p. 111–137, mar. 2004.

SHULL, F. J. et al. The role of replications in Empirical Software Engineering. Empirical Software Engineering, v. 13, n. 2, p. 211–218, 29 jan. 2008.

SILVA, F. et al. Replication of Empirical Studies in Software Engineering: Preliminary Findings from a Systematic Mapping StudyInternational Workshop on Replication in Empirical Software Engineering Research. Anais...Washington: 2011

SILVA, F. Q. B. et al. Replication of empirical studies in software engineering research: a systematic mapping study. Empirical Software Engineering, p. 1–57, 1 set. 2012.

SINGH, K.; ANG, S. H.; LEONG, S. M. Increasing Replication for Knowledge Accumulation in Strategy Research. Journal of Management, v. 29, n. 4, p. 533–549, ago. 2003.

SJØBERG, D. I. K. et al. A Survey of Controlled Experiments in Software Engineering. IEEE Transactions on Software Engineering, v. 31, n. 9, p. 733–753, 2005.

SWEBOK. Guide to the Software Engineering Body of Knowledge - SWEBOK. 4. ed. USA: Angela Burgess, 2004. p. 174

VEGAS, S. et al. Analysis of the Influence of Communication between Researchers on Experiment ReplicationInternational Symposium on Empirical Software Engineering. Anais...Rio de Janeiro: 2006

YIN, R. K. Case Study Research: Design and Methods. 4. ed. London: Sage Publications, 2009. p. 240

APÊNDICE A - Referências dos Trabalhos Teóricos de Replicação

[ABO001FE] JURISTO, N.; VEGAS, S. The role of non-exact replications in software engineering experiments. Empirical Software Engineering, v. 16, n. 3, p. 295–324, 2011. [ABO002FE] BURTON, S. H. et al. Design Team Perception of Development Team Composition: Implications for Conway’s LawProceedings of the 2011 Second International Workshop on Replication in Empirical Software Engineering Research. Anais...Washington, DC, USA: IEEE Computer Society, 2011

[ABO003FE] WEYUKER, E. J.; BELL, R. M.; OSTRAND, T. J. Replicate, Replicate, ReplicateProceedings of the 2011 Second International Workshop on Replication in Empirical Software Engineering Research. Anais...Washington, DC, USA: IEEE Computer Society, 2011Disponível em: <http://dx.doi.org/10.1109/RESER.2011.15>

[ABO004FE] JURISTO, N.; GÓMEZ, O. S. Replication of Software Engineering ExperimentsEmpirical Software Engineering and Verification, 2012.

[ABO005FE] SCATALON, L. P.; GARCIA, R. E.; CORREIA, R. C. M. Packaging Controlled Experiments Using an Evolutionary Approach Based on OntologyInternational Conference on Software Engineering and Knowledge Engineering. Anais...Miami: 2011

[ABO006FE] TERUEL, M. A. . et al. Analyzing the understandability of Requirements Engineering languages for CSCW systems: A family of experiments. Information and Software Technology, v. 54, n. 11, p. 1215–1228, 2012.

[ABO007FE] BERNARD, B.; FRANÇA, N. DE; TRAVASSOS, G. H. Reporting Guidelines for Simulation-Based Studies in Software Engineering. p. 156–160, 2012.

[ABO008FE] SILVA, F. Q. B. et al. Replication of empirical studies in software engineering research: a systematic mapping study. Empirical Software Engineering, p. 1–57, 1 set. 2012.

[ABO009FE] GONZALO, E.; GALLARDO, E. Using Configuration Management and Product Experimentation Process in Software Engineering. 2011.

[ABO010FE] KOZAK, C.; SQUIRE, M. A secondary data archive for code-level debian metricsProceedings - 2011 2nd International Workshop on Replication in Empirical Software Engineering Research, RESER 2011. Anais...2012

APÊNDICE B - Referências das Replicação

[REP001FE] SCANNIELLO, G. On the Effectiveness of the UML Object Diagrams : A Replicated experiment. p. 76–85, 2011.

[REP002FE] CRUZ-LEMUS, J. A. et al. Assessing the influence of stereotypes on the comprehension of UML sequence diagrams : A family of experiments. v. 53, p. 1391–1403, 2011.

[REP003FE] LAUKKANEN, E. I. Survey Reproduction of Defect Reporting in Industrial Software Development. 2011.

[REP004FE] PREMRAJ, R.; HERZIG, K. Network versus Code Metrics to Predict Defects : A Replication Study. 2011.

[REP005FE] PRECHELT, L.; LIESENBERG, M. Design Patterns in Software Maintenance : An Experiment Replication at Freie Universit ¨ at Berlin. 2011.

[REP006FE] JURISTO, N.; VEGAS, S. Design Patterns in Software Maintenace : An Experiment Replication at UPM Experiences with the RESER ’ 11 Joint Replication Project. p. 7–14, 2012.

[REP007FE] NANTHAAMORNPHONG, A.; CARVER, J. C. Design Patterns in Software Maintenance : An Experiment Replication at University of Alabama. 2012.

[REP008FE] KREIN, J. L. et al. Design Patterns in Software Maintenance : An Experiment Replication at Brigham Young University. 2012.

[REP009FE] POW-SANG, J. A.; IMBERT, R.; MORENO, A. M. A Replicated Experiment with Undergraduate Students to Evaluate the Applicability of a Use Case Precedence Diagram Based Approach in Software Projects Use Case Precedence Diagrams and the Construction. p. 169–179, 2011.

[REP010FE] REIJERS, H. A.; MENDLING, J. A Study Into the Factors That Influence the Understandability of Business Process Models. v. 41, n. 3, p. 449–462, 2011.

[REP012FE] NILS, G.; HARDER, J. Clone Stability. 2011.

[REP013FE] GRAVINO, C. et al. Does the Documentation of Design Pattern Instances Impact on Source Code Comprehension ? Results from Two Controlled Experiments. p. 67– 76, 2011.

[REP015FE] BOWES, D. et al. Program Slicing-Based Cohesion Measurement : The Challenges of Replicating Studies Using Metrics. p. 75–80, 2011.

[REP016FE] FERRUCCI, F.; GRAVINO, C.; SARRO, F. A case study on the conversion of Function Points into COSMIC. p. 2–5, 2011.

[REP018FE] ARDITO, C. et al. Usability evaluation: a survey of software development organizations. 2011.

[REP019FE] REGGIO, G. et al. A Precise Style for Business Process Modeling : Results from Two Controlled Experiments. p. 1–15, 2011.

[REP020FE] KIM, S.; LI, S.; YI, J. S. Investigating the Efficacy of Crowdsourcing on Evaluating Visual Decision Supporting System. p. 1090–1094, 2011.

[REP021FE] AMASAKI, S.; ENGINEERING, S. Performance Evaluation of Windowing Approach on Effort Estimation by Analogy. 2011.

[REP022FE] MARTINO, S. DI et al. Using Web Objects for Development Effort Estimation of Web Applications : A Replicated Study. p. 186–201, 2011.

[REP023FE] GENERO, V. D. C. M.; PIATTNI, E. M. M. Empirical study to assess whether the use of routes facilitates the navigability of web information systems. n. May 2010, p. 1– 16, 2011.

[REP024FE] BIEGEL, B. et al. Comparison of Similarity Metrics for Refactoring Detection Categories and Subject Descriptors. n. i, 2011.

[REP025FE] FERRARI, F. C.; GARCIA, A. Development of Auxiliary Functions : Should You Be Agile ? An Empirical Assessment of Pair Programming and Test-First Programming. p. 529–539, 2012.

[REP026FE] BERGERSEN, G. R.; SJØBERG, D. I. K. Evaluating Methods and

Technologies in Software Engineering with Respect to Developers ’ Skill Level. n. 0316, p. 101–110, 2012.

[REP027FE] WNUK, K.; HÖST, M.; REGNELL, B. Replication of an experiment on linguistic tool support for consolidation of requirements from multiple sources. p. 305–344, 2012.

[REP028FE] KUSUMO, D. S. et al. Risks of Off-The-Shelf-based Software Acquisition and Development : A Systematic Mapping Study and. p. 233–242, 2012.

[REP029FE] CALEFATO, F. et al. Assessing the Impact of Real-Time Machine Translation on Requirements Meetings : A Replicated Experiment. p. 251–260, 2012.

[REP030FE] GRBAC, T. G.; RUNESON, P.; MEMBER, S. A Second Replicated

Quantitative Analysis of Fault Distributions in Complex Software Systems. p. 1–15, 2012. [REP032FE] JURISTO, N. et al. Comparing the Effectiveness of Equivalence Partitioning , Branch Testing and Code Reading by Stepwise Abstraction Applied by Subjects. 2012.

[REP033FE] GRAVINO, C. et al. Do Professional Developers Benefit from Design Pattern Documentation ? A Replication in the Context of Source Code Comprehension. p. 185–201, 2012.

[REP034FE] ALI, S.; YUE, T.; BRIAND, L. C. Does aspect-oriented modeling help improve the readability of UML state machines ? [s.l: s.n.].

[REP035FE] SALLEH, N.; MENDES, E.; GRUNDY, J. Investigating the effects of

personality traits on pair programming in a higher education setting through a family of experiments. [s.l: s.n.].

[REP036FE] ALBAYRAK, Ö.; CARVER, J. C. Investigation of individual factors impacting the effectiveness of requirements inspections : a replicated experiment. 2012.

[REP038FE] SILVA, F. Q. B. et al. Team building criteria in software projects : A mix- method replicated study. INFORMATION AND SOFTWARE TECHNOLOGY, 2012. [REP039FE] SHARIF, B.; FALCONE, M.; MALETIC, J. I. An Eye-tracking Study on the Role of Scan Time in Finding Source Code Defects. p. 381–384, 2012.

[REP040FE] ISSABAYEVA, A.; NUGROHO, A.; VISSER, J. Issue Handling Performance in Proprietary Software Projects. 2012.

[REP041FE] FIGUEIREDO, E. et al. On the Impact of Crosscutting Concern Projection on Code Measurement. p. 81–92, 2011.

[REP042FE] RUNESON, P.; HEED, P.; WESTRUP, A. A Factorial Experimental Evaluation of Automated Test Input Generation – Java Platform Testing in Embedded Devices. p. 217–231, 2011.

[REP045FE] CECCATO, M.; MARCHETTO, A.; MARIANI, L. An Empirical Study about the Effectiveness of Debugging When Random Test Cases Are Used. p. 452–462, 2012. [REP046FE] LAMKANFI, A.; DEMEYER, S. Filtering Bug Reports for Fix-Time Analysis. p. 379–384, 2012.

APÊNDICE C - Referências dos Estudos Originais

[ORI001FE] TORCHIANO, M. Empirical assessment of UML static object

diagramsProgram Comprehension, 2004. Proceedings. 12th IEEE International Workshop on. Anais...2004

[ORI003FE]BETTENBURG, N. et al. What makes a good bug report? Proceedings of the 16th ACM SIGSOFT International Symposium on Foundations of software engineering - SIGSOFT ’08/FSE-16, p. 308, 2008.

[ORI004FE] ZIMMERMANN, T.; NAGAPPAN, N. Predicting defects using network analysis on dependency graphs. Proceedings of the 13th international conference on Software engineering - ICSE ’08, p. 531, 2008.

[ORI009FE] POW-SANG, J. A. et al. An Approach to Determine Software Requirement Construction Sequences Based on Use Cases. 2008 Advanced Software Engineering and Its Applications, p. 17–22, dez. 2008.

[ORI012FE] KRINKE, J. Is Cloned Code More Stable than Non-cloned Code?Source Code Analysis and Manipulation, 2008 Eighth IEEE International Working Conference on. Anais...2008

[ORI015FE] MEYERS, T. M.; BINKLEY, D. An empirical study of slice-based cohesion and coupling metrics. ACM Trans. Softw. Eng. Methodol., v. 17, n. 1, p. 2:1–2:27, 2007. [ORI016FE] CUADRADO-GALLEGO, J. J. et al. An Experimental Study on the Conversion Between IFPUG and COSMIC Functional Size Measurement Units. Inf. Softw. Technol., v. 52, n. 3, p. 347–357, 2010.

[ORI018FE] BAK, J. O.; RISGAARD, P.; STAGE, J. Obstacles to Usability Evaluation in Practice : A Survey of Software Development Organizations. p. 23–32, 2008.

[ORI019FE] CERBO, F. DI et al. Precise vs . Ultra-light Activity Diagrams - An

Experimental Assessment in the Context of Business Process Modelling. p. 1–15, 2011. [ORI020FE] HUR, I. et al. A Comparative Study of Three Sorting Techniques in Performing Cognitive Tasks on a Tabular Representation. v. 1, p. 1–26, 2012.

[ORI021] FENTON, N. E.; OHLSSON, N. Quantitative Analysis of Faults and Failures in a Complex Software System. IEEE Trans. Softw. Eng., v. 26, n. 8, p. 797–814, 2000. [ORI021FE] LOKAN, C.; MENDES, E. Applying moving windows to software effort estimationEmpirical Software Engineering and Measurement, 2009. ESEM 2009. 3rd International Symposium on. Anais...2009

[ORI022FE] RUHE, M.; JEFFERY, R.; WIECZOREK, I. Cost estimation for web applicationsProceedings of the 25th International Conference on Software Engineering. Anais...Washington, DC, USA: IEEE Computer Society, 2003

[ORI024FE] WEIBGERBE, P.; DIEHL, S. Identifying Refactorings from Source-Code Changes. v. 2006, n. Ase, 2006.

[ORI026FE] BENANDER, A. C.; BENANDER, B. A.; SANG, J. An empirical analysis of debugging performance — differences between iterative and recursive constructs. J. Syst. Softw., v. 54, n. 1, p. 17–28, 2000.

[ORI027FE] NATT OCH DAG, J.; THELIN, T.; REGNELL, B. An experiment on linguistic tool support for consolidation of requirements from multiple sources in market-driven product development. Empirical Softw. Engg., v. 11, n. 2, p. 303–329, 2006.

[ORI028FE] LI, J.; SOCIETY, I. C.; CONRADI, R. A State-of-the-Practice Survey of Risk Management in Development with Off-the-Shelf Software Components. v. 34, n. 2, p. 271– 286, 2008.

[ORI029FE] CALEFATO, F.; LANUBILE, F.; PRIKLADNICKI, R. A Controlled Experiment on the Effects of Machine Translation in Multilingual Requirements

MeetingsGlobal Software Engineering (ICGSE), 2011 6th IEEE International Conference on. Anais...2011

[ORI032] PRECHELT, L. et al. A Controlled Experiment in Maintenance Comparing Design Patterns to Simpler Solutions. v. 27, n. 12, p. 1134–1144, 2001.

[ORI036FE] CARVER, J. C.; NAGAPPAN, N.; PAGE, A. The Impact of Educational Background on the Effectiveness of Requirements Inspections: An Empirical Study. Software Engineering, IEEE Transactions on, v. 34, n. 6, p. 800–812, 2008.

[ORI039FE] UWANO, H. et al. Analyzing Individual Performance of Source Code Review Using Reviewers ’ Eye Movement. 2004.

[ORI040FE] LUIJTEN, B.; VISSER, J. Faster Defect Resolution with Higher Technical Quality of Software. 2010.

[ORI046FE] GIGER, E.; PINZGER, M.; GALL, H. Predicting the fix time of

bugsProceedings of the 2nd International Workshop on Recommendation Systems for Software Engineering. Anais...New York, NY, USA: ACM, 2010

[ORI085] BASILI, V. R.; SELBY, R. W. Comparing the Effectiveness of Software Testing Strategies. n. 12, p. 1278–1296, 1987.

APÊNDICE D – Avaliação da Qualidade

Neste mapeamento sistemático realizamos também a avaliação da qualidade de cada artigo. Como dito na seção 3.8, esta avaliação da qualidade não foi um critério de exclusão dos estudos e sim apenas uma forma de avaliar comparativamente os estudos encontrados. Os critérios utilizados para avaliar os artigos estão descritos na Tabela 1 e na Tabela 2 e cada artigo foi avaliado por dois pesquisadores e em caso de conflito um terceiro pesquisador resolveu as divergências.

De acordo com os critérios de avaliação estabelecidos, apresentamos na

o índice de qualidade de cada artigo de replicação avaliado, divididos pelos quartis do índice da qualidade e pelo tipo de publicação: interna single-report, são as replicações internas publicadas juntamente com o estudo original; interna multi-report, são as replicações internas publicadas separadamente do estudo original; externa, são as replicações externas. O Gráfico 10 ilustra o quantitativo de artigos de replicação por cada quartil do índice da qualidade versus o tipo de publicação: interna single-report, interna multi-report ou externa.

Analisando os dados da

Tabela 17 e o agrupamento dos dados no Gráfico 10, observamos que as publicações das replicações single-report são as que possuem os maiores índices de qualidade, com 70% dos artigos com qualidade entre 92% e 100%, referente ao quartil Q4, enquanto que as publicações internas multi-report e externas concentram 76% de artigos com qualidade entre 0% e 69%, referente ao quartil com menor índice de qualidade, o quartil Q1.

Tabela 17: Quartis do Índice da Qualidade

Quartil do Índice da Qualidade Tipo de Replicação Interna Externa Single-report Multi-report Q4[92%, 100%] REP025FE(100%) REP038FE(100%) REP045FR(97%) REP013FE(95%)

REP023FE(95%) REP035FE(95%) REP041FE(95%)

Q3[79%, 91%] REP034FE(91%) REP033FE(88%) REP036FE(84%)

REP042FE(91%) REP001FE(84%) Q2[70%, 78%] REP010FE(77%) REP030FE(78%) REP039FE(78%) REP003FE(75%) REP006FE(72%) Q1[0%, 69%] REP029FE(69%) REP004FE(69%) REP009FE(66%) REP007FE(69%) REP018FE(66%) REP026FE(69%) REP002FE(63%) REP027FE(69%) REP005FE(56%) REP022FE(66%) REP019FE(53%) REP016FE(63%) REP024FE(47%) REP032FE(63%) REP020FE(25%) REP008FE(59%) REP012FE(53%) REP021FE(50%) REP015FE(41%) REP028FE(41%) REP046FE(25%) REP040FE(19%)

Gráfico 10: Agrupamento por Quartis do Índice da Qualidade

Nas Tabela 18 e Tabela 19 é ilustrado a média do índice da qualidade dos artigos dividido por cada tipo de publicação. A Tabela 18 é referente ao período de 1994 a 2010 e a Tabela 19 é referente a esta extensão, o período de 2011 a 2012.

Como apresentado na Tabela 1 e na Tabela 2 tivemos dois conjuntos de critérios para avaliar a qualidade dos artigos, assim não foi possível calcular a média geral de todos os artigos, já que possuem critérios de avaliação diferentes.

Na Tabela 18 os artigos single-report possuem a maior média de qualidade neste período por tipo de replicação, 88%. Já quando avaliado os artigos que publicam as replicações separadamente do estudo original (interna ou externa), temos uma média de 73%, ou seja, 15% em média a menos que as replicações single-report.

Na Tabela 19 também encontramos as replicações internas single-report com a maior média da qualidade por tipo de replicação, com índice de 94%. Os artigos que publicaram as replicações sem o estudo original (interna ou externa) também obtiveram um menor valor, uma média de 61%, e uma diferença média em relação às replicações single-report de 36%.

Tabela 18: Média do Índice da Qualidade no período de 1994 a 2010

Tipo de replicação Número de artigos Média

Interna single-report 36 88%

Interna multi-report 29 74%

Externa 31 72%

Interna multi-report e Externa 60 73%

Tabela 19: Média do Índice da Qualidade no período de 2011 a 2012

Tipo de replicação Número de artigos Média

Interna single-report 10 94%

Interna multi-report 10 62%

Externa 19 60%

Interna multi-report e Externa 29 61%

A partir das Tabela 18 e Tabela 19 verificamos que existe uma grande diferença entre os valores médios da qualidade dos artigos que publicam apenas a replicação, seja interna ou externa, para os artigos que publicam a replicação juntamente com os estudos originais, neste caso os artigos single-report. Essa diferença é devido aos diferentes critérios de avaliação determinados na Tabela 1 e Tabela 2 para cada tipo de artigo de replicação.

As replicações internas multi-report e externas possuem em sua avaliação de qualidade critérios específicos (CE) que avaliam as informações de seu estudo original. Quando analisado o valor médio, os critérios específicos tiveram os menores índices de avaliação, conforme mostrado na Tabela 20.

O índice de qualidade inferior nos critérios específicos pode ter algumas razões, como levantado por Silva et al. (2012) no primeiro mapeamento: a restrição de número de páginas em algumas publicações não permite a descrição completa do estudo original e sua replicação; os pesquisadores não sabiam que estas descrições eram necessárias, já que na área de Replicação de estudos empíricos na Engenharia de Software o primeiro trabalho que relata as diretrizes para relatar a replicação é o estudo de Carver (2010), um trabalho recente que pode ainda não ser do conhecimento de todos os pesquisadores que realizaram replicações em 2011 e 2012; as informações sobre o estudo original podiam não estar disponíveis ou os pesquisadores da replicação não encontraram estas informações; acrescentamos também neste mapeamento que a existência da informação nos estudos originais pode ter sido de difícil interpretação ou os pesquisadores da replicação não souberam interpretar as informações

corretamente. Acreditamos que a divulgação das diretrizes para relatar as replicações determinadas por Carver (2010) e a existência de repositórios para armazenarem os dados e as informações sobre os estudos originais e as replicações podem ajudar nestas questões.

Tabela 20: Índice da Avaliação da Qualidade por Critério dos Artigos sem o Estudo Original Incluído Fase do estudo Tipo de critério

Critério de qualidade Índice (%)

Este artigo claramente...

Design

CG ...definiu os objetivos? 79%

CE ...descreveu o objetivo do estudo original? 45% CG ... descreveu o método de pesquisa? 90% CE ... descreveu o método de pesquisa do estudo original? 63% CE ...descreveu quais os parâmetros de pesquisa

(variações) são diferentes do estudo original?

62%

Execução CG ...descreveu o ambiente que foi realizado? 90% CE ...descreveu o ambiente que o estudo original foi

realizado?

57%

Análise CG ...descreveu o método de análise dos dados? 79%

No documento Replicação de estudos empíricos em engenharia de software: extensão de um mapeamento sistemático. (páginas 83-101)