3.2 Research Design
3.2.2 Data Collection
Methods for data collection can be classified as direct or indirect methods (Runeson and Höst, 2009). Direct data collection methods are first degree methods where the researchers gather data by interacting with the study subjects such as by interviewing the subjects. Indirect data collection meth- ods are second and third degree methods. In second degree methods, data is collected through some kind of automated mechanisms. The most indi- rect form of data collection methods are the third degree collection meth- ods where the data already exists prior to its study and data is collected from previously stored documents or from other systems like organizational repositories. Figure 3.2 (decision point 7) shows the data collection meth- ods used for the thesis studies with interviews being the main first degree collection method along with a single survey as a second degree method.
Interviews are used to elicit information and detailed views from study subjects with the help of a set of questions presented either locally or re- motely (Wohlin and Aurum, 2015). Questions in interviews are generally open questions that allow more room for the answers or closed questions where the subjects are encouraged to answer questions in a particular form
or structure, e.g. choosing an answer from a set of possible answers (Rune- son and Höst, 2009). Unstructured interviews tend to follow the natural flow of discourse, having mostly open questions in the interview protocol (Runeson and Höst, 2009; Wohlin and Aurum, 2015). As opposed to un- structured interviews, structured interviews favor closed questions whereas semi-structured interviews try to strike a balance between open and closed questions (Runeson and Höst, 2009; Wohlin and Aurum, 2015).
An interview can be divided into phases and sections: an introductory phase traditionally begins the interview followed by background questions and the main questions for the interview (Runeson and Höst, 2009). Each question asked in an interview can also be classified according to its level of enquiry in a five-level classification (Yin, 2014). Level 1 questions are questions regarding the interviewee as an individual and the attitudes of the individual. Level 2 questions are questions about the case being studied and are thus the most common in case studies. Levels beyond Level 2 approach broader questions not only about the single case but more general questions about the cross-case findings, for instance.
Interviews for the 33 cases in the thesis were carried out during the years 2014 and 2015. Themes for the interviews varied by study so three distinct interview question sets were used for the interviews. The studies reported in Publication I and Publication II used the same interview protocol. In these two studies, case companies were approached with the idea that it would be helpful if the people who were interviewed were from software development teams and projects that were in a relatively advanced state. This way, the concrete cases could represent not only the typical case in a company but also reflect the best software development practices and processes available in the context of the company.
A single interview session was used in all of the 19 cases that were part of the first set of interviews. The people who were interviewed worked mostly as developers, architects or team leads in small software development teams developing a particular product. Around two hours were reserved for each interview. Usually, two researchers and one to two members of the soft- ware development team were present in the interview. The roles between researchers were generally divided so that one researcher asked the ques- tions following the interview guideline while the other researcher took notes and asked clarifying questions when necessary. A digital voice recorder was used to record the interviews with the consent of the interviewees. The interviews followed a semi-structured pattern with both open and closed questions. Background questions about the company and the interviewee were asked in the first sections. Most questions in the interview were Level
2 questions about the concrete case, the project or product, with which the interviewees were working. These questions were aimed at gaining insight into the practical ways of software development, testing, and deployment of software releases in the setting of project or product developer teams. A small number of personal Level 1 questions about hypothetical situations and individual perceptions of software development were included in the interview protocol. For instance, the interviewees were asked what bene- fits or challenges they could see with a release model where releases and deployment to production environments were more frequent. Besides open and closed questions that could be answered verbally, the interview proto- col included interactive parts. The interviewees were asked to depict their software development process from start to finish with a freeform process diagram on a whiteboard or on other similar surface. Notes taken in the interview about specifics of the development process were sent back to the interviewees for verification after the interview. In all but two cases the process descriptions were verified.
The second set of 10 case interviews reported in Publication IV had a similar design as the first interviews. With one or two researchers present, the semi-structured interviews on refactoring were recorded and transcribed.
The people who were interviewed were senior software developers and ar- chitects from a wide range of Finnish companies engaged in software de- velopment. Background questions were used to contextualize the cases and characterize the individuals being interviewed. There were a number of questions that could be classified as Level 1 questions since the interest was also to gather information about how senior developers and architects understand and define refactoring. Although the primary unit of analysis and case was the project and the development process used in the project, individual development team members were the embedded units of analysis in the company’s organizational context. Questions in the second interview set were mostly open questions but also several closed questions were asked in the course of the interview. The closed questions included, for instance, estimating the maturity of a project using a five step software development maturity model as reference (Olsson et al., 2012).
Interviews in the third interview set of the case study reported in Publi- cation V touched on the topic of DevOps. Three cases from three companies were selected for the study. The data was collected in three separate inter- view sessions where questions regarding the development and testing prac- tices of the companies were combined with reflective questions about the presumed benefits and challenges of DevOps. The interviews lasted around two hours and followed a semi-structured format. Two of the interviews
were conducted on-site and one of the interviews was conducted remotely.
All interviews were recorded and transcribed later.
Data collection for the longitudinal survey study reported in Publica- tion III was done using an online questionnaire that can be classified as an indirect second degree method of collecting data. The survey measured project-level maturity of software development practices in areas such as test automation, quality, build and deployment, running and monitoring, and the lead time to release new versions over a two-year time period in a Finnish company. Selection of projects in the company was based on self- selection as project representatives were free to choose whether to answer the survey or not. Information about the survey was sent via e-mail to all active projects in the company. Initially, the survey had been conducted in 2015 and it was replicated in 2017 with minor changes to the questionnaire.
There were responses from 43 projects in 2017, which is comparable to the 35 project responses in 2015. Since the company was involved in developing customer projects, the projects changed from time to time. Out of the 43 projects in 2017, 10 were the same as in 2015. Thus the bounding of the case is twofold. The context is the company itself with the organization being a unit of analysis as well as the individual projects that were active both in 2015 and in 2017.