Understanding Development Process of Machine Learning Systems: Challenges and Solutions

Elizamary De Souza Nascimento, Iftekhar Ahmed, Edson Oliveira, Márcio Piedade Palheta, Igor Steinmacher, Tayana Conte

Research output: Chapter in Book/Report/Conference proceedingConference contribution

56 Scopus citations

Abstract

Background: The number of Machine Learning (ML) systems developed in the industry is increasing rapidly. Since ML systems are different from traditional systems, these differences are clearly visible in different activities pertaining to ML systems software development process. These differences make the Software Engineering (SE) activities more challenging for ML systems because not only the behavior of the system is data dependent, but also the requirements are data dependent. In such scenario, how can Software Engineering better support the development of ML systems? Aim: Our objective is twofold. First, better understand the process that developers use to build ML systems. Second, identify the main challenges that developers face, proposing ways to overcome these challenges. Method: We conducted interviews with seven developers from three software small companies that develop ML systems. Based on the challenges uncovered, we proposed a set of checklists to support the developers. We assessed the checklists by using a focus group. Results: We found that the ML systems development follow a 4-stage process in these companies. These stages are: understanding the problem, data handling, model building, and model monitoring. The main challenges faced by the developers are: identifying the clients' business metrics, lack of a defined development process, and designing the database structure. We have identified in the focus group that our proposed checklists provided support during identification of the client's business metrics and in increasing visibility of the progress of the project tasks. Conclusions: Our research is an initial step towards supporting the development of ML systems, suggesting checklists that support developers in essential development tasks, and also serve as a basis for future research in the area.

Original languageEnglish (US)
Title of host publicationProceedings - 13th ACM/IEEE International Symposium on Empirical Software Engineering and Measurement, ESEM 2019
PublisherIEEE Computer Society
ISBN (Electronic)9781728129686
DOIs
StatePublished - Sep 2019
Event13th ACM/IEEE International Symposium on Empirical Software Engineering and Measurement, ESEM 2019 - Porto de Galinhas, Pernambuco, Brazil
Duration: Sep 19 2019Sep 20 2019

Publication series

NameInternational Symposium on Empirical Software Engineering and Measurement
Volume2019-Septemer
ISSN (Print)1949-3770
ISSN (Electronic)1949-3789

Conference

Conference13th ACM/IEEE International Symposium on Empirical Software Engineering and Measurement, ESEM 2019
Country/TerritoryBrazil
CityPorto de Galinhas, Pernambuco
Period9/19/199/20/19

Keywords

  • Machine Learning Systems
  • Software Engineering
  • challenges
  • data handling
  • software development

ASJC Scopus subject areas

  • Computer Science Applications
  • Software

Fingerprint

Dive into the research topics of 'Understanding Development Process of Machine Learning Systems: Challenges and Solutions'. Together they form a unique fingerprint.

Cite this