Looking for related posts on GitHub discussions

Marcia Lima, Igor Steinmacher, Denae Ford, Evangeline Liu, Grace Vorreuter, Tayana Conte, Bruno Gadelha

Research output: Contribution to journalArticlepeer-review

Abstract

Software teams increasingly adopt different tools and communication channels to aid the software collaborative development model and coordinate tasks. Among such resources, software development forums have become widely used by developers. Such environments enable developers to get and share technical information quickly. In line with this trend, GitHub announced GitHub Discussions—a native forum to facilitate collaborative discussions between users and members of communities hosted on the platform. Since GitHub Discussions is a software development forum, it faces challenges similar to those faced by systems used for asynchronous communication, including the problems caused by related posts (duplicated and near-duplicated posts). These related posts can add noise to the platform and compromise project knowledge sharing. Hence, this article addresses the problem of detecting related posts on GitHub Discussions. To achieve this, we propose an approach based on a Sentence-BERT pre-trained general-purpose model: the RD-Detector. We evaluated RD-Detector using data from three communities hosted in GitHub. Our dataset comprises 16,048 discussion posts. Three maintainers and three Software Engineering (SE) researchers manually evaluated the RD-Detector results, achieving 77–100% of precision and 66% of recall. In addition, maintainers pointed out practical applications of the approach, such as providing knowledge to support merging the discussion posts and converting the posts to comments on other related posts. Maintainers can benefit from RD-Detector to address the laborintensive task of manually detecting related posts.

Original languageEnglish (US)
Article numbere1567
JournalPeerJ Computer Science
Volume9
DOIs
StatePublished - 2023
Externally publishedYes

Keywords

  • Communication tool
  • GitHub Discussions
  • Knowledge sharing
  • Related posts
  • Sentence-BERT
  • Software teams interaction

ASJC Scopus subject areas

  • Computer Science(all)

Fingerprint

Dive into the research topics of 'Looking for related posts on GitHub discussions'. Together they form a unique fingerprint.

Cite this