Apache POI pre-processed data for the first DocGen challenge at DySDoc 3

  • Martin P. Robillard (Contributor)
  • Andrian Marcus (Contributor)
  • Christoph Treude (Contributor)
  • Michele Lanza (Contributor)
  • Oscar Chaparro (Contributor)
  • James Clause (Contributor)
  • Neil A. Ernst (Contributor)
  • Marco Gerosa (Contributor)
  • Hideaki Hata (Contributor)
  • Shinpei Hayashi (Contributor)
  • Sarah Nadi (Contributor)



Apache POI pre-processed data for the first DocGen challenge The pre-processed data for First Software Documentation Generation Challenge (DocGen), hosted at the Third International Workshop on Dynamic Software Documentation (DySDoc 3), includes the following datasets for Apache POI 3.17: Call graph between method and classes. File: call-graph-poi-3.17-all.zip CSV file with the call graph between methods and between classes. Class A calls class B if there exists a call between amethod of class A and a method of class B. The call graph was produced by the tool java-callgraph. The CSV file contains the following columns: call_type: call between (C)lasses or (M)ethods caller: the Fully Qualified Name (FQN) of the caller method_call_type: the type of method call: M for invokevirtual calls I for invokeinterface calls O for invokespecial calls S for invokestatic calls D for invokedynamic calls callee: the FQN of the callee For more details about the format and each type of method call, check the tool README. Inheritance hierarchy File: poi-3.17-inheritance.zip A CSV file with the inheritance hierarchy of POI, which was extracted using bcel 6.2 The CSV file contains the following columns: record_id: sequential number parent_class: the parent class child_class: the child class relationship_type: the type of relationship between classes, i.e., the child class 'extends' or 'implements' the parent class Issues File: bugzilla-poi-dump.zip CSV file with the list of issues of Apache POI (timestamp: Tue Feb 27, 2018, 18.41.40 UTC) The CSV file contains the following columns: record_id: sequential number issue_id: the ID that identifies the issue in the issue tracker issue_url: the URL of the issue in the issue tracker issue_title: the title of the issue xml_path: the path to the XML of the issue, which contains all the issue information provided by the issue tracker All the issues in XML format can be found in the "poi" folder in the ZIP file Commits File: poi-commits.zip A JSON file with commit information for POI 3.17 (until revision 219dff00e6, on Sept. 8, 2017). The information was extracted using the tools Historage and Kataribe. For each commit, we provide: Commit hash Parent commit hash (if exists) Commit message Commit time Committer name Method-level changes (addition/deletion/modification/renaming and method FQN). The FQN contains information about the class (CN) and method (MT) or constructor (CS) StackOverflow posts File: apache-poi-SO.zip JSON file with all 6,299 Stack Overflow threads with the apache-poi tag,
Date made availableMar 24 2018

Cite this