Go to:
Logótipo
Comuta visibilidade da coluna esquerda
Você está em: Start > Publications > View > Mining Idioms in the Wild
Publication

Publications

Mining Idioms in the Wild

Title
Mining Idioms in the Wild
Type
Article in International Conference Proceedings Book
Year
2022
Authors
Sivaraman A.
(Author)
Other
The person does not belong to the institution. The person does not belong to the institution. The person does not belong to the institution. Without AUTHENTICUS Without ORCID
Rui Abreu
(Author)
FEUP
View Personal Page You do not have permissions to view the institutional email. Search for Participant Publications View Authenticus page View ORCID page
Scott A.
(Author)
Other
The person does not belong to the institution. The person does not belong to the institution. The person does not belong to the institution. Without AUTHENTICUS Without ORCID
Akomolede T.
(Author)
Other
The person does not belong to the institution. The person does not belong to the institution. The person does not belong to the institution. Without AUTHENTICUS Without ORCID
Chandra S.
(Author)
Other
The person does not belong to the institution. The person does not belong to the institution. The person does not belong to the institution. Without AUTHENTICUS Without ORCID
Conference proceedings International
Pages: 187-196
44th ACM/IEEE International Conference on Software Engineering: Companion, ICSE-Companion 2022
22 May 2022 through 27 May 2022
Indexing
Other information
Authenticus ID: P-017-DKP
Abstract (EN): Existing code repositories contain numerous instances of code patterns that are idiomatic ways of accomplishing a particular programming task. Sometimes, the programming language in use supports specific operators or APIs that can express the same idiomatic imperative code much more succinctly. However, those code patterns linger in repositories because the developers may be unaware of the new APIs or have not gotten around to them. Detection of idiomatic code can also point to the need for new APIs. We share our experiences in mining imperative idiomatic patterns from the Hack repo at Facebook. We found that existing techniques either cannot identify meaningful patterns from syntax trees or require test-suite-based dynamic analysis to incorporate semantic properties to mine useful patterns. The key insight of the approach proposed in this paper - Jezero - is that semantic idioms from a large codebase can be learned from canonicalized dataflow trees. We propose a scalable, lightweight static analysis-based approach to construct such a tree that is well suited to mine semantic idioms using nonparametric Bayesian methods. Our experiments with Jezero on Hack code show a clear advantage of adding canonicalized dataflow information to ASTs: Jezero was significantly more effective in finding new refactoring opportunities from unannotated legacy code than a baseline that did not have the dataflow augmentation.
Language: English
Type (Professor's evaluation): Scientific
No. of pages: 9
Documents
We could not find any documents associated to the publication.
Recommend this page Top
Copyright 1996-2025 © Faculdade de Direito da Universidade do Porto  I Terms and Conditions  I Acessibility  I Index A-Z
Page created on: 2025-08-28 at 19:20:18 | Privacy Policy | Personal Data Protection Policy | Whistleblowing