ASTToken2Vec: An Embedding Method for Neural Code Completion (bibtex)

Abstract:

Code completion systems help programmers to write code more efficiently and to reduce typographical errors by automatically suggesting the code fragment that the programmers likely to write next. This work attempts to increase prediction performance of an LSTM-based code completion system proposed by Chang Liu et al. by proposing a new embedding method (a vector representation) for AST nodes. This method is called ASTToken2Vec, similar to Word2Vec, which trains a neural network by using context information to give a vector representation of an AST node. We integrate our embedding method with an LSTM model and evaluate its prediction performance on a JavaScript AST dataset generated from open-source programs containing a total of 150,000 JavaScript files.

View PDF

Reference:

ASTToken2Vec: An Embedding Method for Neural Code Completion (Dongfang Li and Hidehiko Masuhara), In Proceedings of the 36th JSSST Annual Conference (Kei Ito, ed.), 2019. (The Student Research Award and the Best Presentation Award of the conference.)

Bibtex Entry:

@inproceedings{li2019jssst,
  organization = {{J}apan Society for Software Science and Technology
		  ({JSSST})},
  month = aug,
  location = {Shibaura Institute of Technology, Tokyo, Japan},
  editor = {Kei Ito},
  year = 2019,
  booktitle = {Proceedings of the 36th JSSST Annual Conference},
  author = {Dongfang Li and Hidehiko Masuhara},
  title = {{ASTToken2Vec}: An Embedding Method for Neural Code Completion},
  pages = {No.~13-L},
  date = {2019-08-27},
  note = {\href{https://jssst2019.wordpress.com/}{The Student Research Award and the Best Presentation Award of the conference.}},
  review = {false},
  keywords = {JavaScript, LSTM},
  pdf = {jssst2019completion.pdf},
  abstract = {Code completion systems help programmers to write code more efficiently and to reduce typographical errors by automatically suggesting the code fragment that the programmers likely to write next. This work attempts to increase prediction performance of an LSTM-based code completion system proposed by Chang Liu et al. by proposing a new embedding method (a vector representation) for AST nodes. This method is called ASTToken2Vec, similar to Word2Vec, which trains a neural network by using context information to give a vector representation of an AST node. We integrate our embedding method with an LSTM model and evaluate its prediction performance on a JavaScript AST dataset generated from open-source programs containing a total of 150,000 JavaScript files.}
}