A constrained recursion algorithm for tree-structured LSTM with mini-batch SGD

  • Ruo Ando National Institute of Informatics
  • Yoshiyasu Takefuji Musashino University
Keywords: Constrained recursion, hyperparameter tuning, tree-structured LSTM, mini-batch SGD


Tree-structured LSTM is a promising concept to consider long-distance interaction over hierarchies with syntactic information. Besides, compared with chain-structured one, tree-structured LSTM has better modularity of the learning process. However, there still remains the challenge concerning hyperparameter tuning in tree-structured LSTM. Mainly, hyperparameter of mini-batch SGD (Stochastic Gradient Descent) is one of the most important factors which decides the quality of the prediction of LSTM. For more sophisticated hyperparameter tuning of mini-batch SGD, we propose a constrained recursion algorithm of tree-structured LSTM. Our algorithm enables the program to generate an LSTM tree for each batch. By doing this, we can evaluate the tuning of hyperparameter of mini-batch size more correctly compared with chain-structured one. Besides, our constrained recursion algorithm can traverse the LSTM and update the weights over several LSTM tree with a breadth-first search. In the experiment, we have measured the validation loss and elapsed time in changing the size of mini-batch. We have succeeded in measuring the learning process’s stability with small batch size and the instability of overfitting with large batch size more precisely than chain-structured LSTM.


LSTM implementation for IJSCAI https://github.com/RuoAndo/LSTM-IJSCAI

