top of page

Overcoming Order in Autoregressive Graph Generation for Molecule Generation

E. Cohen-Karlik, E. Rozenberg, and D. Freedman

Transactions on Machine Learning (TMLR), 2024

Graph generation is a fundamental problem in various domains, and is of particular interest in chemistry where graphs may be used to represent molecules. Recent work has shown that molecular graph generation using recurrent neural networks (RNNs) is advantageous compared to traditional generative approaches which require converting continuous latent representations into graphs. One issue which arises when treating graph generation as sequential generation is the arbitrary order of the sequence which results from a particular choice of graph flattening method: in the chemistry setting, molecular graphs commonly have multiple SMILES strings corresponding to the same molecule. Inspired by the use case of molecular graph generation, we propose using RNNs, taking into account the non-sequential nature of graphs by adding an Orderless Regularization (OLR) term that encourages the hidden state of the recurrent model to be invariant to different valid orderings present under the training distribution. We demonstrate that sequential molecular graph generation models benefit from our proposed regularization scheme, especially when data is scarce. Our findings contribute to the growing body of research on graph generation and provide a valuable tool for various applications requiring the synthesis of realistic and diverse graph structures.

© 2026 by Daniel Freedman / Research Scientist

bottom of page