CrystalFlow: a flow-based generative model for crystalline materials - Nature Communications

In this study, we present CrystalFlow, an advanced generative model for crystalline materials that addresses key challenges in the rapidly evolving field of crystal generative modeling. Existing approaches, such as diffusion-based models, often require a large number of integration steps, leading to significant computational inefficiency, while string-based language models struggle to capture the intrinsic symmetries of crystals. To overcome these limitations, CrystalFlow employs Continuous Normalizing Flows (CNFs)55 within the Conditional Flow Matching (CFM) framework56,57, effectively transforming a simple prior density into a complex data distribution that captures the structural and compositional intricacies of crystalline material databases. This approach simultaneously generates lattice parameters, fractional coordinates, and atom types for crystalline systems, while establishing a symmetry-aware design through recent advancements in graph-based equivariant message-passing networks. By explicitly incorporating the fundamental periodic-E(3) symmetries of crystalline systems, CrystalFlow enables data-efficient learning, high-quality sampling, and flexible conditional generation. Our evaluation demonstrates that CrystalFlow achieves performance comparable to or surpassing state-of-the-art models across standard generation metrics when trained on benchmark datasets. Additionally, when trained with appropriately labeled data, CrystalFlow can generate structures optimized for specific external pressures or material properties, underscoring its versatility and effectiveness in addressing realistic and application-driven challenges in CSP.

The architecture of CrystalFlow is schematically illustrated in Fig. 1, with a detailed description of the methodology provided in the Methods section. Following established conventions in the crystal generative modeling community, a unit cell of a crystalline structure containing N atoms is represented as . In this representation, encodes the chemical composition, where each atom type is mapped to a unique a-dimensional categorical vector. The fractional atomic coordinates within the unit cell are denoted by , and the lattice structure is described by the lattice matrix .

To ensure rotational invariance in the lattice representation, an alternative parameterization of L is adopted using a rotation-invariant vector , derived via polar decomposition as . Here, Q is an orthogonal matrix representing rotational degrees of freedom, denotes the matrix exponential, and forms a standard basis of symmetric matrices. This parameterization effectively decouples rotational and structural information, providing a compact and symmetry-preserving representation of the lattice.

In the context of CSP, the primary objective is to predict the stable structure for a given chemical composition A under specific external conditions, such as pressure P. To achieve this, we propose a generative model that learns the conditional probability distribution over stable or metastable crystal configurations, denoted as p(x∣y). Here, x = (F, L) represents the structural parameters, while y = (A, P) serves as the conditioning variables. In cases where the chemical composition A (i.e., atom types) is not pre-specified-a task referred to as de novo generation (DNG)-the model extends its scope to simultaneously predict not only the structural parameters (F, L) but also the atom types A.

The proposed framework, CrystalFlow, models the conditional probability distribution over crystal structures using a CNF approach, which are trained using CFM techniques. This advanced generative modeling technique establishes a mapping between the data distribution q(x) and a simple prior distribution q(x), such as a Gaussian, through continuous and invertible transformations. This formulation enables efficient sampling and exploration of complex, high-dimensional data spaces. The architecture employs a equivariant geometric graph neural network (GNN) to parameterize time-dependent vector fields for the lattice L, fractional atomic coordinates F, and atomic types A, which collectively define the flow transformations and explicitly preserves the intrinsic periodic-E(3) symmetries of crystals, including permutation, rotation, and periodic translation invariance.

During inference, random initial structures are sampled from simple prior distributions and evolved toward realistic crystal configurations through a learned conditional probability path. The model employs numerical ordinary differential equation (ODE) solvers to generate crystal structures, with adjustable integration steps to balance computational efficiency and sample quality. This capability allows for the efficient generation of stable and metastable crystal configurations, enabling the exploration of new materials and their properties.

We evaluate the performance of CrystalFlow on a diverse set of crystal generation tasks using datasets that span a broad range of compositional and structural diversity. The model's effectiveness is systematically benchmarked against existing crystal generation methods using standard evaluation metrics. Furthermore, the quality of the generated structures is thoroughly analyzed through detailed density functional theory (DFT) calculations.

We begin by evaluating the performance of CrystalFlow using two widely recognized benchmark datasets, MP-20 and MPTS-52. The MP-20 dataset comprises 45,231 stable or metastable crystalline materials sourced from the Materials Project (MP), encompassing the majority of experimentally reported materials in the ICSD database with up to 20 atoms per unit cell. In contrast, MPTS-52 represents a more challenging extension of MP-20, containing 40,476 crystal structures with up to 52 atoms per unit cell, organized chronologically based on their earliest reported appearance in the literature. The datasets are divided into training, validation, and test subsets in a manner consistent with previous studies.

In accordance with standard practice, the predictive performance of the model is evaluated by calculating its match rate (MR) and the root mean squared error (RMSE) on the test set. Specifically, for each structure in the test set, k candidate structures are generated using CrystalFlow, and it is determined whether any of the predicted structures match the ground truth structure. The MR is defined as the fraction of structures in the test set that are successfully predicted. To match structures and evaluate their similarity, we employ the StructureMatcher function from the Pymatgen library. For consistency with prior studies, the same threshold parameters are used: ltol=0.3, stol=0.5, angle_tol=10 (see Methods for details). For matched structures, the RMSE between the positions of matched atom pairs is calculated and normalized by , where is the volume derived from the average lattice parameters. It is important to note that, while this benchmarking approach assumes known compositions, real-world CSP is often more general and challenging, as the precise stoichiometry may be unknown or only partially specified.

The MR and RMSE values for CrystalFlow at k = 1, 20, and 100 on the MP-20 and MPTS-52 datasets are presented in Table 1, alongside comparisons with other crystal generative models, including CDVAE, DiffCSP, and FlowMM. While k = 1 is conventionally used for benchmarking, practical CSP requires generating multiple candidate structures per composition to effectively explore complex energy landscapes and capture both stable and metastable (polymorphic) structures. Therefore, we report results at different k for a more realistic and comprehensive evaluation. The results indicate that CrystalFlow achieves performance that is comparable to or exceeds that of state-of-the-art models. On the MP-20 dataset, CrystalFlow demonstrates comparable MR and RMSE values to FlowMM, outperforming CDVAE and DiffCSP. On the more challenging MPTS-52 dataset, CrystalFlow achieves the best performance among all four models, highlighting its superior predictive capability. MR and RMSE values calculated using more stringent threshold parameters, as well as RMSE computed over all predictions (RMSE-all, including both matched and unmatched pairs), are presented in Supplementary S1. As anticipated, the application of stricter (i.e., smaller) tolerance values leads to a reduction in both match rate and RMSE, as only more closely matching structures are considered equivalent. Notably, the relative performance ranking among the evaluated methods remains largely unchanged under varying criteria.

A direct comparison of inference times for CrystalFlow and other state-of-the-art models, all benchmarked on the same GPU device (NVIDIA A800), is provided in Table 1. These results clearly show that CrystalFlow is approximately an order of magnitude faster than the diffusion-based model DiffCSP, while maintaining comparable or superior generation quality. This substantial efficiency gain is primarily due to the significantly fewer integration steps required by flow-based models such as CrystalFlow and FlowMM. Here, the integration step refers to the number of discrete numerical steps used by the ODEs (in flow-based models) or stochastic differential equations (in diffusion-based models) solver during the generation process. Fewer integration steps not only accelerate sample generation but also reduce computational cost, making the model more practical for large-scale applications.

We further evaluate the performance of CrystalFlow using the extensive MP-CALYPSO-60 dataset described in ref. . This dataset is constructed by integrating two sources: (1) ambient-pressure crystal structures obtained from the MP database, and (2) crystal structures generated from previous CALYPSO CSP studies conducted over a wide pressure range, with the majority of structures corresponding to pressures between 0 and 300 GPa. See Supplementary S2 and ref. for more details about the CALYPSO dataset. In contrast to the original dataset in ref. , structures containing more than 60 atoms per unit cell have been excluded. The resulting dataset comprises 657,377 crystal structures, spanning 86 elements and 79,884 unique chemical compositions. CrystalFlow, trained on this dataset, is conditioned on both chemical composition and external pressure, allowing it to generate crystal structures across a variety of pressure conditions. This capability is particularly important for simulating realistic conditions that materials may encounter in practical structure prediction applications.

We randomly selected 500 pairs of chemical compositions and pressures from the test set to serve as conditional inputs. For each conditional input, one structure was generated using CrystalFlow, as well as the previous developed Cond-CDVAE model, for comparative analysis. For CrystalFlow, integration steps S = 100, 1000, and 5000 were employed, whereas, for Cond-CDVAE, a diffusion-based model, a larger integration step of S = 5000 was used. To assess the quality of the generated structures, all samples were subjected to DFT single-point calculations and local optimizations at the corresponding target pressure by VASP package (see Sec. IV G for computational details). The relationship between the DFT-computed lattice stresses for the initial structures and the target pressures specified as generation conditions for every model is illustrated as scatter plots in Fig. 2a, while Fig. 2b presents the distributions of enthalpy differences between the structures generated by the two models, both before and after optimization.

As observed in Fig. 2a, with the exception of a few outliers, the structures generated by CrystalFlow across all integration steps exhibit significantly improved alignment with the target pressures compared to those generated by Cond-CDVAE. This underscores CrystalFlow's superior ability to learn and incorporate the effects of pressure within the generative modeling process. Consequently, the majority of CrystalFlow-generated structures, even with integration steps as low as S = 100, exhibit lower enthalpy than those produced by Cond-CDVAE (Fig. 2b), suggesting that CrystalFlow generates more physically plausible lattice and geometric configurations. After optimization, the distribution of enthalpy differences between the two models narrows, as the optimization process mitigates the discrepancies between the initial structures. This suggests that both models are effective in learning and incorporating essential structural information from the dataset. An additional comparison with the known lowest-enthalpy reference structures in the dataset is provided in Supplementary S3. The results are consistent with those presented in Fig. 2b.

In practice, neither CrystalFlow nor other state-of-the-art generative models can guarantee that the initially generated structures correspond to true energy minima. Consequently, further quantum-mechanical geometry optimization (e.g., using DFT) is an essential step to ensure the local stability of these structures. Notably, this optimization step is typically the most computationally demanding part of the structure prediction workflow. To assess the computational efficiency and practical utility of generative models, two key metrics are considered: the convergence rate and the number of ionic steps required during local optimization. The convergence rate reflects the percentage of generated structures that successfully reach a local energy minimum, while the number of ionic steps indicates the number of iterations needed to relax atomic positions and minimize the total energy. As shown in Table 2, structures generated by CrystalFlow generally achieve a higher convergence rate compared to those generated by Cond-CDVAE. Additionally, the number of ionic steps required for CrystalFlow structures decreases with increasing integration steps, suggesting that higher integration steps improve the quality of the generated samples. At an integration step of S = 5000, CrystalFlow requires 39.82 average ionic steps, which is lower than the 45.91 steps needed for Cond-CDVAE, indicating a 13.3% reduction in computational cost.

Further analysis reveals that ~55.2% of structures generated by CrystalFlow (after local optimization), as determined by StructureMatcher, are new, i.e., not present in the training set. For comparison, the MatterGen model achieves a newness rate of 61% in the DNG task. Although differences in training sets and evaluation tasks preclude a direct comparison, these results demonstrate that CrystalFlow achieves a level of structural newness rate comparable to state-of-the-art generative models.

In the previous test, we generated one sample for each randomly selected chemical system from the test set. To further evaluate the model's performance on a specific system, we conducted a case study using SiO, a material known for its significant structural polymorphism. We generated 200 SiO structures at 0 GPa, each containing three formula units per unit cell (9 atoms), using both CrystalFlow with integration steps S = 100, 1000, and 5000, and Cond-CDVAE with S = 5000. The average energy curves, calculated relative to the corresponding local minimum for each structure during the optimization process, are presented in Fig. 2c. These curves provide a quantitative measure of both the initial deviation of the generated structures from their local minima and the efficiency of their relaxation toward locally stable configurations. As shown in the figure, CrystalFlow consistently yields lower energy curves compared to Cond-CDVAE, indicating that structures generated by CrystalFlow are initially closer to their local minima and require less relaxation to achieve stability. This result is further supported by the faster convergence rates as a function of ionic steps, as shown in Supplementary S4. It is noteworthy that, in the later stages of optimization (after 40-50 steps), Cond-CDVAE exhibits slightly lower average energies and reduced standard deviations. Our analysis reveals that most structures converge within 50 ionic steps, and the remaining uncertainties in the energy curves are primarily due to a small subset of structures that are more challenging to optimize. For these difficult cases, CrystalFlow-generated structures tend to have higher relative energies, resulting in increased standard deviations. The high quality of the structures generated by CrystalFlow, which is further supported by the energy distribution of structures (with S = 100) before and after optimization shown in Fig. 2d, as well as the convergence rate and number of ionic steps detailed in Table 2. With an integration step of S = 5000, CrystalFlow requires an average of 31.99 ionic steps, which is about 27.9% fewer than the 44.36 steps required by Cond-CDVAE.

Finally, we evaluate the DNG performance of CrystalFlow on the MP-20 dataset to assess its potential for inverse materials design tasks. Initially, we train CrystalFlow on MP-20 without conditioning, and compare its performance with other models using common DNG metrics, including structural and compositional validity, coverage, and property statistics. The structural validity is defined as the percentage of generated structures in which all pairwise atomic distances exceed 0.5 Å. The compositional validity, on the other hand, refers to the percentage of generated structures with an overall neutral charge, calculated using SMACT. Coverage quantifies the structural and compositional similarity between the test set and the generated structures with a detailed definition given in Sec. IV H. For property statistics, we evaluate the similarity between the test set and the generated structures in terms of density ρ and number of elements N, using Wasserstein distances (wdist). We also measure the model's ability to generate stable, diverse, new materials, denoted as SUN rate. A structure is deemed stable (S.) if its energy above hull, relative to Matbench Discovery convex hull, is negative. Among these stable structures, one is considered new (N.) if it does not appear in the training set. Furthermore, take all stable and new structures, a structure is classified as unique (U.) if it is distinct from all others. The SUN rate was calculated after geometric optimization of the sampled structures via pre-relaxation with CHGNet, followed by DFT relaxation using the same settings as prior studies.

The statistic results of 10,000 randomly generated structures are presented in Table 3, and demonstrate that CrystalFlow achieves a performance comparable to that of existing models across various metrics. Although its compositional validity is marginally lower than that of other approaches, CrystalFlow achieves the smallest wdist for density, underscoring its capability to generate structures with more physically reasonable lattice parameters. CrystalFlow achieves a competitive stable and SUN rate among state-of-the-art models. For example, the SUN rate of CrystalFlow (3.7%) is higher than that of DiffCSP (3.3%) and FlowMM (2.3%), but lower than FlowLLM (4.7%) and ADiT (4.7%). This comparison highlights both the strengths of our approach and potential avenues for further improvement, such as training on more comprehensive datasets (as in MatterGen), adopting advanced model architectures based on transformers (as in ADiT), or integrating large language models with diffusion/flow-based frameworks.

To demonstrate CrystalFlow's capability to generate structures with targeted properties, the model is further trained on the MP-20 dataset with formation energy (E) as a conditioning label. Subsequently, 10,000 structures are generated for each specified formation energy value, conditioned on E = 0, - 1, - 2, - 3, and - 4 eV/atom, respectively. To facilitate efficient evaluation, energy calculations and geometric optimization are performed using CHGNet, a universal and computationally efficient interatomic potential. Notably, after local optimization, ~93.5% of the generated structures were identified as new - i.e., absent from the training set as determined by StructureMatcher - highlighting CrystalFlow's strong generative capability.

We compared the distributions of E for the generated structures produced by CrystalFlow and Con-CDVAE, a state-of-the-art model for conditional DNG tasks, as shown in Fig. 3. The generated structures exhibit E distributions that are generally shifted in accordance with the specified target values; however, the centers of these distributions tend to be displaced toward higher-energy regions. Following geometric optimization, the distributions become more closely aligned with the target values. This improvement is particularly pronounced for target E values that are well-represented in the training dataset. These results highlight the effectiveness of CrystalFlow in generating structures with targeted properties and its ability to leverage training data to improve the accuracy of conditional generation.

A quantitative evaluation of model performance is presented in Supplementary S5, reporting the mean absolute error (MAE) and root mean square error (RMSE) between target and generated formation energies. While the generated energy distributions generally follow the target values, significant deviations persist, particularly for underrepresented targets in the training data. The MAE ranges from 0.67 to 2.02 eV/atom before optimization and improves to 0.17-0.82 eV/atom after optimization, underscoring the challenge of high-fidelity conditional generation with current crystal generative models. Notably, the Con-CDVAE model achieves superior performance, with formation energy distributions more tightly centered on target values, as shown in Fig. 3. This enhanced accuracy is attributed to both the advanced architectural design of Con-CDVAE and the incorporation of a property predictor, which filters latent variables for structure generation at a ratio of 20,000:200. These findings suggest that incorporating a property predictor could further enhance the conditional generation capability of CrystalFlow in practical material design applications.

Info Pulse Now

CrystalFlow: a flow-based generative model for crystalline materials - Nature Communications

POPULAR CATEGORY

misc

entertainment

corporate

research

wellness

athletics