Introduction

In recent years, ultrafast spectroscopic techniques, including pump-probe transient absorption1,2, solid state high-harmonic generation (HHG)3, terahertz spectroscopy4, and time-resolved angle-resolved photoemission spectroscopy (TR-ARPES)5, have emerged as crucial techniques for studying novel non-equilibrium physics in condensed matter systems. Interpreting the complex features observed in these experiments requires a predictive understanding of the roles of electron-electron and electron-phonon interactions and other many-body effects under highly non-equilibrium and strong-field conditions. Electron-electron (and electron-hole) interactions, in particular, play a key role in the coherent femtosecond dynamics of solid state systems, leading to a wide variety of phenomena, including light-induced superconductivity, many-body enhancement of HHG, coherent and incoherent excitonic effects in ARPES, and exciton-enhanced shift currents3,6,7,8,9,10,11,12. In the last decades, there has been an intense effort to simulate these driven non-equilibrium electronic processes from first principles13,14,15,16,17,18. One highly successful approach is the time-dependent adiabatic GW (TD-aGW)8,9,13,19, which is a non-equilibrium quantum master equation approach that simulates the dynamics of the single-electron reduced density matrix and has successfully described diverse phenomena, including transient absorption19, TR-ARPES9, and HHG with exciton effects7. However, a significant challenge lies in the computational demands of these simulations, which are prohibitively slow for large systems and for parameter sweeping, which is necessary to describe different driving-field conditions in ultrafast and strong-field spectroscopic measurements.

There has been growing interest in leveraging data-driven and machine learning techniques to accelerate first-principles many-body simulations, using methods such as low-rank approximations20,21,22,23,24,25,26,27, representation learning28,29,30,31,32,33, equivariant graph neural networks34,35,36 and other deep learning approaches37,38,39,40. For non-equilibrium simulations, data-driven prediction based on extrapolation from an existing time sequence, using methods such as dynamic mode decomposition (DMD)41,42,43 or Gaussian processes44, are commonly employed. However, since these approaches are extrapolative, they fail to describe situations where the external field changes dramatically outside the initially observed time interval, which is crucial for modeling ultrafast spectroscopy. Operator learning neural networks45,46,47,48,49,50 show promise for solving differential equations under more general conditions. However, such operator learning requires extensive training datasets for different driving conditions and still suffers from time-accumulated predictive errors and a black-box process that can make it difficult to extract physical understanding45,51,52. Thus, the development of new data-driven techniques that can accelerate calculations of many-body dynamics under highly non-equilibrium conditions and are robust against both rapid changes in the external field and error accumulation over the simulation time is urgently needed.

Here, we develop a data-driven low-rank approximation for the electron-hole kernel—the key interaction that encodes exciton effects across different many-body approaches—based on singular value decomposition (SVD). It exploits the inherent localization of excitonic effects within valleys in reciprocal space, which gives rise to a low-rank structure for the electron-hole interaction kernel across crystalline materials. In practice, we find that the kernel can be compressed by as much as 95% in most crystalline solids, while retaining information essential for reproducing both optical absorption spectra, for excitons with near-zero momentum, and energy loss spectra for finite momentum excitons. Moreover, we apply the SVD-compressed electron-hole kernel to accelerate TD-aGW calculations, achieving an order of magnitude speedup for calculations of both linear and nonlinear spectra under different driving conditions. This speedup makes repeated calculations under different field conditions computationally tractable, allowing first principles simulations to mirror experimental conditions. In contrast with approaches based on extrapolation from early-time dynamics, our physically interpretable approach exhibits no accumulated predictive error over time and requires no datasets or training, laying the groundwork for high-throughput simulations of nonequilibrium light-driven dynamics in materials.

Results

Two-particle electron-hole kernel in non-equilibrium dynamics

In interacting many-body systems, the equation of motion of the non-equilibrium single-particle Green’s function takes the form:

$$\left({\rm{i}}\frac{{\rm{d}}}{{\rm{d}}t}-H(t)\right)G(t,{t}^{{\prime} })=\delta (t,{t}^{{\prime} })+{\int}_{C}\Sigma (t,\bar{t})G(\bar{t},{t}^{{\prime} }){\rm{d}}\bar{t},$$
(1)

where t, \({t}^{{\prime} }\) lie on the Keldysh contour; \(G(t,{t}^{{\prime} })\) is the two-time Green’s function; Σ is the self-energy matrix in the Keldysh formalism53; and H(t) is a mean-field Hamiltonian. There is an equivalent adjoint equation for the time-evolution over \({t}^{{\prime} }\). Following the Langreth rules, one can derive a set of self-consistent equations from Eq. (1) for the retarded, advanced, lesser, and greater two-time Green’s functions on the real-time axis, known as the Kadanoff-Baym equations (KBE)53,54,55. The self-energy can be computed within conserving schemes, such as the GW approximation56,57,58—i.e., Σ = iGW, where W is the screened Coulomb potential commonly approximated by the random phase approximation (RPA)59,60.

One challenge in solving Eq. (1) is that the time evolution of G over t and \({t}^{{\prime} }\) must be performed simultaneously. In first principles calculations, a common approximation is to decouple the many-body effects in equilibrium from the changes induced by the driving field by splitting the self-energy into an equilibrium piece Σ[G0], plus a correction term δΣ[G, G0] = Σ[G] − Σ[G0], where G0 is the Green’s function in the absence of an external field. Then, G0 is evaluated within the equilibrium GW approximation56,57,58, while the correction is evaluated in the static limit of the GW approximation (i.e., the static Coulomb hole and screened exchange approximation, or static-COHSEX) which ignores retardation effects in ΣGW and thus eliminates the double time dependence13,19,56,57,58. In this approximation, the non-equilibrium dynamics of the system can be described by the equation of motion of the reduced single-particle density matrix, expressed here in a Bloch basis, \({\rho }_{nm{\bf{k}}}=\left\langle \right.n{\bf{k}}| \rho | m{\bf{k}}\left.\right\rangle\)13:

$$\begin{array}{lll}{\rm{i}}\frac{\partial }{\partial t}\rho {(t)}_{nm{\bf{k}}}\;=\;{\left[{H}^{{\rm{QP}}}+\delta {\Sigma }^{GW}(t)-e{\bf{E}}(t)\cdot {\bf{r}},\rho (t)\right]}_{nm,{\bf{k}}}\\ \qquad\qquad\quad=\;\left({\epsilon }_{n{\bf{k}}}^{{\rm{QP}}}-{\epsilon }_{m{\bf{k}}}^{{\rm{QP}}}\right)\rho {(t)}_{nm{\bf{k}}}+{\left[\delta {\Sigma }^{GW}(t),\rho (t)\right]}_{nm{\bf{k}}}-e{\bf{E}}(t){[{\bf{r}},\rho (t)]}_{nm{\bf{k}}},\end{array}$$
(2)

where the quasiparticle Hamiltonian HQP = H0 + ΣGW[G0], and δΣGW is the non-equilibrium change in the static-COHSEX self-energy, in which time-dependent changes to the screened Coulomb interaction are typically neglected13. E(t) is an arbitrary external electric field, which couples to the system in the length gauge through the position operator r61,62. The dipole approximation is made, so there are no finite momentum transitions, and the reduced single-electron density matrix only needs one crystal momentum index k. We refer to Eq. (2) as the time-dependent adiabatic GW (TD-aGW)8,9,13.

In the linear response limit, TD-aGW can be explicitly proven to be equivalent to the GW-Bethe-Salpeter equation (GW-BSE) approach in many-body perturbation theory (Fig. 1a, b)13,63. In GW-BSE, exciton eigenstates are found by diagonalizing an effective Hamiltonian consisting of the non-interacting electron-hole pair Hamiltonian plus the BSE electron-hole interaction kernel, which is derived by taking the functional derivative \(K=\frac{\delta \Sigma }{\delta G}\) of the GW self-energy and then ignoring both the non-equilibrium time evolution of the screened Coulomb interaction W and the frequency dependence of the screened Coulomb interaction63,64,65. We note that these approximations are consistent with the approximations made to the GW self-energy in TD-aGW. Therefore, under the static-COHSEX approximation for the single-electron self-energy in TD-aGW, δΣGW is given by the matrix product of the BSE kernel K and the non-equilibrium lesser Green’s function, i.e.,

$$\delta {\Sigma }_{mn{\bf{k}}}^{GW}=\sum _{{m}^{{\prime} },{n}^{{\prime} },{{\bf{k}}}^{{\prime} }}{K}_{mn{\bf{k}}{m}^{{\prime} }{n}^{{\prime} }{{\bf{k}}}^{{\prime} }}\delta {\rho }_{{m}^{{\prime} }{n}^{{\prime} }{{\bf{k}}}^{{\prime} }}.$$
(3)

The non-equilibrium electron dynamics are therefore completely controlled by the two-particle BSE kernel \({K}_{{n}_{1}{n}_{2}{\bf{k}}{n}_{3}{n}_{4}{{\bf{k}}}^{{\prime} }}\) (Fig. 1). When finite momentum electron-hole pairs are created, \({K}_{{n}_{1}{n}_{2}{\bf{k}}{n}_{3}{n}_{4}{{\bf{k}}}^{{\prime} }}\) has to be replaced by \({K}_{{n}_{1}{n}_{2}{\bf{k}},{n}_{3}{n}_{4}{{\bf{k}}}^{{\prime} }}^{{\bf{Q}}}\), where k and \({{\bf{k}}}^{{\prime} }\) are the momenta of the hole after and before electron-hole interaction, and Q is the center of mass momentum of the electron-hole pair. The kernel is composed of two terms (Fig. 1b)66,67,68: an exchange interaction mediated by the bare Coulomb interaction v:

$$\begin{array}{rcl}&&{K}_{{n}_{1}{n}_{2}{\bf{k}}{n}_{3}{n}_{4}{{\bf{k}}}^{{\prime} }}^{{\rm{x}},{\bf{Q}}}:= \langle {n}_{1}{n}_{2}{\bf{k}}{\bf{Q}}| {K}^{{\rm{x}}}| {n}_{3}{n}_{4}{{\bf{k}}}^{{\prime} }{\bf{Q}}\rangle \\ &=&\mathop{\sum}\limits _{{\bf{G}}}{M}_{{n}_{1}{n}_{2}}({\bf{k}},{\bf{Q}},{\bf{G}})v({\bf{Q}}+{\bf{G}}){M}_{{n}_{3}{n}_{4}}^{* }({{\bf{k}}}^{{\prime} },{\bf{Q}},{\bf{G}}),\end{array}$$
(4)

and a direct interaction mediated by the screened Coulomb interaction

$$\begin{array}{rcl}&&{K}_{{n}_{1}{n}_{2}{\bf{k}}{n}_{3}{n}_{4}{{\bf{k}}}^{{\prime} }}^{{\rm{d}},{\bf{Q}}}:= \langle {n}_{1}{n}_{2}{\bf{k}}{\bf{Q}}| {K}^{{\rm{d}}}| {n}_{3}{n}_{4} {{\bf{k}}}^{{\prime} }{\bf{Q}}\rangle \\ &=&-\mathop{\sum}\limits_{{\bf{G}}{{\bf{G}}}^{{\prime} }}{M}_{{n}_{1}{n}_{3}}({{\bf{k}}}^{{\prime} }+{\bf{Q}},{\bf{q}},{\bf{G}}){W}_{{\bf{G}}{{\bf{G}}}^{{\prime} }}({\bf{q}}){M}_{{n}_{2}{n}_{4}}^{* }({{\bf{k}}}^{{\prime} },{\bf{q}},{{\bf{G}}}^{{\prime} }).\end{array}$$
(5)

Here, the matrices are written in the electron-hole pair basis, where \(| n{n}^{{\prime} }{\bf{k}}{\bf{Q}}\left.\right\rangle ={| n{\bf{k}}+{\bf{Q}}\left.\right\rangle }_{{\rm{electron}}}\otimes {| {n}^{{\prime} }{\bf{k}}\left.\right\rangle }_{{\rm{hole}}}\); v(Q + G) denotes the bare Coulomb interaction, where G is a reciprocal lattice vector representing how much the momentum transfer exceeds the first Brillouin zone; \({W}_{{\bf{G}}{{\bf{G}}}^{{\prime} }}({\bf{q}})\) is the screened Coulomb interaction; \({\bf{q}}={\bf{k}}-{{\bf{k}}}^{{\prime} }\); and \({M}_{n{n}^{{\prime} }}({\bf{k}},{\bf{q}},{\bf{G}}):= \langle n{\bf{k}}+{\bf{q}}| {{\rm{e}}}^{{\rm{i}}({\bf{q}}+{\bf{G}})\cdot {\bf{r}}}| {n}^{{\prime} }{\bf{k}}\rangle\). In the current work (and previous first principles implementations66,69), the electron and hole states are Kohn-Sham (KS) states70 derived from density functional theory (DFT) in a plane-wave basis.

Fig. 1: Feynman diagrams illustrating the relation between TD-aGW and BSE.
figure 1

Double solid black lines are electron or hole quasiparticle propagators. Double blue dashed lines correspond to RPA-screened Coulomb interactions, and single blue dashed lines correspond to the bare Coulomb interaction. A cross means that the two ends connected to the cross are at the same time, corresponding to the calculation of physical observables like the dipole moment. Orange wavy lines correspond to the external driving field. a Linear response of TD-aGW with (top) and without (bottom) the coupling to the external field. Note that the linear response of TD-aGW is a ladder diagram approximation which is equivalent to GW-BSE. b \({K}_{{n}_{1}{n}_{2}{\bf{k}}{n}_{3}{n}_{4}{{\bf{k}}}^{{\prime} }}^{{\bf{Q}}}\) is the irreducible BSE kernel in the electron-hole pair basis. It is composed of a screened direct term and an unscreened exchange term that correspond exactly to the ladder diagrams in (a). c Example of a nonlinear diagram of TD-aGW contributing to the nonlinear response of \({\delta}{\rho}_{c_{2}v_{1}}\) involving \({K}_{{c}_{1}{c}_{2}{\bf{k}}{c}_{1}{c}_{2}{{\bf{k}}}^{{\prime} }}\), which will never appear in the equilibrium BSE. d Schematic of the physical process in (c): the external driving first causes occupation of the conduction bands (corresponding to the blue highlighted part of the conduction band), and then an electron-hole pair between two conduction bands form and coherently transitions to another electron-hole pair due to renormalization by \({K}_{{c}_{1}{c}_{2}{\bf{k}}{c}_{1}{c}_{2}{{\bf{k}}}^{{\prime} }}\). The polarization of the latter electron-hole pair (where the electron is on c2 and the hole is on v1) is then given by (c).

We note that the equilibrium BSE is a ladder diagram approximation and can be viewed as a self-energy correction to the free electron-hole pair, and therefore only matrix elements in the form of \({K}_{cv{c}^{{\prime} }{v}^{{\prime} }}\), \({K}_{cv{v}^{{\prime} }{c}^{{\prime} }}\), \({K}_{vc{v}^{{\prime} }{c}^{{\prime} }}\) and \({K}_{vc{c}^{{\prime} }{v}^{{\prime} }}\) appear (indexing convention in Fig. 1b), corresponding to interactive renormalization of the resonant and anti-resonant parts of the free electron-hole pair propagator, where c and v run over the conduction subspace and the valence subspace of the system, respectively64,65. Nonlinear TD-aGW, on the other hand, does not have such constraints, and the whole \({K}_{{n}_{1}{n}_{2}{\bf{k}},{n}_{3}{n}_{4}{{\bf{k}}}^{{\prime} }}^{{\bf{Q}}}\) matrix contributes to the time evolution (Fig. 1c). The intuition is that in nonlinear response, the external driving field can first pump electrons to conduction bands and then create electron-hole pairs from the occupied conduction band (Fig. 1d). In non-equilibrium BSE, which can be seen as the incoherent limit of TD-aGW in a pump-probe scheme, these components also appear14.

SVD analysis of electron-hole kernel

The electron-hole kernel matrix has a size of \({N}_{{\rm{b}}}^{4}{N}_{{\bf{k}}}^{2}{N}_{{\bf{Q}}}\), where Nb, Nk and NQ denote the number of bands, the number of k-points, and the number of momenta Q of the electron-hole pairs, respectively. For a typical solid-state semiconductor, Nk is on the order of 103–106 to capture fine features of the exciton wavefunction arising from dispersive bands, while the number of bands Nb is 1–102, for features of interest near the bandedge71,72,73,74. Here, we start by only considering coupling to an external field in the dipole approximation, so NQ = 1, but more generally, when scattering to finite momenta is allowed NQ is of the same order as Nk. The kernel matrix is therefore numerically demanding to evaluate and also poses computational challenges for downstream tasks.

In the equilibrium BSE, the exciton envelope wavefunction is often localized in a low-energy region of the Brillouin Zone (i.e., a valley) so that the BSE can be solved on a nonuniform patch in k-space73,74,75,76,77,78. However, in TD-aGW (Eq. (2)), the external field can drive intraband transitions for the excited electrons and holes, and their motion may no longer be restricted to a single clearly-defined region of k-space, determined by the exciton eigenstate (Fig. 1d). The occurrence of exciton patches nonetheless hints at a low-rank structure of the BSE kernel (SI §2), which is explored in this section.

We start by focusing on the photo-excited electron-hole kernel \({K}_{0}:= {K}_{{n}_{1}{n}_{2}{\bf{k}},{n}_{3}{n}_{4}{{\bf{k}}}^{{\prime} }}^{{\bf{Q}} = 0}\in {{\mathbb{C}}}^{{N}_{{\bf{k}}}{N}_{{\rm{b}}}^{2}\times {N}_{{\bf{k}}}{N}_{{\rm{b}}}^{2}}\) of monolayer MoS2, a prototypical 2D material known for hosting strong exciton effects72 (See Methods for more details). K0 can be separated into contributions from transitions between different bands, which we refer to as channels \({K}_{0}^{{\rm{ch}}}\in {{\mathbb{C}}}^{{N}_{{\bf{k}}}\times {N}_{{\bf{k}}}}\), where the channel, ch, is defined as (n3, n4) → (n1, n2). A detailed analysis of individual channels (see SI §2) reveals that off-diagonal channels \({K}_{0}^{{\rm{off}}\,\text{-}\,{\rm{ch}}}\), involving transitions between different sets of bands (i.e., n1, n2 ≠ n3, n4), are low-rank. However, diagonal channels \({K}_{0}^{{\rm{diag}}\,\text{-}\,{\rm{ch}}}\) between the same set of bands (i.e., n1, n2 = n3, n4) are close to full-rank because of large diagonal (\({\bf{k}}={{\bf{k}}}^{{\prime} }\)) elements. As a result, the entire K0 has a full-rank structure, making it difficult to achieve significant compression when applying a naive SVD directly to K0.

To address this, we further decompose K0 into parts that are diagonal in k, \({K}_{0}^{{\rm{diag}}}\) (i.e., \({n}_{1},{n}_{2},{\bf{k}}={n}_{3},{n}_{4},{{\bf{k}}}^{{\prime} }\)), and off-diagonal in k, \({K}_{0}^{{\rm{off}}}\) (i.e., \({n}_{1},{n}_{2},{\bf{k}}\ne {n}_{3},{n}_{4},{{\bf{k}}}^{{\prime} }\)). \({K}_{0}^{{\rm{diag}}}\) contributes to most of the non-zero singular values but only accounts for a small portion \(1/{N}_{{\rm{b}}}^{2}{N}_{{\bf{k}}}\) of the entire K0. The primary computational challenge lies in \({K}_{0}^{{\rm{off}}}\), which is low rank and can be effectively compressed by SVD. The final low-rank approximation is achieved by performing SVD on \({K}_{0}^{{\rm{off}}}\) and retaining only the first z largest singular values so that the kernel is approximated as

$${\tilde{K}}_{0}\approx {K}_{0}^{{\rm{diag}}}+{\tilde{U}}_{0}{\tilde{M}}_{0}{\tilde{V}}_{0}^{T}$$
(6)

where \({\tilde{M}}_{0}\in {{\mathbb{C}}}^{z\times z}\), \({\tilde{U}}_{0}\in {{\mathbb{C}}}^{{N}_{{\rm{b}}}^{2}{N}_{{\bf{k}}}\times z}\) and \({\tilde{V}}_{0}^{T}\in {{\mathbb{C}}}^{z\times {N}_{{\rm{b}}}^{2}{N}_{{\bf{k}}}}\) are the truncated singular value and singular vector matrices for \({K}_{0}^{{\rm{off}}}\). This low-rank approximation can achieve a compression rate of \(z/{N}_{{\rm{b}}}^{2}{N}_{{\bf{k}}}\), where z is numerically determined from the computational convergence of K0 reconstructed from \({\tilde{K}}_{0}\). The sorted singular values of both the entire K0 and \({K}_{0}^{{\rm{off}}}\) are shown in Fig. 2a. After an initial rapid decay, the singular values of K0 converge to a non-zero value, preventing accuracy-preserving truncation. The singular values of the \({K}_{0}^{{\rm{off}}}\), on the other hand, rapidly decay to zero, thus supporting our assumption that (6) is an efficient low-rank approximation of K0.

Fig. 2: SVD compression of the BSE kernel.
figure 2

a Decay of sorted singular values for the electron-hole kernel K0 of monolayer MoS2. The red dots represent the singular values of the full K0, and the blue dots represent the singular values of the off-diagonal part \({K}_{0}^{{\rm{off}}}\). b Zoomed in logscale decay of the sorted singular values for each channel \({K}_{0}^{{\rm{ch}}}\). The red dots represent the diagonal channels \({K}_{0}^{{\rm{diag}}\,\text{-}\,{\rm{ch}}}\), and the blue dots represent the off-diagonal channels \({K}_{0}^{{\rm{off}}\,\text{-}\,{\rm{ch}}}\). c–e Heat maps of the minimum ratio of singular values required to reconstruct each channel with an accuracy of R2 = 0.98 for K0, the direct interaction \({K}_{0}^{{\rm{d}}}\), and the exchange interaction \({K}_{0}^{{\rm{x}}}\). The mean truncation ratios for the SVD of the \({K}_{0}^{{\rm{off}}\,\text{-}\,{\rm{ch}}}\) are 12%, 14%, and 2% respectively. f The minimum number of z (preserved singular values) for reconstructing K0 starting from different k-grids of 12 × 12 × 1, 24 × 24 × 1, 36 × 36 × 1, and 48 × 48 × 1. The reconstruction accuracy ranges from R2 = 0.80 to 0.98, represented by different colors. The blue labels 31%, 15%, 12%, and 8% denote compression rate with reconstruction accuracy of R2 = 0.98 for different k-grids.

Next, to quantitatively analyze the rank of each individual channel in K0, we conduct channel-wise SVD on each \({K}_{0}^{{\rm{ch}}}\) over k and \({{\bf{k}}}^{{\prime} }\), then preserve the \({z}^{{\prime} }\) largest singular values:

$$\tilde{{K}_{0}^{{\rm{ch}}}}\approx {\tilde{U}}_{0}^{{\rm{ch}}}{\tilde{M}}_{0}^{{\rm{ch}}}{\left({\tilde{V}}_{0}^{{\rm{ch}}}\right)}^{T}$$
(7)

where \({\tilde{M}}_{0}^{{\rm{ch}}}\in {{\mathbb{C}}}^{{z}^{{\prime} }\times {z}^{{\prime} }}\), \({\tilde{U}}_{0}^{{\rm{ch}}}\in {{\mathbb{C}}}^{{N}_{{\bf{k}}}\times {z}^{{\prime} }}\) and \({({\tilde{V}}_{0}^{{\rm{ch}}})}^{T}\in {{\mathbb{C}}}^{{z}^{{\prime} }\times {N}_{{\bf{k}}}}\) are the truncated singular value and singular vector matrices for each \({K}_{0}^{{\rm{ch}}}\). Figure 2b shows the decay of the singular values for each \({K}_{0}^{{\rm{ch}}}\). As previously noted, the singular values of \({K}_{0}^{{\rm{off}}\,\text{-}\,{\rm{ch}}}\) decay much faster than the \({K}_{0}^{{\rm{diag}}\,\text{-}\,{\rm{ch}}}\). Figure 2c shows the minimum ratio of singular values \({z}^{{\prime} }/{N}_{k}\) required to reconstruct each \({K}_{0}^{{\rm{ch}}}\) with a high accuracy of R2 = 0.98 (\({R}^{2}=1-{\Sigma }_{i}{({y}_{i}-{\hat{y}}_{i})}^{2}/{\Sigma }_{i}{({y}_{i}-{\bar{y}}_{i})}^{2}\), where yi and \({\hat{y}}_{i}\) are real and predicted values). The \({K}_{0}^{{\rm{diag}}\,\text{-}\,{\rm{ch}}}\) requires a high SVD truncation ratio of 95% for reconstruction, while the mean truncation ratio for the \({K}_{0}^{{\rm{off}}\,\text{-}\,{\rm{ch}}}\) is only 12%. Figure 2d, e respectively show the channel-wise truncation rate of the direct \({K}_{0}^{{\rm{d}}}\) and exchange \({K}_{0}^{{\rm{x}}}\) interaction only. Notably, the exchange interaction is highly compressible, requiring a truncation ratio of only 2%. The direct term alone is less compressible with a mean truncation ratio of 14% for off-diagonal channels.

Eqs. (6) and (7) can both be utilized to compress the BSE kernel and accelerate TD-aGW simulations. As the number of bands increases, the SVD compression scheme described in Eq. (6) poses challenges for both calculations, which scale as \({\mathcal{O}}({({N}_{{\rm{b}}}^{2}{N}_{{\bf{k}}})}^{3})\), and memory. Alternatively, the time complexity of channel-wise SVD (Eq. (7)) is \({\mathcal{O}}({N}_{{\rm{b}}}^{4}{({N}_{{\bf{k}}})}^{3})\), and a high overall SVD compression rate can still be achieved as non-compressible diagonal channels only account for \(1/{N}_{{\rm{b}}}^{2}\) of K0. Moreover, channel-wise SVD is inherently well-suited for parallel implementation and scalability.

Lastly, we investigate the SVD compression for different sizes of the k-grid. Figure 2f shows the average minimum number of z required to reconstruct \({K}_{0}^{{\rm{off}}\,\text{-}\,{\rm{ch}}}\) for varying k-grid sizes. Notably, for a given accuracy requirement, the number of z tends to converge as the size of the k-grid increases, indicating that additional k-points do not proportionally contribute new information to the K0. This key observation ensures computational feasibility even when an extremely dense k-grid is required for BSE and TD-aGW, where k-grid convergence is a major bottleneck. Moreover, since z remains independent of Nk, the compression ratio \(z/{N}_{{\rm{b}}}^{2}{N}_{{\bf{k}}}\) will decrease with increasing Nk, significantly accelerating calculations with large k-grids. We explain this by the fact that excitonic binding has a limited interaction range (SI §5).

SVD acceleration of TD-aGW

Next, we apply the SVD-compressed kernel to accelerate TD-aGW calculations. In Eq. (2), the time-dependent Hamiltonian H(t) = HQP − eE(t) ⋅ r + δΣ(t) is updated at each time step. The major computational bottleneck lies in evaluating the change in the COHSEX self-energy δΣ(t) (Eq. (3)), which has a complexity of \({\mathcal{O}}({N}_{{\rm{b}}}^{4}{N}_{{\bf{k}}}^{2})\) per time step. Further analysis of the time complexity is given in SI §1. We can reduce this complexity by taking advantage of the low-rank structure of \({K}_{0}^{{\rm{off}}}\). We plug the SVD truncated kernel (Eq. (6)) into the TD-aGW Hamiltonian:

$$\begin{array}{lll}\tilde{H}(t)\;\approx \;{H}^{{\rm{QP}}}-e{\bf{E}}(t)\cdot {\bf{r}}+{K}_{0}^{{\rm{diag}}}\delta \rho (t)+{\tilde{U}}_{0}{\tilde{M}}_{0}{\tilde{V}}_{0}^{T}\delta \rho (t)\\ \qquad\;\;=\;{H}^{{\rm{diag}}}(t)+{\tilde{S}}_{0}\left({\tilde{V}}_{0}^{T}\delta \rho (t)\right)\end{array}$$
(8)

Here, the diagonal part of Hamiltonian \({H}^{{\rm{diag}}}(t)={H}^{{\rm{QP}}}+{K}_{0}^{{\rm{diag}}}\delta \rho (t)-e{\bf{E}}(t)\cdot {\bf{r}}\) is time-dependent but can be evaluated in a negligible \({\mathcal{O}}({N}_{{\rm{b}}}^{2}{N}_{{\bf{k}}})\). \({\tilde{S}}_{0}={\tilde{U}}_{0}{\tilde{M}}_{0}\) is time-independent, so the acceleration is achieved by first projecting the density matrix ρ(t) onto the low-rank subspace spanned by the truncated singular vectors \({\tilde{V}}_{0}^{T}\), then mapping it back to the full Hilbert space via \({\tilde{S}}_{0}\). The time complexity of this term is reduced to \({\mathcal{O}}({N}_{{\rm{b}}}^{2}{N}_{{\bf{k}}}\times 2z)\), accelerating TD-aGW by a factor of \({N}_{{\rm{b}}}^{2}{N}_{{\bf{k}}}/2z\) compared to the full kernel case. We note that with a given accuracy, z is almost independent of Nk in the large Nk limit, and therefore the SVD-compression approach is particularly powerful with a dense k-grid, which is needed for convergence of the BSE kernel72.

As an initial benchmark, we validate the SVD-accelerated approach by comparing the linear optical absorption spectrum obtained from TD-aGW using full K0 to \({\tilde{K}}_{0}\) with different SVD truncation ratios. We conduct TD-aGW simulations for 150 fs with a time step of 0.125 fs, and a narrow Gaussian pulse of width 0.125 fs centered at 0.125 fs is applied to excite the system. The absorption spectrum A(ω) is obtained by a Fourier transformation of the time-dependent polarization P(t)8. Figure 3a shows the linear absorption spectrum of monolayer MoS2 calculated with the full K0 and SVD subspace truncated to include 20%, 5% and 1% of singular values. Notably, the absorption spectrum with a 5% SVD subspace (resulting in a 10× speedup) can accurately reproduce the result from the full calculation with less than 0.04 eV error in the position of the first exciton peak. When the SVD truncation threshold is decreased to 1% of singular values, there is a significant blue shift of the absorption spectrum, which arises from the information loss in the direct Coulomb interaction in the kernel.

Fig. 3: Applying SVD compression to absorption spectra based on TD-aGW calculation of monolayer MoS2.
figure 3

a The optical absorption spectrum of monolayer MoS2 in arbitrary units (a.u.) is calculated in TD-aGW linear response after excitation by a narrow Gaussian electric pulse. The TD-aGW simulations are conducted with the full K0 (dark) and the \({\tilde{K}}_{0}\) that includes 20% (red), 5% (orange), and 1% (blue) of singular values. The (b) second- and (c) third-order response intensity of monolayer MoS2 to incident electric field frequencies from 0.4 to 3.2 eV using full K0 and the \({\tilde{K}}_{0}\). The external field is a sine wave with a field amplitude of 2.36 × 107 V/cm2.

The SVD-accelerated TD-aGW method is robust to different setups and parameters of the external driving field due to the preservation of the main physical interaction. To demonstrate this, we perform a parameter sweeping of the incident field frequency ω. Figure 3b–c shows the second harmonic generation (SHG) and the third harmonic generation (THG) intensity of monolayer MoS2 corresponding to plane-wave incident light with frequencies ranging from 0.4 to 3.2 eV with an interval of 0.04 eV. As we expect, the SVD-accelerated TD-aGW with 5% truncation is still enough to capture both SHG and THG spectrum patterns, while the 1% truncation kernel results in an obvious deviation from the ground truth, in line with the linear response results.

Next, to evaluate the performance of SVD-accelerated TD-aGW in the nonperturbative strong-field regime, we conduct simulations with the external field used in previous nonlinear optical experiments3 and calculations7: \({\bf{E(t)}}={{\bf{E}}}_{{\bf{0}}}\cdot {\sin }^{2}(\pi t/{T}_{{\rm{pulse}}})\cdot \sin (\omega t)\), where the electric field amplitude E0 is 2.36 × 107 V/cm2, Tpulse is 100 fs, and ω is the incident pulse frequency. The simulation length is 100 fs with a time step of dt = 0.1 fs. Figure 4a–c shows the time-dependent occupation ∑kρnnk(t)/Nk of the first conduction band c1 in monolayer MoS2 with excitation field energies in the gap (0.35 eV), resonant with the lowest bright exciton (1.80 eV), and above the continuum (2.90 eV). For the in-gap excitation (Fig. 4a), the maximum excited-state population fills less than 1.5% of the conduction band, while the occupation increases to 25% for excitation above the continuum (Fig. 4c). We note that the external driving field in this scenario is strong, and proliferation of carriers may lead to considerable non-equilibrium renormalization of screening19. Figure 4 serves primarily as a proof of principle that SVD compression is still numerically reliable when the population change in the density matrix is large, and we leave possible dynamic updates to the screening in the non-equilibrium self-energy in TD-aGW to future studies. For below-gap excitations, a SVD truncation ratio of 1% (50× speedup) is sufficient to accurately capture both the occupation and harmonic yield. When the SVD truncation ratio is lowered to 0.1%, a substantial deviation from the ground truth is observed, indicating that the \({K}_{0}^{{\rm{off}}}\) is not negligible even if it is much smaller than the \({K}_{0}^{{\rm{diag}}}\). However, as the excitation energy increases (Fig. 4b, c), the numerical deviation between the 0.1% SVD-accelerated TD-aGW from the ground truth becomes smaller. A possible explanation is that \({K}_{0}^{{\rm{off}}}\) is more important for in-gap excitations, where the small energy of the incident photon can drive transitions between different valence bands or conduction bands, whereas the larger driving fields are resonant with specific valence to conduction band transitions.

Fig. 4: Applying SVD compression to high-harmonic generation spectroscopy of monolayer MoS2.
figure 4

TD-aGW simulations in the strong-field regime with the full K0 (black) vs. SVD subspace with different singular value truncation ratios of 5% (red), 1% (orange), and 0.1% (blue). Time-dependent electronic occupation of the first conduction band c1 in MoS2 with driving field frequencies of (a) 0.35, (b) 1.80, and (c) 2.90 eV. The occupation percentage is computed as \(\frac{1}{{N}_{{\bf{k}}}}{\sum }_{{\bf{k}}}{\rho }_{{c}_{1},{c}_{1},{\bf{k}}}(t)\), where Nk is the number of k points. d–f The high-harmonic generation (HHG) spectrum for the same driving field energies as (a–c), respectively.

Importantly, compared to previous data-driven methods and operator learning approaches for solving the KBE43,44, our SVD-accelerated TD-aGW avoids the time-accumulated errors that quickly cause the prediction to diverge from the ground truth. This arises from two key factors: i) the SVD-accelerated TD-aGW leverages the linear low-rank structure of the kernel matrix to retain most of the physical couplings for non-equilibrium dynamics; and ii) our SVD compression is independent of the external driving field, whereas supervised learning methods may struggle to capture dynamic behaviors driven by different external fields that deviate significantly from the training set.

To demonstrate the effectiveness of SVD-accelerated TD-aGW in the frequency domain, we calculate the high-harmonic generation (HHG) spectrum by \(\,\text{HHG}\,(\omega )={\left\vert \omega \int{\bf{J}}(t){{\rm{e}}}^{-{\rm{i}}\omega t}{\rm{d}}t\right\vert }^{2}\)7, where the time-dependent current is calculated by \({\bf{J}}(t)={\rm{Tr}}(\rho (t){\bf{v}})\) and v is the velocity operator. Figure 4d shows the logarithmic magnitude of the HHG spectrum with an incident pulse frequency of ω = 0.35 eV. The HHG calculation using 5% truncated kernel agrees well with the ground truth. The 0.1% truncation can only reproduce the intensity of the 1st-order response, but the relative strength of the high harmonics are still faithfully reproduced. As shown in Fig. 4e, f, when the incident photon frequencies are increased to 1.8 and 2.9 eV, even the low truncation ratio of 0.1% can still agree well with the ground truth for the high-harmonic generation due to the decreased contribution from the \({K}_{0}^{{\rm{off}}}\) in the high-frequency region of the incident pulse. We note, however, that while the above gap pump is an informative numerical test, it does not reflect physically realistic conditions, as the pump intensity required to produce high harmonics would burn the sample. For the parameter sweeping of the angle-dependent HHG spectrum see SI §3.

SVD compression for GW-BSE within TDA

Up to now, we have used monolayer MoS2 as a testbed for SVD compression. Here, we explore material dependence and demonstrate that the optical absorption can be reliably calculated with \({\tilde{K}}_{0}\) across extended crystalline materials. In Fig. 5a, b, we apply channel-wise SVD to the GW-BSE kernel in the Tamm-Dancoff approximation (TDA) for monolayer MoS2 and black phosphorus, which are both 2D semiconductors with exciton binding energies on the order of several hundred meVs and exciton wavefunctions confined to low-energy valleys in k-space75,76,79,80. Calculation details are given in Methods. As Fig. 5a, b illustrates, electron-hole kernel matrices of MoS2 and black phosphorus exhibit similar low-rank structure, with an average 5% SVD compression of \({K}_{0}^{{\rm{off}}}\) being sufficient to accurately reproduce the linear absorption spectra.

Fig. 5: Comparison between results of SVD compression to absorption spectrum calculation using equilibrium BSE in various materials.
figure 5

The absorption spectrum calculated through the equilibrium GW-BSE formalism within the Tamm-Dancoff approximation of (a) monolayer MoS2, (b) monolayer black phosphorus, and (c) a Benzene molecule. In (a, b) SVD compression is conducted over the momentum space \(| {\bf{k}}\left.\right\rangle\) of \({K}_{0}^{{\rm{ch}}}\) for crystal systems. The BSE is calculated with the full K0(black), \({K}_{0}^{{\rm{diag}}\,\text{-}\,{\rm{ch}}}\)+5%\({K}_{0}^{{\rm{off}}\,\text{-}\,{\rm{ch}}}\)(red), and \({K}_{0}^{{\rm{diag}}\,\text{-}\,{\rm{ch}}}\) only (blue). c SVD compression is conducted over the band-to-band transition channels \(| {n}_{1},{n}_{2}\left.\right\rangle\) of the molecule system due to lack of periodicity. The BSE is calculated with the full K0(black), \({K}_{0}^{{\rm{diag}}\,\text{-}\,{\rm{ch}}}\)+50% \({K}_{0}^{{\rm{off}}\,\text{-}\,{\rm{ch}}}\)(red), +20% \({K}_{0}^{{\rm{off}}\,\text{-}\,{\rm{ch}}}\)(orange), +5% \({K}_{0}^{{\rm{off}}\,\text{-}\,{\rm{ch}}}\)(blue).

We have studied electron-hole interactions in extended systems, where we perform a channel-wise SVD compression over k-points. Lastly, we consider a benzene molecule in the gas phase, which lacks periodicity but converges slowly with respect to the number of bands81. Here, without the crystal momentum k degree-of-freedom, the SVD can be only applied to the transition channels \(| {n}_{1},{n}_{2}\left.\right\rangle\). However, even with a modest compression rate of 50%, compression over bands introduces unphysical peak splitting, highlighting that while SVD compression works well over k-points, it is fundamentally unsuitable for molecular systems that lack periodicity.

Finally, we note that SVD compression can be trivially generalized to a finite momentum BSE kernel (SI §4), suggesting a route for further accelerating dynamcis calculations simulating scattering between finite momentum states.

Discussion

In summary, we reveal a low-rank structure of the electron-hole kernel, which arises due to the localization of excitonic effects in momentum space in crystalline systems. As a consequence of the low-rank structure, we show that the full kernel matrix can be compressed to at least 5% of its original size, in semiconductors with exciton binding energies on the order of a few hundreds of meV. Notably, we show that the size of the subspace of singular values kept after SVD compression, capturing key information about the electron-hole kernel, tends to converge as the k-grid size increases, ensuring computational tractability even for dense k-grids. Leveraging this general low-rank property, we apply it to accelerate non-equilibrium TD-aGW calculations, achieving a speedup of an order of magnitude in calculating the time-dependent density matrix, angle-resolved HHG in the strong-field regime, and optical absorption spectrum in the weak-field limit for monolayer MoS2. Here, we demonstrate SVD acceleration in the context of zero-momentum excitations in TD-aGW, but our formalism can be trivially generalized to finite momentum BSE calculations, where we show that 5% of SVD kernel data can generate electron loss spectroscopy comparable to the ground truth (see SI §4). Our method avoids time-accumulated prediction errors seen in other data-driven and operator learning methods for dynamic calculation of nonequilibrium many-body effects. It also requires no training data and is agnostic to the choice of crystal system and external driving field, regardless of whether the driving field lies above or below the bandgap or in the weak or strong-field regime. Our approach provides a robust, efficient, and broadly applicable framework for accelerating many-body calculations, offering new possibilities for studying nonequilibrium excitonic and electronic dynamics with high accuracy and reduced computational cost.

Methods

DFT and equilibrium G W-BSE calculations

In this work, all DFT calculations are performed using Quantum ESPRESSO69 with the Perdew-Burke-Ernzerhof (PBE) functional82, and all GW-BSE calculations are performed using BerkeleyGW66,67,83. For zero-momentum GW-BSE calculations of MoS2, the planewave cutoff for the DFT kinetic wavefunctions, dielectric matrix used in GW calculations, and kernel matrix are 95, 20, and 5 Ry respectively. 6000 bands are included in constructing the dielectric matrix. The K0 of MoS2 includes 2 conduction and 2 valence bands, which are sampled on a uniform k-grid of 36 × 36 × 1 over the first Brillouin zone. For black phosphorus, the planewave cutoffs for the DFT kinetic energy, RPA dielectric matrix, and kernel matrix are 95, 20, and 5 Ry respectively. 1000 bands are included in constructing the dielectric matrix. The K0 of black phosphorus includes 4 valence and 4 conduction bands, which are sampled on a k-grid of 20 × 20 × 1. For the benzene molecule, the DFT planewave cutoff energy is 90 Ry. 600 bands are included in evaluating the dielectric matrix and kernel matrix with a planewave cutoff energy of 40 and 10 Ry respectively.

TD-aGW

The theory of TD-aGW has been outlined in the main text. The implementation we use is the same as that in refs. 8,9. The time-integration is performed with a time step of 0.1 fs using a fourth order Runge-Kutta method.

Long range exchange and the definition of response functions

The G = 0 contribution in Eq. (4), which corresponds to the long-range, macroscopic Coulomb interaction, is often zeroed out in actual simulations so that the E field in Eq. (2) is the total macroscopic electric field (including the macroscopic electric field radiated by the electrons in the material) and not just the external electric field. This means the linear and nonlinear response from Eq. (2) are calculated with respect to the total macroscopic electric field, which is consistent with the way material-specific constitutive relations are defined in electrodynamics in media, where D = ϵ0E + P(E) and E is the total electric field, with no distinction between the “external” electric field and the electric field from the material’s response being made. The linear response from Eq. (2) (which is equivalent to BSE; see the discussion in the main text) with the long-range Coulomb interaction zeroed out, therefore, directly gives the dielectric function, while the linear response from Eq. (2) with the long-range Coulomb interaction corresponds to the inverse of the dielectric function. This fact is also discussed in SI § 4.