Solving Nonlinear Conservation Laws of Partial Differential Equations Using Graph Neural Networks

Nonlinear Conservation Laws of Partial Differential Equations (PDEs) are widely used in different domains. Solving these types of equations is a significant and challenging task. Graph Neural Networks (GNNs) have recently been established as fast and accurate alternatives for principled solvers when applied to standard equations with regular solutions. There have been few investigations on GNNs implemented for complex PDEs with non-linear conservation laws. Herein, we explore GNNs to solve the following problem


Introduction
Machine learning methods have been widely used to solve PDEs in science and engineering, for example, aerodynamics [16,3], electromagnetism [13], geophysics [17] and weather prediction [1], etc.According to [18,7,14,10], GNNs have recently been introduced and made much progress in this area, offering faster runtimes than principled solvers.Compared to grid-based convolutional neural networks(CNNs) [20,21], GNNs demonstrated better adaptivity in the simulation scenarios [22,2,5].However, most currently published papers use GNNs to resolve PDEs with regular solutions, such as the Wave equation, the Poisson's equation, and the Navier-Stokes equations.These tests have demonstrated that GNNs can resolve these partial differential equations with high efficiency and precision.How would GNNs perform if we try to use it in the context of nonlinear conservation laws?
In this paper, we implement GNN variants to solve PDEs with nonlinear flux function f (u, β) that is involved in general scalar nonlinear conservation laws.We restrict to the one-dimensional case given by u t + f (u, β) x = 0 (1) where u = u(x, t) is the main variable and β is a parameter of a physical phenomenon.We train GNNs based on observation data u(x j , t i , β k ) on a spatial grid x j , j = 1, . . ., N x at specified times t i , i = 1, . . ., N obs with some specific parameters β k , k = 1, . . ., N β .Using learned GNNs, we pre-dict the solutions of Eq. ( 1) given new values of parameter β and initial state u 0 .Compared to equations with regular solutions, Eq. ( 1) has some characteristics that make it more challenging to solve.Typically, Eq. ( 1) generates shock wave solutions u(x, t) in finite time, i.e., solutions that contain one or more discontinuities expressed as a jump (u L , u R ) with u L ̸ = u R , though the initial value u 0 (x) is smooth [8,6].In particular, the specific form of f (u) in the interval [min(u L , u R ), max(u L , u R )] is not used in the construction of the entropy solution, only the slope . As jumps arise and disappear in the solution over the period for which observation data is collected, the data may lack information about f (u).This situation is illustrated in Fig. 1.In the left panel, we plot the flux function f (u) = u 2 /(u 2 + (1 − u) 2 ).In the right panel, the entropy solution is shown after a time T = 0.5.At the time t = 0, the initial data u 0 (x) involves one jump at x = 0 and another at x = 1.The initial jump at x = 0 is instantly transformed into a solution that is a combination of a continuous wave solution (rarefaction wave) and a discontinuous wave (u L , u R ) ≈ (0.3, 1.0), as dictated by the lower convex envelope shown in the left panel (green curve) [8].Similarly, the initial jump at x = 1 is transformed into a solution that is a combination of a continuous wave solution (rarefaction wave) and a discontinuous wave (u L , u R ) ≈ (0.7, 0), by the upper concave envelope illustrated in the left panel (brown curve) [8].From this example, we see that we have no observation data that directly can reveal the shape of f (u) in the interval u ∈ [0.3, 0.7] (approximately).

Related Works
Several machine learning approaches are proposed to solve the PDEs.Raissi et al. [15] introduced the physics informed neural networks (PINNs) for solving solutions of PDEs and learning the parameters in PDEs.However, the neural network methods struggle to learn the nonlinear hyperbolic PDE that governs two-phase transport in porous media [4].They experimentally indicated that this shortcoming of PINNs for hyperbolic PDEs is not due to the specific architecture or to the choice of the hyperparameters, but rather to the lack of regularity in the solution.Long et al. [11,12] proposed a PDE-Net that combines numerical approximations of differential operators and a symbolic multi-layer neural network.They employed convolutions to approximate differential operators and deep networks to approximate the nonlinear response.However, in our case, f (u) x can not be written by f ′ (u)u x as f (u) is not in general a differentiable function in our problem.Recently, more and more GNNs methods are being used to solve PDE problems.Gao et al. presented a novel discrete PINN framework based on Graph Convolutional Networks (GCNs) and the variational structure of PDEs in [5].This framework could solve forward and inverse PDEs in a unified manner.Zhao et al. used GNNs in conjunction with autodecoder style priors to tackle PDE-constrained inverse problems [22].In [2], authors combined GNNs with some classical numerical methods, such as finite differences, finite volumes, and WENO schemes to solve PDEs problems.They experimentally proved that this method has fast, stable, and accurate performance across different domain topologies on various fluid-related flow problems.However, these current works mainly use GNNs on some typical equations with regular solutions.Several complex equations with discontinuous solutions still need to be researched.In a recent work [9], we introduced a framework coined ConsLaw-Net that combines a symbolic multi-layer neural network and an entropy-satisfying discrete scheme to learn the non-Figure 2: The pipeline of the approach.For each node c i , at time point t k , the last K solutions (u(x i , t k−K+1 ), ..., u(x i , t k−1 ), u(x i , t k )), position x i , current time t k , and equation parameters β P DE are fed into GNNs.After the operators of Encoder, P rocessor and Decoder, we get the information increment of this node c i at the next K time points: (d(x i , t k+1 ), d(x i , t k+2 ), ..., d(x i , t k+K )).Finally, we get solutions of the next K time points by linear, unknown flux function f (u) without parameter β.

Graph Neural Networks (GNNs)
In this paper, we leverage GNNs with encoder processor decoder structure to learn a fast-forward model that predicts solutions of PDEs.We model the domain X as a graph G = (V, E) with node c i ∈ V, edges l ij ∈ E. The features of node c i are remarked by f i ∈ R c and the edges l ij define local neighborhoods.An overview of the approach that we use is shown in Fig. 2.
For each node c i , the operator of Encoder maps the last K solution values (u(x i , t k−K+1 ), ..., u(x i , t k−1 ), u(x i , t k )), node position x i , current time t k , and equation parameters β P DE to node embedding vector where ϵ representing the operator of Encoder.P rocessor P rocessor contains M layers with intermediate graphs G m , m = 1, 2, ..., M .Eq. ( 3) and Eq. ( 4) is the messaging of edges and the information update of node c i , respectively.
• edge c j → c i message: • node c i update: where N (i) holds the neighbors of node c i , and ϕ and ψ are multilayer perceptrons (MLPs).Using relative positions x i − x j can be justified by the translational symmetry of the PDEs we consider.Solution differences u i − u j make sense by thinking of the message passing as a local difference operator, like a numerical derivative operator.
Decoder After P rocessor, we use a shallow convolutional network to output the K next timesteps information increment at grid point Readout After GNNs with the structure of encoder processor decoder, we get solutions of the next K time points by where 1 ≤ l ≤ K.

Data Generator
We investigate a discretization of the spatial domain [0, L] in terms of {x i } Nx−1 i=0 where x i = (1/2 + i)∆x for i = 0, . . ., N x − 1 with ∆x = L/N x .1) is based on the Rusanov scheme [8] which is expressed as with j = 2, . . ., N x − 1 and the Rusanov flux takes the form The Courant-Friedrichs-Lewy(CFL) condition [8] determines the magnitude of ∆t for a given ∆x.We detail the CFL condition in Algorithm 1.We illustrate how to learn the solution U = {u(x j , t n )} of the discrete conservation law Eq.( 6) in Algorithm 2.

Experiment Setup
In this section, we study a class of nonlinear conservation laws that are naturally from the problems where one fluid is displaced by another fluid in a vertical domain.The displacement process involves a balance between buoyancy and viscous where the parameter β ∈ (−400, 400) represents the balance between gravity (bouyancy) and viscous forces.We study the solution of Eq.  functions.We use Algorithm 2 to generate 500 samples with different values of β and initial states.The values of β are randomly selected from (0, 300) and initial states are from [0.0, 1.0].We consider the observation in terms of x-dependent data at fixed times {t * i } N obs i=1 extracted from the solution U as follows: where j = 1, . . ., N x , N x = 400 with ∆x = 0.025 and N obs = 250 with ∆t obs = 0.008.
For training, we split U sub into five parts of length 50 in terms of time.Values of u(x) for the first 25 time points are used to predict u(x) for the following 25 time points, i.e., K = 25 in Eq. ( 5).In the testing, we use the first stage, which contains 25 time points of u(x) to predict the solutions of the second stage, and then predict u(x) at the third phase based on the predicted value of the second stage.This process is repeated until T = 2.
All the examples show that the error between exact and predicted values of u(x) increases with time.Notably, the model cannot predict the underlying trend at t = 1.6 and t = 2.0 on the case selected from G 1 .While, it performs very well on G 2 and G 3 .The result is also reflected in Table 1 Fig. 5 shows another set of samples from G 1 , G 2 , G 3 with x ∈ [0, 10) and t ∈ [0.4,2.0].When the solution fluctuates, the model is prone to large prediction errors.This phenomenon is also reflected in Fig. 4.This phenomenon is probably related to the discontinuity of the solutions of 1.
Experimental results show that the model can predict more accurately when β is within a specific range.However, when β deviates too much from that used for the training model, the model's predictive power is significantly reduced.

Conclusion
In this work, we apply GNNs to solve PDEs, which are unique since their solutions usually contain shocks.Our experiments demonstrate that the GNN models can predict accurately when the parameter β is within a specific range.However, when the parameter deviates far from the value for training, the model's predictive performance drops sharply.Furthermore, the model is not very good at predicting the discontinuity of the solutions.There will be numerous small fluctuations around the discontinuity.One of our future work directions is to explore how to improve GNN's behavior when handling discontinuous solutions or develop new methods to deal with PDEs containing shocks.

Algorithm 1 :
CFL Input: L: length of the spatial domain; N x : the number of spatial grid cells; f (u): the nonlinear flux function; T : computational time period; Output: ∆t: local time interval Function CFL(L, N x , f (u), T ): ∆x = L/N x M = max u |f ′ (u)| dt = ( 3 4 ∆x)/(M + 0.0001) n time = ⌊T /dt⌋ ∆t = T /n time return ∆t; End Function Furthermore, we consider time lines {t n } Nt n=0 with N t ∆t = T .The discretization of Eq. (

Algorithm 2 :
DataGenerator Input: T : computational time period; N x : the number of spatial grid cells; L: length of the spatial domain; u 0 = {u 0 (x j )} Nx j=1 : initial state set of dimension N x ; f (u): the flux function; Output: U = {u n j }: the solution based on initial state u 0

Figure 4 :
Figure 4: Solutions of (1) based on different initial states and values of β at time point t = 0.8, 1.2, 1.6 and 2.0, respectively.In each subplot, the solid red line is the true solution obtained by Algorithm 2, and the blue dashed line is the solution predicted by the trained GNNs.

Figure 5 :
Figure 5: Solutions of (1) based on different initial states and values of β at time t ∈ [0.0, 2.0].The left subplot is the real solutions of Eq. (1).The middle subplot is the solutions generated by GNNs, and the right subplot is the error of real and predicted u(x, t).

Table 1 :
The value of MSE for each group.We calculate MSE of the true and predicted u(x, t) of all samples with x ∈ [0, 10] and t ∈ [0.4,2.0]in each group.