Grid functions

Other than series (time and frequency), grid functions are probably the other most important quantity that we extract from simulations with the Einstein Toolkit. In this page, we describe how to use kuibit to handle grids. (Reference on kuibit.grid_data). The main object that we use to represent grid function is the HierarchicalGridData. This represents data defined on a grid with multiple refinement levels. On each level, data is represented as UniformGridData. While you will likely never initialize these objects directly, it is useful to be aware of what they are and what they can do. If you want to know how to read data into kuibit, jump to the second half of this page.

UniformGrid

The most basic concept that we need to work with grid function is the concept of UniformGrid, which represents an uniform cell-centered Cartesian grid in arbitrary number of dimensions. An object of type UniformGrid is immutable and is defined by the location of the origin of the grid, by the number of points along each dimension, and by either the opposite corner or the spacing. UniformGrid are important because they are are the building blocks of grids with refinement levels and because they are the most natural grid to plot.

Let’s see how to define a UniformGrid:

import kuibit.grid_data as gd

box = gd.UniformGrid(
            [101, 201],  # shape: 101 (201) points along the x (y) direction,
            [0, 0],  # origin, at 0, 0 (cell-centered)
            x1=[10, 20]  # other corner, at (10, 20)
            )

This is a two dimensional grid where the bottom left corner is (0, 0), and the top right one is (10, 20). There are 101 points on the x direction and 201 on the y. The grid is cell-centered, so the coordinates of the points will be the integers. Instead of specifying the other corner with respect to the origin, you can specify the size of each cell by providing the dx parameter.

box = gd.UniformGrid(
            [101, 201],  # shape: 101 (201) points along the x (y) direction,
            [0, 0],  # origin, at 0, 0 (cell-centered)
            dx=[1, 1]  # cell size
            )

UniformGrid are used as part of grids with refinement levels, so they can house additional information, like time, num_ghost, ref_level. In most cases, it is not necessary to work directly with these quantities.

Some useful attributes to know about UniformGrid are: - x0 and x1 return the two corners of the grid, - dx or delta return the cell size, - dv returns the volume of a cell, and volume returns the total volume, - num_dimensions returns how many dimensions are in the grid, num_extended_dimensions returns how many dimensions are in the grid with more than one grid point.

When you initialize a grid with a flat dimension, you must specify x0 and dx (you cannot do it by specifying x0 and x1, because there is no x1!). In general, prefer providing x0 and dx instead of x0 and x1.

You can use the in operator to see if a coordinate is inside the grid. The operation considers the size of the cell, for example

box = gd.UniformGrid([101, 201], [0, 0], delta=[1, 1])

[5, 5] in box  # True
[-1, 2] in box  # False

The contains() is syntactic sugar for the same operation.

To obtain all the coordinates in the grid, you can use the coordinates() method. This can be used in three different ways. When called with no arguments, the output is a list of 1D arrays. Each of these arrays contains the coordinates along a fixed axis. For example, for the 2D grid, the first array will be the x coordinates, the second the y. Finally, with as_meshgrid=True, the return value will be a NumPy meshgrid. This is useful for plotting. When as_same_shape=True the return value is a list of coordinates with the same shape of the grid itself, each element of this list is the value of that coordinate over the grid. This last one is the most useful way to do computations that involve the coordinates. You can obtained the coordinate as a list of coordinates along each direction also with the method coordinates_1d().

To obtain a coordinate from a multidimensional index, just use the bracket operator (box[i, j]).

UniformGrid may have dimensions that are only one point (e.g., when simulating a plane). We call extended_dimensions those that have more than one grid point. You can return a new UniformGrid with removed all the dimensions that are not extended using the method flat_dimensions_removed.

You return a new UniformGrid with coordinates shifted with shifted(). You can also remove the ghost zones with ghost_zones_removed(). This will return a new UniformGrid with no ghost zones.

You can also print a UniformGrid object to have a full overview of the properties of the grid.

The functions coordinates_to_indices() and indices_to_coordiantes() can be used to convert from indices to coordinates for the considered grid. You can pass single points, or collection of points. If you provide coordinates, the returned indices will be those of the closest grid points.

UniformGridData

Once we have a grid, we can define data on it. UniformGridData packs together a UniformGrid and data defined on it. This is the most basic form of a grid function. There are two ways to define UniformGridData, first from a UniformGrid and a NumPy array with matching shape, or from the details of the grid along with the data (again, as a NumPy array with matching shape):

box = gd.UniformGrid([101, 201], x0=[0, 0], delta=[1, 1])

data = np.array([i * np.linspace(1, 5, 201) for i in range(101)])

# First way
ug_data1 = gd.UniformGridData(box, data)

# Second way
ug_data2 = gd.from_grid_structure(data, x0=[0, 0], delta=[1, 1])

UniformGridData shares the same basic infrastructure as the classes TimeSeries and FrequencySeries (they are derived from the same abstract class BaseNumerical). This means that all the mathematical operations are defined, such as, adding two UniformGridData, or taking the exponential with np.exp.

ug_data3 = np.exp(ug_data1) / ug_data2

Mathematical operations are performed only if the two UniformGridData have the same underlying grid structure. UniformGridData also support N-dimensional Fourier transforms with the fourier_transform() method.

UniformGridData can be sliced to lower dimensional UniformGridData. To do this, use the meth:~.slice method. This function takes a cut paramter which is a list of the same lenght as the dimension of the data. The elements of cut are None for the dimensions you want to keep and are the coordinate of where you want to slice. For example, if you have 3D data and you want to only look at the line with x=1 and y=2, then, cut has to be [1, 2, None]. You can cut in arbitrary places and optionally enable the resample option to obtain the values with a multilinear interpolation instead of approximating the point with the closest available.

As TimeSeries, UniformGridData can be represented as splines (constant or linear). This means that the objects can be resampled or can be called as normal functions. Computing splines is an expensive operation that can take several seconds if the grid have thousands of points.

Splines allow you to use the UniformGridData as a normal function. Suppose rho is a grid function. You can either use the bracket operator to find the value of rho corresponding to specific indices (rho[i, j]), or you can call rho with the coordinate where you want to evalue it (rho(x)). When there are flat dimensions, the only possible splines are with nearest neighbors. You can use a multilinear interpolation on the extended by removing the flat dimensions with flat_dimensions_remove().

Some basic useful functions are mean(), integral(), norm1(), or norm2(). In general, there’s a norm_p(), computed as

\[\| u \|_p = \left( \Delta v \sum \|u \| \right)^{(1/p)}\]

with \(\Delta v\) being the volume of a cell.

UniformGridData can be differentiated along a direction with partial_differentiated(), or the gradient can be :py:calculated with meth:~.grid_data.UnfiromGridData.gradient. In both cases, :py:the order of the derivative can be specified. The derivative are numerical :py:with finite difference. Derivative are second order accurate everywhere.

A convenient function is sample_function(). This takes a multivariate function (e.g., \(sin(x + y)\)) and returns a UniformGridData sampling that function. If you already have the grid structure, you can use sample_function_from_uniformgrid().

Another useful function is histogram(), which can be used to compute histograms of UniformGridData with weights or without. Similarly, one can compute percentiles with percentiles(). The input of this function can either be relative (percentuals, as 0.01, 0.5, or so, if you enable relative=True), or the actual number of points.

You can resample the data to a new grid using the function resampled(), which takes as input a UniformGrid and returns a new UniformGridData resampled on the new grid. If the new grid is outside the old one, you can either raise an error, of fill the points outside with zeros. This behavior is controlled by the flag ext. When ext=1, zeros are returned, when it is 2, ValueError is raised. By default, resampled() uses a multilinear interpolation, but you can force to use a piecewise constant interpolation with the nearest neighbors by setting piecewise_constant=True.

Another useful feature is to dx_changed(), which can be used to return a new UniformGridData with different grid spacing. The new grid spacing has to be an integer multiple or an integer factor of the old one. With this function you can upsample or downsample data. This is especially useful when dealing with refinement levels, which typically have spacing related by factors of 2. dx_changed() takes an optional argument piecewise_constant to prescribe how the resampling should be done.

Often, it is useful to save a UniformGridData and read it later. UniformGridData can be saved as ASCII files with the save() method, which takes a path and writes an ASCII file to that destination. The file contains a header that specifies the grid information. The data is always saved as as 1D array (due to the limitations of the backend). These files can be read with the load_UniformGridData() function. For large datasets, it is convinent to compress the file. To do this, just provide a file extension that is compressed (e.g., .dat.gz).

To access the data (ie, for plotting), you can simply use .data. This is a standard numpy array. Alternatively, you can use the .data_xyz attribute, which swaps rows and columns (.data_xyz is coordinates-indexed, .data is matrix-indexed).

Warning

Arrays are stored row-first, so if you want to use .data, to have a natural mapping between coordinates and indices you have to transpose the data! (See, this blog post for an explanation.)

When working with 1D grid data, you can transform the UniformGridData into a GridSeries. This is different type of object that shares the same properties and infrastructure as TimeSeries and FrequencySeries. GridSeries are typically simpler and more direct to use (for example, you can plot them directly with plt.plot(rho)).

With UniformGridData, it is possible to undo reflection symmetry across a given dimension. The relevant method is reflection_symmetry_undone() (which takes the parity of the data as optional argument). This will only work if the grid crosses 0 along the given dimension and if the resulting grid is a valid equispaced grid (so 0 has to be part of the coordinates, or the data has to be such that it yields a uniform grid). Similarly, it is possible to undo rotational symmetries (pi-symmetry) using the rotation180_symmetry_undone() method. This function takes two mandatory arguments, the first is the dimension for which the data has to be filled (in Carpet it is always the x axis, so 0), and the second is the plane over which the rotation happens specified by providing the two dimensions involved (in Carpet rotation is always about the z axis, so this parameters should always be (0, 1)). The sign of the filled data can be flipped by passing -1 to the parity keyword (this is useful for vectors and tensors).

HierarchicalGridData

A HierarchicalGridData represents data defined on a mesh-refined grid. In practice, this is a collection of UniformGridData, roughly one per level. You can work directly with the UniformGridData on the different levels using the brackets operator. As for UniformGridData supports all the mathematical operations.

In many cases, one works with a nested series of refinement levels, with a domain that is split in multiple patches. Hence, the output data will also be in multiple chunks. When initializing an HierarchicalGridData, kuibit will make an effort to put all the different patches back together. If the provided components cover an entire grid, kuibit will merge them. In doing this, all the ghost zone information is discarded. If kuibit finds that the provided components do not cover a regular grid, then it will leave them untouched. This is the case when one has multiple refinement centers (for example in binary simulations). HierarchicalGridData is essentially a dictionary that maps refinement level to lists of UniformGridData that represent the different patches. In case kuibit manages to combine all the patches, then the list will have only one element. You can print a HierarchicalGridData to see what the structure looks like:

print(rho)

# The output will look like
#
# Available refinement levels (components):
# 0 (1)
# 1 (3)
# 2 (2)
# 3 (2)
# Spacing at coarsest level (0): [640. 640.]
# Spacing at finest level (3): [0.01 0.01]

You can access the relative level using the bracket operator (e.g. rho[0][0] is rho on the coarsest level on the 0th patch, which could be the only one). The two level of brackets are (in order): refinement level, then component. In many cases, the grid structure is simple and there are no multiple refinement centers, so one can access the level with :py:meth:~.get_ref_level. This method will work only if there’s a single component.

As for UniformGridData, HierarchicalGridData are callable and splines are used to interpolate to the requested points. This operation can be expensive, especially for 3D grids with many points. The way calling works is the following: we find the finest refinement level that contains the requested point, and we use the multilinear interpolation on that level (and component, if there are multiple components).

Using splines, we can also combine the various refinement levels to obtain a UniformGridData. This is often handy when plotting. The method refinement_levels_merged() does exactly that. By default, refinement_levels_merged() does not resample the data, but simply uses the values on the grid. If the argument resample is set to True, the data is resampled with a multilinear interpolation. One can also specify what grid (as UniformGridData) to merge the data on by calling the method to_UniformGridData() or to_UniformGridData_from_grid(). This is especially useful when resampling on smaller grids, because it drastically reduces the computation time.

Warning

Operations that involve resampling can be very expensive and require a lot of memory!

Another useful method is the coordinates(), which returns a list of HierarchicalGridData with the same structure as the one in consideration but with values the various coordinates at the points. This is useful for computations that involve the coordinates.

As it is the case for UniformGridData, also HierarchicalGridData can be differentiated along a direction with partial_differentiated(), or the gradient can be calculated with gradient(). In both cases, the order of the derivative can be specified. The derivative are numerical with finite difference. The result is a HierarchicalGridData or a list of HierarchicalGridData (for each direction).

Both HierarchicalGridData and UniformGridData have several useful methods. For instance, coordinates_at_maximum() can be used to find what is the coordinate where the data has its maximum.

Reading data

So far, we have discussed how grid functions are represented in kuibit. In this section, we discuss how to read the output data from simulations as HierarchicalGridData or UniformGridData.

At the moment, kuibit fully support reading HDF5 files of any dimension (1D, 2D, and 3D). kuibit can also read ASCII files, but the interface is less robus and not as well-tested.

Warning

kuibit works better with HDF5 data. In general, reading and parsing HDF5 is orders of magnitude faster than ASCII data. kuibit can read one iteration at the time in HDF5 data, but has to read the entire content of all the files when the data is ASCII. This can take a long time. HDF5 are also much more storage-efficient and contain metadata that can be used to better interpret the data (e.g., the number of ghost zones). For these reasons, we strongly recommend using HDF5 files.

Warning

The ASCII reader should be considered experimental. If reads the files line by line and will likely not fail if the data is not exactly in the format that the reader expect. You may find unexpected results. If you use the ASCII reader, make sure to test it!

Warning

The ASCII reader works by scanning all the files line by line. This can take an extremely long time if you have many files with a lot of iterations. If you want to speed up the process, consider isolating the files you are interested in working with in a separate directory, and run SimDir in that folder.

From SimDir

The easiest way to access grid data is from SimDir. SimDir objects contain an overview of the entire data content of a directory. For more information about SimDir, read Getting started with SimDir.

Assuming sim is a SimDir, the access point to grid functions is in sim.gf or sim.grid_functions. You can find all the available variables just by printing this object

print(sim.gf)

# The output will look like
#
# Available grid data of dimension 1D (x):
# ['P', 'rho', 'rho_star', 'vz', 'Bz', 'By', 'vx', 'rho_b', 'vy', 'Bx']
#
# ... and so on ...

sim.gf is an object of type GridFunctionsDir. The main role of this class is to organize the available files depending on their dimensions. So, from GridFunctionsDir you can specify what dimensions you are interested in. You can do this in two ways, as a dictionary call, or via an attribute. For example, if you are interested in 2D data on the xy plane:

# All these methods are equivalent
data2d = sim.gf.xy
data2d = sim.gf['xy']
data2d = sim.gf[(0, 1)]

In case you want a lower dimensional cut (say, you want only the y axis and you have the xy data), you can always look at higher-dimensional data and slice it to your liking, as described in the above sections. HierachicalGridData can be sliced in the same way as UniformGridData.

Once you selected the data you are interested in, you will be working with a AllGridFunctions object. This is a dictionary-like object that organizes all the variables available for the requested dimensions. You can access the variables using the bracket operator of looking in the fields attribute. In case a variable is available as HDF5 file and as ASCII file, the HDF5 representation is preferred.

# These methods are equivalent
rho = sim.gf.xy['rho']
rho = sim.gf.xy.fields.rho

In case you are reading an ASCII file, you have to set the correct number of ghost zones. The simplest way to do this is to set the num_ghost() attribute. If the output does not contain ghost zones, set them to zero.

# If rho_star is from an ASCII file, we want to set num_ghost before
# reading it
ASCII_reader = sim.gf.xy
ASCII_reader.num_ghost = (3, 3)
rho_star = ASCII_reader.rho_star

num_ghost() has to be a tuple or a list with the same number of entries as the dimensionality of the grid: each entry is the number of ghost zones along a direction.

Warning

ASCII files do not have information about how many ghost zones are in the data, so we will assume that there are none. This can lead to imperfect results in the regions of overlap between two grid patches. In the future, we will try to read this value from the parameter file.

Finally, once you selected the variable, you will have a OneGridFunctionH5 or OneGridFunctionASCII object. These are derived from the same base class OneGridFunctionBase and share the interface. The main difference is how files are read (which justifies why we need to different classes). These objects are certainly the most interesting ones and the ones you will deal with most of the time.

At first level, OneGridFunctionH5 (we will consider this for definiteness, but the most of what said here holds true for OneGridFunctionASCII) is another dictionary-like object. The keys of this class are the various iterations available in the files. Hence, to read some data at a given iteration iteration, you can simply use the bracket operator. Alternatively, you can use the get_iteration() method:

# These methods are equivalent
rho0 = sim.gf.xy.rho[0]
rho0 = sim.gf.xy.rho.get_iteration(0)

You can find what iterations are available with the available_iterations() attribute. Similarly, you can find what times are available with available_times():

print(sim.gf.xy.rho.available_iterations)
print(sim.gf.xy.rho.available_times)

You can read a time instead of a iteration with the method get_time(). You can convert between time and iteration with the methods time_at_iteration() and iteration_at_time().

These methods return a HierarchicalGridData object with all the available data for the requested iteration. If HDF5 files are being read, the correct ghost zone information is being used. In case you want to work with a specific subgrid with uniform spacing, you can use the read_on_grid() method. This will return a UniformGridData object instead, with grid the grid you specify. The grid is specified by passing a UniformGrid object. For example

from kuibit.grid_data import UniformGrid

grid = UniformGrid([100, 100], x0=[0, 0], x1=[2,2])
rho0_center = sim.gf.xy.rho.read_on_grid(0, # iteration
                                         grid)

This method works by reading the entire grid structure and resampling onto the requested UniformGridData, so it may be slow for large 3D data.

Similarly, you can read a chunk of evolution from min_iteration to max_iteration on a specified grid with the method read_evolution_on_grid(). This returns a UniformGridData that has as first dimension the time, and as other dimensions the specified grid. So, this is a “spacetime” UniformGridData. With this function you can evaluate grid data on specific spacetime points with multilinear interpolation in space and time. This can also be used to generate additional time frames between two outputs.

OneGridFunctionH5 objects are iterable: you can loop over all the available iterations by iterating over the object.

Note

OneGridFunctionH5 objects cache information to avoid expensive read operations. This can lead to a growing memory usage in scripts when the same object is used multiple times (for example, to render a video reading multiple iterations). The method clear_cache() can be used to free up memory.

Warning

kuibit supports grid arrays, which are arrays with a fixed size, typically distributed among all the MPI processes. Carpet treats them in the same way as standard grid functions, so in kuibit there is no distinction between the two. This means that kuibit will assign some coordinates (which are read from the output), but these coordinates are meaningless.

Warning

kuibit supports files output with the option one_file_per_group set to yes. However, the maximize the performance it is best to set that option to no. kuibit has to open each single file to understand what variables are inside, and this impacts, especially for several or big files.