Frequently Asked Questions

What is the UserWarning: Unused (redundant) file: warning?

kuibit scans recursively the simulation directory to find what data is available and organizes it. The warning UserWarning: Unused (redundant) file is emitted when multiple different files claim to contain the same data (for instance, two grid data files containing the same iteration). A common example that lead to such scenario is restarting a simulation when the previous checkpoint could not be produced. More specifically, for instance, output-0000 ran successfully and produced a checkpoint, the simulation is restarted from there and data is saved in output-0001. However, the checkpoint could not be created (for example, because of quota issues or other crashes). When the simulation is restarted again, a new folder output-0002 is produced and the run starts from the checkpoint associated to output-0000. In this case, output-0002 will initially contain overlapping iterations as in output-0001. If output-0001 is not removed, kubit will issue a warning.

The default policy in kuibit is to use the most recent data.

How do I access grid data as NumPy arrays?

kuibit defines its own high-level types to keep complexity under control. Sometimes, however, one wants to have full control over how the data is stored and manipulated. Suppose grid_fun is the grid function for which you want to access the data over for the refinement level i and component j as NumPy array. grid_fun[i][j] is a UniformGridData, a higher-level object that contains data, coordinates, and additional metadata. For here, grid_fun[i][j].data is the NumPy array with the actual data.

Some of the attributes in SimDir are slow

When you instantiate a SimDir, kuibit will go through all the files and organize them. Next, when you ask for specialize your request (e.g., you want to work with grid data), kuibit will isolate the relevant files and analyze them more carefully. This can take a long time if you have several thousands of files. In some cases, kuibit has to open each single file to determine its content (e.g., kuibit scans the headers of your files to see what variables are inside). Opening files has some system overhead that cannot be reduced.

To solve this problem, you have to reduce the number of files in your directory. There are two ways to do this:

  1. you create a new folder filled with files obtained by concatenating the corresponding files across different simulations restarts. For example, if your simulation has 10 restarts, each one with the same file rho.x.asc, you can concatenate all these files in a new rho.x.asc in the new folder. This will decrease the file count by a factor of 10 (the number of restarts).

  2. you create new folders only containing the data you want to study. Let us assume that you are only interested in gravitational wave data. You can create a new folder that follows the same structure of your simulation directory, and create links to the files you are interested in. Then, when you use kuibit from this second folder, there will be far fewer files to analyze.

However, if your data is not changing, the best way to deal with this problem is with pickles: you can always load and save SimDir objects to reduce the amount of work that needs to be done (SimDir.save(), load_SimDir()).

refinement_levels_merged() is too slow and/or requires too much memory

When working with grid data, it is tempting to deal with HierarchicalGridData using refinement_levels_merged() so that you do not have to deal with all the complexity of the various refinement levels. However, what refinement_levels_merged() does is to reduce everything to the highest resolution. If you have several refinement levels in a large grid, this would require hundreds of terabytes!

refinement_levels_merged() is provided as a convenience function for small simulations. For larger simulations, you have to work directly with HierarchicalGridData (which fully support all the various mathematical operations and other various methods), until you want to plot the result. When are ready to plot, you should use the method to_UniformGridData() instead of refinement_levels_merged(). With this function you can control the region where you want to focus and the resolution that you want to work with. In this way, you can reduce the number of computations needed and make the problem tractable.

I want to pre-process Psi4 before computing the strain

GravitationalWavesOneDet contains methods to compute quantities from Psi4, but sometimes is desirable to perform some operations on Psi4 first. The easiest way to do so is to create a new GravitationalWavesOneDet with the new data. For instance, to smooth Psi4 (with the Savitzky-Golay filter):

from kuibit.cactus_waves import GravitationalWavesOneDet

data = []

for mult_l, mult_m, ts in wav:
    data.append([mult_l, mult_m, ts.savgol_smoothed(window_size=11)])

new_wav = GravitationalWavesOneDet(wav.dist, data)

where wav is the old GravitationalWavesOneDet, new_wav the new one.

Another common operation is cropping the data (e.g., to remove junk radiation). You can use the same approach, or use directly the ~.:py:meth:crop or ~.:py:meth:cropped methods.