# Masking invalid data¶

It is often useful to ignore specific pieces of data. For example, it is wise to
exclude the atmosphere when we compute the maximum temperature GRMHD
simulations. For this, `kuibit`

inherits from NumPy the concept of masks:
masked data carries along the information of where the data is valid and where
it is not. In `kuibit`

, classes derived from `BaseNumerical`

(mainly, `TimeSeries`

, `FrequencySeries`

,
`UniformGridData`

, `HierarchicalGridData`

) support
masks, meaning that operations like `max()`

will not include the data
marked as invalid. In this page we describe how to work with masks
(series_ref:Reference on kuibit.masks).

## Creating masked objects¶

Since the interface for is the same for all the classes defined in kuibit, we
will consider a `TimeSeries`

as an example.

To create a masked object, you first need to start from the clean version.
Suppose `ts`

is a `TimeSeries`

, there are multiple ways to return
a new object `ts_masked`

:

```
# Data is invalid when it is equal to 1
ts1 = ts.masked_equal(1)
# Data is invalid when it is larger than 2
ts2 = ts.masked_greater(2)
# Data is invalid when it is larger or equal than 3
ts3 = ts.masked_greater_equal(3)
# Data is invalid when it is between 4 and 5
ts4 = ts.masked_inside(4, 5)
# Data is invalid when it is NaN or inf
ts5 = ts.masked_invalid()
# Data is invalid when it is larger than 6
ts6 = ts.masked_less(6)
# Data is invalid when it is larger or equal than 7
ts7 = ts.masked_less_equal(7)
# Data is invalid when it is not 8
ts8 = ts.masked_not_equal(8)
# Data is invalid when it is outside the range (8,9)
ts9 = ts.masked_outside(8, 9)
```

All these methods return new objects. Alternatively, it is possible to edit the
object in place using methods with the imperative form (e.g., `mask_equal`

instead
of `masked_equal`

).

The second way to create masked objects is by using the functions in
`masks`

, which contains methods for mathematical functions that are
defined on a domain. For instance, if you want to compute the natural logarithm
of some data, you can use the function `masks.log()`

, which automatically
applies a mask where the operation is not defined.

```
import kuibit.masks as ma
log_ts = ma.log(ts)
log_ts.is_masked() # => True
```

The method `is_masked()`

checks whether the object is masked or not.
When objects are masked, some methods become unavailable. For example, it is not
possible to compute splines or perform interpolations. For
`TimeSeries`

and `FrequencySeries`

, you can go around
this limitation by removing the invalid points with the methods
`mask_remove()`

or `mask_removed()`

. This is not possible with
grid data because we assume that the data is defined on regular grids.

Suppose you want to mask the atmosphere. You have your density variable `rho`

and you want to remove everything that is below `1e-8`

, and you want to plot
the pressure `press`

. For that you would first construct a masked grid data
with `rho`

, and then apply the mask to `press`

:

```
masked_rho = rho.masked_less(1e-8)
masked_press = press.mask_applied(masked_rho.mask)
```

If you want to plot this with `visualize_matplotlib`

, you need to pay
attention that resampling erases mask information. Therefore, if you want to
plot the mask, you have to pass directly the `UniformGridData`

you
want to plot (and not `HierarchicalGridData`

).

Warning

We only mask the data, not the independent variable (e.g., the time in
`TimeSeries`

). If your computations required this variable to be
masked too, you should extract the mask array with the `mask()`

method and manually apply the mask.

Warning

Some methods will not work with masked data (e.g. splines and interpolation).
Therefore, resampling operations will not carry over the masks. You have to
apply the masks again. One instance in which this is important is plotting
with `visualize_matplotlib`

. Also, the `save()`

method
will discard the mask information.