FAQ

Test Data

If you don't have VLSV data at hand, Vlasiator.jl provides some test data for you to begin with.

using LazyArtifacts

rootpath = artifact"testdata"
files = joinpath.(rootpath, ("bulk.1d.vlsv", "bulk.2d.vlsv", "bulk.amr.vlsv"))

These are also used in the standard test. These will be automatically downloaded from vlsv_data if you run the package test locally.

Precision

For post-processing and data analysis purposes, it makes less sense to stick to double precisions, so we mostly use Float32 in Vlasiator.jl for numerical arrays. Several exceptions are:

physical constants are defined in Float64, since single precision only resolves up to ±3.4E+38, and it may go out of bound in the middle of calculation (e.g. plasma frequency).

Int vs. UInt

Integers but not unsigned integers shall be used for indexing, even though unsigned integers are tempting.

Vlasiator output files can be large. If we have limited memory relative to the file size, Vlasiator.jl provide direct hard disk mapping through mmap in Julia. With this mechanism you never need to worry about unable to process data with small free memory. Besides, we found that proper usage of mmap can also speed up reading and reduce memory comsumption. However, without reinterpret we may encounter the alignment issue. Therefore, we check the page alignment while using mmap.

The memory layout of the field solver grid is a bit tricky when converting from 1D arrays to multidimensional arrays. In pyvlasiator/analysator, the FSgrid output fg_b has size of (nx,ny,nz,3); In Vlasiator.jl, the final output would be size of (3, nx,ny,nz).

Parallelism

The current design choice is to achieve optimal serial performance per file, and apply parallel processing across individual files. In most common cases, the time it takes for post-processing one snapshot is reasonably short, but the number of snapshots are large. Julia's built-in support for all kinds of parallelism paradigm (multithreading, multiprocessing, channel) and external support from packages (MPI.jl, Polyester.jl) can be relatively easily incorported to make the whole workflow parallel.

multi-threading with @threads (recommended when working within one node)
multi-processing with pmap
multi-processing with RemoteChannel
ClusterManagers for multi-node jobs

See more in the examples.

VTK

VLSV is just an uncompressed binary format. If we convert VLSV to VTK through write_vtk, the generated VTK files, even the highest resolution one with every coarse cell mapping to the finest level, can be several times smaller than the original VLSV file.

One drawback of this conversion is that it cannot deal with phase space outputs, i.e. VDFs.

FAQ

Test Data

Performance

Precision

Int vs. UInt

Memory

Parallelism

VTK