Using gkyl¶
We now cover the basics of runing gkyl on desktops, clusters, and GPUs. Gkyl can also be used be used to run any Lua scripts, as well as tools provided within gkyl. We also comment on some useful tools provided by ADIOS.
Additional details on the contents of the input files can be found in the Input file file basics page and the pages for the Vlasov, Gyrokinetic and Moment Apps.
The installation placed the gkyl executable in
<INSTALL-DIR>/gkylsoft/gkyl/bin/
(where the default <INSTALL-DIR>
is home, ~
),
so one typically needs to call gkyl as <INSTALL-DIR>/gkylsoft/gkyl/bin/gkyl
. However
most users create an alias so one
can simply call gkyl
. The documentation assumes such alias unless specified otherwise.
The gkyl
command has a built-in help menu. Access it with
gkyl -h
Run simulations¶
There are three ways of running simulations with gkyl:
Serial: using a single core/processes/CPU.
Parallel: running a multi-core simulation (using MPI).
GPUs: using graphical processing units (GPUs).
The input file has to have some knowledge of which of these modalities you will use. We provide some examples of each of these below.
Serial simulations¶
Suppose you have the gk-sheath.lua input file for simple 3D
gyrokinetic calculation with sheaths at the z-boundaries and a source in the middle (along z).
In the Common section of the App declaration (i.e. between plasmaApp = Plasma.App {
and
electron = Plasma.Species {
) there are two variables, decompCuts
and parallelizeSpecies
.
The refer to the number of MPI/NCCL decompositions and whether to parallelize amongst
species, respectively.
For serial simulations one can remove these from the input file, or parallelizeSpecies
must be set to false
, and decompCuts
must be a table with as many 1’s as
there are configuration space dimensions (three in this case). That’s why the input
file contains:
plasmaApp = Plasma.App {
...
decompCuts = {1, 1, 1}, -- Parallel decomposition in each position dimension.
parallelizeSpecies = false, -- =True to parallelize amongst species.
...
}
Then one can run the input file in serial with the simple command:
gkyl gk-sheath.lua
By the time it completes, after 54 seconds on a 2015 MacbookPro, this simulation will produce the following output to screen:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 |
|
These simulation logs contain the following:
Line 1: |
start date and time. |
Lines 2-3: |
gkyl repository revision with which this simulation was run (a + at the end indicate it’s been modified from the version in github), and the date on which the executable was built. |
Line 11: |
report the initial time step number, time and initial time step size. |
Lines 12-21: |
report progress every 1% of the simulation (first column). Then, every 10% of the simulation time, give the number of time steps taken so far, simulation time transcurred, and the latest time step size. |
Lines 23-54: |
give various metrics regarding the time-steps and wall-clock time taken by the simulation, and the time spent on various parts of the calculation. |
Line 55: |
Date and time when the simulation finished. |
Also, by default gkyl produces a log file with the format <input-file-name>_0.log
.
If you wish to disable this set logToFile = false,
in the Common section of the App.
Parallel simulation¶
For large problems running on a single CPU can lead to impractical runtimes. In those cases one benefits from parallelizing the simulation over many CPUs (or GPUs, see next section). This is accomplished in gkyl by decomposing the (position) space into subdomains and/or decomposing the species into subpopulations, and handing off each subdomain/subpopulation to a separate parallel process (with MPI or NCCL on GPUs).
Suppose one wishes to run the 3D sheath calculation in
the previous section on a node with 16 cores,
using 4 MPI processes along \(x\) and 4 along \(z\). In this case one must edit the
variable decompCuts
in the Common of the input file to reflect this decomposition:
plasmaApp = Plasma.App {
...
decompCuts = {4, 1, 4}, -- Parallel decomposition in each position dimension.
parallelizeSpecies = false, -- =True to parallelize amongst species.
...
}
Once decompCuts
and the rest of the input file is set appropriately, you can run
the simulation with the MPI executable provided by your cluster or MPI implementation
(e.g. mpirun, mpiexec, srun, ibrun). For example, with mpirun we would run the simulation as
mpirun -n 16 gkyl gk-sheath.lua
The argument following -n
is the total number of MPI processes to launch, in this case
\(4\times4=16\). This clearly requires that your computer/node/job has access to
at least 16 cores.
Note
The number of decompCuts
in any dimension should not exceed the number of cells in that dimension.
Another way to parallelize a simulation with more than one species is to send each species (or groups of) to separate MPI processes. For example, we could instead use 2 MPI processes along \(z\) and 4 along \(z\), and further parallelize species to utilize all 16 cores by setting
plasmaApp = Plasma.App {
...
decompCuts = {2, 1, 4}, -- Parallel decomposition in each position dimension.
parallelizeSpecies = true, -- =True to parallelize amongst species.
...
}
and again launching the executable with
mpirun -n 16 gkyl gk-sheath.lua
In simulations with few (or no) cross-species interactions species parallelization is particularly efficient.
On many computer clusters where one may run parallel simulations one must submit scripts in order to submit a job. This jobscript causes the simulation to be queued so that it runs once resources (i.e. cores, nodes) become available. When resources are finally available the simulation runs in a compute node (instead of the login node).
Jobscripts for some machines are provided in the gkyl/machine
folder and below. Note
that the installation instructions point to machine scripts
for building gkyl on each of these computers. If you need assistance with setting up gkyl
in a new cluster, see this or feel free to contact
the developers.
Sample submit scripts:
Running on GPUs¶
Gkyl is also capable of running on graphical processing units (GPUs) with minimal or no
modifiation of an input file that you would use to run on CPUs. Our implementation of GPU
capabilities uses CUDA and NCCL. Our model is to launch an MPI process for each GPU available.
For this reason one must always launch gkyl with mpirun
, mpiexec
or any equivalent
command provided by the system (e.g. srun
). Furthermore, one must provide the -g
flag
following the gkyl executable. For example given the input file gk-sheath.lua
previously
mentioned, we would simply run it with on a single GPU with
mpirun -np 1 gkyl -g gk-sheath.lua
If multiple GPUs are available, one may use those to decompose position space and/or
species using decompCuts
and speciesParallelization
in the same way it is done for
CPUs (as explained above). We just need to ensure there are enough GPUs for the amount of
parallelization requested. For example, if we have 4 nodes with 4 GPUs each, we can again
use
plasmaApp = Plasma.App {
...
decompCuts = {2, 1, 4}, -- Parallel decomposition in each position dimension.
parallelizeSpecies = true, -- =True to parallelize amongst species.
...
}
to parallelize along \(x\), \(z\) and species on 16 GPUs, and launch the simulation with
mpirun -n 16 gkyl -g gk-sheath.lua
(multi)GPU runs on clusters
Clusters often have specific instructions for running GPU and multi-GPU jobs. Please refer to your cluster’s documentation/admins to determine the correct/best way to run such simulations on that system.
On clusters is often common to submit scripts that queue the job for running on compute
nodes (when the resources become available). In fact this is often preferable to ssh-ing
into a node if that is even possible. Some sample job scripts for running GPU-enabled jobs
are in the gkyl/machines
folder and below:
Restarts¶
Sometimes a simulations ends prematurely (e.g. your job’s wallclock time allocation ran out), or perhaps it ended successfully but now you wish to run it longer. In these cases one can restart the simulation.
The first simulation prints out a number of restart files, those ending in _restart.bp
. In
order to begin a second simulation from where the first left off, check the tEnd
and nFrame
variables in the input file. These are defined as absolute times/number of frames, that is, they
specify the final simulation time and number of ouput frames from the beginning of the first
simulation, not relative to the previous simulation.
So suppose we run simulation 1 with the following in the App’s Common section:
momentApp = Moments.App {
...
tEnd = 10.0,
nFrame = 100,
...
}
There are two restart scenarios:
If the simulation completes successfully, one must increase
tEnd
andnFrame
in order to run the second, restart simulation. Otherwise it will just initialize, realize it does not need to advance any further, and terminate.The first simulation ended prematurely, so
tEnd=10.0
was not reached. One can restart the simulation with the sametEnd
andnFrame
and it will simply try to get there this second time. Or one can increasetEnd
andnFrame
so the second simulation goes farther than the first one intended to.
Once you’ve made the appropriate edits to the input file the second, restart simulation is run by simply appending the word restart after the input file, like
gkyl inputFile.lua restart
This second, restart simulation will use the _restart.bp
files of the first simulation to
construct an initial condition. Note that it will look for the restart files in the same
directory in which the restart simulation is being run, so typically we run restarts in the same
directory as the first simulation.
Using the fromFile
option¶
The fromFile
option can be used to read data from a file on initialization. This can be used
for initial conditions, sources, and geometry data. The file to be read must have the same prefix
as the input file but can otherwise be named as desired, including the extension (it might be useful
to use a different extension, such as .read
, to avoid accidentally deleting needed files if one
does rm *.bp
).
Handy perks¶
Run Lua with gkyl¶
One can use gkyl to run (almost?) any Lua code. Say for example I find code in the interverse which promises to compute the factors of “Life, the Universe, and Everything” (who wouldn’t want that?). We can take such code, put it in an input file named factors.lua and run it with
gkyl factors.lua
Try it! It’s free!
gkyl Tools¶
A number of additional tools that users and developers may find useful as part of their (Gkeyll) workflow are shipped as gkyl Tools. One such tool, for example, allows us to compare BP (ADIOS) files.
Suppose you ran the plasma beach simulation with the Moment App, using the momBeach.lua input file which contains a variable
local J0 = 1.0e-12 -- Amps/m^3.
in the collisionless electromagnetic source. Let’s assume you were scanning this variable, so
you may choose to create another input file momBeachS.lua which
increases J0
to
local J0 = 1.0e-10 -- Amps/m^3.
If after running momBeachS you are not sure if the results changed at all, you can use the
comparefiles
tool. For example, compare the electromagnetic fields produced at the end of
both simulations with the following command:
gkyl comparefiles -a momBeach_field_100.bp -b momBeachS_field_100.bp
In this particular example the tool would then print the following to screen:
Checking attr numCells in momBeach_field_100.bp momBeach_field_100s.bp ...
... comparing numCells
Checking attr lowerBounds in momBeach_field_100.bp momBeach_field_100s.bp ...
... comparing lowerBounds
Checking attr upperBounds in momBeach_field_100.bp momBeach_field_100s.bp ...
... comparing upperBounds
Checking attr basisType in momBeach_field_100.bp momBeach_field_100s.bp ...
... comparing basisType
Checking attr polyOrder in momBeach_field_100.bp momBeach_field_100s.bp ...
... comparing polyOrder
Files are different!
So we know that increasing J0
by a factor of a 100 did change the simulation.
Additional documentation of these tools is found in the gkyl Tools reference.
ADIOS tools¶
ADIOS has two handy tools that one may use to explore data files produced by a gkyl
simulation. These are bpls
and bpdump
. We give a brief example of each here, and
expanded descriptions of their capabilities can be found in the
ADIOS documentation
, or using the
bpls -h
and bpdump -h
commands.
Note that these tools are complimentary to postgkyl’s info command.
bpls¶
bpls
provides a simple view of the structure and contents of a .bp
file. For example,
in the previous section we discussed a 5-moment calculation of the
plasma beach problem. Such simulation
produced the file momBeach_field_1.bp
. We can explore this file with
bpls momBeach_field_1.bp
which outputs
double time scalar
integer frame scalar
double CartGridField {400, 8}
It tells us that this file contains three variables, the simulation time
at which this snapshot
was produced, the frame
number, and a Cartesian grid field (CartGridField) for 400 cells which
contains 8 electromagnetic components (3 for electric field, 3 for magnetic field, and the other 2
are used in gkyl’s algorithms). One may dump one of these variables with the additional -d
flag.
So if we wish to know the simulation time of this frame, we would use
bpls momBeach_field_1.bp time -d
and see it output
double time scalar
5.1e-11
Note that for large variables (e.g. CartGridField) dumping can overwhelm the terminal/screen. One
can also slice the dataset and only dump part of it, see bpls -h
.
There are also a number of attributes (smaller pieces of time-constant data), which one can see with
the -a
flag:
ws:dir jill$ bpls momBeach_field_1.bp -a
double time scalar
integer frame scalar
double CartGridField {400, 8}
string changeset attr
string builddate attr
string type attr
string grid attr
integer numCells attr
double lowerBounds attr
double upperBounds attr
string basisType attr
integer polyOrder attr
string inputfile attr
and you can peek the value of an attribute with bpls <filename> -a <attribute-name> -d
.
bpdump¶
The -d
flag in the previous dumps the values of a variable onto the screen. There’s a separate
command to do just that called bpdump
. You can dump a specific variable with
bpdump -d <variable-name> <filename>