Defining Custom Workloads

This page documents the JSON input format that FireSim uses to understand your software workloads that run on the target design. Most of the time, you should not be writing these files from scratch. Instead, use FireMarshal to build a workload (including Linux kernel images and root filesystems) and use firemarshal’s install command to generate an initial .json file for FireSim. Once you generate a base .json with FireMarshal, you can add some of the options noted on this page to control additional files used as inputs/outputs to/from simulations.

Workloads in FireSim consist of a series of Jobs that are assigned to be run on individual simulations. Currently, we require that a Workload defines either:

  • A single type of job, that is run on as many simulations as specfied by the user. These workloads are usually suffixed with -uniform, which indicates that all nodes in the workload run the same job. An example of such a workload is firesim/deploy/workloads/linux-uniform.json.
  • Several different jobs, in which case there must be exactly as many jobs as there are running simulated nodes. An example of such a workload is firesim/deploy/workloads/ping-latency.json.

FireSim uses these workload definitions to help the manager deploy your simulations. Historically, there was also a script to build workloads using these JSON files, but this has been replaced with a more powerful tool, FireMarshal. New workloads should always be built with FireMarshal.

In the following subsections, we will go through the two aforementioned example workload configurations, describing the how the manager uses each part of the JSON file inline.

The following examples use the default buildroot-based linux distribution (br-base). In order to customize Fedora, you should refer to the Running Fedora on FireSim page.

Uniform Workload JSON

firesim/deploy/workloads/linux-uniform.json is an example of a “uniform” style workload, where each simulated node runs the same software configuration.

Let’s take a look at this file:

{
  "benchmark_name"            : "linux-uniform",
  "common_bootbinary"         : "br-base-bin",
  "common_rootfs"             : "br-base.img",
  "common_outputs"            : ["/etc/os-release"],
  "common_simulation_outputs" : ["uartlog", "memory_stats.csv"]
}

There is also a corresponding directory named after this workload/file:

[email protected]:~/firesim/deploy/workloads/linux-uniform$ ls -la
total 4
drwxrwxr-x  2 centos centos   69 Feb  8 00:07 .
drwxrwxr-x 19 centos centos 4096 Feb  8 00:39 ..
lrwxrwxrwx  1 centos centos   47 Feb  7 00:38 br-base-bin -> ../../../sw/firesim-software/images/br-base-bin
lrwxrwxrwx  1 centos centos   53 Feb  8 00:07 br-base-bin-dwarf -> ../../../sw/firesim-software/images/br-base-bin-dwarf
lrwxrwxrwx  1 centos centos   47 Feb  7 00:38 br-base.img -> ../../../sw/firesim-software/images/br-base.img

We will elaborate on this later.

Looking at the JSON file, you’ll notice that this is a relatively simple workload definition.

In this “uniform” case, the manager will name simulations after the benchmark_name field, appending a number for each simulation using the workload (e.g. linux-uniform0, linux-uniform1, and so on). It is standard pratice to keep benchmark_name, the JSON filename, and the above directory name the same. In this case, we have set all of them to linux-uniform.

Next, the common_bootbinary field represents the binary that the simulations in this workload are expected to boot from. The manager will copy this binary for each of the nodes in the simulation (each gets its own copy). The common_bootbinary path is relative to the workload’s directory, in this case firesim/deploy/workloads/linux-uniform. You’ll notice in the above output from ls -la that this is actually just a symlink to br-base-bin that is built by the FireMarshal tool.

Similarly, the common_rootfs field represents the disk image that the simulations in this workload are expected to boot from. The manager will copy this root filesystem image for each of the nodes in the simulation (each gets its own copy). The common_rootfs path is relative to the workload’s directory, in this case firesim/deploy/workloads/linux-uniform. You’ll notice in the above output from ls -la that this is actually just a symlink to br-base.img that is built by the FireMarshal tool.

The common_outputs field is a list of outputs that the manager will copy out of the root filesystem image AFTER a simulation completes. In this simple example, when a workload running on a simulated cluster with firesim runworkload completes, /etc/os-release will be copied out from each rootfs and placed in the job’s output directory within the workload’s output directory (See the firesim runworkload section). You can add multiple paths here.

The common_simulation_outputs field is a list of outputs that the manager will copy off of the simulation host machine AFTER a simulation completes. In this example, when a workload running on a simulated cluster with firesim runworkload completes, the uartlog (an automatically generated file that contains the full console output of the simulated system) and memory_stats.csv files will be copied out of the simulation’s base directory on the host instance and placed in the job’s output directory within the workload’s output directory (see the firesim runworkload section). You can add multiple paths here.

Non-uniform Workload JSON (explicit job per simulated node)

Now, we’ll look at the ping-latency workload, which explicitly defines a job per simulated node.

{
  "benchmark_name" : "ping-latency",
  "common_bootbinary" : "bbl-vmlinux",
  "common_outputs" : [],
  "common_simulation_inputs" : [],
  "common_simulation_outputs" : ["uartlog"],
  "no_post_run_hook": "",
  "workloads" : [
    {
      "name": "pinger",
      "simulation_inputs": [],
      "simulation_outputs": [],
      "outputs": []
    },
    {
      "name": "pingee",
      "simulation_inputs": [],
      "simulation_outputs": [],
      "outputs": []
    },
    {
      "name": "idler-1",
      "simulation_inputs": [],
      "simulation_outputs": [],
      "outputs": []
    },
    {
      "name": "idler-2",
      "simulation_inputs": [],
      "simulation_outputs": [],
      "outputs": []
    },
    {
      "name": "idler-3",
      "simulation_inputs": [],
      "simulation_outputs": [],
      "outputs": []
    },
    {
      "name": "idler-4",
      "simulation_inputs": [],
      "simulation_outputs": [],
      "outputs": []
    },
    {
      "name": "idler-5",
      "simulation_inputs": [],
      "simulation_outputs": [],
      "outputs": []
    },
    {
      "name": "idler-6",
      "simulation_inputs": [],
      "simulation_outputs": [],
      "outputs": []
    }
  ]
}

Additionally, let’s take a look at the state of the ping-latency directory AFTER the workload is built (assume that a tool like FireMarshal already created the rootfses and linux images):

[email protected]:~/firesim-new/deploy/workloads/ping-latency$ ls -la
total 15203216
drwxrwxr-x  3 centos centos       4096 May 18 07:45 .
drwxrwxr-x 13 centos centos       4096 May 18 17:14 ..
lrwxrwxrwx  1 centos centos         41 May 17 21:58 bbl-vmlinux -> ../linux-uniform/br-base-bin
-rw-rw-r--  1 centos centos          7 May 17 21:58 .gitignore
-rw-r--r--  1 centos centos 1946009600 May 18 07:45 idler-1.ext2
-rw-r--r--  1 centos centos 1946009600 May 18 07:45 idler-2.ext2
-rw-r--r--  1 centos centos 1946009600 May 18 07:45 idler-3.ext2
-rw-r--r--  1 centos centos 1946009600 May 18 07:45 idler-4.ext2
-rw-r--r--  1 centos centos 1946009600 May 18 07:45 idler-5.ext2
-rw-r--r--  1 centos centos 1946009600 May 18 07:46 idler-6.ext2
drwxrwxr-x  3 centos centos         16 May 17 21:58 overlay
-rw-r--r--  1 centos centos 1946009600 May 18 07:44 pingee.ext2
-rw-r--r--  1 centos centos 1946009600 May 18 07:44 pinger.ext2
-rw-rw-r--  1 centos centos       2236 May 17 21:58 ping-latency-graph.py

First, let’s identify some of these files:

  • bbl-vmlinux: This workload just uses the default linux binary generated for the linux-uniform workload.
  • .gitignore: This just ignores the generated rootfses, which you probably don’t want to commit to the repo.
  • idler-[1-6].ext2, pingee.ext2, pinger.ext2: These are rootfses that we want to run on different nodes in our simulation. They can be generated with a tool like FireMarshal.

Next, let’s review some of the new fields present in this JSON file:

  • common_simulation_inputs: This is an array of extra files that you would like to supply to the simulator as input. One example is supplying files containing DWARF debugging info for TracerV + Stack Unwinding. See the Modifying your workload description section of the TracerV + Flame Graphs: Profiling Software with Out-of-Band Flame Graph Generation page for an example.

  • no_post_run_hook: This is a placeholder for running a script on your manager automatically once your workload completes. To use this option, rename it to post_run_hook and supply a command to be run. The manager will automatically suffix the command with the path of the workload’s results directory.

  • workloads: This time, you’ll notice that we have this array, which is populated by objects that represent individual jobs (note the naming discrepancy here, from here on out, we will refer to the contents of this array as jobs rather than workloads). Each job has some additional fields:

    • name: In this case, jobs are each assigned a name manually. These names MUST BE UNIQUE within a particular workload.
    • simulation_inputs: Just like common_simulation_inputs, but specific to this job.
    • simulation_outputs: Just like common_simulation_outputs, but specific to this job.
    • outputs: Just like common_outputs, but specific to this job.

Because each of these jobs do not supply a rootfs field, the manager instead assumes that that the rootfs for each job is named name.ext2. To explicitly supply a rootfs name that is distinct from the job name, add the rootfs field to a job and supply a path relative to the workload’s directory.

Once you specify the .json for this workload (and assuming you have built the corresponding rootfses with FireMarshal, you can run it with the manager by setting workload=ping-latency.json in config_runtime.ini. The manager will automatically look for the generated rootfses (based on workload and job names that it reads from the JSON) and distribute work appropriately.

Just like in the uniform case, it will copy back the results that we specify in the JSON file. We’ll end up with a directory in firesim/deploy/results-workload/ named after the workload name, with a subdirectory named after each job in the workload, which will contain the output files we want.