Test suite JSON representation

This page describes SyntaxGym’s standard format for representing test suites. For more information on test suites and their basic structure, see The SyntaxGym architecture.

Here’s an example of a simple one-item test suite which conforms to our standard, and which measures subject–verb number agreement knowledge:

  {
    "meta": {"name": "Sample subject--verb suite", "metric": "sum"},
    "predictions": [{"type": "formula", "formula": "(2;%mismatch%) > (2;%match%)"}],
    "region_meta": {"1": "Subject NP", "2": "Verb", "3": "Continuation"},
    "items": [
      {
        "item_number": 1,
        "conditions": [
          {"condition_name": "match",
           "regions": [{"region_number": 1, "content": "The woman"},
                       {"region_number": 2, "content": "plays"},
                       {"region_number": 3, "content": "the guitar"}]},
          {"condition_name": "mismatch",
           "regions": [{"region_number": 1, "content": "The woman"},
                       {"region_number": 2, "content": "play"},
                       {"region_number": 3, "content": "the guitar"}]}
        ]
      }
    ]
  }

In the following sections we’ll describe the items of this JSON standard one-by-one. The end of this document also contains a formal JSON schema summary.

meta: Test suite metadata

This section contains basic archival facts such as test suite name, description, and paper reference (if relevant).

Required fields:

name

A unique identifying string name for the test suite.

metric

Surprisal statistic over which this test suite specifies predictions. Currently the only supported metric is "sum" – this means that, when multiple tokens are in the same region, a model’s surprisals for each individual token will be summed together to form the region-level surprisal statistic.

Optional fields:

reference

A paper reference for the content of the suite.

comment

Any other comments.

region_meta: Region declarations

This object declares the mapping from region numbers to region names. Region numbers should form a contiguous integer range beginning at 1. These names are only for visualization purposes; they will not be used for reference within the test suite. We will reference regions in other parts of the test suite specification using the region numbers.

predictions: Prediction declarations

Predictions state expected relations about surprisal statistics between regions and conditions within each item. We represent predictions as arithmetic formula strings.

Prediction string format

The atomic units of the prediction string are region references, identifying a particular region instance by its region number and condition name using the following format:

(<region_number>;%<condition_name>%)

For example, the prediction in this document’s example test suite references region 2 of the mismatch condition using the string (2;%mismatch%). The surrounding parentheses are necessary for these expressions to be correctly parsed.

The user can then use these region references in symbolic arithmetic relationships to compare their associated surprisal values. Our example offers the following prediction string:

(2;%mismatch%) > (2;%match%)

stating that the total surprisal in region 2 of the mismatch condition should be greater than the total surprisal in region 2, condition match.

Formal definition

Region numbers must be integers corresponding to regions previously defined in region_meta, or a single asterisk *. An asterisk indicates that the total surprisal of all regions of the sentence in the given condition should be computed.

The following operators are available:

  • Comparison operators: < and > specify hard inequality constraints. = specifies an approximate equality constraint.

  • Arithmetic operators: + and - specify float addition and subtraction, respectively. For example:

    (2;%mismatch%) - (2;%match%) > 0
    
  • Logical operators: & and | specify logical conjunction and disjunction, respectively. This can be used to coordinate multiple equalities, for example:

    (2;%mismatch%) > (2;%match%) & (2;%match%) < (2;%mismatch%)
    

items

The bulk of the test suite specification consists of the actual experimental items. These are represented as lists of region instances, nested within lists of conditions, nested within a list of items.

Concretely, each item in the items array contains two properties:

item_number

A unique identifying integer item number.

conditions

A list of condition objects, specified below.

Conditions

Each condition in the conditions array contains two properties:

condition_name

A reference to one of the test suite’s conditions. Each item object should have the same set of condition names.

regions

A list of region contents for this particular item and condition.

Regions

Each region in the regions array consists of two properties:

region_number

An integer referencing one of the test suite’s regions.

content

A string containing the region-level text content. This should be formatted as natural language (not pre-tokenized). Some regions may have no content. There should not be leading or trailing spaces in a region’s content.

Examples

The format of test suites is perhaps best learned by example. You can find plenty of example test suites in the codebase for a recent ACL paper using SyntaxGym: see on GitHub.

SyntaxGym test suite schema

This schema describes the standard representation of a SyntaxGym test suite.

type

object

properties

  • meta

Suite metadata

type

object

properties

  • name

A unique identifying name for this test suite

type

string

  • metric

The surprisal statistic referenced by the suite’s predictions. TODO

type

string

  • predictions

A list of expected relations between surprisal statistics computed on different regions and conditions of this test suite.

type

array

default

[]

items

type

object

properties

  • type

TODO

type

string

  • formula

A string representation of the prediction formula.

type

string

examples

(2;%mismatch%) > (2;%match%)

  • region_meta

A map from region numbers to region names. Region numbers should form a contiguous integer range beginning at 1.

type

object

additionalProperties

True

  • items

type

array

items

type

object

properties

  • item_number

A unique identifying number for this item.

type

integer

  • conditions

type

array

items

type

object

properties

  • condition_name

type

string

  • regions

The regions schema

An explanation about the purpose of this instance.

type

array

items

type

object

properties

  • region_number

Should correspond to a key of region_meta.

type

integer

  • content

The string text content of this region. May be empty. Should not contain leading or trailing spaces.

type

string