JSON in Praat

Some time ago, I was asked to set up a web service that received some data on a web request, passed it to a Praat process, and got some data out of it. Particularly since the advent of the so-called “barren” releases of Praat, the first part of that task has become a lot easier to do. But that still leaves us with the problem of importing and exporting data to and from Praat.

Probably the most widely spread data serialisation format out there is JSON and with good reason: it is powerful and simple, and easy to parse because all of the relations between the different data items are expressed explicitly using a relatively small set of characters: [] for lists, {} for maps, "" for strings, and : for assignments.

Here’s how it looks when pretty-printed:

1
2
3
4
5
6
7
8
9
10
11
12
13
{
  "family": {
    "parents": [
      "Homer",
      "Marge"
    ],
    "children": [
      "Bart",
      "Lisa",
      "Maggie"
    ]
  }
}

JSON

Although there are JSON libraries for most languages out there, Praat is not one of them, making data transport difficult and turning Praat into a lonely kid at the playground¹. So I had to go and write my own.

I give you the appropriately named “json” plugin for Praat:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
#!/usr/bin/env praat

# Procedures are in separate files for parsing and producing JSON
include ../../plugin_json/procedures/read.proc
include ../../plugin_json/procedures/write.proc

form Args...
  sentence Path
endform

file$ = readFile$(path$)
@read_json: file$

# ... Do something with the read data

@write_json()

Praat

In the above snippet, lines 4 and 5 “import” the library, which provides the two main procedures of the plugin: @read_json, and @write_json. The first one is then put to use in line 12, to convert a JSON string into a data structure; while the latter is used in line 16, to print the selected data structure to STDOUT.

Below, I’ll go into a little more detail about how this all works.

Representing data structures

One of the first problems had to do with representing data structures in Praat.

Although JSON uses fairly straightforward data structures (lists and maps), none of these exist as such in Praat. True: for some time now there have been the so-called indexed variables, and for a couple of versions now these have allowed using strings as keys. However, these still represent sets of unrelated variables, with nothing in common but part of their identifiers.

So there is no way to pass a whole set of indexed variables to a procedure, or to get the keys in one of the new ones using strings. Which means that for the purpose of what we need here, they are useless.

Instead, the json plugin makes use of two existing Praat objects to represent these data structures: Table objects represent objects (or dictionaries, or hashes, etc), while Strings objects represent lists.

The strutils plugin already includes some procedures to handle Strings objects as arrays, and implements some common methods like @push, which adds an element at the end of the list; @pop, which takes the last element from the list; @slice, which extracts a set of elements from the list; etc.

Similarly, the json plugin includes a hash.proc file that includes some procedures to standardise the handling of Table objects as dictionaries: they can be created with the @hash procedure, and new values can be added with the @keyval and @keyval$ procedures, that add numeric and string values respectively.

Complex data structures

This is all well and good when having flat lists or dictionaries. But even a case like the first example above already breaks this: there is a top level dictionary with a single key, whose value is another dictionary with two keys, each of which have a list of strings. So it’s not enough to be able to represent dictionaries and lists: we need to be able to have dictionaries of dictionaries of lists, and so on.

For this, the json plugin also includes procedures that take object IDs and insert them into the Strings and Table objects with a special marker to let us know that that value represents a reference to an existing object. This marker is not unlike the $ marker already used for strings in Praat, which is called a carat. In this case, the so-called “id carat” defaults to #, such that a reference to object 32 would be stored as 32#.

This has the benefit that, if numified with number(), it will turn into the correct value (32), and because the carat is stored in a separate variable, that can be redefined at runtime, the user can set this to whatever fits their data (if, for example, they need the hash character as a plain character).

There are other variables like this, which are defined in the vars.proc file. They allow the user to control almost all of the finer aspects of the read and write process. Take a look at that file for more information.

Parsing JSON

Parsing is done by the @read_json procedure, which takes a JSON string and processes it character by character, populating the necessary objects along the way, until it comes to the end. This is where the beauty of JSON really shines, because when parsing JSON it is never necessary to back-track: it can be read properly in a single go, which makes the process significantly simpler.

The mapping between JSON data structures and Praat objects is direct, but can at times become rather cumbersome: even the relatively simple example at the top of this post will generate two Table objects and two Strings objects!

To make this management easier, all newly created objects will be selected at the end, and the ID of the top-level object (that will hold any reference to the other objects) will be stored in the read_json.return variable.

Producing JSON

But the reason the plugin was written in the first place was not so much to read JSON data into Praat, but to have a way to reliably write fully compliant JSON strings from Praat data.

This is done with the @write_json procedure, which will act on a single selected object and choose how to proceed based on the type of that object. In particular, @write_json will call a procedure called <TYPE>_to_json, where <TYPE> is the lowercase type of the selected object, which is expected to turn the relevant data in that object into usable JSON.

The “json” plugin already includes two such procedures: @table_to_json and @strings_to_json, which take care of Table and Strings objects respectively. But this approach allows experienced users to write their own procedures that will handle specific objects in the way that best suits their application, and indeed to completely redefine the way Table and Strings objects are serialised, if they so desire.

Closing

The solution provided by the “json” plugin is far from ideal. In particular, there are a lot of elements that were necessary (like object references), and that had to be forced into the language by re-using existing features. But it works!

And indeed, it works remarkably well: it passes all of the standard test cases in the JSON test suite, and is currently in use in several production-ready web-applications, that are now able to pass information to and from Praat in a more standard format.

Eventually, I have no doubt that this feature will be included into Praat using a more standard library, and this plugin will become obsolete. I look forward to that day.

But until then, the “json” plugin allows us to do more with less, and opens new doors to applications that were not possible before. It allows us to be lazy, and imaginative, and daring.

And, if nothing else, it shows us that this can be done, even in the limited and often disliked confines of the Praat scripting language.

What will you build now?

If you can represent your data using a Table object, this can be serialised as a comma or tab-separated list of values, but that only works if your data is flat and matrix-like. What about other, maybe more complex data structures? ↩