JSON in Praat
Some time ago, I was asked to set up a web service that received some data on a web request, passed it to a Praat process, and got some data out of it. Particularly since the advent of the so-called “barren” releases of Praat, the first part of that task has become a lot easier to do. But that still leaves us with the problem of importing and exporting data to and from Praat.
Probably the most widely spread data serialisation format out there is JSON
and with good reason: it is powerful and simple, and easy to parse because all
of the relations between the different data items are expressed explicitly
using a relatively small set of characters: []
for lists, {}
for maps, ""
for strings, and :
for assignments.
Here’s how it looks when pretty-printed:
1
2
3
4
5
6
7
8
9
10
11
12
13
{
"family": {
"parents": [
"Homer",
"Marge"
],
"children": [
"Bart",
"Lisa",
"Maggie"
]
}
}
Although there are JSON libraries for most languages out there, Praat is not one of them, making data transport difficult and turning Praat into a lonely kid at the playground1. So I had to go and write my own.
I give you the appropriately named “json” plugin for Praat:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
#!/usr/bin/env praat
# Procedures are in separate files for parsing and producing JSON
include ../../plugin_json/procedures/read.proc
include ../../plugin_json/procedures/write.proc
form Args...
sentence Path
endform
file$ = readFile$(path$)
@read_json: file$
# ... Do something with the read data
@write_json()
In the above snippet, lines 4 and 5 “import” the library, which provides the
two main procedures of the plugin: @read_json
, and @write_json
. The first
one is then put to use in line 12, to convert a JSON string into a data
structure; while the latter is used in line 16, to print the selected data
structure to STDOUT.
Below, I’ll go into a little more detail about how this all works.
Representing data structures
One of the first problems had to do with representing data structures in Praat.
Although JSON uses fairly straightforward data structures (lists and maps), none of these exist as such in Praat. True: for some time now there have been the so-called indexed variables, and for a couple of versions now these have allowed using strings as keys. However, these still represent sets of unrelated variables, with nothing in common but part of their identifiers.
So there is no way to pass a whole set of indexed variables to a procedure, or to get the keys in one of the new ones using strings. Which means that for the purpose of what we need here, they are useless.
Instead, the json
plugin makes use of two existing Praat objects to represent
these data structures: Table
objects represent objects (or dictionaries, or
hashes, etc), while Strings
objects represent lists.
The strutils
plugin already includes some procedures to handle Strings
objects as arrays, and implements some common methods like @push
, which adds
an element at the end of the list; @pop
, which takes the last element from
the list; @slice
, which extracts a set of elements from the list; etc.
Similarly, the json
plugin includes a hash.proc
file that includes some
procedures to standardise the handling of Table
objects as dictionaries: they
can be created with the @hash
procedure, and new values can be added with the
@keyval
and @keyval$
procedures, that add numeric and string values
respectively.
Complex data structures
This is all well and good when having flat lists or dictionaries. But even a case like the first example above already breaks this: there is a top level dictionary with a single key, whose value is another dictionary with two keys, each of which have a list of strings. So it’s not enough to be able to represent dictionaries and lists: we need to be able to have dictionaries of dictionaries of lists, and so on.
For this, the json
plugin also includes procedures that take object IDs and
insert them into the Strings
and Table
objects with a special marker to
let us know that that value represents a reference to an existing object.
This marker is not unlike the $
marker already used for strings in Praat,
which is called a carat. In this case, the so-called “id carat” defaults to
#
, such that a reference to object 32
would be stored as 32#
.
This has the benefit that, if numified with number()
, it will turn into the
correct value (32
), and because the carat is stored in a separate variable,
that can be redefined at runtime, the user can set this to whatever fits their
data (if, for example, they need the hash character as a plain character).
There are other variables like this, which are defined in the vars.proc
file.
They allow the user to control almost all of the finer aspects of the read and
write process. Take a look at that file for more information.
Parsing JSON
Parsing is done by the @read_json
procedure, which takes a JSON string and
processes it character by character, populating the necessary objects along the
way, until it comes to the end. This is where the beauty of JSON really shines,
because when parsing JSON it is never necessary to back-track: it can be read
properly in a single go, which makes the process significantly simpler.
The mapping between JSON data structures and Praat objects is direct, but can
at times become rather cumbersome: even the relatively simple example at the
top of this post will generate two Table
objects and two Strings
objects!
To make this management easier, all newly created objects will be selected at
the end, and the ID of the top-level object (that will hold any reference to
the other objects) will be stored in the read_json.return
variable.
Producing JSON
But the reason the plugin was written in the first place was not so much to read JSON data into Praat, but to have a way to reliably write fully compliant JSON strings from Praat data.
This is done with the @write_json
procedure, which will act on a single
selected object and choose how to proceed based on the type of that object.
In particular, @write_json
will call a procedure called <TYPE>_to_json
,
where <TYPE>
is the lowercase type of the selected object, which is expected
to turn the relevant data in that object into usable JSON.
The “json” plugin already includes two such procedures: @table_to_json
and
@strings_to_json
, which take care of Table
and Strings
objects
respectively. But this approach allows experienced users to write their own
procedures that will handle specific objects in the way that best suits their
application, and indeed to completely redefine the way Table
and Strings
objects are serialised, if they so desire.
Closing
The solution provided by the “json” plugin is far from ideal. In particular, there are a lot of elements that were necessary (like object references), and that had to be forced into the language by re-using existing features. But it works!
And indeed, it works remarkably well: it passes all of the standard test cases in the JSON test suite, and is currently in use in several production-ready web-applications, that are now able to pass information to and from Praat in a more standard format.
Eventually, I have no doubt that this feature will be included into Praat using a more standard library, and this plugin will become obsolete. I look forward to that day.
But until then, the “json” plugin allows us to do more with less, and opens new doors to applications that were not possible before. It allows us to be lazy, and imaginative, and daring.
And, if nothing else, it shows us that this can be done, even in the limited and often disliked confines of the Praat scripting language.
What will you build now?
-
If you can represent your data using a
Table
object, this can be serialised as a comma or tab-separated list of values, but that only works if your data is flat and matrix-like. What about other, maybe more complex data structures? ↩