Tutorial: Greek Syntax Queries using Lowfat and Jupyter Notebooks¶

This tutorial illustrates some of the kinds of queries that can be done using the Nestle 1904 Lowfat Syntax Trees and Jupyter notebooks. It is aimed at someone who knows Greek fairly well but may not have experience with query languages or programming. It uses the greeksyntax package, written to simplify the task of writing queries for this environment.

This tutorial does not cover installation. It assumes that you have installed BaseX and that the current Nestle 1904 Lowfat Syntax Trees are installed in a database called "nestle1904lowfat". It also assumes that you are running a Jupyter notebook from the labnotes subdirectory in the greek-new-testament repo from biblicalhumanities.org.

Jupyter notebooks allow headings, text, and query results to appear together. This document is a Jupyter notebook. If you have properly installed the software, you can run the queries in this notebook and see the results, or modify the queries to see different results.

Opening the Database¶

The following code imports the functions we need and opens the database:

from greeksyntax.lowfat import *

q = lowfat("nestle1904lowfat")

Database 'nestle1904lowfat' was opened in 1.7 ms.

Let's make sure that we have successfully opened the database using a simple query:

q.xquery("count(//book)")

'27'

If the query works, you are up and running. Let's get on with the tutorial.

Don't Try to Return the Whole Database¶

You should be aware that there are limits on the amount of data Jupyter allows a query to return. Queries can return large results, even entire books, but there are limits. If your query returns too much data, you will see the following error:

# This query attempts to return every word in the Greek New Testament.  Jupyter returns an error.
q.xquery("//w")

IOPub data rate exceeded.
The notebook server will temporarily stop sending output
to the client in order to avoid crashing it.
To change this limit, set the config variable
`--NotebookApp.iopub_data_rate_limit`.

The solution is to write a more specific query. You will see how to do that in the following sections.

Book, Chapter, Verse, Word¶

Let's start by looking up specific texts using references to books, chapters, verses, and words.

Showing Text with `find()`¶

The following query returns the sentences in Matthew 5.

If you hover a mouse over a word in the query results, it displays morphological information about the word and a contextualized English gloss.

q.find(milestone("Matt.5"))

Glosses and Morphology using `interlinear()`¶

You can use the interlinear() function to show a table with contextualized glosses and morphology for a given word or verse:

q.interlinear(milestone("Matt.5.6"))

The `milestone()` function¶

The above queries use the milestone() function, which generates a query that looks for text structures corresponding to a particular reference. You can execute it by itself to see the query it generates.

milestone("Matt.5.1")

"//sentence[milestone[@id='Matt.5.1']]"

If you use the milestone() function inside of q.find(), it finds the sentences specified by this query:

q.find(milestone("Matt.5.1"))

Milestones have the following structure:

Matt - an entire book
Matt.5 - a chapter
Matt.5.36 - a verse
Matt.5.36!4 - a word

If you specify a large result like a chapter, it will be displayed in a scrollable window. If you specify a single word, just that word is shown:

q.find(milestone("Matt.5.6!3"))

Like the previous query, these results have tooltips so you can hover over a word with your mouse and see information about the word.

Words, Lemmas, and Morphology¶

Many queries are based on the characteristics of individual words. Let's look at the structure of a word in our representation. First, let's look up an individual word the way we did previously:

q.find(milestone("Matt.5.6!1"))

Words and their attributes in XML¶

In this tutorial, most results are presented as readable text, but words have a rich structure that contains a great deal of information. Let's use the xquery() function to see the raw structure of that same word:

q.xquery(milestone("Matt.5.6!1"))

'<w role="p" class="adj" osisId="Matt.5.6!1" n="400050060010010" lemma="μακάριος" normalized="μακάριοι" strong="3107" number="plural" gender="masculine" case="nominative" head="true" gloss="Blessed [are]">μακάριοι</w>'

If you like color, you can use the pretty() function to make that a little more readable:

pretty(q.xquery(milestone("Matt.5.6!1")))

<w role="p" class="adj" osisId="Matt.5.6!1" n="400050060010010" lemma="μακάριος" normalized="μακάριοι" strong="3107" number="plural" gender="masculine" case="nominative" head="true" gloss="Blessed [are]">μακάριοι</w>

Queries on words and their attributes¶

With XPath and XQuery, the structure of the XML tells you how to write a query. The following list shows each attribute associated with a word and provides an equivalent query.

<w> - Each word is wrapped in a w element. You can count the words in the Greek New Testament with this query: count(//w).
class="verb" - this word is a verb. You can count the verbs in the Greek New Testament with this query: count(//w[@class='verb']), which counts the w elements that have class attributes with the value verb.
role - the grammatical role of the word within its clause, in this case p means predicate. Not all words have roles - sometimes the role is given to a group of words rather than individual words, and some words like conjunctions do not have clausal roles. You can count individual words that occur as predicates using this query: count(//w[@role='p']).
osisId - the milestone for the individual word. You can find this word using the following query: //w[@osisId='Matt.5.6!1'].
n - an integer that can be used to sort words into sentence order.
lemma - the dictionary form of the word. You can look up other instances of this word with this query: //w[@lemma='μακάριος'].
normalized - a "normalized" form of the word that ignores changes in accent due to phonological context such as position in the sentence or the presence of clitics. You can look up other instances of this normalized form with this query: //w[@normalized='μακάριοι'].
strong - a Strong's number.
number, gender, case, etc - morphology of the word. You can look up other adjectives that are plural, masculine, and nominative using this query: //w[@class='adj' and @number='plural' and @gender='masculine' and @case='nominative'].

For more documentation on this format, see the Lowfat documentation.

You can play with the queries shown above by creating new cells with the + button in the menu bar and putting your conditions in a string like this:

query = "//w[@class='adj' and @number='plural' and @gender='masculine' and @case='nominative']"

We can search for all instances by calling q.find() like this:

q.find(query)

Scoping Queries with Milestones¶

Sometimes you want to narrow the results of a query to a particular book or chapter. To search for instances in a given scope, we can use the milestone() function to specify the scope like this:

q.find(milestone("Matt.5") + query)

Let's look for instances of this in Matthew 5.

q.find(milestone("Matt.5") + query)

Highlighting query results with `highlight()`¶

The highlight() function gives more useful output for queries like this, showing the result highlighted in context of the original sentence. Let's use highlight() instead of find(), using the same query.

q.highlight(milestone("Matt.5") + query)

Showing matching items separately with `sentence()`¶

A similar function, sentence(), shows the matching item after the sentence. This can be useful for posting to some online forums that strip formatting.

q.sentence(milestone("Matt.5") + query)

Multiple scopes in a single cell¶

We can search for results in a set of scopes by specifying each one in the same cell. Let's look for instances of our query in Luke 1 and Acts 1:

q.highlight(milestone("Luke.1") + query)
q.highlight(milestone("Acts.1") + query)

Syntax¶

Word groups in XML¶

Syntax is largely about exploring relationships within a clause. The @role attribute identifies these relationships. Clauses can contain other clauses and phrases in complex recursive structures.

Groups of words are found in <wg> elements ("word group"). A clause is identified by the attribute class='cl'. Like words, word groups can have role attributes that identify their role in a clause.

A simple syntax query¶

Let's look for clauses that function as objects of other clauses.

q.highlight(milestone("Matt.1") + "//wg[@class='cl' and @role='o']")

Combining conditions on words and word groups¶

Queries can combine conditions on individual words and conditions on word groups. Let's modify that query to show only clauses that contain participles and function as objects of other clauses. We will use role='v' rather than class='verb so that we find only clauses in which the participle governs the clause.

q.highlight(milestone("Acts") + "//wg[@class='cl' and @role='o' and w[@role='v' and @mood='participle']]")

Phrases¶

Word groups can also represent phrases of various kinds (see this documentation).

Let's look for prepositional phrases that contain the word πίστις:

q.highlight(milestone("Acts") + "//wg[@class='pp' and .//w[@lemma='πίστις']]")

And let's narrow that to prepostitional phrases where the preposition is ἐν. But let's also broaden the scope, looking for all instances in the Greek New Testament instead of specifying a milestone.

q.highlight("//wg[@class='pp' and w[@lemma='ἐν'] and .//w[@lemma='πίστις']]")

Now let's narrow these results further, showing only phrases where πίστις occurs in the same word group as ἐν or the word group immediately below it.

q.highlight("//wg[@class='pp' and w[@lemma='ἐν'] and (w, wg/w)[@lemma='πίστις']]")

Next Steps¶

This is only an introductory tutorial showing a small number of queries. It is meant to whet your appetite, to inspire you to think of queries that will teach you about aspects of biblical Greek you are interested in.

I plan to follow this up with more Jupyter notebooks, illustrating specific questions I would like to explore. I also expect to add more resources to the greeksyntax package. If you want to follow this work, I encourage you to follow my blog.

μακάριοι	Blessed [are]	adj : μακάριος plural masculine nominative
οἱ	those	det : ὁ plural masculine nominative
πεινῶντες	hungering	verb : πεινάω plural masculine nominative present active participle
καὶ	and	conj : καί
διψῶντες	thirsting for	verb : διψάω plural masculine nominative present active participle
τὴν	-	det : ὁ singular feminine accusative
δικαιοσύνην,	righteousness	noun : δικαιοσύνη singular feminine accusative
ὅτι	for	conj : ὅτι
αὐτοὶ	they	pron : αὐτός plural masculine nominative
χορτασθήσονται.	will be filled	verb : χορτάζω plural future passive indicative