**Important**: If you are reading this in GitHub the results are not shown. Please view it using this link: Tutorial: Greek Syntax Queries using Lowfat and Jupyter Notebooks.
This tutorial illustrates some of the kinds of queries that can be done using the Nestle 1904 Lowfat Syntax Trees and Jupyter notebooks. It is aimed at someone who knows Greek fairly well but may not have experience with query languages or programming. It uses the greeksyntax
package, written to simplify the task of writing queries for this environment.
This tutorial does not cover installation. It assumes that you have installed BaseX and that the current Nestle 1904 Lowfat Syntax Trees are installed in a database called "nestle1904lowfat". It also assumes that you are running a Jupyter notebook from the labnotes
subdirectory in the greek-new-testament repo from biblicalhumanities.org.
Jupyter notebooks allow headings, text, and query results to appear together. This document is a Jupyter notebook. If you have properly installed the software, you can run the queries in this notebook and see the results, or modify the queries to see different results.
The following code imports the functions we need and opens the database:
from greeksyntax.lowfat import *
q = lowfat("nestle1904lowfat")
Let's make sure that we have successfully opened the database using a simple query:
q.xquery("count(//book)")
If the query works, you are up and running. Let's get on with the tutorial.
You should be aware that there are limits on the amount of data Jupyter allows a query to return. Queries can return large results, even entire books, but there are limits. If your query returns too much data, you will see the following error:
# This query attempts to return every word in the Greek New Testament. Jupyter returns an error.
q.xquery("//w")
The solution is to write a more specific query. You will see how to do that in the following sections.
Let's start by looking up specific texts using references to books, chapters, verses, and words.
find()
¶The following query returns the sentences in Matthew 5.
If you hover a mouse over a word in the query results, it displays morphological information about the word and a contextualized English gloss.
q.find(milestone("Matt.5"))
interlinear()
¶You can use the interlinear()
function to show a table with contextualized glosses and morphology for a given word or verse:
q.interlinear(milestone("Matt.5.6"))
milestone()
function¶The above queries use the milestone()
function, which generates a query that looks for text structures corresponding to a particular reference. You can execute it by itself to see the query it generates.
milestone("Matt.5.1")
If you use the milestone()
function inside of q.find()
, it finds the sentences specified by this query:
q.find(milestone("Matt.5.1"))
Milestones have the following structure:
Matt
- an entire bookMatt.5
- a chapterMatt.5.36
- a verseMatt.5.36!4
- a wordIf you specify a large result like a chapter, it will be displayed in a scrollable window. If you specify a single word, just that word is shown:
q.find(milestone("Matt.5.6!3"))
Like the previous query, these results have tooltips so you can hover over a word with your mouse and see information about the word.
Many queries are based on the characteristics of individual words. Let's look at the structure of a word in our representation. First, let's look up an individual word the way we did previously:
q.find(milestone("Matt.5.6!1"))
In this tutorial, most results are presented as readable text, but words have a rich structure that contains a great deal of information. Let's use the xquery()
function to see the raw structure of that same word:
q.xquery(milestone("Matt.5.6!1"))
If you like color, you can use the pretty()
function to make that a little more readable:
pretty(q.xquery(milestone("Matt.5.6!1")))
With XPath and XQuery, the structure of the XML tells you how to write a query. The following list shows each attribute associated with a word and provides an equivalent query.
<w>
- Each word is wrapped in a w
element. You can count the words in the Greek New Testament with this query: count(//w)
.class="verb"
- this word is a verb. You can count the verbs in the Greek New Testament with this query: count(//w[@class='verb'])
, which counts the w
elements that have class
attributes with the value verb
.role
- the grammatical role of the word within its clause, in this case p
means predicate
. Not all words have roles - sometimes the role is given to a group of words rather than individual words, and some words like conjunctions do not have clausal roles. You can count individual words that occur as predicates using this query: count(//w[@role='p'])
.osisId
- the milestone for the individual word. You can find this word using the following query: //w[@osisId='Matt.5.6!1']
.n
- an integer that can be used to sort words into sentence order.lemma
- the dictionary form of the word. You can look up other instances of this word with this query: //w[@lemma='μακάριος']
.normalized
- a "normalized" form of the word that ignores changes in accent due to phonological context such as position in the sentence or the presence of clitics. You can look up other instances of this normalized form with this query: //w[@normalized='μακάριοι']
.strong
- a Strong's number.number
, gender
, case
, etc - morphology of the word. You can look up other adjectives that are plural, masculine, and nominative using this query: //w[@class='adj' and @number='plural' and @gender='masculine' and @case='nominative']
.For more documentation on this format, see the Lowfat documentation.
You can play with the queries shown above by creating new cells with the + button in the menu bar and putting your conditions in a string like this:
query = "//w[@class='adj' and @number='plural' and @gender='masculine' and @case='nominative']"
We can search for all instances by calling q.find()
like this:
q.find(query)
Sometimes you want to narrow the results of a query to a particular book or chapter. To search for instances in a given scope, we can use the milestone()
function to specify the scope like this:
q.find(milestone("Matt.5") + query)
Let's look for instances of this in Matthew 5.
q.find(milestone("Matt.5") + query)
highlight()
¶The highlight()
function gives more useful output for queries like this, showing the result highlighted in context of the original sentence. Let's use highlight()
instead of find()
, using the same query.
q.highlight(milestone("Matt.5") + query)
sentence()
¶A similar function, sentence()
, shows the matching item after the sentence. This can be useful for posting to some online forums that strip formatting.
q.sentence(milestone("Matt.5") + query)
We can search for results in a set of scopes by specifying each one in the same cell. Let's look for instances of our query in Luke 1 and Acts 1:
q.highlight(milestone("Luke.1") + query)
q.highlight(milestone("Acts.1") + query)
Syntax is largely about exploring relationships within a clause. The @role
attribute identifies these relationships. Clauses can contain other clauses and phrases in complex recursive structures.
Groups of words are found in <wg>
elements ("word group"). A clause is identified by the attribute class='cl'
. Like words, word groups can have role
attributes that identify their role in a clause.
Let's look for clauses that function as objects of other clauses.
q.highlight(milestone("Matt.1") + "//wg[@class='cl' and @role='o']")
Queries can combine conditions on individual words and conditions on word groups. Let's modify that query to show only clauses that contain participles and function as objects of other clauses. We will use role='v'
rather than class='verb
so that we find only clauses in which the participle governs the clause.
q.highlight(milestone("Acts") + "//wg[@class='cl' and @role='o' and w[@role='v' and @mood='participle']]")
Word groups can also represent phrases of various kinds (see this documentation).
Let's look for prepositional phrases that contain the word πίστις:
q.highlight(milestone("Acts") + "//wg[@class='pp' and .//w[@lemma='πίστις']]")
And let's narrow that to prepostitional phrases where the preposition is ἐν. But let's also broaden the scope, looking for all instances in the Greek New Testament instead of specifying a milestone.
q.highlight("//wg[@class='pp' and w[@lemma='ἐν'] and .//w[@lemma='πίστις']]")
Now let's narrow these results further, showing only phrases where πίστις occurs in the same word group as ἐν or the word group immediately below it.
q.highlight("//wg[@class='pp' and w[@lemma='ἐν'] and (w, wg/w)[@lemma='πίστις']]")
This is only an introductory tutorial showing a small number of queries. It is meant to whet your appetite, to inspire you to think of queries that will teach you about aspects of biblical Greek you are interested in.
I plan to follow this up with more Jupyter notebooks, illustrating specific questions I would like to explore. I also expect to add more resources to the greeksyntax
package. If you want to follow this work, I encourage you to follow my blog.