FRDB Archives

Freethought & Rationalism Archive

The archives are read only.


Go Back   FRDB Archives > Archives > Religion (Closed) > Biblical Criticism & History
Welcome, Peter Kirby.
You last visited: Yesterday at 03:12 PM

 
 
Thread Tools Search this Thread
Old 11-29-2006, 12:33 AM   #1
Veteran Member
 
Join Date: Jul 2001
Location: the reliquary of Ockham's razor
Posts: 4,035
Default Information Architecture and History

I have been thinking about Early Latin Writings and how specifically to design the website. I have also been thinking more broadly, however, about information architecture and history.

As I see it, there are at least three "layers of history":

On the surface, we have the historians and the books of history. (There are even edifices built on this surface, tertiary works such as encyclopedias and journalistic prose, or even legitimately useful things like bibliographies.)

In the middle of the history strata, we have the evidence of history, the various manuscripts and archaeological finds and (in cases) oral history.

At the bottom layer, we have the people and the events and the texts, which history is ultimately about.

Here is what the bottom layer is composed of in my vision:

Code:
DATE

PLACE

EVENT
+------> DATE
+------> PLACE
+------> Optional Info
             +-------> Actors (PERSON)
             +-------> Causes (EVENT)
             +-------> Effects (EVENT)

PERSON
+------> Birth (EVENT)
+------> Death (EVENT)
+------> Optional Info
             +-------> Deeds (EVENT)
             +-------> Works (WRITING)

WRITING
+------> Author (PERSON)
+------> Provenance (EVENT)
+------> Optional Info
             +-------> Allusions and Citations (WRITING)
             +-------> Recorded Persons (PERSON)
             +-------> Recorded Events (EVENT)
             +-------> Recorded Places (PLACE)
I won't draw the structure of the middle and top layers just yet; I first want to be able to think that I've set out a good foundation.

Think of all these arrows as being like hyperlinks. When you encounter a WRITING, you may be linked to an Author (PERSON), who has a Death (EVENT), which has a DATE and a PLACE. For example, you may encounter the Gallic Wars and be linked to Gaius Julius Caesar, and his death by assassination, which has a date of 15 March 44 BC and a place of Rome. But instead of going for the date of the event "death by assassination", you might look at the other events listed as causes, or you might want to find out who exactly "Brutus" was in that famous line. And so it goes.

Okay, so, who would like to correct and/or add to what I've said about information architecture and history?

Who has any ideas for me on how I can make a useful resource with knowledge of this kind of structure? (Such as a book or, more likely, a website?)

For example, we will want some way to sort "EVENTS" other than by Date, Place, and Actor(s)...won't we want to know whether this was an event of a certain category, like a war or a birth or an intellectual creation? How are those kinds of tags to be included with the data?

regards,
Peter Kirby
Peter Kirby is online now   Edit/Delete Message
Old 11-29-2006, 08:37 PM   #2
Veteran Member
 
Join Date: Jul 2001
Location: the reliquary of Ockham's razor
Posts: 4,035
Default

ZzzZzzz...bored yet? Ha! Information architecture is boring. Add history and you've got boring + boring. Well...

Did you know that these techniques will reveal to us whether Jesus existed? Read on!

I am now thinking about how this data will be entered into the database. I don't think it would be entirely acceptable to build it up from a bunch of forms where one manually enters in an "Event" with a Place and a Date and then moves on to the next Event to enter, ad nauseum...real historians don't do database construction and data entry. They write history.

Rather, I think we need a kind of markup language for historical writing, that allows the computer to extract from it the information it needs to build the database. I would call it "Historical Text Markup Language" but the acronym is taken. How about PDML, "Place-Date Markup Language," since the fundamental units are the Place and the Date. (A person is a combination of a birthdate, birthplace, deathdate, and deathplace, along with a bunch of optional dates and places where that person did something.)

Picking a book off my shelf and a passage off that book arbitrarily, I get the following entry from the Oxford Dictionary of the Bible, 55:

Quote:
Caiaphas Son-in-law and successor of Annas, high priest in Jerusalem. He held office from 18 to 37 CE, but the statement in John (18:13) that he was high priest 'that year', combined with the reference to a preliminary investigation by Annas (John 18:13-24) has suggested the possibility of the high priesthood being held for one year only at a time. But, more probably, John's meaning is that Caiaphas was the high priest 'in that memorable year of the crucifixion'. An ossuary with an Aramaic inscription thought to mean 'Joseph, son of Caiaphas' was found in 1990 in the Caiaphas family tomb in Jerusalem; but the translation is by no means certain.
How could we mark this up so that a computer could take advantage of some of the information that is in text? Perhaps something like this...

Quote:
<person>Caiaphas</person> Son-in-law and successor of <person>Annas</person>, high priest in <place>Jerusalem</place>. He held office from <date>18 to 37 CE</date>, but the statement in <cite>John (18:13)</cite> that he was high priest 'that year', combined with the reference to a preliminary investigation by Annas <cite>(John 18:13-24)</cite> has suggested the possibility of the high priesthood being held for one year only at a time. But, more probably, John's meaning is that Caiaphas was the high priest 'in that memorable year of the crucifixion'. <artifact>An ossuary with an Aramaic inscription thought to mean 'Joseph, son of Caiaphas' was found in 1990 in the Caiaphas family tomb in Jerusalem; but the translation is by no means certain.</artifact>
Then the historian would be given the opportunity to enter the persons into the database, or match up the persons in the article with the persons in the database, whichever the appropriate action would be. And the article would be found under those persons. The cites would hopefully be parsed and the historian could simply confirm that those are the cites. The date likewise would be parsed and the historian would confirm that this is the date in question. Then the historian would be asked to enter the event associated with the date, namely, "Caiaphas high priest in Jerusalem", correlating this event with the given date, the given place, and the given person. Lastly, the historian would select the artifact from the database that his article was mentioning.

To the extent that it is possible, we would want to go from history-book form to database form with as little human intervention, and with as little error computer or human, as possible. I would suggest doing this by having the historian (or the person entering the history books' information) simply enter the text without markup of any kind at first. Then the computer attempts to mark up the passage by identifying likely citations, likely dates, and likely names of people and places with things that the computer already knows. In the final step, the person is asked to enter any additional people, places, dates, artifacts, etc. that the computer did not recognize.

This would achieve a couple things:

Passages of history would be indexed according to their subject matter in terms of the people, places, dates, artifacts, texts, etc. that they talk about. This would make it magnitudes easier to find the passages of history of interest to the future researcher.

The data itself could be manipulated and analysed and displayed. You could do an interactive timeline showing where and when people were writing the extant texts preserved to us, for example. Or if you were including the events of wars, an interactive map showing what battles occured at what places and when, linked to the history books that talk about that battle.

What like this has been done in the domains of history or other domains?

regards,
Peter Kirby

PS- OK, I lied. These techniques won't reveal to us whether Jesus existed. Is that all you care about?
Peter Kirby is online now   Edit/Delete Message
Old 11-29-2006, 09:39 PM   #3
Contributor
 
Join Date: Jun 2000
Location: Los Angeles area
Posts: 40,549
Default

I think you need more:

<person>Caiaphas</person> <familial relation>Son-in-law</familial relation> and <political relation>successor</political relation> of <person>Annas</person>, <office>high priest</office> in <place>Jerusalem</place>. He held office from <date of office>18 to 37 CE</date of office>, but the statement in <cite>John (18:13)</cite> that he was high priest 'that year', combined with the reference to a preliminary investigation by Annas <cite>(John 18:13-24)</cite> has <speculation>suggested the possibility of the high priesthood being held for one year only at a time. But, more probably, John's meaning is that Caiaphas was the high priest 'in that memorable year of the crucifixion'.</specualtion> <artifact>An ossuary with an Aramaic inscription thought to mean 'Joseph, son of Caiaphas' was found in 1990 in the Caiaphas family tomb in Jerusalem; but the translation is by no means certain.</artifact>

This looks like a lot of work. Will it actually save time or lead to better analysis?
Toto is offline  
Old 11-29-2006, 10:28 PM   #4
Contributor
 
Join Date: Mar 2006
Location: Falls Creek, Oz.
Posts: 11,192
Default

Quote:
Originally Posted by Peter Kirby View Post
I have been thinking about Early Latin Writings and how specifically to design the website. I have also been thinking more broadly, however, about information architecture and history.

As I see it, there are at least three "layers of history":

On the surface, we have the historians and the books of history. (There are even edifices built on this surface, tertiary works such as encyclopedias and journalistic prose, or even legitimately useful things like bibliographies.)

In the middle of the history strata, we have the evidence of history, the various manuscripts and archaeological finds and (in cases) oral history.

At the bottom layer, we have the people and the events and the texts, which history is ultimately about.

Here is what the bottom layer is composed of in my vision:

Code:
DATE

PLACE

EVENT
+------> DATE
+------> PLACE
+------> Optional Info
             +-------> Actors (PERSON)
             +-------> Causes (EVENT)
             +-------> Effects (EVENT)

PERSON
+------> Birth (EVENT)
+------> Death (EVENT)
+------> Optional Info
             +-------> Deeds (EVENT)
             +-------> Works (WRITING)

WRITING
+------> Author (PERSON)
+------> Provenance (EVENT)
+------> Optional Info
             +-------> Allusions and Citations (WRITING)
             +-------> Recorded Persons (PERSON)
             +-------> Recorded Events (EVENT)
             +-------> Recorded Places (PLACE)
I won't draw the structure of the middle and top layers just yet; I first want to be able to think that I've set out a good foundation.

Think of all these arrows as being like hyperlinks. When you encounter a WRITING, you may be linked to an Author (PERSON), who has a Death (EVENT), which has a DATE and a PLACE. For example, you may encounter the Gallic Wars and be linked to Gaius Julius Caesar, and his death by assassination, which has a date of 15 March 44 BC and a place of Rome. But instead of going for the date of the event "death by assassination", you might look at the other events listed as causes, or you might want to find out who exactly "Brutus" was in that famous line. And so it goes.

Okay, so, who would like to correct and/or add to what I've said about information architecture and history?

Who has any ideas for me on how I can make a useful resource with knowledge of this kind of structure? (Such as a book or, more likely, a website?)

For example, we will want some way to sort "EVENTS" other than by Date, Place, and Actor(s)...won't we want to know whether this was an event of a certain category, like a war or a birth or an intellectual creation? How are those kinds of tags to be included with the data?

regards,
Peter Kirby

Some comments:

1) Good start on the understanding that you are classifying
a writing as an event, which has its own date and place.

2) You will need to defined an "INTERPOLATION" event. That
is the system will need to not only tell you that Josephus
wrote "Antiquities" in 91 CE (or AD, etc, etc), but that a
certain slab of text within this WRITING, known as the TF,
is often thought to have been written in the fourth century,
and sometimes that this author was Eusebius. This can get
tricky, but by thinking it through, and incorporating certain
conventions in your database schema, you will eventually
overcome this type of "writing".

3) CATS: Category Codes will become all important. At the
global levels, if you were to dump all your texts (christian,
hebrew, latin, greek, coptic, syriac, etc) in to the one
database with some form of "tribal language" code, then
they could all co-exist in the same schema, and simply
be differentiated by this one code.

Other category codes will be useful against WRITINGS
and AUTHORS to differentiate various groupings as
previously mentioned above.

4) MANY-TO-MANY relationships: are difficult. For example,
one WRITING might have many names, and many different
ideas of when (the EVENT) of authorship. The author too,
may be one of many purported/attributed, or the author
may be unknown.

5) CITATIONS: Thus the way WRITINGS reference WRITINGS
(ie: who mentions who, etc) will be important. Possibly the
best arrangement of this type of system that I have recently
seen are the academic citation databases, such as LANL for
the physics publications (I am unaware whether there are
similar citation databaes for BC&H papers, etc). Authors
and writings and events can be cited. It is the citations
which provide the referential integrity of the WRITINGS
and the AUTHORS of various groups and clusterings by
category --- perhaps even inter-category.

6) FRAUD: You are going to have to make provision for
fraudulent works. For example:
* the correspondence between Seneca and Paul
* Acts of Pilate - ref by Justin, then Tertullian, then Eusebius
* Pilate's conversion, etc
* Emperor Marcus Aurelius Antoninus' "Thundering Legion" report.

7) SCHEMA CHANGE: As you have done, keep things simple.
No matter what, change will mandate that you need to add
new fields (perhaps new categories and/or flags to service
various optional data classifications). Dont be worried about
this, as you already know, this is not difficult. Nothing is
set in cement. It changes.

8) EVENTS as we know them are usually going to be sourced
to a specific TEXT or WRITING. So unless you are defining the
text level to be associated with events, I see that the text level
is at the foundation, not the events. It will be a challenge
enough to create database with all the different texts of
antiquity and late antiquity (and their authors). Out of this
the events should emerge, because it is only via the texts
that --- in many cases --- we learn of ancient events.

9) DATE UNCERTAINTIES: Big problem with some authors
and some texts, vartying by 300 years in some cases. I
have been confronted with this problem. Somehow one
needs to associate with any given date (event, text,
writing-date, birth, etc) an ERROR-BAR. (Plus or minus
2, 20 or 200 years for example). This could be an additional
field associated with ALL* DATES.

That's about it for now.
Best wishes,


Pete
mountainman is offline  
Old 11-29-2006, 11:28 PM   #5
Veteran Member
 
Join Date: Jul 2001
Location: the reliquary of Ockham's razor
Posts: 4,035
Default

Quote:
Originally Posted by Toto View Post
I think you need more:

<person>Caiaphas</person> <familial relation>Son-in-law</familial relation> and <political relation>successor</political relation> of <person>Annas</person>, <office>high priest</office> in <place>Jerusalem</place>. He held office from <date of office>18 to 37 CE</date of office>, but the statement in <cite>John (18:13)</cite> that he was high priest 'that year', combined with the reference to a preliminary investigation by Annas <cite>(John 18:13-24)</cite> has <speculation>suggested the possibility of the high priesthood being held for one year only at a time. But, more probably, John's meaning is that Caiaphas was the high priest 'in that memorable year of the crucifixion'.</specualtion> <artifact>An ossuary with an Aramaic inscription thought to mean 'Joseph, son of Caiaphas' was found in 1990 in the Caiaphas family tomb in Jerusalem; but the translation is by no means certain.</artifact>

This looks like a lot of work. Will it actually save time or lead to better analysis?
Yes, it would be a lot of work. Please scratch the idea of a markup language per se, as being something the human does to prepare a text for the computer.

What I would prefer is for a computer to scan a text for the following:
  • Citations of texts
  • Names of people
  • Names of places
  • Dates

And to store the metadata about the history-book passage somewhere centralized, with a pointer to that passage of history-book material.

Someone who is entering the history book passage could add additional references from database items (text cites, people, places, and dates) to the given history-book material, if the computer did not pick it up automatically.

The database (which is needed before any recognition of text cites, people, or places can be made) would be created ahead of time and continually added to.

The end result is that you can make a search on a given textual passage (such as a verse of the Bible or a section of Tacitus), a certain ancient person, a certain place, or a certain range of dates, and get all the history-book pages that relate to that. (We could add names of events too, by the way, like 'the destruction of the Temple'.) Further, you can mix and match your criteria for search with good old-fashioned keyword searching. You could also use logical NOTs, ANDs, and ORs. ("Find me everything written about a Jesus, but NOT Jesus of Nazareth, kk thx.")

The work involved includes creating the history books, but people already do that.

The extra work involved on a continuing basis is to upload these history books to the database (this could be done more easily if it weren't for copyright, but there are non-protected history books out there) and allow the computer to find the names of places, names of people, date ranges, etc.--the work on the human operator side would consist primarily of quality control and disambiguation, making sure that the person named Pliny is actually the Younger or the Elder, for example.

I would say that the operator-disambiguation probably creates a premium of work of about 20% over the work of digitalizing the books, and perhaps a 5% increase over the work of writing history books from scratch.

However, it could improve our ability to access these history books by a good factor, multiplied by the great number of searches that would be performed.

A full text search on history books would itself be an improvement, but adding metadata would make it easier still to find information. And it would make it possible, where it would otherwise be impossible, to create automatically generated timelines, maps, and other aids to study.

regards,
Peter Kirby
Peter Kirby is online now   Edit/Delete Message
Old 11-30-2006, 01:08 AM   #6
Senior Member
 
Join Date: Mar 2005
Location: Darwin, Australia
Posts: 874
Default

Quote:
Originally Posted by Peter Kirby View Post
As I see it, there are at least three "layers of history":

On the surface, we have the historians and the books of history. (There are even edifices built on this surface, tertiary works such as encyclopedias and journalistic prose, or even legitimately useful things like bibliographies.)

In the middle of the history strata, we have the evidence of history, the various manuscripts and archaeological finds and (in cases) oral history.

At the bottom layer, we have the people and the events and the texts, which history is ultimately about.
2 points:

1.
As I understand it what you are intending to design is known in the Semantic Web world as a Topic Map.

2.
Your 3 layers would appear to give the bottom layer a reality comparable to the upper two. Many historians I suspect would argue that one cannot tease out your bottom layer to have an entity independent of your middle layer. Rather, several philosophies of history would argue it is more valid to say that what you have posited as a bottom layer would more justifiably be placed above above your surface layer. Your bottom layer really only exists in the culture of those who read or are otherwise influenced by the historians.

Further, the example you cite, Josephus, is not an archaeololgical artifact of events in Judea, and it is invalid to equate the literature of another historian with archaeological finds. The very nature of the knowledge each yields to an historian is completely different.

And forget oral history: once it is written history it is by nature no longer oral history; and anything else that presents itself as such is first and last a construct of your surface layer.

Not that we have to give up the search for truth and start looking for a good fantasy as someone said, but your architecture presents a model that most historians I think would regard as misleading in the way it appears to be capable of being used as a means of getting to that "truth".
neilgodfrey is offline  
Old 11-30-2006, 02:49 AM   #7
Contributor
 
Join Date: Mar 2006
Location: Falls Creek, Oz.
Posts: 11,192
Default

Quote:
Originally Posted by neilgodfrey View Post
Your 3 layers would appear to give the bottom layer a reality comparable to the upper two. Many historians I suspect would argue that one cannot tease out your bottom layer to have an entity independent of your middle layer. Rather, several philosophies of history would argue it is more valid to say that what you have posited as a bottom layer would more justifiably be placed above above your surface layer. Your bottom layer really only exists in the culture of those who read or are otherwise influenced by the historians.
Actually, I agree that the model can be simplified
to the consideration of only 2 layers as follows:

Layer 1: Historical commentary (on records) of various modes. (surface)
Layer 2: Historical records of various types. (foundation/core)
(See below)

Quote:
Further, the example you cite, Josephus, is not an archaeololgical artifact of events in Judea, and it is invalid to equate the literature of another historian with archaeological finds. The very nature of the knowledge each yields to an historian is completely different.
Historical records or artifacts should be capable of being stored
next to each other, and separate according to a record-type
or artifact type, only one of which is text:

* texts
* architecture (buildings, etc)
* statues
* inscriptions
* archeological relics
* art
* coins
* carbon dating citations

Another aspect that is needed is an extention of the database
schema so described up until now, in order to provide for the
detail specification of the textual transmission history, as is
often outlined by Roger Pearce.

Roger I am sure would have had moments in which he has
contemplated a database structure for all the necessary
references, and "trees of reference" involved in the step
by step reconstruction of the textual transmission.

In fact, I see this as your "middle layer" if one insists on
having a trinity of layers as a concept. This, with respect
to the * TEXT record type, forms the chain of citations
which link the texts we have with a place back in antiquity.

Really though, it is not a middle layer, but simply additional
fields whereby some specific original text item (eg: the letter
of Jesus to King Agbar) can be tracked to its current state,
if existent in some University collection of letters, mss, etc.
All texts need to have these "transmission history" sequence
fields associated with them at the end of the day.

Finally, probably the best method of kick-starting such a massive
project is to determine whether there are any other collections
already collated in the public domain, primarily for texts and
authors, but also for all of the other artifact types (eg: coins).

The stubs of many items could be shipped in if they can be
"harvested" from extant (public domain) rich data sources.



Pete
mountainman is offline  
Old 12-01-2006, 10:15 AM   #8
Veteran Member
 
Join Date: Dec 2005
Location: Scotland
Posts: 1,549
Default

I am not an expert at all in the matters that you are discussing, but I was interested in the idea of self-organising collections of documents, and about ten years ago I even went to the extent of developing a proof-of-concept website. It organises a set of poems that I like by keyword, title, and author, but it does so automatically. If I want to add a new keyword, I run a little script which searches all of the poems, and turns the each occurrence of the keyword into a tag, and then if you click on the tag an index of all of the poems containing the keyword is generated. If I want to add another poem, the keywords in the poem are automatically made hot. If I remove the poem any keywords unique to that poem are removed. If I remove a keyword it is edited out of all the poems that contain the word.

What makes the approach powerful is that you don't have to go through every added text and make the keywords hot by hand. So if electronic versions of the texts of interest are available the whole database more or less assembles itself. I set up the website for fun. This whole posting may be irrelevant, a waste of everybody's time but if anyone wants to play and see how the documents organise themselves visit

http://ccgi.houseofdeer.plus.com/connexion/

Obviously a more comprehensive system of searching would be necessary for a system that was more than a toy, and boolean combinations of search terms, and hierarchical searching of documents would be necessary for serious use, but the principle might be of interest (or again it might not). In any case it's a quick way to make a complicated website.

johno
johno is offline  
Old 12-01-2006, 05:11 PM   #9
Senior Member
 
Join Date: Mar 2005
Location: Darwin, Australia
Posts: 874
Default

Quote:
Originally Posted by mountainman View Post

Historical records or artifacts should be capable of being stored
next to each other, and separate according to a record-type
or artifact type, only one of which is text:

* texts
* architecture (buildings, etc)
* statues
* inscriptions
* archeological relics
* art
* coins
* carbon dating citations
If the texts are contemporaneous with the other archaeological evidence and are treated as evidence of the same culture or events as evidenced by that contemporary archaeological evidence, then Yes, they should be treated as data supporting the same kind of knowledge. Such texts would be things like store lists or trade records or legal documents or official letters or various scribal treatises or pupil exercises or literary collections or court chronologies -- texts that help flesh out the evidence that we are piecing together from the coins and building remains.

But if the text is an interpretative history (especially one that depends in part on unknown sources and a debatable agenda and quality of memory) and when it is writing of another generation not directly evidenced by the archaeological evidence with which it is contemporaneous, then on what grounds can we justify distinguishing that text from the surface layer of other historians in Peter's model? It can only be treated on the same level as archaeological evidence insofar as we study it for what light it can throw on its own context. The stories it tells of the past must be used to inform the historian what that cultural product reveals about the time of the author for it to be treated on the same level as stones and metal finds.

Further, texts that are treated as oral history would be treated more coherently as a written genre purporting to be a recording of oral history rather than "oral history" as such. (Especially if Peter's "topic map" is down the track going to be interoperable with other history TMs, some of which will certainly refer to works of current historians of oral history.)

Neil
http://vridar.wordpress.com
neilgodfrey is offline  
Old 12-02-2006, 02:31 PM   #10
Contributor
 
Join Date: Mar 2006
Location: Falls Creek, Oz.
Posts: 11,192
Default

Quote:
Originally Posted by neilgodfrey View Post
If the texts are contemporaneous with the other archaeological evidence and are treated as evidence of the same culture or events as evidenced by that contemporary archaeological evidence, then Yes, they should be treated as data supporting the same kind of knowledge. Such texts would be things like store lists or trade records or legal documents or official letters or various scribal treatises or pupil exercises or literary collections or court chronologies -- texts that help flesh out the evidence that we are piecing together from the coins and building remains.
Agreed.

Quote:
But if the text is an interpretative history (especially one that depends in part on unknown sources and a debatable agenda and quality of memory) and when it is writing of another generation not directly evidenced by the archaeological evidence with which it is contemporaneous, then on what grounds can we justify distinguishing that text from the surface layer of other historians in Peter's model? It can only be treated on the same level as archaeological evidence insofar as we study it for what light it can throw on its own context. The stories it tells of the past must be used to inform the historian what that cultural product reveals about the time of the author for it to be treated on the same level as stones and metal finds.
This presupposes that the texts in question are not fictions.
For example, consider the "Historia Augusta".
We may assume (quite nievely) that the text is descriptive
of historical events, but it may be baloney.

Therefore, at the detail level, prior to any assessment,
all we have is a text and a date. It then needs to be
categorised, and this is a separate step up from the detail,
the first in a series of steps which provide commentaries,
which themselves are texts, with dates (from some chronology)

Best wishes,



Pete
mountainman is offline  
 

Thread Tools Search this Thread
Search this Thread:

Advanced Search

Forum Jump


All times are GMT -8. The time now is 03:35 PM.

Top

This custom BB emulates vBulletin® Version 3.8.2
Copyright ©2000 - 2015, Jelsoft Enterprises Ltd.