Freethought & Rationalism ArchiveThe archives are read only. |
09-05-2005, 11:02 AM | #1 |
Veteran Member
Join Date: May 2005
Location: Midwest
Posts: 4,787
|
Inquiry on synoptic statistics.
For any who can assist....
There has been an online statistical approach to the synoptic problem for some time now, and I am wondering what a mathematical mind might make of the process described there. I myself know very little about statistics. Once the discussion arrives at this point...: Luke group: 012, 002, 112...I feel (at least somewhat) qualified to evaluate the results. But most of it is a blur till then. So my question is for statistics buffs. Is the process sound (mathematically)? Is there any way of dumbing it down for the likes of me? (If not, I understand....) Thanks. Ben. |
09-05-2005, 11:23 AM | #2 | |
Veteran Member
Join Date: Jul 2001
Location: the reliquary of Ockham's razor
Posts: 4,035
|
David Gentile divides the words in the Synoptics into groups (following an existing concordance/synopsis):
Quote:
I haven't actually tinkered with his algorithm, but at the root of it, I suppose, is word frequency. kind thoughts, Peter Kirby |
|
09-05-2005, 08:19 PM | #3 |
Veteran Member
Join Date: May 2005
Location: Midwest
Posts: 4,787
|
Ah, I must have been imprecise. I understand the 0-1-2 system that he uses to distinguish parts of the synoptic tradition. What I was after was more the math of it (such as what a Poisson gamma test does).
Thanks. Ben. |
09-05-2005, 08:30 PM | #4 |
Veteran Member
Join Date: Jul 2001
Location: the reliquary of Ockham's razor
Posts: 4,035
|
Ha! I think you were precise enough, but that I most fully answered the question about which I had the most readily available information.
Like I said I have not scrutinized his maths. I probably should do so in preparation for the formal discussion of the synoptic problem. In the meantime, have you considered contacting him? (There is also an old thread about this lying around on IIDB somewhere.) I do believe that he bases it somehow on lexical frequency rather than combinations of letters, combinations of parts of speech, or some other stylometric. kind thoughts, Peter Kirby |
09-05-2005, 08:53 PM | #5 |
Contributor
Join Date: Jun 2000
Location: Los Angeles area
Posts: 40,549
|
Prior thread: Statistical Approach to the Synoptic Problem
|
09-05-2005, 09:40 PM | #6 |
Junior Member
Join Date: Aug 2003
Location: Illinois
Posts: 70
|
The math is pretty much straight out of an Econometrics textbook. There is an example in it that uses the maximum likelihood method with a Poisson distribution, and the math is right from that example. As for making it simpler, I tried my best...
One thing I might add is that the process is very similiar to regression. But regression assumes a normal distribution, and word frequencies are better discribed by a Poisson distrabution. The proceedure is even closer to something known as Logistic regression, which is used all the time in credit risk. It's only a very slight change form that. If you have specific questions, I could try to explain more. Dave Gentile |
09-06-2005, 11:35 AM | #7 | ||
Veteran Member
Join Date: May 2005
Location: Midwest
Posts: 4,787
|
Nice to meet you, Dave! Thanks for dropping in.
Quote:
Quote:
I asked if the math could be dumbed down a bit, and the impression I am getting is that no, it is already as simple as it is going to get. Let me ask one very specific question and see if the answer still eludes me. Suppose a certain Greek word is found 10 times in 020 material. Does your process try to predict, so to speak, that this same word should be found 10 times in 002 material too, and then, failing that, draw the conclusion that 020 and 002 material must not be very closely related (after, of course, 800 other words have been similarly tried)? Also, if the above is even slightly accurate as a description of your process, does it take into account that the 020 material might be more or less lengthy than the 002 material? BTW, I have been drawn of late to the 3SH (in some form or other) for reasons completely unbased in statistics, word counts, or vocabulary comparisons. Which is why I asked about your site in the first place. Thanks. Ben. |
||
09-06-2005, 11:46 AM | #8 |
Veteran Member
Join Date: May 2005
Location: Midwest
Posts: 4,787
|
Is there any way, I wonder, of making distinctions within each category (such as 200)? Streeter once claimed that much of the M material is parasitic on Mark (for example, Peter walking on the water depends on Jesus walking on the water, and the exchange between John the baptist and Jesus in the Jordan depends on there being a baptismal scene). I wonder what would happen if we took that kind of material and compared it with other parts of M that are not modifying Marcan pericopes (one example would be the speech about fasting, prayer, and alms in the sermon on the mount).
Ben. |
09-06-2005, 12:41 PM | #9 | |
Veteran Member
Join Date: Jan 2005
Location: USA
Posts: 1,307
|
Quote:
As best as I can tell, Dave Gentile investigates e.g., whether the distribution of words in the 002 material is a better predictor of the distribution of words in the 020 material as compared with the word distribution averaged across in the synoptics (think of this as his control). Although I have not studied it in great detail, the particular estimators Gentile used (i.e. a Poisson) appear reasonable for the nature of the problem he is investigating and they do certainly take into the account the relative sizes of the different bodies of material (e.g. 002 and 020). At any rate, the key things to understand about Gentile's approach are: 1. He is using a prima facie reasonable way to evaluate how well corpus predicts the the distribution of words in another corpus. 2. He is also using a "control" corpus to estimate the distribution of words in that other corpus. 3. He then compares the first prediction with the "control" prediction is to assess whether the difference is statistically significant. Issues of Gentile's work to explore include: a. Is his estimator appropriate? b. Is his control appropriate? c. Is his comparison of the estimator and the control appropriate? d. Is his interpretation of the statistically significant comparisons appropriate? I have not studied it in sufficient detail to obtain answers I would be satisfied with, but, on the other hand, nothing blatantly flawed is obvious either. In other words, futher investigation along Gentile's approach is not likely to be futile, though the level of competence in statistics requires may preclude most people actually interested in the synoptic problem from attempting to follow up with Gentile's work. Stephen |
|
09-06-2005, 01:22 PM | #10 | |
Veteran Member
Join Date: May 2005
Location: Midwest
Posts: 4,787
|
Quote:
A follow-up.... From your description it appears that if both 020 and 002 (just my running example) lacked a particular word that happened to appear somewhat frequently in other parts of the synoptic traditions, the correlation between 020 and 002 would be improved, as it were. Is that correct? Also, it appears to me that this method really deals only with the broadest level of synoptic relations. For example, it does not appear capable of identifying any given pericope within 002 that may actually more closely resemble 200 than 020. Correct? Thanks. Ben. |
|
Thread Tools | Search this Thread |
|