Freethought & Rationalism ArchiveThe archives are read only. |
03-29-2003, 04:10 PM | #31 |
Veteran Member
Join Date: Mar 2002
Location: anywhere
Posts: 1,976
|
You know, on a second reading, maybe what DNAunion is really after is some pat on the back for taking the time to write some Visual FoxPro and C++ code, even though it is based on a flawed model. After all, he has taken great pains in his last post to explain his code to me, as if to suggest that I cannot read code. But, assumptions seemed to be the entire basis of that post.
Anyway, by all means, let's not deprive him of that accolade: <pat pat> |
03-29-2003, 04:34 PM | #32 | |
Veteran Member
Join Date: Mar 2002
Location: anywhere
Posts: 1,976
|
Quote:
P(getting at least 3 matches) = P(getting exactly 3 or 4 matches) = (5C3 * 4C1 + 5C4 * 4C0)/9C4 = 5/14 Now the question here is whether or not this demonstration is sufficient to satisfy DNAunion? Or will he just keep peppering me with academic questions because he himself doesn't know of a more elegant way of doing combinatorial analysis? NB: there is ambiguity in DNAunion's language. He asked for "getting 3 matches" when I calculated how to get "at least 3 matches." I assume the SIPF sequence is the one being matched (all in keeping with Rode's paper). Anyway let's see where this goes. |
|
03-29-2003, 06:07 PM | #33 | ||
Veteran Member
Join Date: Mar 2002
Location: anywhere
Posts: 1,976
|
Well, DNA pretty much asked for a critique of his code. So I shall oblige.
First off, a quick comment to the following: Quote:
Quote:
1) The code checks on an individual basis each case. That is to say, it only calculates P(n exact matches). Yet, all of the discussion is around P(at least n matches)... not to mention, Rode is talking about cumulative probabilities. Gee, not very useful. 2) The code gives no sense of how many significant figures to believe in the final result. In fact, it reports as many as is permissible in the OS. So, suppose I get the result 38.0542%, what is the actual significant result: 0.4? 0.38? 0.381? 0.3805? Yet, the only way to get this information is to run multiple lIterations to get a sense of the significant digit. (Of course, there is a theoretical answer to this matter, but why would DNAunion believe that?) In any case this brings us to the complexity of the code. 3) In order to accomodate running multiple scenarios quickly (i.e. to have efficient code), one needs to minimize bloated code. Let me give just one example of how DNAunion's code is not efficient: Code:
long GetRandomNumber(int nMin, int nMax) { long lRandomNumber; lRandomNumber = rand(); while (lRandomNumber < nMin || lRandomNumber > nMax) { lRandomNumber = rand(); } return lRandomNumber; } 4) In any case, I'd like to see what other applications this is useful for. Suppose the SIPF sequences also have equalities. In other words, suppose an amino acid is not listed in the archae dipeptides but has equivalent SIPF yields with another amino acid that is listed with the target archae dipeptides. Can this program handle that scenario? Nope, I don't believe so. Let's check another. I mentioned checking for duplicates between trials. Does the program do that? Nope. (In fact this error "sneaked in somewhere" into DNAunion's result.) This all goes to show: 5) The model as presented by Rode is only good for a quick back of the hand calculation, but further elaborating it won't produce any more significant results. See my first post for what I mean. The probabilistic analysis, though sloppy, does not discredit Rode's work by itself. Neither, for that matter, do I see how DNAunion's 1e-7 discredits Rode's work. 6) DNAunion speaks as if his code ought to be used by others. He talks about "empowering" people and "reusability" and so forth. Having evaluated the code, I can only be thankful that there are many programmers in the world, as well as mathematicians. |
||
03-30-2003, 07:12 AM | #34 | |
Junior Member
Join Date: Nov 2002
Location: USA
Posts: 29
|
Hi DNAunion (and others),
One minor quibble about the opening post. DNAunion remarked (among other things): Quote:
Having scanned the review by Rhodes but once, I am spurred to wonder just how many mechanisms for the generation of nonrandom dipeptide frequencies can be thought of (or have been established). This consideration has significant bearing on the comparison. |
|
03-30-2003, 08:41 AM | #35 | |||||||
Veteran Member
Join Date: Mar 2002
Location: anywhere
Posts: 1,976
|
Quote:
Quote:
Quote:
PS: Some interesting prebiotic peptide synthesis articles from other groups: Quote:
Quote:
Quote:
Quote:
|
|||||||
03-30-2003, 10:01 AM | #36 | ||
Veteran Member
Join Date: Jan 2001
Location: USA
Posts: 1,072
|
Quote:
Quote:
First point is that proteins – and even dipeptides – have an intrinsic directionality: one end is the C-terminus and the other the N-terminus. By convention the numbering of amino acid residues – and thus the order of the amino acids - starts at the N-terminus and ends at the C-terminus. Thus, A-B and B-A linkages are different. Here's an explanatory example. Looking at the Archaebacteria entries for Ala we can construct the following, where we are looking at Ala being the third amino acid in a chain of five (with x denoting any amino acid). A-B N-x-x-Ala-Ala-x-C N-x-x-Ala-Asp-x-C N-x-x-Ala-Glu-x-C N-x-x-Ala-Gly-x-C N-x-x-Ala-Leu-x-C B-A N-x-Ala-Ala-x-x-C N-x-Glu-Ala-x-x-C N-x-Val-Ala-x-x-C N-x-Leu-Ala-x-x-C So we can see that the A-B entries are looking at a given amino acid (here Ala) and showing which amino acids are most likely to FOLLOW it in the examined proteins from archaebacteria, whereas the B-A entries show which amino acids are most likely to PRECEDE it. The fact that Ala is most likely to FOLLOW Ala and that it is most likely to PRECEDE Ala are two separate facts that just happen to coincide. To further show that A-B and B-A are different facts, note that the entries for archaebacteria for Ala are not symmetric. 1) The A-B amino acids listed do not include Val, whereas the B-A list does. 2) The B-A amino acids listed do not include Gly, whereas the A-B list does. 3) The B-A amino acids listed do not include Asp, whereas the A-B list does. So Ala-Ala needed to be included in both the A-B and B-A linkages. What Principia may mean – his statement is a bit too vague for me to be absolutely sure what he is saying – is that the total number of Ala-Ala occurrences should be "split" between A-B and B-A because they are two separate facts; and that this would throw Ala further down in the rankings. If that is what he means, then he should have stated so...clearly. But more importantly, it wouldn’t make a difference for Ala-Ala. Even if Ala-Ala values are halved in the two tables (table 6 and table 7), Ala would still be one of the top 4 for A-B and for B-A, so the number of coincidences would not change: no amino acids would drop out of the top 4 and none would be added. |
||
03-30-2003, 10:19 AM | #37 | |||
Veteran Member
Join Date: Jan 2001
Location: USA
Posts: 1,072
|
Quote:
Let’s look at my statement IN CONTEXT, shall we. Quote:
What I found erroneous was his claim that the match between the SIPF dipeptides and the ‘primitive’ proteins had a probability of occurring by chance of only 1 in 10^18. And I was correct: his claim was erroneous - even Principia noted this… Quote:
|
|||
03-30-2003, 10:34 AM | #38 | |||
Veteran Member
Join Date: Jan 2001
Location: USA
Posts: 1,072
|
Quote:
Quote:
My only other statement about statistical significance was as follows: Quote:
Now, last time I checked, 1 in 10^7 falls somewhere in between 1 in 5 and 1 in 10^18. Therefore, it does not meet or exceed either of the probabilities to which I attached any statistical significance. PS: As I will point out in my next post, the probability is likely much larger than 1 in 10^7 because of a problem with Rode’s calculation I pointed out but could not eliminate in my calculations due to Rode's incomplete information. |
|||
03-30-2003, 10:39 AM | #39 | |
Veteran Member
Join Date: Mar 2002
Location: anywhere
Posts: 1,976
|
Quote:
EDIT: Note, DNAunion is saying here that Ala joining Ala via the Nterminal of the 2nd Ala is a different process than Ala joining Ala via the Cterminal of the 2nd Ala. |
|
03-30-2003, 10:43 AM | #40 | ||
Veteran Member
Join Date: Mar 2002
Location: anywhere
Posts: 1,976
|
Quote:
Quote:
|
||
Thread Tools | Search this Thread |
|