I’ve written before about Shawn Carlson’s “A Double-blind Test of Astrology”, published in the journal Nature, in 1985. To recap, 116 people completed California Personality Index (CPI) surveys and provided their natal data (date, time and place of birth). One natal chart and the results of three CPI surveys (one of which was for the same person as the natal chart) were given to an astrologer who was to interpret the natal chart and determine which of the three CPI results belonged to the subject whose natal chart it was. In only 40 of the 116 cases did the astrologers choose the correct CPI. This is the exact success rate expected by random chance. The conclusion: the test gave us no reason to suppose astrology works.
That was 24 years ago. As far as I am aware, no serious challenge has been made to Carlson’s conclusions, perhaps until recently. I have now been informed that Suitbert Ertel, professor of psychology at Göttingen University, has claimed to have found serious flaws in Carlson’s paper. Ertel’s paper is apparently not available online, although I have read a summary of it by Ken McRitchie: Reappraisal of 1985 Carlson study finds support for astrology.
Normally, articles in Nature, or any scientific journal, are peer reviewed before publication. The peer review process subjects scientific beliefs and claims of fact to critical analysis by qualified experts. Yet, even though the Carlson study makes claims of scientific fact, it had not been peer reviewed. Nature had published the article as editorial content in the Commentary section, a detail that undoubtedly has been overlooked by countless authors who have cited the study and regarded it as definitive.
I found this a strange if interesting claim – some Nature articles are not peer reviewed? The “commentary” section is somehow less “definitive” than if it had been published elsewhere in the journal? I emailed Shawn Carlson to see what he had to say about this. His reply began:
In general I stop reading an article when I come across the first absurdity. Ken McRitchie states that my paper was not peer reviewed. Not so. It survived a rigorous peer review that included a famous psychologist whom I will reveal in a later publication. To the best of my knowledge, Nature never publishes research articles that have not been properly reviewed, and I doubt that any serious scientist thinks otherwise. I talked personally with then Editor-In-Chief John Maddox about whether or not the "Commentary" section was the appropriate venue and he assured me that they often published original results there that they believed were likely to be of general public interest.
So it was peer reviewed, and the commentary section is not reserved for non-definitive reporting.
I have to say, I found McRitchie’s criticisms here to be laughable, especially considering that Ertel’s “reappraisal” is published in the so-called Journal of Scientific Exploration – a journal that includes papers by Dean Radin and Ian Stevenson among others. Ertel’s paper is rather laughingly referred to as “peer-reviewed.” Well yes, but if his “peers” are the likes of Radin and Stevenson, I’m unimpressed. According to Wikipedia, the journal is not indexed in Web of Science, an indexing service for leading scientific journals, that covers over 10,000 of the highest impact journals worldwide. Of course, that doesn’t make Ertel’s paper wrong, but it shows this criticism of Carlson’s paper to be absurd at best.
Instead of presenting the astrologer participants with pair choices, which is the normal format for such tests, and the format followed in an earlier well-known astrological study by Vernon Clark (1961), Carlson presented a three-choice format, consisting of one genuine object and two selected at random. This three-choice format, Ertel notes, is less powerful than a two-choice format.
Maybe, but that doesn’t mean the one in three choice is invalid. If astrology were real, it should have been possible for the astrologers to pick out the correct one with greater odds than you would get with just guessing. They didn’t. If you were to pay money to an astrologer, would you be happy to be given advice based on an interpretation of your horoscope that, while not accurate for you, was accurate for the second closest person? I doubt it, and this certainly isn’t what astrologers claim.
Ertel is also critical of Carlson’s piecemeal analysis of the sampled data, in which only sub-samples are examined instead of the total effects. The correct analysis for a three-choice format, Ertel asserts, is to calculate the proportion of combined first and second choices according to the normally accepted protocol. Carlson initially states his intention to do this but then disregards the protocol for no given reason. Re-analysis shows that the astrologers correctly matched CPI profiles to natal charts better than would be expected by chance with marginal significance (p = .054). This positive result, Ertel found, was replicable with even better results (p = .04) for the astrologers’ ten-point rating of profiles fit to birth charts, a procedure that Carlson requested of the astrologers but ignored in the end, again without giving reasons.
Firstly – Carlson didn’t ignore the “ten-point rating of profiles.” He uses the ten point rating results and concluded (page 424):
Next we took the weights into account, by a method established before studying the data.
[Snip]
The scientific hypothesis predicts that 1/3 of the choices at any weight should be correct choices. Figure 4 shows the percentage correct for each weight with the appropriate error bars, and the best linear fit with slope – 0.01 +/- 0.02. The slope is consistent with the scientific prediction of zero slope.
In other words, the astrologers’ confidence that they were right (on a scale of 1 to 10) did not fit the actual correct choices any better than the incorrect ones.
Regarding the correctness of the second choice options, the Nature paper (page 424) states:
The correct CPI was chosen as the second place choice at the 0.40 +/- 0.044 rate which is also consistent with [random chance].
It looks to me as though the astrologers' first and second choices were consistent with guessing, rather than with any information provided by astrology. In addition, Carlson did use the ten point rating scale (despite what Ertel claims), and I see nowhere in the Nature paper where Carlson initially states his intention to calculate the proportion of combined first and second choices but then changes his mind, as Ertel also claims. (If anyone can see where he says this, I’d be interested. I’ve read the paper several times, and I can’t find it.)
The eminent psychologist Hans Eysenck, late author of the book Astrology: Science or Superstition, argued that the CPI explicitly states that it should be interpreted only by trained and experienced users, and the astrologers lacked the necessary qualification. Other critics questioned whether the CPI and astrology evaluate personality in the same ways, and whether there was enough common ground for astrologers to make valid matches.
This one really had me rolling my eyes. McRitchie really needs to make his mind up. Is the CPI a valid way of testing astrology or not? Because if it isn’t, then Carlson’s whole test is invalid regardless of what any “reappraisal” of the statistics tell us, which means that Ertel can’t claim that Carlson’s test now supports astrology. He can’t have it both ways. Which is it?
Of course, if the CPI is not a valid way of testing astrology, then all this means is that there is still no evidence astrology works. Remember, proponents of astrology are the ones making the claim, and so they are the ones with the thing to prove. Null hypothesis, anyone?
Furthermore, if the CPI is not a valid way of testing astrology, and if asking the subject which natal chart interpretation is correct (Carlson’s test #1) is not valid (as Carlson concluded), then how should astrology be tested? It’s a bit rich for proponents of astrology to criticize a method of testing astrology to see if it works, if they don’t have a valid way of testing it themselves. As I wrote in January 2007 in Testing Astrology – Again, if none of these methods are acceptable, and therefore if astrology can’t be tested, then this means that astrology is almost certainly bogus. For one, astrology’s doubtful provenance (no known method by which it is supposed to work, no known way its rules were derived, its absurd premises), means we need extraordinary evidence that it works. By this I mean better evidence that we demand for many other things. And we certainly don’t have this extraordinary evidence. And for two, if astrology can’t be tested, then clearly no one would ever have been able to work out all the detailed rules astrologers use. How would they have been able to work out the rules if there is no way of ever testing them to see if they were right? On what basis do astrologers claim that astrology works? How do they know?
Proponents of astrology want to have it both ways – they claim astrology can’t be tested, and yet they also claim they know it works. And you can see this mindset in all its absurdity with this from Ertel:
“The results are regarded as insufficient to deem astrology as empirically verified,” Ertel warns, “but they are sufficient to regard Carlson’s negative verdict on astrology as untenable.”
- which is an epic FAIL. If the results of Ertel's study are insufficient to deem astrology as empirically verified, then astrology failed the test. The null hypothesis is that astrology doesn't work. Come on, this is basic experimental design. Ertel, with this comment, shows himself to be a pseudo-scientist who is clueless about how real science works. This is further confirmed by the fact that in Ertel’s “reappraisal” methodology, he decided to change the things that he thinks the test should have been looking for in the first place. This is a strict no no in experimental design. In a well designed test, you state your objectives clearly before you do the test, and you compare the results with what you stated at the beginning you would be looking for. What you should never do, is do the test, decide the results were not what you wanted, and then go looking for something else that might show the result you did want. It’s well known that this technique is likely to return false positives, which is why it is never done by real scientists. Or as Carlson wrote in his email:
I could go on at length about Dr. Ertel's flawed analysis. I will take up that effort at a later time. For now, let me just point out that a scientist never figures out how to analyze his data after the fact with all the data in view. This leads to selection biases that can skew the results to favor the experimenter's hopes. Apparently Dr. Ertel has done just that and from many possible approaches he has selected a particular method of analysis that gives him a result that he says is "marginally significant" in favor of the astrologers. That won't surprise anyone who understands statistics and knows how certain subtle pitfalls often turn a "great discovery" into a fool's errand (and sometimes vice versa).
To get there however, Dr. Ertel ignores the direct and extremely significant null result that I obtained using data analysis methods that were perfectly reasonable and selected before the fact. The astrologers failed directly--thereby refuting the notion that astrology is so effective as a discipline that a selection of astrologers who are held in high esteem by their peers can gain access to vital information that is not available to ordinary psychologists.
This is the best astrology proponents can do after 24 years – no new information, no new tests that show astrology can do anything, just some dishonest and scientifically dodgy sniping at a comprehensive test that all involved (including the astrologers) agreed before the test was done, was a valid test. Proponents of astrology are the ones who need to show evidence that their magic fortune telling system is real, and yet even after 24 years, and after the efforts of a university professor with a history of writing papers supportive of astrology, the best they can do is say, and I quote again:
The results are regarded as insufficient to deem astrology as empirically verified
If Ertel, or anyone else, believes that astrology works in any measurable way, then they need to stop sniping at Carlson’s well designed and genuinely peer reviewed test, and they need to design and perform their own experiment, have it peer reviewed and published in Nature or another real scientific journal. As Carlson notes, such a successful test would undoubtedly win its author the Nobel Prize. If astrology is real, it should be possible to design and conduct such an unequivocal test. Why don’t they?
> For now, let me just point out that a scientist never figures out how to analyze his data after the fact with all the data in view.
This is not entirely true, I think, and depends on the area of scientific research. There are studies where you generate a lot of data first, and the best way of organizing and presenting them depends on what you find. In evolutionary biology, for example, you might want to explore the relationships of different species with DNA sequence data. Once you have generated sequence data, you might find either that there are a lot of differences between sequences and sequences are generally unique to individual species, or you might find few differences and incomplete lineage sorting. In the first case, a cladistic approach would be appropriate, in the second, a network approach. Of course, this is not really comparable to a simple is-this-significant-yes-or-no question, but it is still science. It can even be hypothesis testing, if you had a hypothesis, for example based on morphology, of what the relationships were.
Posted by: Mintman | August 03, 2009 at 02:28 AM
Excellent piece;thanks for posting this.
I just wanted to relate a recent encounter with an astrologer, who has studied the subject for many years, and regularly produces birthcharts for friends and colleagues. I wrote what I thought was a devastating critique of astrology with the usual searching questions, such as "Why 12 signs? How were the initial correlations between signs and individual characteristic arrived at? By what mechanism could the constellations have an effect on individual humans?" etc, and gave it to her to read. She later conceded that there was a lot of sense in what I had written, did not contest any of my arguments, but stated, with absolute conviction "But you would be sceptical, wouldn't you? After all you ARE a sagittarius!" Ever feel that you're wasting your time on this rationality thing?
Posted by: derek hudson | August 03, 2009 at 06:26 AM
Great piece as usual. It did get me thinking about one thing, though. We continually say that "extraordinary claims require extraordinary evidence", and I'm not sure why. That is, if we can produce clear evidence that something is true (or false) then it is clear evidence, regardless of how extraordinary the claim might be. The phrase implies that it is acceptable to say a mundane claim can be supported by questionable evidence, which seems wrong to me. Either you have the goods or you don't, no matter how far out the claim might be.
Perhaps it should not be how extraordinary the claim is, but how significant. When it is something with a really big impact - like, say, homeopathy, where validation would require throwing out quite a lot of physics and chemistry - we need to be certain about the evidence before we start tossing out possibly baby-inhabited bathwater; while we can accept or reject marginal evidence about trivial claims because it doesn't matter anyway.
Not a particularly important observation - just something that strikes me every time I hear the phrase.
Posted by: Yojimbo | August 03, 2009 at 08:27 AM
"Ever feel that you're wasting your time on this rationality thing?"
LOL, please note the name of my blog
Posted by: TechSkeptic | August 03, 2009 at 09:12 AM
Yojimbo
That’s a good question. Strictly speaking, all claims require exactly the same amount of evidence, it’s just that most "ordinary" claims are already backed by extraordinary evidence that you don’t think about. When we say “extraordinary claims”, what we actually mean are claims that do not already have evidence supporting them, or sometimes claims that have extraordinary evidence against them.
I wrote more about this here: Extraordinary Claims Require Extraordinary Evidence.
Posted by: Skeptico | August 03, 2009 at 10:22 AM
Ah! As the window washer said, it all becomes clear - the link is a great explanation. It makes sense, but as a stand-alone phrase it isn't immediately clear - though I have no idea how it could be improved.
Posted by: Yojimbo | August 03, 2009 at 02:29 PM
I get so sick and tired of "skeptics" disparaging the Journal of Scientific Exploration. Do you have any idea how rigorously peer-reviewed it is? Have you even taken a look at the scientists on the editorial board? Don't dismiss the journal simply because it publishes papers on "fringe" science. And are you even the slightest bit aware of how much of a rigorous experimentalist Dean Radin is? Your shooting yourself in the foot by pointing out that he publishes in it, therefore the journal is a joke.
Posted by: MrEvidential | August 03, 2009 at 06:58 PM
Perhaps you'd like to give us an example of his alleged rigor to back up these assertions.
Posted by: Bronze Dog | August 03, 2009 at 07:15 PM
I get so sick and tired of "skeptics" disparaging the Journal of Scientific Exploration.
No one is forcing you to read skeptics. Really, they are just doing their thing. Why do you have a problem with that?
Do you have any idea how rigorously peer-reviewed it is?
A brief impression: a commenter on another thread reccommended this study as a model of excellence.
It has a note from the editor about what the reviewers didn't like about the study at the end of the paper, and that incicates the standard of the "peer review".
The researcher had gathered data on "personality" or behavioural traits of puppies, and then analysed it according to astrology. The reviewers suggested that the behavioural data should have been gathered by a neutral person as well as by dog owners to avoid subjectivity, and wanted better justification that human astrology is valid for dogs. They also suggested using older dogs as it may be that (as with humans) younger subjects might not have fully developed their true astrological traits.
They didn't have any problem with astrology being used as a valid research tool. They didn't have a problem with the fact that any "hit" was automatically ascribed to astrology without reference to probability. They didn't complain about the conclusion that basically two hits out of six was seen as a positive (or in some way worthwhile result). The study is so badly designed that at best it could be called a complete and utter waste of time and at worst, deceitful and fraudulent.
Further, the reviewers had no trouble at all with this statement either:
The similarity of observations between dogs and human astrological descriptions can only explained by the existence of a physical causal effect, so far unknown. This eliminates the arguments frequently advanced to "explain" this astrological tool; the fact that the human mother, knowing the birth chart of ther children, influences her child in the "right" direction. Clearly no such cultural factor can occur in dogs. It is also difficult to evoke a factor of hereditary nature. For such a factor to be effective, all pups of a given letter should be borne under the same planet position, which is not the case due to the duration of whelping. Indeed, pups coming from the same litter have different behaviours and different sky positions.
Thus it must be supposed that a causal physical influence exists.It is worth recalling here various studies on the erception of waves emanating from sky elements, particularly the Sun and Jupiter....
That kind of utter stupidity and pig ignorance does not belong anywhere near any kind of journal containing the word science. The reviewers who passed it and the editor who published it are complete fruitcakes.
Have you even taken a look at the scientists on the editorial board?
Appeal to authority. Just because someone has a university post doesn't mean that he doesn't have a whole heap of rationalisations about why his work is important. There's more to "scientific method" than "I'm a scientist and this is my method". And I think those scientists need to have a look at themselves for associating themselves with such a stupid publication.
Don't dismiss the journal simply because it publishes papers on "fringe" science.
I dismiss the journal as absurd deceitful crap and bullshit purely on the strength of the above quote. I'm sure there are plenty more where that came from, but that's enough for me. And Ertel criticises Nature as deceptive?????
Sorry Mr Evidential, but the evidence is not only against you, it's pulling funny faces at you.
Posted by: yakaru | August 04, 2009 at 04:59 AM
Isn't Mr Evidential joking? Somehow I immediately got the impression that he was. If he was not: EPIC FAIL!
Posted by: Valhar2000 | August 04, 2009 at 05:16 AM
Valhar2000:
If he's the same MrEvidential you can find with a quick Google search then no, he doesn't appear to have been joking. Unless he's a Poe.
He's another one of the "If you don't believe what I do then you aren't a true skeptic" brigade it seems.
One only has to look at this page to see how serious this journal is.
From looking through some of these you'll get an idea of the standard of peer review and research. For instance, 'Common Knowledge of the Loch Ness monster'. Well worth studying I think you'll agree, or how about 'A Case of Severe Birth Defects Possibly Due to Cursing'.
Clearly we are looking at the work of 'True Skeptics'.
Posted by: Jimmy_Blue | August 04, 2009 at 08:38 AM
In fact, the more one looks at the research available on the journals website the more laughable the claim that it is rigourosly peer reviewed.
We have papers on HIV written by AIDs denialist and self confessed homophobe Henry H Bauer. Not to mention the rigorous peer reviewing that goes into his work on the existence of the Loch Ness Monster (in which he believes).
How about the paper advocating Intelligent Design and comparing it to the theory of continental drift?
How about the paper on the evolution of bipedalism that appears to simply accept the existence of sasquatch as a proven fact and references the Patterson-Gimlin film?
As to the "rigorous experimentalist Dean Radin" - does that include his simply assuming that psi information can and does exist everywhere and can be accessed at all times, in order for him to say that a dog in someone else's study is psychic? Check out his "A Dog That Seems To Know When His Owner Is Coming Home".
How does assuming the existence of that which you hope to prove when looking at someones else's research relate to the journals advice to authors that papers should "conform to the rigorous standards of observational techniques and logical argument"?
That journal is just a bad joke.
Posted by: Jimmy_Blue | August 04, 2009 at 10:39 AM
Didn't the Carlson study also find that the test subjects themselves couldn't identify their own CPI? It seems to me, then, that what the study really showed is that the CPI isn't a good way to distinguish people's personalities in a test like this.
Posted by: AvalonXQ | August 04, 2009 at 01:47 PM
"Proponents of astrology want to have it both ways – they claim astrology can’t be tested, and yet they also claim they know it works."
Of course, pretty much all woo proponents operate this way. Reiki practitioners, for example, openly admit that ki/qi cannot be detected by any scientific instruments, yet simultaneously claim to be able to manipulate it with their bare hands.
Posted by: Chayanov | August 04, 2009 at 03:03 PM
AvalonXQ: Even if it is true, it probably has something to do with the fact that people lie to themselves a lot. A better test of CPI might be asking people familiar with the person, and perhaps familiar how the CPI works, to identify the CPI.
Posted by: King of Ferrets | August 04, 2009 at 03:20 PM
Valhar2000 and Yakaru:
Did I say anywhere that I believed in astrology? No, I didn't. The point was that the JSE has legitimate peer-review. And Jimmy Blue, you're dismissing the dog study simply because it suggests an anomaly. You should go back and read the history of science. You guys dismiss the journal simply because it published an article about the Loch Ness Monster and without even delving into the evidence presented in the paper. Wow... real skeptical.
Posted by: MrEvidential | August 04, 2009 at 05:32 PM
Personally, I'm perfectly fine with not bothering to look at the Loch Ness Monster paper before dismissing it. Seeing how, y'know, we've checked the entire goddamn loch and found absolutely nothing!
Posted by: King of Ferrets | August 04, 2009 at 10:49 PM
I read through "The Case for the Loch Ness 'Monster'". The only interesting part was at the end:
"Since the author is also Editor of the Journal, no truly disinterested mode of having the piece refereed seemed available. Consequently it is published not as a Research Article but as an Essay."
How rigorous is that peer-review process, again?
I am a little curious about the article entitled "How to Reject Any Scientific Manuscript" though.
Posted by: Chayanov | August 04, 2009 at 11:39 PM
Did I say anywhere that I believed in astrology? No, I didn't.
Did I say you believed in astrology? No I didn't. Everything I wrote above was about their peer review process. Their reviewers accepted the sentence:
"The similarity of observations between dogs and human astrological descriptions can only explained by the existence of a physical causal effect, so far unknown."
Regardless of whether astrology is true or not, that sentence is utter bollocks, isn't it. It indicates how low the bar has been set by the reviewers and the editorial board. I would not accept that statement from an eight year old. The editor and the reviewers not only pass it, but allow the researchers to claim that their study demonstrates that astrology is a stronger influence than both genetics and culture.
Notice, I am not just laughing at the idea of astrology, I am saying that the researchers have not justified their claims, and the reviewers haven't responded to their glaring and laughable errors.
That is why this journal is a joke and its editors and reviewers are weak-minded. That is my claim, and there, above, is my evidence.
The point was that the JSE has legitimate peer-review.
Wrong. They have illegitimate peer review, as shown above.
Rather than just repeating your assertion that we are dismissing things out of hand, how about dealing with the issues we have raised?
Do you find the reasoning in the passage I quoted acceptable?
Do you expect anyone not to dismiss out of hand the idea that a curse can "possibly" cause serious human deformities?
Posted by: yakaru | August 04, 2009 at 11:45 PM
MrEvidential:
Where did I say that was the reason I dismissed it, or are you just parroting the usual psuedo-scientist's defensive response?
Oh go on then I'll bite, why should I and how do you know I am not already familiar with this?
So, you know for a fact that I didn't read the paper then? How do you know this? You aren't just parroting the usual pseudo-scientists assumptions, are you?
Unlike you obviously, who makes statements based purely on their own bias and assumptions.
Posted by: Jimmy Blue | August 05, 2009 at 06:20 AM
MrEvidential, summarized: "La-la-la! I'm not listening to your actual complaints! I'm making up my own lame excuses for you so that I can inflate my ego and act superior!"
Here's a hint, MrEvidential: If someone links to a web page on a logical fallacy to make a point, he's not "dismissing it just because it has an anomaly."
Additionally, you seem to be missing the whole point behind peer review. It's Cargo Cult Science all over again.
Note on the Loch Ness monster: Something that big living in the lake with no biological impact and able to evade all attempts at detection other than tourists with fuzzy cameras? Seriously. If someone could get quality evidence for something like that, it'd be plastered all over every TV Network as real scientists look it over.
Posted by: Bronze Dog | August 05, 2009 at 07:11 AM
Skeptio, thanks, this post is just what I was asking for in my original comment. While I think some criticisms of your criticism of the criticism of Carlson can still be made, at least when people search with Google and the appropriate keywords, something will be there, whereas before there was nothing at all but multiple copies of the "press release" of (not Ertel, but) some astrology PR institute.
Yakaru, you say that "a commenter on another thread reccommended [the dog] study as a model of excellence.". I don't know who or what you are referring to, since the only reference to that study I am aware of was by me, and I didn't recommend it in that way. In fact I said I was "prima facie skeptical", and indicated that I thought a main problem was that they provided no evidence that they did adequate blinding.
You criticized the study originally (and apparently again this time) by suggesting that the statistical design was seriously flawed.
You said "Six behavioural traits were correlated with ten astrological traits. All they found was some kind of "strong" association between the angle of Jupiter and the sun, and extroversion. The other measures failed. Of course, they don't put it like that in the article."
But of course, a strong association (correlation) of that sort is a positive result, and apparently it was a statistically significant one. You suggest that the lack of other significant correlations renders this positive correlation irrelevant, worth emphasizing with a taunting "fail" in boldface, but as far as I can tell, given the statistical design of the experiment, this is not true at all. It is just a matter of multiple hypothesis testing, where some such techniques as the Bonferroni or Hochberg corrections are employed to adjust the p values required for global significance, and indeed they seemed to have employed such techniques. That is "the way they put it in the article". Why would they put it the way you accuse them of not putting it? Their way appears to be the correct way.
None of this is to say the positive result has to be accepted at face value, or that their speculation about it does not go far beyond its face value. They certainly will not have satisfied everyday skeptics that they have adequately ruled out wishful thinking as a way to account for the results, through inadequate blinding. And given the strange nature of the astrological aspects considered, one might also suspect that they may have went through many data sets before they hit upon the published one, without applying the multiple testing corrections to these.
Posted by: Benson Bear | August 05, 2009 at 11:55 AM
Benson, firstly, an apology; yes, I was refering to the comment you made here, and as anyone who reads it will see, you did not offer it as a "model of excellence", nor any thing of the sort.
Wrong. Lazy. Fail. Apology and unconditional retraction.
I really only wanted to mention the study because I find it a classic example of obfuscation and deceitful reasoning.
I did (and still do) find your position on this a little hard to pin down, and maybe a bit overly accomodating. I still disagree that their results can be termed positive, because the probability of a chance "hit" is ignored. More importantly, given the way astrologers regard their accuracy, and the authors' own conclusion that astrology is stronger than environment and heredity, the results should be much clearer. And of course, the possibility that there is no astrological influence whatsoever is blotted out completely. A clearly anti-scientific attitude.
That's why I (still) see it as a failure. (The bold was supposed to be referring to the study, and wasn't meant as a taunt.)
As you note: "And given the strange nature of the astrological aspects considered, one might also suspect that they may have went through many data sets before they hit upon the published one, without applying the multiple testing corrections to these.
It seems to me like a classical smoke and mirrors trick, giving the impression that unless you understand the astrological theory you're not qualified to object to any part of it at all.
If astrology was as accurate as they and so many routinely claim, it would be quite easy to test. Instead, people like Ertel are still grizzling about experiments from 1985.
Thanks for your insightful comments, here and elsewhere, Benson.
Posted by: yakaru | August 05, 2009 at 01:21 PM
Simply digging up a couple papers which appear to demonstrate at least occasional flaws in the peer-review process of the journal will not suffice to demonstrate that it's peer-review is below mainstream scientific standards. I emailed Dr. Stephen Braude, editor-in-chief of the JSE, and he commented:
"I don't know of any journal for which the peer review process is flawless. Peer review never guarantees that only worthy papers and books are published. If that were the case, we'd see far fewer publications across the board."
I'm sure if you guys would really delve into the many papers published you would actually agree that the overall rigorousness of peer-review is pretty standard.
Yakaru --
NO, the idea of "cursing" should not be dismissed out of hand. If evidence is produced for a claim, it should be objectively evaluated and considered. PERIOD. See Skepticism 101.
JimmyBlue said:
"MrEvidential, summarized: "La-la-la! I'm not listening to your actual complaints! I'm making up my own lame excuses for you so that I can inflate my ego and act superior!"
Wow, Jimmy that's quite a leap of faith isn't it? To assume that "I'm making up my own lame excuses for you so that I can inflate my ego and act superior!" Do you have any good evidence that my so-called "excuse making (which it isn't, see above)" is to "inflate my ego and act superior." Have you eliminated other alternative explanations for my comments, or are you just being an asshole?
JimmyBlue said:
"As to the "rigorous experimentalist Dean Radin" - does that include his simply assuming that psi information can and does exist everywhere and can be accessed at all times, in order for him to say that a dog in someone else's study is psychic? Check out his "A Dog That Seems To Know When His Owner Is Coming Home"."
What an odd thing to say... aren't you aware that other scientists often revisit each others' studies and attempt to look for factors that might have helped produced the outcome? This is also a clear case of confirmation bias. Why don't you look up a study in which Radin conducts an actual experiment with his own subjects - not where he attempts to provide evidence for a effect in another researcher's study. Just take a look at how rigorously well-controlled his experiments are.
Posted by: MrEvidential | August 05, 2009 at 08:38 PM
Please don't take the Stephen Braude quote as a simple appeal to authority. The peer-review process is in no way flawless! That's quite obvious from taking a look at just about any other journal. I realize the study is about astrology, but the objective of the SSE is to evaluate alleged scientific anomalies. Realize the occasional flaws of peer-review and avoid bias when thinking about the fact that it is a astrological study.
Posted by: MrEvidential | August 05, 2009 at 08:48 PM
It occurs to me that the JSE could be very strictly peer-reviewed.
It's just that those peers believe in just as much crap as the people who do the studies.
Posted by: King of Ferrets | August 05, 2009 at 08:52 PM
Those allegedly occasional flaws are very basic errors.
Not recognizing posting names:
You laid out plenty of obvious straw men, and I'm just using my pattern recognition with other cookie-cutter woos to infer your motives.
These straw men you blurted out are very standard ones that are parroted by woos instead of dealing with what skeptics actually say. These characterizations often make it into entertainment media where horror movie goers can have a laugh at, for example, the folly of a straw skeptic that exists only in their stereotypical fiction. The result is usually that they feel superior to this mockery.
So, anyway, how about you show us what you would consider a good example of their peer review, so that we won't be able to cherry pick?
Posted by: Bronze Dog | August 05, 2009 at 09:12 PM
King of Ferrets --
According to your (null) hypothesis, if the JSE were indeed strictly peer-reviewed (meaning it wouldn't matter whether the "peers" believed in it, just whether the research was of good quality) the so-called "crap" would not get published. Yet the anomalous stuff does get published. So what you said doesn't make sense.
Posted by: MrEvidential | August 05, 2009 at 09:18 PM
Bronze Dog --
What an epic fail in attempting to defend your massive leap of faith in labeling me a egotistical person. Well, I believe I have sufficient evidence to conclude that you're an ass because of the above and that you call people names like "woo" because you disagree with them. Oh yes, go ahead Bronze, label anybody who happens to be compelled by the large body of evidence for psi a "woo." Real civil. And go ahead and label just about anything a "strawmen." Your criteria are very loose, indeed.
Bronze Dog just take a look at any experiment by Dean Radin published in JSE.
If you wish to indulge yourself in this go ahead. I really don't want to waste to much time going back and forth on this.
Posted by: MrEvidential | August 05, 2009 at 09:36 PM
Anomalous is a fun word. It's often used to describe a lot of stuff I find downright boring, easily explainable, and/or expected.
...And it seems I don't have it on the Doggerel index, yet. Guess I should get pencil that in.
Posted by: Bronze Dog | August 05, 2009 at 09:40 PM
How about Radin's recently published randomized triple-blind replication of the effects of distant intention on water crystal formulation. If you're not satisfied with the level of statistical significance of the study, just pick another. Good luck with that...
Posted by: MrEvidential | August 05, 2009 at 09:43 PM
*snicker*
Do go on.
Posted by: Skemono | August 05, 2009 at 09:44 PM
How about this study by Dean Radin?
Says it all about Radin, really.Posted by: Skeptico | August 05, 2009 at 09:46 PM
Pattern recognition. If you want me to think of you as otherwise, try acting against stereotype.
Smells like projection. I call 'em woos because they rely on logical fallacies.
Evidence, such as, specifically?
And why haven't they applied for the JREF million?
If you say someone believes something they don't, that's called a straw man fallacy. What's so hard to understand about that?
Oh, I know this old trick: If I pick one out myself and eviscerate it, you'll just claim I was aiming for a low hanging fruit. How about you pick the highest hanging fruit you can think of and save us both some time, instead of setting up mobile goal posts.
Posted by: Bronze Dog | August 05, 2009 at 09:50 PM
Since this came up while I was typing:
...How exactly do you "triple blind" a study? Last time I heard that term, it involved a study that couldn't even be called single-blind, much less double-blind: A medium talking, yes, talking, to subjects with just a thin wall between them.
Posted by: Bronze Dog | August 05, 2009 at 09:53 PM
Skemono --
Go buy Entangled Minds by Dean Radin. Take a look at the numerous meta-analyses which demonstrate no significant correlation between methodological flaws and outcome. Also, take a look at the well replicated EEG correlation and presentiment studies. Knock yourself out. And yes, it is a large body of evidence. Why don't you try informing yourself eh?
Posted by: MrEvidential | August 05, 2009 at 09:57 PM
Hey Bronze Dog, yet another assumption! Big surprise!
And what evidence? Take a look at the above response to Skemeno. Radin has already demonstrated that it would take more than a million dollars to carry out the presentiment experiments over the period of time it takes for the many trials. It's towards the end of the video during the FAQ period:
http://www.youtube.com/watch?v=qw_O9Qiwqew
Posted by: MrEvidential | August 05, 2009 at 10:02 PM
Haha lol the experiment is of such good quality Bronze Dog can't even fathom the possibility of the rigorous controls being enforced.
Posted by: MrEvidential | August 05, 2009 at 10:04 PM
Read the goddamn paper, then you'll find out. Cut the sarcastic bullshit.
Posted by: MrEvidential | August 05, 2009 at 10:05 PM
Skeptico --
Read the many rigorously controlled studies by Radin! Then you'll get the real message!
Posted by: MrEvidential | August 05, 2009 at 10:08 PM
MrEvidential
Why would I waste my time? The study I reviewed shows Radin is a pseudo-scientist who will do anything, including unblinding an experiment, until he can find something that he likes. Come on - extraordinary claims require extraordinary evidence. Radin has shown he is credulous and can't be trusted. Whatever experiment Radin performs, it would need to be replicated by somebody independent with integrity who knows what he is doing. Without independent replication, Radin's "experiments" are not worth the paper they are written on. That review I wrote on that Radin paper - that's the real message about Radin. I don't have to waste any more of my time poring over any more of his experiments just to please you.
Posted by: Skeptico | August 05, 2009 at 10:20 PM
Are you that goddamn uninformed?! Radin's presentiment experiments have been replicated a dozen times!!! Go to Radin's blog entry here:
http://deanradin.blogspot.com/2009/08/combat-intuition.html
There are some references in the comments.
And calling Radin a pseudoscientist because he did a little fuzzy post hoc analysis is absolutely ridiculous. How about you email Radin and have a chat with him? Have you done that? Have you asked him what his justification was? I suggest you do it.
Posted by: MrEvidential | August 05, 2009 at 10:28 PM
Dean Radin:
"This line of experiments has been successfully replicated by a growing number of independent investigators, and of the 20 or so studies I'm aware of, nearly all have shown effects in the predicted direction. About half of those studies report statistically significant outcomes."
Posted by: MrEvidential | August 05, 2009 at 10:30 PM
If you're talking about my characterization of you, it's not an assumption, but a tentative conclusion based on observation of people who talk like you. Get the difference straight.
Translation: "I'm not even going to bother answering an honest question about a definition, so instead I'll make something up from nowhere about Bronze Dog."
What does "triple blind" mean? That's what the question was. Your response leads me to think that you don't even know. So, please, prove me wrong while educating me.
The side note I included was only done to demonstrate the people I usually hear from use the term don't know what blinding even means.
If you can't explain what triple blind means, I can't exactly have confidence that you'd know what good evidence would be and thus what would be satisfactory to genuine skeptics, who pretty much require double-blinding.
So, while that YouTube video's loading, how about you give me a timestamp for the relevant part(s), since I'd rather not watch an hour and 34 minutes for one specific thing.
Posted by: Bronze Dog | August 05, 2009 at 10:31 PM
Dang. Messed up the tags. Please fix if you can, Skeptico.
Posted by: Bronze Dog | August 05, 2009 at 10:32 PM
Skeptico, earlier:
And thank you for bringing that example forth. Unblinding an experiment has got to be one of the most egregious sins against the scientific method.
Posted by: Bronze Dog | August 05, 2009 at 10:35 PM
You moron. Replicated as reported on his blog? Come on. I said get it replicated by a real independent body. Wake up. Get a clue.
“a little fuzzy post hoc analysis”? You didn’t read my post did you? It was total pseudoscience:
Why the hell should I? I don’t have the responsibility to investigate every piece of crap from this guy. I've read enough of his stuff from his absurd 9/11 global consciousness garbage to the latest study. He’s a clueless twit and I have better things to do. Anyway, it's already been done: An evening with Dean Radin. If you think his work is worth following after that then you are a credulous fool.
Posted by: Skeptico | August 06, 2009 at 12:26 AM
...why do woos have a tendency to post like 5 times in a row?
Posted by: King of Ferrets | August 06, 2009 at 01:03 AM
Uh-huh.
Posted by: Skemono | August 06, 2009 at 01:24 AM
Mr Evidential:
Simply digging up a couple papers which appear to demonstrate at least occasional flaws in the peer-review process of the journal will not suffice to demonstrate that it's peer-review is below mainstream scientific standards.
When the mistake is as idiotic as the one I quoted, it does. See if you can find such a blithering pig ignorant statement as that published in any proper journal.
Yakaru -- NO, the idea of "cursing" should not be dismissed out of hand. If evidence is produced for a claim, it should be objectively evaluated and considered. PERIOD. See Skepticism 101.
See history of science 101.
Evidence for cursing is still produced regularly with horrifying consequences. After a couple of hundred years of scientific research, we have not only every reason, but also the moral duty to dismiss it out of hand and oppose its spread.
There is an enormous body of established science on the causes human deformities and there is absolutely nothing in it whatsoever that suggests a possibility that cursing might be one of them. In ignoring this, you are dismissing this out of hand - 300 years or more of painstaking scientific research and advancement.
But of course, you don't need to provide any evidence for this act, do you?
Posted by: yakaru | August 06, 2009 at 02:53 AM
Well said, yakaru. Puts it in perspective with that link.
And Skeptico, thanks for reminding me about that 9/11 Global Consciousness thing. Forgot Radin was associated with it. I have a few fond memories of a JREF forum thread I enjoyed reading:
1. Seems they refused to release their original raw data, instead only releasing some chunks that went through a completely undescribed process.
2. The manufacturers of the "egg" random number generators themselves said they weren't truly random, just the standard computer pseudorandom.
3. They changed the time frames where they looked at the data however they liked, after the data was collected.
4. Subtle point I thought was interesting: How do you calibrate the RNG so that it's random in the absence of global consciousness, if you can't separate it from said consciousness.
I left a real groaner of a comment in that thread, "So, you're saying the chickens wouldn't let anyone look at the eggs until after they've been cooked?"
And reading over the first part of that review from the Skeptics' Dictionary Skemono linked to, I see a particularly cynical appeal to motive from Radin: Why would we be embarrassed to admit psi existed if there was good evidence? That'd be frelling AWESOME! (Why must woos be so horribly, horribly cynical?)
Instead, all we get is people ignoring basic principles of science.
Posted by: Bronze Dog | August 06, 2009 at 06:47 AM
Yakaru,
Thanks, but your mischaracterization of me doesn't bother me very much; my main concern is to get the central core of the dog experiment described fairly. My position is "accomodating" because in developing a criticism of any point of view, one wants to be as fair as possible to what might be good in that view, and also one wants to have the strongest possible criticism of the good part, one that is as immune as possible to being "overturned on appeal". But this is what still seems to me to be an entirely incorrect description of the core of the experiment. It seems you are still rehashing your original criticism, in which you complained that they only found 2 statistically significant positive correlations out of 60 possible correlations through which they searched. But as I understand it, and said twice already, the fear of a "chance hit" (i.e, a Type I error or "false positive") was dealt with by modifiying the individual alphas according to the aforementioned Bonferroni and Hochberg corrections.Of course, they could have made an error in their attempts to do this. I don't really trust the statistical analyses of anyone doing any of this stuff but I am not qualified to definitively pronounce upon it one way or the other.
I am not too interested in the interpretations and speculations that the experimenters offer at the end. There do seem to be errors one can pick out there. Such as, for example, their claim that "the similarity of observations between dogs and human astrological descriptions can only be explained by the existence of a physical causal effect, so far unknown." Even granting them that the similarity exists and requires a causal effect, why must it be "physical"? Rather, my main concern, again, is with the core of the experiment, which is where all the serious meat is (rancid or not) and which I believe your criticism does not address.Once again, before folks jump on me, this does not mean the study really shows anything. Perhaps there is an error in the statistics for example, just not the obvious ones that have been claimed so far. Or, again, I suggest problems with either blinding or illegitimate and unmentioned data-mining in picking the actual data-set which then in turn was data-mined in a statistically legitimate manner. I wrote the second listed author about this (the first listed author is dead (no seance jokes, please, that would not be nice!)) but I have received no answer from him.
Posted by: Benson Bear | August 06, 2009 at 12:30 PM
Skemono:
Thanks for posting that review of Entangled Minds – it demonstrates why we have no need to take Radin seriously, and why I certainly won’t bother reading any more of his drivel. This paragraph sums up how I feel about Radin:
Although I did mention Radin in my post, he wasn’t the main thrust of what I was writing about, and so I’m done with it now. If others want to continue to beat this dead horse then be my guest, but I’m out for now. My point in criticizing the Journal of Scientific Exploration was merely to point out the absurdity of some publishing in the JSE, complaining about Nature not being peer reviewed. I mean, come on – you’re trying to say that the JSE is more tightly peer reviewed than Nature? If you believe that then you deserve to be laughed right out of the room.
Posted by: Skeptico | August 06, 2009 at 12:39 PM
And yet he didnt link to a single one.
/beating dead horse
Posted by: TechSkeptic | August 06, 2009 at 01:48 PM
Skeptico --
Those are almost all of the goddamn papers! Read them! Look anywhere for a list of independent replications of presentiment and you'll see a similiar list! Those are independent labs replicating the effect! Go seek the evidence for a f**king change! Here's another list with almost all of the papers except a couple by Patrizio Tressoldi in the Journal of Parapsychology:
http://publicparapsychology.blogspot.com/2007/11/brain-response-to-future-event.html
Skemeno --
That review of Entangled Minds is an absolute piece of garbage. He spends a good portion of it faulting Radin for a one sentence line in the book about the Mitchell experiment. How moronic. He doesn't even bother to delve into the meta-analyses -- which demonstrate no significant correlation between methodological flaws and outcome. That demonstrates the effects are real, HELLO! BTW, I am the one who told him to write that review and he said he would do it. He claimed he had misplaced his copy of the book. Practically the next day or two or three, he sent the review to me. He wrote that trash up in a day or two and it shows.
Bronze Dog --
You are wrong. Gary Schwartz and Julie Beischel published a triple blind study in 2007, and it was actually triple-blind. You are apparently referring to the earlier flawed single-blind studies which no one claims were triple-blind. His latest study allows no sensory leakage whatsoever.
Triple Blind in this study meant that the people giving intentions, the raters, the photographers and researchers didin't know anything.
And BTW the reason they triple-blind these protocols is because asshole pseudo-skeptics like you make nonsensical criticisms. Yes, that is why parapsychology is the most rigorous of all sciences. 85% of parapsychological studies are single or double-blind!!! 4.9% of psychological studies are blind, 24.2% of medical studies are blind, 0.08% of biological studies are blind, and 0.0% of physical studies are blind.
http://www.sheldrake.org/Articles&Papers/papers/experimenter/blind.html
Posted by: MrEvidential | August 06, 2009 at 05:39 PM
Ummm, forgive me if I'm wrong, but isn't that what double-blind means? That everyone involved is blinded?
By the way, is it just me or did you give two completely different statistics for blinding in parapsychological studies, one of which would show that you're full of shit about parapsychology being the most rigorous of all sciences?
Incidentally, I'm pretty sure that it's idiotic to blind physical studies... it's not like you shouldn't know whether the electron you're looking at is electron A or electron B or whatever.
Also, the link you gave doesn't indicate how many were single blind and how many were double blind, and the sample size for the parapsychological journals was much smaller than the other, plus the parapsychology articles were from a different time period than all of the others. This seems somewhat shady to me.
Oh, and the statistics are ten years out of date.
Posted by: King of Ferrets | August 06, 2009 at 06:33 PM
The question is: are they blind at all. Parapsychology blinds the most, by far. And if you would read the studies you'd see they are usually double-blind.
Posted by: MrEvidential | August 06, 2009 at 06:52 PM
No double-blind means the subjects and experimenters are blind, only.
No it wouldn't be idiotic to blind physical studies. Many times it's been proven that someone's results were due to experimenter bias.
Posted by: MrEvidential | August 06, 2009 at 06:57 PM
And no, I did not give to completely different statistics on blind. The question is: are they blind? I simply pointed out that that includes single or double-blind.
Posted by: MrEvidential | August 06, 2009 at 07:00 PM
*two completely, I meant.
Posted by: MrEvidential | August 06, 2009 at 07:00 PM
I'm pretty sure you're wrong about that, MrEvidential. Studies of acupuncture where the acupuncturists administering the treatment were blind to whether they were using real or retractable needles were considered double-blind, for example, but the acupuncturists were not the experimenters.
Got any evidence that any physical studies had corrupt results due to experimenter bias that would actually be solved by blinding?
Yes, you did give two completely different statistics on the blinding of parapsychological studies. Lemme quote you:
I think those are two rather different statistics, don't you? Especially if triple blinding actually exists. (Which I'm pretty sure it doesn't.)
Posted by: King of Ferrets | August 06, 2009 at 07:25 PM
MrEvidential:
Well you'd be wrong. I only mentioned a couple of papers because they are the ones I remembered from my couple of hours casually browsing the PDFs available online. They were not exceptions.
No-one has argued that peer review is faultless, so why imply that someone has?
Rule number one of debate - make sure that you are actually responding to the right person and the things they actually said. I didn't say that, Bronze Dog did.
Do you read the JSE with such a keen eye for the details?
For someone who keeps implying that we are cherry picking our data this is a bit rich. I note that instead of focusing on the fact that I point out Radin's reliance on logical fallacies and the unscientific you focus on the minor point I was making that the paper he wrote was not based on his own research. Why would you ignore the important point and focus on the secondary one, I wonder?
Of course I am aware that other scientists refer to other experiments in their papers. How many of them, described as rigorous experimentalists, base entire papers solely on other people's work however? This isn't the important point though - Radin assumes in the dog paper that what he hopes to prove already exists and is part of the proof - grossly unscientific and reliant on a logical fallacy. Would you like me to quote where he baldly admits this? Oh what the hell, I will anyway:
And you claim that this went through rigorous peer review? No expert and impartial referee could possibly have left that unchallenged. First - he assumes that psi exists in the face of overwhelming counter evidence. Second - he assumes that Sheldrakes's experiment further proves the existence of psi without explaining why or why other possibilities can be discounted. Third - he admits that he basically pretends the means for psi to happen are present without providing any evidence whatsoever.
How rigorous a scientist do you think someone would be if they wrote "When people look at things they sometimes catch fire and this demonstrates strong evidence that people can fire lasers from their eyes. I just finesse the mystery of how people shoot lasers out of their eyes by assuming superpowers and the ability to do so."
How rigorous would a journal be if they uncritically published a paper that began with this claim?
I've seen enough to see that the JSE is full of crap like this.
How so? It is one of several on the list I picked randomly, and it was not exceptional.
There is nothing rigorous in concluding that which you hope to prove exists already exists and then assigning any result you like in subsequent experiments to that.
If you can't understand this, there is no point in continuing to speak to you.
No one said it was - however you cannot show that the crap we have found is down to mistakes and not simply indicative of piss poor peer review.
You are simply engaging in ad hoc hypothesising. First you say the journal is rigorously peer reviewed, then when shown examples that clearly aren't you switch to "Well peer review isn't flawless." Or "Oh it's just your own bias." Or "Oh you're just picking the examples you want to." Prove that the examples given are down to flaws in the system and not examples of how bad the peer review process for this journal is or you are just making excuses for the obvious.
And you, of course, are bias free. Right?
Want another example of Radin's rigorous experimentalism and the rigorous peer review it goes through? Let's look very briefly at the paper Exploring">http://www.scientificexploration.org/journal/jse_16_4_radin_1.pdf">Exploring Relationships between Random Physical Events and Mass Human Attention: Asking for Whom the Bell Tolls.
Here Radin claims that the data shown in figure 2 on page 6 shows a large variance on one day in September that no others came close to - yet you can clearly see there are two other peaks in the graph shown that are of similar variance.
On page 6, beneath figure 2, Radin writes:
So, if his conclusion (and it is) is that focused and mass public attention effects random number generators - why is this peak one hour before anyone was focusing on the 9/11 attacks? How does he explain the way this fits into his conclusion? Oh that's right. He doesn't.
He goes on to write:
Oh ok. Obviously everyone in the world stopped thinking about the 9/11 attacks about 8 hours after the hour before the first plane hit. No wait...
How does Radin explain how this fits into his conclusion? Oh that's right. He doesn't.
He does mention that a drop of 6.5 z variance zones in his graph is unique throughout 2001. He doesn't mention that on the graph you can clearly see two other large variances, one of 5 zones at about 712 in figure 2 and another one of 5 at about 803. Why doesn't he mention these dates I wonder? The variance is almost as big, what happened on those days? The paper's referees did ask this right? They did ask why Radin ignores them, didn't they? Radin does explain why he ignores them, doesn't he?
The data that Radin is using does show that there was a large variance continuing for weeks after the attacks because so much attention was focused on one specific event, right? But it doesn't, does it? The referees did ask Radin about this, didn't they? And obviously Radin explains this doesn't he? Oh...
But not to worry, because Radin points out in his conclusion that in the 20th century
Oh good, who were these investigators so we can verify if Radin is working correctly?
Wait, he references himself as independent proof? The referees did ask about this, didn't they?
Yes, very rigorous. I'm convinced....
Posted by: Jimmy_Blue | August 06, 2009 at 07:40 PM
No, read the paper. I was pointing out that the studies include single-blind or double-blind at the start. It is not different statistics!
Wikipedia:
"In a double-blind experiment, neither the individuals nor the researchers know who belongs to the control group and the experimental group."
Of course triple-blinding exists!! It's even on the medical dictionary website! Here's the definition:
"Pertaining to an experiment in which neither the subject nor the person administering the treatment nor the person evaluating the response to treatment knows which treatment any particular subject is receiving."
Typically in physics it takes a lot of analysts working together to extract data from datasets and they want to make accurate systematic error estimates. This can be difficult to do if their is observer bias. Eliminating observer bias can help guarantee an accurate result.
Posted by: MrEvidential | August 06, 2009 at 08:01 PM
Oh, Wikipedia? I've found 2 Wikipedia articles that involve blinding. Neither of them properly cite their sources. See how this one doesn't link to a source for the double blind part and only even mentions triple blinding in the introduction, without a section for it, and how this one even has the citation needed tags for single, double, and triple-blinding. Not trusting that.
I have no idea how trustworthy The Free Dictionary is, so I don't think I'll trust that for the moment either. By the way, nice Googling. It shows you really know about triple-blinding, now doesn't it?
Oh, and I just realized that I misread psychological as parapsychological in your statistics, sorry. Their parapsychology stats still look fishy, though.
Posted by: King of Ferrets | August 06, 2009 at 08:19 PM
I just looked up The Free Dictionary's definition of homeopathy. I don't think it's particularly trustworthy as a medical resource anymore.
Posted by: King of Ferrets | August 06, 2009 at 08:45 PM
Sorry for the triple-post, but something just occurred to me.
My understanding is that the point of blinding is to prevent corruption of the results by accidentally revealing to the patient which treatment they are on, or some other corruption of the results by the subjects of the experiment. So, single blinding is not telling the patients what they're on. But sometimes, a doctor might accidentally reveal to a patient what treatment they're on if they know. So they invent double blinding, which blinds the doctors too.
This means that using blinding in physical science is probably pointless, because a piece of earth isn't exactly going to change its composition if the geologist tells said earth what he's looking for. In physical science, a change in data based off an experimenter bias would probably be called making shit up.
Additionally, it means that if triple blinding actually existed, double blinding would be completely pointless in any experiment that involved people other than the subjects and the experimenters. Even if the experimenters were blind, if there were unblinded people in the presence of the subjects, the results could be corrupted. Not only that, but a double blind study, if we're using MrEvidential's definition, would actually have to go out of its way to unblind the others involved while keeping experimenters and subjects blind! This indicates, to me, that MrEvidential is full of shit about triple-blinding and triple-blinding probably doesn't actually exist.
Oh, and I'd like an answer to the criticism of the parapsychology data used in that comparison you linked to. Mind giving me one?
Posted by: King of Ferrets | August 06, 2009 at 09:09 PM
Very interesting post Skeptico.
With respect to alternative approaches to testing astrology, about a year ago I participated in a thread on the JREF forums related to testing astrology:
http://forums.randi.org/showpost.php?p=3928499&postcount=3
Post in that link was one of many within that thread (which grew to more than 360+ posts). You might be interested in browsing a bit the rest of the thread for some amusement maybe :)
In summary, I think using astroloGERs' interpretations and skill (or lack of) to test astroloGY is an inadequate approach. Some fundamental premises of astrology could be debunked in and of themselves after all, as I explain in that post.
Posted by: Raul Saavedra | August 06, 2009 at 09:18 PM
Did I see MrEvidential bring up Schwartz? He's the guy I was thinking of who called a medium experiment "triple-blind" and yet the only information transfer he filtered out between medium and reading subject was visual. Yeah, impressive way to prevent information flow and the resulting cold reading. I guess triple blinding means 3 ply of wood.
Posted by: Bronze Dog | August 06, 2009 at 10:19 PM
Mr. Evidential:
linking to Sheldrake? Really? you go from one massive woonophile to another woo? Gimme a break.
Posted by: TechSkeptic | August 07, 2009 at 05:41 PM
I think MrEvidential stopped playing since he saw how easy it was to take almost any random study from his journal and start poking holes in it.
If he didn't stop playing, after all, he'd have to re-examine his own beliefs instead of smugly assuming everyone who doesn't agree with him is wrong.
Posted by: Jimmy_Blue | August 07, 2009 at 07:47 PM
Yeah, though given the typical woo's complete lack of self-awareness and introspection, I imagine there's a strong chance of him showing back up and not addressing the big pile of criticism.
Anyway... Would triple-blinding be something like this?
Posted by: Bronze Dog | August 07, 2009 at 09:21 PM
Yup, totally.
If the guys behind the glass were blinded too, it would be quadruple-blinded!
Posted by: King of Ferrets | August 07, 2009 at 10:40 PM
As I understand it, and I'm pretty sure it's legit, a triple-blind study is one in which the subjects are blinded (single-blind), the researchers are blinded (double-blind), and the people analyzing the data are also blinded. It's meant to rule out cherry-picking and sharpshooter fallacy-type errors.
Of course, if the blinds are done incompetently, then it doesn't really matter.
Posted by: Tom Foss | August 08, 2009 at 01:24 AM
Huh. Why isn't it in wider use then?
Posted by: King of Ferrets | August 08, 2009 at 02:38 AM
Triple blinding is having vertical slats on three windows.
Posted by: Big Al | August 08, 2009 at 02:44 AM
Posted by: Tom Foss | August 08, 2009 at 07:23 AM
From the Wikipedia article on Randomized controlled trials:
I imagine part of the situation is that in some studies, the statistical analysis is done by the researchers, who were already blinded.
Posted by: Tom Foss | August 08, 2009 at 07:27 AM
Nah, that's a horrible Wiki article. It even says [citation needed] right at the end of that sections. The one on blinding trials is horrible too and needs someone to stick a few [citation needed]s in it.
Posted by: King of Ferrets | August 08, 2009 at 02:34 PM
KoF,
double blinding isn't just about preventing the researcher from leaking information. Its also about keeping him protected from his own biases.
As for this triple blind stuff. In double blind, when we say the researcher is blinded, it usually means the researcher and all his staff, including people evaluating the data. I don't think there really is a triple blind.
Posted by: TechSkeptic | August 09, 2009 at 07:08 PM
Ah. Sorry, my information on the subject isn't as complete as I'd like it to be.
Posted by: King of Ferrets | August 09, 2009 at 09:08 PM
Yikes - everyone's so cross. . .
Posted by: mogs160 | September 03, 2009 at 12:09 PM
dont confuse "desire for accuracy and demonstration of claims" with "being cross".
We just don't like swindlers, whether they knowngly scam people or not.
Posted by: TechSkeptic | September 03, 2009 at 12:59 PM
I'm not at all confused. There are an awful lot of cross posts here. On both sides of the fence.
Posted by: mogs160 | September 05, 2009 at 09:23 AM
I'm generally cross with woos for good reason: Irrational superstition often hurts people and wastes resources.
Of course, I realize that "he did it first" isn't terribly effective of an excuse, but, in my experience, I'm better able to remain calm if my adversaries start out with a measured, thoughtful response.
Posted by: Bronze Dog | September 05, 2009 at 09:55 AM
Seems there are an awful lot of people who spend too much time worrying about it all. Those who believe the bullshit and those who worry about those who believe the bullshit. . . never the twain. . .
Posted by: mogs160 | September 05, 2009 at 03:08 PM
the problem is that those who believe the bullshit most of the time don't care about the harm that they can cause, (and some times they do it willingly in the name of gods, aliens, or whatever excuse they use) and if you are thinking "what's the harm", google just that, "what's the harm":
there's a web site with lots of articles about when something goes wrong because of superstition, religion, new age beliefs, fake medicine, vaxine-phobia, and so on.
Posted by: Pelger | September 05, 2009 at 05:09 PM
What was Ertel trying to say when "combining first and second choices", how did he get that p=.054 or .04.
It makes sense that p(succes)= 1/3
and p(failure)=2/3
That's how they test in statistics the outcome you test is called "succes", "failure" is used for when the outcome being tested did not happen. In this case, succes means the astrologer got it right, failure means he didn't.
Just read any book on statistics, the binomial distribution.
Suppose you flip a coin and test for heads. Heads would be succes and Tails would be failure.
Posted by: Nico | November 09, 2009 at 07:03 PM
Nico, Ertel does not think Carlson's tests provided sufficient evidence for his claim. One objection, actually from the astrologers, was that the randomly selected charts from the student participants produced selections that were unfairly difficult to differentiate.
Ertel is critical of the test design, arguing that Carlson should have supplied only two charts per astrologer (instead of three) and the charts in each pair should have been distinctly different. This is what had been done in the successful Vernon Clark study that preceded Carlson's study.
There is no good reason to supply three charts and this is unusual. But there are textbook rules for three-choice tests, which Ertel cites. This would be to evaluate the proportion of first and second choices taken together against the third choice. This would have allowed the astrologers to decide, for example, I think it's either A or B (which unfortunately have similar features), but I don't think it's C.
If Carlson had followed the textbook, he would have obtained the near-significant results that Ertel calculates from the published data.
Thank you for your question.
Posted by: Ken McRitchie | November 23, 2009 at 04:26 PM
Even so, if conditions A and B were similar, but both distinct from C, I would expect at least agreement over (A+B) or (C) answers as appropriate. Mixed (A+C) or (B+C) answers would count against the claim.
So a quick meta-study could still glean useful data.
Posted by: Big Al | November 24, 2009 at 06:28 AM
Al, One would need to understand astrological interpretation to tell if chart A is similar to chart B, so I don't think it would be easy to validate Carlson's claim that way.
If I understand Ertel's main conviction correctly--which he has tried to get across to both sides of the controversy--it's that the data is more sensitive, and more objective, if it is ranked. A sports champion with citations in all five volumes should not be regarded the same as a sports champion with only one citation. The eminence (ranking) effect is where the astrological factor under consideration increases with rank.
If I understand the textbook method that Ertel applies to Carlson, it's a way to rank the astrologers' confidence in their choices while considering that they might have been given similar charts. He calls Carlson's analysis of that test "piecemeal" because it ignores rank by confidence. In another test where the astrologers were asked to rank their confidence on a scale of 10, Ertel assesses the most significant result.
Posted by: Ken McRitchie | November 25, 2009 at 11:00 AM
But the astrologers, possessing the gift of astrological interpretation, didn't complain that the charts were similar? Was this explanation invoked only after the event, when the results were non-optimal?
Discrimination between either (A+B) or (C) results would still be valid, whether or not the similarity was too subtle to spot.
I understod that astrology was meant to be some kind of proto-science, using immutable universal laws. What's all this about confidence?
Do astrologers ever rank their horoscopes by how confident they feel about their accuracy?
Posted by: Big Al | November 26, 2009 at 12:14 AM
Big Al, Astrology is something studied and learned, and there are many teachers. It's not a gift as say psychics usually say about themselves.
Ertel mentions that Carlson ignored the protests of astrologers who received similar charts. At least one astrologer dropped out of the study because of this. Ertel thinks the design that allowed this was unfair.
You would have to read Ertel's reassessment yourself concerning your argument of (A+B) or (C). I'm sure he'd be glad to hear from readers of his article, or even if you didn't read it. He's approachable.
As I understand it, astrology is based on observation and no one is suggesting that astrology has immutable universal laws governing behavior.
Ranking improves choice tests, such as Carlson's. I think it is only reasonable to agree with Ertel that this makes sense.
Thank you for your comments.
Posted by: Ken McRitchie | November 26, 2009 at 07:56 AM
I think my major issue with astrology is along the following lines:
Whenever it is tested, the tests seem to be inconclusive, very slightly favourable, or negative. When inconclusive or negative, there seem to be complaints that the tests are biased, flawed or otherwise not fair on the claimants.
If there is any correlation between natal star sign and personality/destiny, it has to be very, very slight, or it would be a slam-dunk undeniable, verifiable phenomenon in the internet age where the sample pool is worldwide.
But if it is so subtle and hard to assess, even in the modern global age, how did early astrologers attached to small villages and towns (when the average life expectancy was only 30-35) ever latch on to the phenomenon in the first place? Especially when there seems to be such deep, difficult and arcane calculation involved?
Posted by: Big Al | November 26, 2009 at 11:49 AM
en,
Al was asking about the yawning gap between what astrologers routinely claim regarding their accuracy, compared this sudden talk about "confidence".
And astrologers do claim immutable laws - not in the sense that your behavior will be forced on you, but in the sense that the planets influence our lives in an objective way. That influence should be the same regardless of whether or not the astrologer is sitting in a scientist's office or in his own, shouldn't it?
So what is it? Do you tell your customers that the information they are paying for is "possibly slightly more accurate" than what they would get if they randomly received someone elses reading?
Also, Skeptico's article showed you to be factually wrong on a few points. Care to address that or post a retraction?
Posted by: yakaru | November 26, 2009 at 12:01 PM
Aaaarrgh!!! First letter missing: K
Posted by: yakaru | November 26, 2009 at 12:17 PM
Ken McRitchie wrote:
I’m sorry Mr. McRitchie, but that comment demonstrates that you don’t really understand this subject. Carlson was not making any claim here, and in fact there is no obligation on Carlson or any skeptics of astrology to claim or demonstrate anything. The astrologers were the ones making the claims – namely that they could select the correct CPI results using astrology. The astrologers designed a test that they claimed would show they could do this with a greater probability than pure chance. They failed. The burden of proof is upon those claiming that astrology works, not upon skeptics to show it doesn’t. The null hypothesis is that astrology doesn’t work, and the null is the position to take until it is shown that the null is incorrect. And considering astrology’s doubtful provenance – no history of how its rules were derived / no known means by which all the detailed and highly specific rules could possibly work – we need extraordinary evidence that it works before we can reject the null. But we don’t get that. Instead we get this rather blatant attempt to rewrite history – to suggest that some other test should have been done rather than the detailed peer reviewed test that was agreed in advance by all concerned. Anyone who has been involved in experimental design should know that you don’t change what you said you were looking for after the test has been done simply because the actual agreed test didn’t give you the results you wanted. That’s just a basic no no, for reasons that should be obvious. And yet even with this dodgy approach, Ertel admits that his data mining gives results that are still (in his exact words) “insufficient to deem astrology as empirically verified.” IOW, Ertel specifically states that the null should not be rejected. So what you or he are hoping to achieve by flogging this dead horse, is beyond me.
If Ertel, or you, or anyone else thinks that a better experiment should have been done, then you should design the experiment that you want to do and then carry it out. Then get it published in a real peer reviewed journal like Nature. Then have it replicated by independent testers several times. If you can do that, perhaps skeptics would begin to discount the heretofore failings of astrology, and its absurd premises. As of now, you are nowhere near that point.
Posted by: Skeptico | November 29, 2009 at 03:37 PM
Skeptico, Carlson claims that his study argues a "surprisingly strong case against natal astrology" (425). The question is whether his study actually supports his claim.
Big Al, One only has to look at the subject matter to realize that astrology and personality are very complex areas of study. Measurement is not easy even in psychology, which is what Carlson and the participants were trying to compare astrology with. There is a lot of bad research in astrology on both sides of the controversy. This is why it is important to discover flaws and suggest improvements, which is what science is supposed to do. I believe this is where Ertel tries to position himself.
This is a global era, but data with accurate times for definitive events and strong traits, which astrology likes to discuss, are extremely difficult to gather. You reflect on how hard it is to imagine how astrology could have begun. I imagine the first astrologers just observed the planets and stars without calculation. But here you and I are just imagining aren't we? Maybe these thoughts are not particularly relevant or objective.
yakaru, It's not clear to me what you mean by the accuracy you say astrologers routinely claim. There is a gap between the advice, suggestions, warnings, and example possibilities published in astrology columns, and the sort of scientific claim Carlson makes in his article. I think confidence measures, properly applied, are appropriate in a study like Carlson's.
I think it is agreed that astrology does not claim to force behavior, but neither should one force the claim of "immutable laws" on astrology either, because astrologers themselves do not agree to this. If you think you see laws in astrology that you could elucidate, please go ahead, but astrologers talk only about principles, properties, and characteristics.
The Skeptico article is mainly about peer review. Ertel does not say anything about peer review in his article so it's a moot point. I had discussed peer review with Ertel when I tried to understand why he became interested in the Carlson study. Nature had not published any letters on the article, and subsequent critique, along with refereeing, is a normal part of peer review in the broader sense, where errors should be detected and corrected. He was offering his critique.
Also I knew that inquiries had been made with Nature concerning the long delay in getting the article published and why it was published in the "Commentary" section, which seemed unusual. I'll quote from a third source on this:
"In a phone conversation with Robert Pool, news editor, Washington D.C. editorial office of Nature, he indicated that he was not familiar with the Carlson article. Pool said that the length of delay was unusual, and probably caused by problems with the original article or difficulty obtaining referee acceptance. He did indicate that articles published as Commentary are not subjected to the same peer review process as the regular articles (and even some letters!); rather they are published at the discretion of the editor-in-chief."
Be that as it may, the criticism from disinterested parties tends to go along the following lines: The CPI manual specifically states that the scales 1) must be interpreted by a person trained in their use and 2) the gender of the person must be known because the scales are interpreted differently for males and females.
Carlson's design violated both of these assumptions. Hence, Nature's editorial process is flawed because it accepted Carlson's design.
Skeptico says that JSE is not among the 10,000 most influential journals. I have no problem with that.
As to the reassessment, I have tried to convey the substance of article in a few words in my article, but one really needs to read Ertel's article itself to fully appreciate the arguments and supporting facts. I cannot adequately do that here. Ertel does not try to stand the study on its head and show the reverse is true, but he points out flaws, inconsistencies, and the use of incorrect procedures, such as Lowry's procedure, which is the accepted practice for combining data in three-choice formats.
Carlson states in the Nature article that he intended, "before the data had been analyzed," to combine them: "We had decided to test to see if the astrologers could select the correct CPI profile as either their first or second choice at a higher than expected rate..." (425). Estel argues that Carlson ignored his own protocol without giving reasons.
I'll quote a part of Ertel's article regarding this:
"Carlson does not explain the statistical procedure of his analysis. He
uses standard deviation as a term to denote both, standard deviations of original
distributions and of normalized or Z-distributions, the latter with M = 0 and
SD = 1. Readers will be confused by an uncommented use of two different word
meanings. For some analyses Carlson seems to calculate confidence intervals
of proportions, which differ, however, from the confidence intervals obtained
by ordinary procedures. A standard procedure for calculating the confidence
interval of proportions dates back to Wilson (1927), of which an account can be
found in Newcombe (1998), and in Lowry’s VassarStats online (Lowry, 2008)."
Regarding the 1-10 ranking scale of the CPI results:
"In addition to ranking each birth chart into first, second, and third fit categories for their CPI cases, each astrologer “also rated each CPI on a 1–10 scale (10 being highest) as to how closely its description of the subject’s personality matched the personality description derived from the natal chart” (Carlson, 1985: 420). Again, Carlson analyzed these ratings piecemeal for each of the three choice categories separately, and found his result “consistent with the scientific prediction of zero slope” (Carlson, 1985: 424). Yet the ratings should have been analyzed together, across choice categories, because they had been made independently of the three ranking choices."
One really needs to read the article.
Here's a quote from another author, who is critical of the Carlson and similar astrological studies.
"It is my contention that these articles indicate that astrologers are not the only ones who do not understand the difficulty in designing solid experiments to test astrological claims. Many scientists believe
that their training in one discipline (say, Astronomy) qualifies them to evaluate claims in other "scientific" areas. This is the primary problem with the Carlson study: A physicist was using psychological tools."
Posted by: Ken McRitchie | November 29, 2009 at 09:58 PM
Ken, you wrote:
The astrologers I've known and read always said (or clearly implied) a chart is in principle 100% accurate (given accurate data). Apparent anomolies, they told me, disappear when one looks more deeply at other aspects. A square with Saturn complicates an otherwise straight forward characteristic, for example. Or, maybe the influence of personal planet gets trumped by the activity of a generational planet causing a war.
Or a more concrete example: my birth was induced. Inductions were always performed on Tuesdays at 10 am because the doctors always played golf on Tuesday afternoons. Yet, astrologers always told me that this didn't mean I had the wrong birth chart, rather my birth happened exactly as it "should have". That is, my birth time was guided by (or happened under the governance of, to use your kind of language) astrological influences. So that is a claim of 100% accuracy, isn't it? Surely my birth time can't be 30 or 50% right. Or can it?
That is the kind of accuracy I was referring to.
I accept that the test is indeed a different situation from a one to one reading with a client, so maybe the comparison seems unfair. But to me it looks duplicitous, the way astrology is presented to clients without any question about its accuracy, yet then we see you and Ertel arguing for the possibility of a "slightly better than chance" success rate, especially after the standard for success was initially set by astrologers themselves as 50%.
So, do you think astrology is accurate, or is it merely a game by which you can retrospectively fit anyones chart to anyones personality and life events?
By "laws" I was referring exactly to those "principles, properties, and characteristics" that the study was examining. I used the word laws. What do "principles, properties and characteristics" do if they are not consistently active or present (i.e. act like laws)?
Furthermore, I could quote from virtually any of the astrology books I have ever read. A common explanation is the way the moon affects the tides, and plant growth, and likening that to the way the planets affect our lives in a similar tangible way. Cosmic rays are often suggested these days, too.
You seem to be wanting to avoid making any clear statement of astrology's function or efficacy here, while on your blog, and no doubt to your clients, you present it rather differently - that the laws (or principles) of astrology bound up with the physical universe, that they influence events, and that you can predict or determine the extent and nature of this influence. Instead of equivocating and backing off, how about answering some concrete questions, clearly and consistently, exactly as you would to your clients --
Are the principles of astrology consistently applicable to reality or not?
Do you think the position of the planets exerts an influence and that you can predict or determine it?
If so, just how accurate do you think an individual reading is? (i.e. can it be used for making important decisions, or is it only useful for retrospective interpretation?)
If you think it is more accurate than random chance, why? On what grounds?
Posted by: yakaru | November 30, 2009 at 07:49 AM
And, if it's such a exquisitely subtle effect, how did anyone ever stumble across it in the first place, in the dim and distant past?
Posted by: Big Al | November 30, 2009 at 08:35 AM
The Carlson study was fatally flawed because of its reliance on the CPI. What is not usually reported is that the subjects themselves were not able to identify their own CPI profiles, either. This is evidence that, even if a non-psychologist has a working knowledge of a personality, he still can't successfully match it to a CPI. For that reason alone, the Carlson study says much more about the CPI than it does about astrology.
Posted by: AvalonXQ | December 02, 2009 at 09:59 AM