Reader Kate sent me a link to the HuffPo article by Srinivasan Pillay, The Science of Distant Healing, that everyone’s talking about this week. Apparently a study showed remote “intention” could act as a therapeutic intervention. I originally wasn’t going to bother with this, as the article was in my view rather confused and poorly written, and several skeptics in the comments seemed to be doing a pretty good job of taking it apart already anyway. And then on Friday both Orac and Steven Novella wrote posts critical of the article. But then I got a hold of the full study, and a little light went off in my head that told me I had something to add even to what those two luminaries had written.
First, I’ll do what Pillay didn’t do, and link to the abstract. I managed to click on the Elsevier link in the abstract and obtain a temporary log on ID to read the study. It’s entitled “Compassionate Intention As a Therapeutic Intervention by Partners of Cancer Patients: Effects of Distant Intention on the Patients' Autonomic Nervous System”. An odd title since the authors clearly state in the study, “we did not test for distant healing” (more on that below).
The study supposedly measured the effects of intention on the autonomic nervous system of a human "sender" and distant "receiver". Well, not really. What they actually measured were changes in skin conductance level, or as Pillay wrote, “a measure of the ability of sweat to conduct electricity”. PAL called that “measurements from glorified Scientology E-meters”. Ouch! No illnesses being cured then (as they admitted – see above). The paired senders and receivers were divided into three groups:
- Trained in directing intention, one person in each pair had cancer
- Untrained in directing intention, one person in each pair had cancer
- Untrained in directing intention, neither person in each pair had cancer
In group 1 and 2, the healthy person directed intention at the sick person. In group 3, a healthy person directed intention at another healthy person. Members of group 3 were not randomly selected – they were (obviously) non-randomly allocated to the group with no cancer. And yet, group 3 was claimed to be the “control group”. However, all three groups were instructed to direct intention – ie, even the “control group” directed intention. This is important when you consider the hypothesis being tested, which was:
The principal hypothesis was that the sender's DHI [distant healing intention] directed toward the distant, isolated receiver would cause the receiver's autonomic nervous system to become activated. A secondary analysis explored whether the factors of motivation and training modulated the postulated effect.
To test the principal hypothesis you obviously need a control group which is not sending intention, to compare with the intention group. Otherwise, how do you know if the intention had any effect? But there was no group without directed intention, which means there was no control group to test the actual principal hypothesis the authors of the study specifically said they were testing. So what were the results? Did the receiver's autonomic nervous systems become activated, and did training and motivation make a difference? Take a look at Figure 6 from the study, and the note under it, and see what you think:
Figure 6 Comparison of sender and receiver effect sizes (per epoch) measured at stimulus offset (with ±2 standard error confidence intervals) for all sessions, motivated sessions (trained group and wait group combined), and trained, wait, and control groups separately. EDA, electrodermal activity.
You’ll note there is no significant difference between the receivers in the different groups. (The senders differ, but then they knew they were sending.) The receivers all register an effect, but since there is no control group to compare these results with, these data tell you nothing about the principal hypothesis. Again I say, you need a control group to test this hypothesis, and they didn’t have one. There was no significant difference between the trained / untrained groups or between the motivated (ie including sick people) / unmotivated groups. So the secondary hypothesis failed.
So, end of story. Study failed, yes? Write it off, study something new next time? Well, no of course not. Not with woo. The study authors weren’t satisfied with that. Here, Steven Novella noticed something I initially missed. It was this, from the abstract:
Planned differences in skin conductance among the three groups were not significant, but a post hoc analysis showed that peak deviations were largest and most sustained in the trained group, followed by more moderate effects in the wait group, and still smaller effects in the control group
Translation: the study didn’t show what we wanted it to show (clearly – see Figure 6), so we data mined it to find something we could say was an effect. So what did they find with this bit of ad hoc activity? They produced several other graphs, of which (to keep it simple) I will reproduce just Figure 7:
Figure 7 Normalized comparison of receiver skin conductance levels in the three groups. EDA, electrodermal activity.
What they want you to look at is the difference between the three groups during the “intention” period. Group 1 (the “trained” group), showed the largest increase during the ten second burst of “intention”. (See the timescale on the bottom – seconds 0 to 10 is when the intention is being directed.) OK, but what I want you to notice is the 5 seconds before the intention (-5 to 0 on the bottom axis). The normalized EDA is actually higher for one group (group 2 – the “wait” group), when no intention is being directed at all! So for those not trained, it appears distant healing effects are higher when the sender does nothing. Even with group 1 (“trained”), the “doing nothing” period has a higher measurement than roughly 50% of the “sending intention” period. Then it struck me what we are missing – we are missing readings from all the “doing nothing” periods – since the intention sessions were all 10 seconds long, and the non-intention sessions were from five to 40 seconds long, we are talking about probably 60-75% of the total time. Was the EDA measurement higher or lower during those periods? Were there other peaks in EDA during those periods? They don’t say.
And I’m pretty sure they didn’t even look. Tucked away just before the “results” section of the report, they state this:
To avoid multiple testing problems, the preplanned hypothesis examined the normalized deviation only at stimulus offset.
I’ve read that section about ten times now, and the only sensible interpretation of that sentence is that they only looked at EDA changes during the intention sending sessions – they didn’t look at them during the non-intention sending periods (unintentional periods?). This sounds like the sharpshooter fallacy – shooting a load of bullets at the side of a barn and then painting a target where most of the bullets landed. But they ignored the larger clusters of bullets fired at different times.
Why This Is Significant
The study’s lead author is Dean Radin. Radin has a history of fitting statistical anomalies to temporal events, while ignoring the same anomalies that occur at other times that he doesn’t want you to know about. An example would be Radin’s interpretation of the now defunct (correction - it's still going) Global Consciousness Project’s (GCP) output from a series of random number generators – data that supposedly showed global consciousness spiked at certain major global events. If you want to see how credulous Radin can be, and/or how determined he is to find a correlation whether one exists or not (you decide), read this account by Claus Larsen, who attended a talk by Dean Radin in 2002:
Radin gave several examples of how GCP had detected "global consciousness". One was the day O.J. Simpson was acquitted of double-murder. We were shown a graph where - no doubt about that - the data formed a nice ascending curve in the minutes after the pre-show started, with cameras basically waiting for the verdict to be read. And yes, there was a nice, ascending curve in the minutes after the verdict was read.
However, about half an hour before the verdict, there was a similar curve ascending for no apparent reason. Radin's quick explanation before moving on to the next slide?
"I don't know what happened there."
It was not to be the last time we heard that answer.
Does that remind you a little of figure 7 above, and does it make you ask what happened during the “no intention” periods? It should.
And then there was 9/11:
Another serious problem with the September 11 result was that during the days before the attacks, there were several instances of the [random number generators] picking up data that showed the same fluctuation as on September 11th. When I asked Radin what had happened on those days, the answer was:
"I don't know."
I then asked him - and I'll admit that I was a bit flabbergasted - why on earth he hadn't gone back to see if similar "global events" had happened there since he got the same fluctuations. He answered that it would be "shoe-horning" - fitting the data to the result.
Checking your hypothesis against seemingly contradictory data is "shoe-horning"?
For once, I was speechless.
Did Radin check to see if there were similar fluctuations in the data in the “down” periods of this recent study? I don’t know, but we know for a fact from the above that Radin has selected data to fit his hypothesis in the past, and so I’m not going to trust him not to have done it this time. We know he performed some additional manipulation on the data, as Orac also noticed from the study:
To reduce the potential biasing effects of movement artifacts, all data were visually inspected, and SCL epochs with artifacts were eliminated from further consideration (artifacts were identified by [Dean Radin], who was not blind to each epoch's underlying condition).
So Radin admits he un-blinded the study and eliminated data he didn’t like.
Throughout this post I avoided any personal attacks on Radin’s (or Pillay’s) credibility, and concentrated instead on the actual study. However, when considering a study that claims a statistical effect like this (and the study authors admit the size of the observed effects were very small), on such frankly dubious grounds, it is relevant to consider where the author has in the past ignored contradictory data when forming conclusions. Clearly he has in the past, and he may well have done so here. The most generous conclusion I can draw about this study would be that it would need to be replicated by independent experimenters before I would even consider that there might be some basis in what it is claiming. (Randi’s $1 million test, anyone?) A more realistic interpretation is that Radin has been known to select data that fits his hypothesis and ignore that which doesn’t, and so there’s no reason to think that hasn’t happened here. Radin even admits he un-blinded the study to eliminate some data he didn’t like. Add the fact that there was no control group, the null hypotheses were not even rejected, and the only interesting thing they found required some (admitted by the authors) post hoc rationalization, and there really isn’t much left worth looking at.
The study ends with the words “This study is dedicated to Elisabeth Targ.” That would be the Elizabeth Targ whose study of intercessory prayer was also fraudulently un-blinded so it could report a success when in reality it had failed. And this study is dedicated to her? I couldn’t have put it better myself.
Excellent write up, Skeptico, excellent.
I note Srinivasan didn't mention in his article that the study was done by Dean Radin. Omitting that name is already data mining.
Posted by: yakaru | March 28, 2009 at 12:11 PM
Wow. You demolished that study so resoundingly that I predict there will be a bump in the global consciousness today.
Posted by: kate | March 28, 2009 at 12:35 PM
no control....*shudders*
Posted by: deep | March 28, 2009 at 01:44 PM
Clarifications in agreement with skeptico:
1. i agree that no illnesses were cured in this study.
2. i think that the flaws that you point out in this study are valid and entirely accurate. this is probably why it is not published in nature or science. however, the medical literature offers a wide variety of studies to contemplate. perhaps this article would have been better suited to a journal like “medical hypotheses” which is a well recognized journal for exploratory work. i do think that exploratory work deserves a voice. my ambivalence about calling this purely exploratory, is that for studies of its kind, the double-blind procedure was unusual and therefore differentiated it from other similar studies. i am ambivalent about this.
3. i agree that more control groups would have made the findings much more robust. further studies are needed to confirm these findings.
4. i think your “eda” questions are valid. i would have included this and the other points you made in an explicit limitations section. i think that most studies have limitations.
Slight differences of opinion:
1. while skin conductance is by no means histology, in the measurement of peripheral autonomic responses, it is a fairly standard tool used in medical research. my own opinion is that as a stand alone test, it is a very broad and general statement about peripheral autonomic function and it is best used in conjunction with other measures to increase certainty. here are some examples of recent articles in prominent scientific journals in the last year that used skin conductance:
1. McTeague, L.M., et al., Fearful imagery in social phobia: generalization, comorbidity, and physiological reactivity. Biol Psychiatry, 2009. 65(5): p. 374-82.
2. Ramachandra, V., N. Depalma, and S. Lisiewski, The role of mirror neurons in processing vocal emotions: evidence from psychophysiological data. Int J Neurosci, 2009. 119(5): p. 681-90.
3. Robinson, J.L. and H.A. Demaree, Experiencing and regulating sadness: Physiological and cognitive effects. Brain Cogn, 2009.
4. Sokol-Hessner, P., et al., Thinking like a trader selectively reduces individuals' loss aversion. Proc Natl Acad Sci U S A, 2009.
i think that what this illustrates is that radin did not depart from standard traditions in measuring peripheral autonomic functions.
2. one of the benefits that this kind of exposure provides, I think, is that an exploratory study like this invites criticisms from mindful readers that can help to improve future studies. if dean radin reads this, it might be helpful to him. if other people want to do these studies, it might be helpful to them too. i, for one, am grateful for your insightful analysis.
3. post hoc analyses are a standard statistical procedure. i think that it was admirable that the authors called it this and did not pretend that this was part of the initial analyses. while I agree that there is a certain amount of fishing in post hoc analyses, “post hoc” reveals the fishing, so I prefer for these to be reported than not, in general. can they be said to be the same as analyses based on a priori hypotheses: no. are they interesting: yes…to me, and to some others, anyway.
Clarifications in my defense:
1. although I did not initially like to the abstract, I provided the reference and continued the discussion. i think that the temperament of my writing was one of sharing the interesting results of a study that i had come across. it was not to write a scientific review article or to proclaim some strong feeling about data that i thought met the highest levels of scientific rigor. however, i take the points of many readers and will do my best to refer to authors when I write material of this nature in the future.
2. i find the responses to my post to be ignoring of the fact that i advised caution against just accepting this kind of finding. also, i pointed out cases in which this did not work and mentioned that this can also have adverse effects. i think this substantiates my actual position on this subject, which is that i find it interesting (as did many people, clearly-even if they disagreed with it) and inconclusive. the many paranoid responses that I got wondering if i was trying to surreptitiously represent a stronger point of view than I actually had does not resonate with me, except that the inaccuracy of this supposition makes me a bit defensive.
Objections to the reactions:
1. for people who are so accurate and interested in representing the “truth”, the eagerness to jump to conclusions that are certain about me and my beliefs is quite remarkable given the limited data that are available. simply, they are quite inaccurate and overlook many of my intentions.
2. it does not behoove a liberal newspaper to be attacking or discouraging of differences in opinion. the way in which president obama chose his colleagues should be evidence enough of this. we cannot transform our society if we only preach to the converted. to me scientific fanaticism is as dangerous as religious fanaticism. i feel very strongly about the uselessness of empty attack. most people who would want to disagree with you would not feel free to. when we lose the voice of any group of dissenters, we lose the voice of the total people.
Conclusions:
to that extent, I am very grateful for the evolution in my conversation with skeptico, and took the time to reply to his blog because i value conversation and I think that many of his objections are valid within the context that i presented them here. our conversation skeptico has been challenging, difficult but transforming in a helpful way to me. it is this kind of transformation that argument is meant to achieve, I think-not repetitive defensiveness without any insight. thank you for that.
Posted by: srini pillay | March 28, 2009 at 07:12 PM
to srini pillay:
just one thing, you say "more control groups"....... I would say AT LEAST ONE control group since there was not a single one for reference.
if you are going to test a chemical reaction between two substances, you leave one test-tube with a solution of water and substance A, and you don't add substance B to that tube, you add it to another tube so if the thing adds color to the water or precipitates to the bottom as a solid, you can be sure that it was the combination of substances A and B that caused the reaction, and not just substance A on water.
(also, having a test tube with substance B on water, and maybe one with just water).
sorry if my example is not rigurous enough, and for the bad english,
but I think that the lack of a control group, single-handedly invalidates the experiment.
Posted by: Pelger | March 28, 2009 at 07:29 PM
Pillay: "2. it does not behoove a liberal newspaper to be attacking or discouraging of differences in opinion. the way in which president obama chose his colleagues should be evidence enough of this. we cannot transform our society if we only preach to the converted. to me scientific fanaticism is as dangerous as religious fanaticism. i feel very strongly about the uselessness of empty attack. most people who would want to disagree with you would not feel free to. when we lose the voice of any group of dissenters, we lose the voice of the total people."
This is bullshit for four reasons: 1) Liberal is right there in the quote. You're sorta diametrically opposed to quite a few ideas in the first place. 2) It's not an opinion, it's claiming an idea with little support, no plausibility, and evidence against it is fact based on flawed research. 3) Scientific fanaticism? This is us pointing out that something isn't supported by science. Saying "hey, there's no evidence this crap is true" and explaining why isn't really fanaticism. 4) It's not an empty attack, it's a logical argument bringing up valid points against a flawed paper.
Posted by: King of Ferrets | March 28, 2009 at 11:32 PM
Sorry, but lack of a control group means the study is worthless and poorly designed. Lack of readings for the "doing nothing time" (as Skeptico notes) means that the study is so poorly designed that Radin was either incompetent or outright deceitful.
And for this paragraph in your article, I also call you deceitful:
1. The study did not show that, as you acknowledge in your comment here. So why did you tell your readers it did?
2. Which "other studies" have shown distant healing can heal tumors? Do these studies really exist? Are they as shonky as this one? It always stuns me how frivolously people like you throw around the claim you've got a cure for cancer.
Also, from your comment here:
There was no control group in the study. The "control groups" were there to test "how much" not "if at all". Big fail.
What? Those are glaring and stupid errors. And why didn't you share this opinion with your readers, intead of suggesting it's a cure for cancer?
Your whole article was based on the assumption that distant healing is a proven phenomenon. Show even one sentence containing the possibility it has absolutely no effect. (The first sentence already leapfrogs the idea.) The temperament of your writing is the same as the temperament of your comment. Deceitful, trying to hold the door open long enough to let as much self-serving esoteric bullshit through as possible.
More bullshit. You said that there are some situations where distant healing doesn't work, already predicating that there are situations where it does. Slippery, deceitful logic.
Here, you use the word "inconclusive", as if you are undecided whether or not distant healing works. But that is not what you wrote in your article. The only "inconclusive" there was in which situations it works. Equivocation. Two-faced deceitfulness.
The objections have completely demolished the credibility of the study, and therefore of your article too. You failed to acknowledge any of the glaring, fundamental errors, and chose instead to talk of curing cancer.
And how about the references for those studies which
Posted by: yakaru | March 29, 2009 at 04:19 AM
Good for Dr. Pillay. He came into unfriendly territory to defend himself. Kudos.
However:
Lucky for my that I find being called a hateful, nihilistic, fanatical scoffer more funny than intimidating.
Love the word "shonky", yakaru.
Posted by: kate | March 29, 2009 at 10:16 AM
If you could heal someone in this way, it would be logically possible also to harm someone in the same way--in fact there is already a name for that: cursing (as with a magic spell). But if it were really possible to do that, and the technique has been known as long as there have been written records (which cursing has been)--don't you think it would have been put to military use by now? That it would have been developed into a science by now?
(Sorry for the reductio ad absurdam, but the whole thing is so ridiculous--goodbye laws of physics!)
Posted by: Helena Constantine | March 30, 2009 at 09:08 PM
What does it even mean for the autonomic nervous system "to become activated"? I'm pretty sure mine is active right now...
Posted by: Dunc | March 31, 2009 at 02:15 AM
That's because someone is healing you right now, Dunc! Feel the power through your body! SEND ME MONEY!
Posted by: Valhar2000 | April 01, 2009 at 02:19 AM
I would suggest an "EXPLORE: The Journal of Science and Healing" tag for this post and tracking and/or reviewing "Explore." From what I can tell its very purpose is to misrepresent woo as verified by a "peer-reviewed" journal.
Posted by: Anyrandomfool | April 03, 2009 at 02:55 PM