In our information-centered world, where answers are a click away, the term “evidence-based” carries a lot of weight. This is especially true in the sleep training world, where the stakes feel high, and there are a lot of very strong opinions. Today, everyone is or can be an expert. But when someone lays down the big “research-based” card, people tend to take notice.
Research is cited a lot in expert sleep training advice in books and on social media. Experts warn parents that research has found that sleep problems are common and serious and will not resolve on their own. Failure to start early could make change impossible. (No wonder parents are freaked out.) The most evidence-based approach, they say, is extinction (some form of crying it out), which has been shown to be quick (3-7 days), extremely effective, and without any negative side effects, even for very young babies.
A lot of parents have issues with some of this: they’re not seeing “quick” results, there’s a lot of crying that doesn’t reduce in a day or two, they don’t want to do extinction at all, they’re not sure about doing something like this with their newborn.
So, what does the research say in practical terms about extinction? You might be really surprised.
Before we dive into this thorny topic, let me first emphatically say that I am not making a case against crying it out in all its forms.
No.
I’m not at all saying it’s not a perfectly reasonable drama-less intervention for many families. For anyone who used it and it was successful and not a nightmare, I say "huzzah." That's awesome.
There are others who strongly dislike extinction, won’t ever use it, or tried it without success and won’t use it again. In my research with parents, more than 30% said extinction “didn’t work at all” and nearly that many said that there was “significantly more crying than they were told to expect” (Gordon, 2020). While some parents do have quick, easy traction, it’s clear that not everyone does.
To read research and advice, it sounds like a settled question ["The only standard recommendation for effectiveness is extinction. . . Extinction has the strongest empirical support” (Weissbluth, 2015, p. 189)].
It’s not settled….at all. There are lots of questions about extinction that have simply not been asked or answered.
“Research-based” doesn’t always mean it’s objectively true.
It’s important for me to give you a little peek inside how research gets done. Most reasonable adults know that there are ways to cheat with statistics. We know that there are ways to fudge numbers or maybe bias a sample or just plain lie. But research is way more easily biased than that, in ways that may not be so obvious or easy to label as dishonest.
Research is rooted in questions. How you ask those questions and the words you choose to use can affect the whole direction of a study.
How you choose to measure things can dramatically affect the outcome. A particularly sticky issue is when researchers need to measure something that is abstract…like whether sleep training is “effective.” It may sound like a concrete thing, but what would "effective" actually look like in measurable terms? This is called operationalizing.
Say you want to measure the effect of social media consumption on empathy. You probably could easily measure social media consumption by putting some kind of tracker on the phone or laptop…no problem. How do you measure empathy? There are no manuals for this. Researchers need to decide how to measure it. A researcher could use a valid scale or measure that’s been tested to make sure it’s measuring empathy. But researchers could also say, “We measured empathy by asking the subjects’ partners how empathetic they were.” Ummm. You might say, “Is that accurately measuring empathy?”
And you’d be right. But as long as researchers devise a plausible way to measure a variable and are upfront about it, they technically can measure it however they want.
What does “effective” really mean?
When parents read that “extinction is effective,” they think that means, “sleeping through the night.” This is not what research has found.
In some studies, “effective” meant a longer longest stretch (Hall et al., 2015) or shorter (not fewer) awakenings, or a shorter time to fall asleep (Mindell & Durand, 1993). It never meant “slept through the night without waking” … not once. Across all studies, children were still waking at the end of the study. Remember, research isn’t interested in the resolution of sleep challenges, just improvement.
In some sleep training research, they didn’t always even measure sleep.
What? You say. How can they do that? I know.
They may have just asked the parents, “Did your child sleep better? (yes/no)" (Adachi et al., 2009) or “In the last week, how many times did your child wake up?” (Eckerberg, 2002). You and I know that those aren’t really objective measures. Those are subjective impressions and you’re asking parents to try to remember what happened (Matthey & Speyer, 2008). It’s not a very reliable way to measure anything. But research has used this as hard data to document that the sleep intervention “significantly improved nighttime sleep.”
Meta-analyses (these are compilations of a body of studies) found that even though parents reported better sleep, the infants’ sleep didn’t actually change. In many studies, extinction had no impact on nightwakings (Kempler et al., 2016), which is perhaps parents’ biggest concern.
In several studies, the impact on sleep in practical terms were really small.
One meta-analysis found that across studies extinction only resulted in about 11.3 minutes more nighttime sleep (Fangupo et al, 2021).
That’s it. ELEVEN MINUTES. It’s unclear how many parents would put themselves through extinction for only eleven minutes more sleep.
Here’s a major kicker: Books seem to suggest that, if you do it right, extinction works 100% of the time. Across studies, extinction did not work for between 16- and 50% of the subjects. That’s a lot of families for whom it didn’t work. Parents rarely get the memo that this intervention might not work for them.
Extinction is not as fast as they say.
Extinction is also not as fast as they say. For a subset of kiddos, yes, it can be relatively quick and drama-free. But, in research, extinction took THREE WEEKS or more before sleeping through the night happened (Chadez & Nurius, 1987; France & Hudson, 1990, Rickert & Johnson, 1988). Parents are not hearing that part of it.
I’m going to bum you out a little more.
The investigation of side effects is over-reliant on averages.
The idea that extinction (crying it out) is completely without negative side effects for every amount of crying—at every age—is a question that’s never been researched.
Let me reiterate that I'm not here to prove or suggest that extinction is always problematic. It’s not. However, it’s not accurate to say it never is. Bear with me…
Here’s what you have to know: research deals in averages. So, if, on average, there were no effects, they are able to report “no effects.” Once you start asking about specific subsets of babies, the evidence isn’t quite so definitive.
First, there are no studies on side effects for infants under six months. So, even though books say, “There is no evidence that crying it out harms babies ever,” it’s never been tested on younger babies. Plus, most of these studies don’t say how much crying happened. So, we don’t know how much crying is safe and okay. Is 15 minutes without harm? 20? What about 90 straight minutes for a three-month-old? We literally don’t know the answer to this.
There is a seminal study that everyone cites (France, 1992), and it’s one of the very few that did report how much crying happened. The study concluded that there were “no negative emotional side effects” ON AVERAGE.
On average, there were no negative effects, but it doesn’t mean there weren’t any.
Are you following this? because it’s super important.
In that study, there were a couple of babies (toddlers really) who only cried a total of about 130 minutes over three weeks. They never cried more than 20 minutes, and they were sleeping through the night in under a week. On the other hand, there was a six-month-old who cried over 900 minutes total and cried for 50 minutes on Day 16. This baby was crying A LOT. Did this baby have “no effects”? We absolutely don’t know.
Research doesn’t have to say, “On average they were okay, but the babies who cried more than X had the following effects ….” Researchers are not required to do that.
The other problem with this study is that they used a terrible, terrible scale to measure “infant security,” and it’s often been misconstrued as measuring attachment security. This scale does not measure attachment. The Flint Scale was written in 1974 as a very early way to attempt to assess “security” —an early formulation of something like attachment, but is not attachment. Questions on it are filled out by the parent about the infant and include: “Likes rough play,” “Accepts new foods readily,” “Sits patiently for food to be served,” “Allows a stranger to put them to bed.”
I’m sorry, WHAT?
This is measuring what exactly? To me, this sounds like aspects of temperament or something else entirely. In fact, this scale has never been published in a journal or tested for what’s called internal validity — meaning, does it measure what it says it does? I don’t understand how this is being used to assess the emotional outcomes of anything, but it totally, totally is.
Researchers are allowed to use scales that are “reasonable,” but I don’t think anyone at a journal takes a look at them when reviewing the study. This is a truly terrible measure of infant outcomes, and it’s been used multiple times across several pivotal studies on side effects (Eckerberg, 2004; France, 1992; France et al., 1991).
Another study cited as evidence that extinction doesn’t impact attachment utilized the Disinhibited Attachment Interview (I had to look this one up too. I had never heard of disinhibited attachment) (Price et al., 2012). This scale was used to assess Romanian orphans who had had such traumatic infancies that they would indiscriminately go off with strangers. So, this study found that extinction doesn’t cause severe attachment disruption? Um…okayyyy.
Another study used the Strange Situation (a validated measure of attachment) but did not assess attachment at the beginning of the study…only at the end (Gradisar et al., 2016). So, while it’s good to know that the sleep trained children didn’t disproportionately have insecure attachment, we have no idea where these children started. This just doesn’t give us the whole picture.
Many studies just ask parents, “Do you think this affected your relationship with your child?” Well, what are they going to say? “Actually, I do think my baby looks depressed now.” This is called social desirability bias. There’s no parent who’s going to be able to report that something they just did to their child was harmful. PLUS, presumably, they are now sleeping better, and the whole world looks better to them.
The business of research
Extinction is “the most evidence-based” merely because it’s the most researched, not because it’s better. In fact, head-to-head, nothing works better than anything else. In fact, a study conducted by Sarah Blunden in Australia found that compared to extinction, a more responsive approach was just as effective, and there was less stress for both mother and child. Head-to-head no intervention has ever been found to work “better.”
Why, then, does it seem like extinction is the best and only option?
Here’s a dirty little secret of research: it’s more attractive to jump on a body of research that’s already established than to buck the system and ask new questions. Once a bunch of researchers are asking the same incomplete questions and measuring outcomes in ways that don’t map to parents’ reality, that methodology just gets replicated, and suddenly, you have a “large body of evidence” that’s all built on the same shaky ground.
Research is also a business. Researchers/university faculty need to be published to survive. You need to do work that’s publishable.
Here’s how it often goes in Researchland:
Researchers conduct a study and get positive “significant” results
Other researchers would rather build on that than start something from scratch
Now, you have a bunch of studies, and (surprise) they all find the same outcomes.
The batch of similar studies gets compiled into systematic literature reviews and meta-analyses. These further carve the intervention into research stone because now, it’s a body of research.
Then, it gets crafted into a hierarchy of evidence so that extinction is considered an “empirically-supported treatment for infant insomnia” (Mindell et al., 2006)
This gets crafted into practice recommendations for pediatricians and other professionals and becomes part of a policy statement for big sleep medicine associations (American Academy of Sleep Medicine, n. d.; Morgenthaler et al., 2006)
And — boom — it’s the only game in town, and it’s unassailable because of the evidence base that got built up. (In fact, it’s been used to dismiss alternatives. One researcher says that “immediate-responding approaches do not meet the required level of empirical backing. . . and would be considered experimental” (Crncec et al., 2010).
It’s a racket.
No one is policing book authors or other experts to make sure that they’re citing research correctly. One book author I won’t mention says, “Sleep problems cause ADHD.”
No, they don’t.
Those two things are related, but no researcher on this planet would say one caused the other — — “Correlation does not equal causation.” It could be that sensory sensitivity underlies both sleep problems and ADHD. Sleep doesn’t cause ADHD…or obesity…or behavioral problems.
So, what can you do?
Be really, really skeptical. If someone is trying to sell you something or persuade you to use a method that doesn’t sound right and research is the crowbar they’re using, use Google Scholar and look some stuff up yourself.
Better yet, listen to your gut. If you don’t want to use crying it out, you don’t have to. If you tried it and it really didn’t work, it’s not your fault. You didn’t do it wrong. It’s that the book or expert painted way too rosy a picture of it. You should also know that parents are trying to do all of this on their own. Parents in studies have a ton of evaluation and information and follow-up support. One researcher suggests that parents in the real-world aren’t having the same level of success with extinction simply because they’re forced to do it without all of that backup (Loutzenhiser et al., 2015).
Remember that all of this “research-based” rhetoric is made to push you off your center and away from your OWN knowledge — because if you feel like you don’t know what you’re doing…or you’re worried you’ll screw up your child, you will buy products and you will turn to experts for answers.
When it comes to research, don't let it bully you into making a choice. That “evidence base” can actually be shakier than it sounds.
Macall Gordon has a B.S. from Stanford in Human Biology and an M.A. from Antioch University, Seattle in Applied Psychology, where she is currently a Sr. Lecturer in the mental health counseling department. She is a researcher looking at the relationship between temperament and sleep, and the gap between research and parenting advice on sleep training, and the effect of the oversupply of expert advice on sleep. She is a certified pediatric sleep consultant working with parents of alert, non-sleeping children in private practice, as well as on the women’s telehealth platform, Maven Clinic. She comes to this work because she had two sensitive, intense children, and she didn’t sleep for 18 years.
She recently presented this work at the World Infant Mental Health conference in Dublin, Ireland.
References Adachi, Y., Sato, C., Nishino, N., Ohryoji, F., Hayama, J., & Yamagami, T. (2009). A brief parental education for shaping sleep habits in 4-month-old infants. Clinical Medicine & Research, 7(3), 85–92. https://doi.org/10.3121/cmr.2009.814
American Academy of Sleep Medicine. (n.d.). Practice parameters. https://aasm.org/clinical-resources/practice-standards/practice-guidelines/
Blunden, S., Osborne, J., & King, Y. (2022). Do responsive sleep interventions impact mental health in mother/infant dyads compared to extinction interventions? A pilot study. Archives of Women’s Mental Health, 25(3), 621–631.
Chadez, L. H., & Nurius, P. S. (1987). Stopping bedtime crying: Treating the child and the parents. Journal of Clinical Child Psychology, 16(3), 212-217.
Crnčec, R., Matthey, S., & Nemeth, D. (2010). Infant sleep problems and emotional health: A review of two behavioural approaches. Journal of Reproductive and Infant Psychology, 28(1), 44–54.
Eckerberg, B. (2002). Treatment of sleep problems in families with small children: is written information enough? Acta Paediatrica, 91, 952-959.
Fangupo, L. J., Haszard, J. J., Reynolds, A. N., Lucas, A. W., McIntosh, D. R., Richards, R., Camp, J., Galland, B. C., Smith, C., & Taylor, R. W. (2021). Do sleep interventions change sleep duration in children aged 0-5 years? A systematic review and meta-analysis of randomised controlled trials. Sleep Medicine Reviews, 59, 101498.
Flint, B. M. (1974). The Flint Infant Security Scale for infants aged 3 to 24 months. University of Toronto Press.
France, K. G. (1992). Behavior characteristics and security in sleep-disturbed infants treated with extinction. Journal of Pediatric Psychology, 17, 467–475.
France, K. G., & Blampied, N. M., & Wilkinson, P. (1991). Treatment of infant sleep disturbance by trimeprazine in combination with extinction. Journal of Developmental & Behavioral Pediatrics, 12(5), 308-314.
France, K. G., & Hudson, S. M. (1990). Behavior management of infant sleep disturbance. Journal of Applied Behavior Analysis, 23(1), 91-98.
Gordon, M. D. (2020, October). The effect of difficult temperament on experiences with infant sleep and sleep training: A survey of parents. Poster presented at the Occasional Temperament Conference. University of Virginia (Virtual).
Gradisar, M., Jackson, K., Spurrier, N. J., Gibson, J., Whitham, J., Williams, A. S., Dolby, R., & Kennaway, D. J. (2016). Behavioral interventions for infant sleep problems: A randomized controlled trial. Pediatrics, 137(6).
Kempler, L., Sharpe, L., Miller, C. B., & Bartlett, D. J. (2016). Do psychosocial sleep interventions improve infant sleep or maternal mood in the postnatal period? A systematic review and meta-analysis of randomised controlled trials. Sleep Medicine Reviews, 29, 15–22.
Loutzenhiser, L., Hoffman, J., & Beatch, J. (2014). Parental perceptions of the effectiveness of graduated extinction in reducing infant night-wakings. Journal of Reproductive and Infant Psychology, 32(3), 282–291.
Matthey, S., & Speyer, J. (2008). Changes in unsettled infant sleep and maternal mood following admission to a parentcraft residential unit. Early Human Development, 84(9), 623-629.
Mindell, J. A., & Durand, V. M. (1993). Treatment of childhood sleep disorders: Generalization across disorders and effects on family members. Journal of Pediatric Psychology, 18(6), 731-750.
Mindell, J. A., Kuhn, B., Lewin, D. S., Meltzer, L. J., & Sadeh, A. (2006). Behavioral treatment of bedtime problems and night wakings in infants and young children. Sleep, 29(10), 1263–1276.
Morgenthaler, T. I., Owens, J., Alessi, C., Boehlecke, B., Brown, T. M., Coleman, J., Friedman, L., Kapur, V. K., Lee-Chiong, T., Pancer, J., Swick, T. J., & American Academy of Sleep Medicine. (2006). Practice parameters for behavioral treatment of bedtime problems and night wakings in infants and young children. Sleep, 29(10), 1277–1281.
Price, A. M. H., Wake, M., Ukoumunne, O. C., & Hiscock, H. (2012). Five-year follow-up of harms and benefits of behavioral infant sleep intervention: Randomized trial. Pediatrics, 130(4), 643–651.
Rickert, V. I., & Johnson, C. M. (1988). Reducing nocturnal awakening and crying episodes in infants and young children: A comparison between scheduled awakenings and systematic ignoring. Pediatrics, 81(2), 203-213.
Weissbluth, M. (2015). Healthy sleep habits, Happy child (4th Edition). Ballantine.
Comments