This document was reviewed and approved by the Board of Governors of the Society of American Gastrointestinal and Endoscopic Surgeons (SAGES) in Nov 2020.
Rebecca C. Dirks1, MD, MS; Geoffrey P. Kohn2, MBBS, MSurg; Bethany Slater3, MD, MBA; Jake Whiteside1, BS; Noe A. Rodriguez4, MD; Salvatore Docimo5, DO, MS; Aurora Pryor5, MD, MBA; Dimitrios Stefanidis1, MD, PhD; On behalf of the SAGES guidelines committee
- Department of Surgery, Indiana University School of Medicine, Indianapolis, IN, USA
- Department of Surgery, Monash University Eastern Health Clinical School, Melbourne, VIC, Australia
- Division of Pediatric Surgery, University of Chicago, Chicago, IL, USA
- Bariatric and Metabolic Institute, Cleveland Clinic, Cleveland, Ohio
- Department of Surgery, Stony Brook Medicine, Stony Brook, NY, USA
Key words: Esophageal Achalasia, POEM, Heller Myotomy, Pneumatic Dilation, Systematic Review
Background: Achalasia is a rare, chronic and morbid condition with evolving treatment. Peroral endoscopic myotomy (POEM) has gained considerable popularity, but its comparative effectiveness is uncertain. We aim to evaluate the literature comparing POEM to Heller myotomy (HM) and pneumatic dilation (PD) for the treatment of achalasia.
Methods: We conducted a systematic review of comparative studies between POEM and HM or PD. A priori outcomes pertained to efficacy, perioperative metrics, and safety. Internal validity of observational studies and randomized trials (RCTs) was judged using the Newcastle Ottawa Scale and the Cochrane Risk of Bias 2.0 tool, respectively.
Results: From 1379 unique literature citations, we included 28 studies comparing POEM and HM (n=21) or PD (n=8), with only 1 RCT addressing each. Aside from two 4-year observational studies, POEM follow-up averaged ≤ 2 years. While POEM had similar efficacy to HM, POEM treated dysphagia better than PD both in an RCT (treatment “success” RR 1.71, 95% CI 1.34 to 2.17; 126 patients) and in observational studies (Eckardt score MD -0.43, 95% CI -0.71 to -0.16; 5 studies; I2 21%; 405 patients). POEM needed reintervention less than PD in an RCT (RR 0.19, 95% CI 0.08 to 0.47; 126 patients) and HM in an observational study (RR 0.33, 95% CI 0.16, 0.68; 98 patients). Though 6 -12 months patient-reported reflux was worse than PD in 3 observational studies (RR 2.67, 95% CI 1.02 to 7.00; I2 0%; 164 patients), post-intervention reflux was inconsistently measured and not statistically different in measures ≥ 1 year. POEM had similar safety outcomes to both HM and PD, including treatment-related serious adverse events.
Conclusions: POEM has similar outcomes to HM and greater efficacy than PD. Reflux remains a critical outcome with unknown long-term clinical significance due to insufficient data and inconsistent reporting.
Achalasia is a rare cause of dysphagia resulting from failed lower esophageal sphincter relaxation and esophageal dysmotility. While its incidence is < 2 per 100,000 and prevalence ranges from approximately 2 to 13 per 100,000 [1, 2], achalasia causes substantial decrease in quality of life and productivity due to its chronic nature . Treatments such as esophageal pneumatic dilation and botulinum toxin injection have relapse rates over 50%, with botulinum toxin relapse often occurring within 6 -12 months [4, 5]. The 2015 SAGES guideline on achalasia suggests targeting the LES mechanism more permanently via division of the muscle fibers of the LES .
The Heller myotomy, first reported by Ernst Heller in 1914 , was the first surgical approach to divide the LES muscle. The “peroral endoscopic myotomy” (POEM) reported by Inoue et al.  has gained popularity and become a viable first line therapy . As a natural orifice approach, POEM has the potential for less morbidity, quicker recovery, better cosmesis, longer myotomy, and spared need of dissecting the thoracic and abdominal esophagus.
Nevertheless, demonstration of POEM’s safety profile and effectiveness relative to alternative interventions is paramount for its wide acceptance by physicians, patients, and payers. Clinical outcomes of POEM must be compared to traditional treatments for achalasia, namely endoscopic pneumatic dilation (PD) and Heller myotomy (HM). The aim of this systematic review was to identify, critically-appraise, and meta-analyze the available evidence on POEM’s effectiveness and safety in comparison to laparoscopic HM and PD to aid in developing evidence-based guidelines for physicians involved in the treatment of achalasia .
We conducted a systematic review using methodological approaches outlined in the Cochrane Handbook for Systematic Reviewers  and reported according to the Preferred Reporting Items for Systematic Reviews and Meta-Analysis (PRISMA) criteria . Only published de-identified data was used, negating the need for IRB approval.
Working Group & Key Questions
A working group was selected from within the Society of American Gastrointestinal and Endoscopic Surgeons (SAGES) Guidelines Committee to systematically review the evidence on POEM. Two key questions were drafted using the PICO format (patient- intervention- comparator- outcome). The full PICO questions are detailed in Supplement 1.
Key Question 1: Should peroral endoscopic myotomy vs. Heller myotomy be used for achalasia in adults and children?
Key Question 2: Should peroral endoscopic myotomy vs. Pneumatic Dilation be used for achalasia in adults and children?
A literature search was performed on 9/3/18 using Medline (Ovid, PubMed), Embase, Cochrane Library, Clinicaltrials.gov, and TRIP to identify studies on POEM and either laparoscopic Heller myotomy or pneumatic dilation. A second search of PubMed occurred in December 2019 given the original literature search would no longer be within 12 months of publication, as recommended by the Cochrane Handbook . The PubMed search string is listed in Supplement 2. Given the suspected limited nature of literature available, a wide selection of study designs was deemed acceptable, including randomized control trials (RCTs) and comparative non-randomized studies. The bibliographies of all included studies, excluded systematic reviews, and background studies were hand-searched to identify articles missed by the original search strategy. Animal studies, non-English studies, and non-comparative observational studies were excluded. Studies were limited to 2010 onwards as POEM was first described in 2009 . Outcomes of interest were decided before the literature search and are listed in Supplement 1.
All citations were uploaded onto Covidence  where two reviewers independently screened each title and abstract for relevance to at least one key question. Exclusion criteria were achalasia not objectively established by esophageal manometry; secondary esophageal motility disorders, such as dysmotility secondary to esophagogastric cancer; large hiatal hernia (>3 cm); post-radiotherapy; and non-comparative studies on POEM and other techniques. Next, full text screening was performed on all relevant citations by two independent, blinded reviewers from the working group. Any disagreements in abstract or full text screening were discussed and, if needed, arbitrated by the working group lead (GK). For new studies identified in December 2019, a single reviewer was used for screening and full text review (RD), with judgements checked by a nonblinded co-author.
Study characteristics, sponsorship source, and population baseline characteristics were extracted into Covidence along with prescribed outcomes of interest. In addition to the primary outcomes of dysphagia rate, reflux symptoms, and pain, additional outcomes included distal esophageal pH exposure, abnormal DeMeester composite score (≥ 14.7), quality of life, cost, major adverse events such as reoperation for postoperative complications, unexpected ICU, death, postoperative leak, or blood transfusion; and perforation rate. Reflux symptoms were defined as heartburn symptoms or regurgitation, in addition to reflux symptoms not otherwise specified, and pain was defined as postoperative pain or chest pain. When multiple papers reported on the same study population, data extraction was still performed on all studies, but outcomes from an earlier paper were only included in meta-analysis if unique from the more contemporary publication.
Internal Validity Assessment
Concurrent with data extraction, internal validity of included studies was assessed. For observational studies, we used a modified Newcastle-Ottawa Scale. This assessment included selection bias, incomplete outcome data bias, selective outcome reporting bias, performance bias, and detection bias. Randomized controlled trials (RCT) were rated at the outcome level based on the Cochrane Risk of Bias (RoB) tool 2.0; this tool consists of five domains of judgement including bias from the randomization process, deviation from intended interventions, missing outcome data, measurement of outcome(s), and selection of reported result(s) . Two independent, blinded reviewers performed data extraction and quality assessment, except for new studies found during creation of this manuscript and RCTs, for which a nonblinded co-author checked the extraction and assessments of the initial reviewer (RD).
Data from RCTs and observational studies were analyzed separately. For meta-analysis, RevMan  was used to calculate mean differences (MD) for continuous outcomes using the same scale and standardized mean differences (SMD) for those with different scales, both using the inverse variance random-effects model. RevMan was used to create risk ratios (RR) for dichotomous outcomes with a Mantel-Haenszel random-effects model. Heterogeneity was explored using both χ2 and I2 values. For χ2 values, p < 0.05 was considered statistically significant. For I2, the rule of thumb in the GRADE (Grading of Recommendations Assessment, Development and Evaluation) handbook was used, where < 40% may be low and 75-100% may be considerable heterogeneity . If heterogeneity was present, we investigated it using subgroup and sensitivity analyses based on study characteristics such as quality, length of follow-up or publication year, or population characteristics such as age or country of origin.
From 1379 citations identified from the literature search, we included 28 studies (2 RCTs and 26 observational studies) as shown in Figure 1 and Table 1. Studies excluded after full text review are listed in Supplement 3, and full study disclosures are listed in Supplement 4. Most comparative studies on POEM included Heller myotomy (n=21), with a minority on POEM versus PD (n=8); one study included all three interventions . All were published after 2012, though data from HM and PD often preceded 2010. Due to its relatively new introduction, POEM often had shorter follow-up. Aside from two 4-year observational studies [19, 20], POEM follow-up averaged ≤ 2 years.
There were two pediatric studies, with one each comparing POEM to PD  and to HM . Most studies predominantly included patients 40-70 years old, with a few studies having younger adults [19, 23, 24]. Most studies reported on baseline achalasia subtype, the majority of which had either predominantly type 2 and/or type 1 achalasia. One study had predominantly type 3 achalasia . All studies had a small sample size. The vast majority of studies had < 100 total patients, many with < 50 total patients; the largest study contained 221 total patients.
Figure 1. Flow diagram for literature search and screening.
Table 1. Characteristics of included studies.
POEM vs. HM
Risk of Bias
A single randomized study on POEM versus laparoscopic HM was published by Werner et al. . All other evidence was taken from observational studies with common limitations including small sample size, low number of events, and notable heterogeneity between studies. The initial methodological quality for all 20 included observational studies is visually summarized in Figure 2, with study level judgments in Supplement 5, demonstrating predominantly low quality across all components of the risk of bias assessment. The only RCT, Werner et al., was rated with low risk of bias in all domains of the Cochrane Risk of Bias Tool 2.0 at the study level. When making judgements on individual outcomes, there was “some concern” for bias for the outcomes patient-reported reflux, esophagitis, LESP, and abnormal pH at 2 years due to substantial missing data from patients not included in these measurements (domain 3), and there was “some concern” for subjective outcomes given it was impossible to blind patients (domain 4). Details for each domain are in Supplement 6.
Figure 2. Study quality for 20 observational studies addressing POEM (per oral endoscopic myotomy) versus HM (Heller myotomy). Solid fill, striped, & no fill (white) patterns represent low, unclear, and high risk of bias, respectively. The bar at the bottom shows what percent of the total 20 observational studies have each type of risk of bias for both the individual domains and for the overall bias.
Randomized Control Trial Data
Table 2 summarizes 2-year outcomes from Werner et al. Reduction in dysphagia, as measured by Eckardt score ≤ 3 (Supplement 7, ), serious adverse events, postoperative relaxation pressure, quality of life improvement, and length of stay (LOS), although better in POEM, were not statistically different. Additionally, patient-reported reflux, grades B-D esophagitis on esophagogastroduodenoscopy (EGD) and use of proton pump inhibitors (PPIs) were worse in POEM but not statistically different. Neither group had any mortality and both groups had similar severe reflux esophagitis (grades C and D). Quality of Life was measured by the Gastrointestinal Quality of Life Index, which was described as the sum of individual domain scores regarding GI function, emotion, physical function, social function, and medical treatment, each ranging 0-4 points for a max score of 144 points with higher scores indicating better function. LESP, reflux esophagitis, patient reported reflux, and percent abnormal pH (the percent of time an ambulatory pH study demonstrates elevated distal esophageal acid exposure) all have risk of bias due to missing outcomes, as detailed in Supplement 6.
Table 2 Two-year outcomes for POEM versus Heller myotomy reported in Werner et al. 
Evidence from observational studies was available for dysphagia, reflux, pain, perforation, return to OR for postoperative complications, unexpected ICU stay, myotomy length, repeat intervention, LOS, and cost. Dysphagia was measured as a binary outcome (Fig 3) and as a continuous outcome (Fig 4), the latter defined by the Eckardt scale (Supplement 7). All data showed no statistically significant difference in dysphagia between POEM and HM, despite point estimates favoring POEM, as shown in Figures 3 and 4. Bhayani et al. post op (≤ 2 week) and 6-month data were visual outliers and were based on solid dysphagia only . In figure 4, there was significant heterogeneity between the three studies which measured Eckardt score at 6 months to 1 year. The study by Leeds et al.  was an outlier without usual causes such as study quality, follow-up length, age/sex, or country of origin. However, Leeds et al. did have predominantly Eckardt stage 3 at baseline whereas the other two had stage 2. When this outlier was removed, there was still no statistically significant difference (MD -0.01; 95% CI -0.42, 0.41) after the heterogeneity resolved (p = 0.76, I2 = 0%).
Reflux was reported as a dichotomous outcome in both a subjective and objective manner, without any consistent definition and across a wide range of follow-ups including 2 weeks to 4 years postoperatively. Figure 5 shows estimates from 8 studies, suggesting no difference in subjective reflux as defined by patient reported symptoms from 6 months to 3 years, but a statistically significant difference in objective reflux as measured by abnormal pH [22, 27, 39], esophogram , and esophagitis , though the last was only grade A on the Los Angeles classification. When a 2-month result from Sanaka et al.  and 3-month result from Wirsching et al.  were removed, the heterogeneity (I2 = 71.9%) between objective and subjective subgroups resolved (I2 = 0%) and the remaining 1-year objective estimates had a risk ratio of 0.99 (95% CI 0.37, 2.70). Sensitivity analyses removing the only pediatric study  and a study with severely mixed time frames , showed no change in risk ratio. One study could not be included, however, because it mixed subjective measurements for POEM with objective measurements for HM . Another study, Bhayani et al. , reported multiple measures for reflux based on symptom definition and follow-up (Supplement 8) ; only the outcomes for patient reported “reflux” were included in Figure 5.
There was no difference in postoperative pain, perforation, return to OR for complications, unexpected ICU stay, or LOS (Supplement 9). The LOS showed significant heterogeneity (p = 0.0003, I2 = 66%) that did not resolve when stratified by study quality, by pediatric versus adult studies, or by country, and was not explained by date of publication. However, when only studies from the United States were included, there was a small but statistically significant difference in LOS, with mean difference 0.58 days shorter for POEM (95% CI -1.00, -0.16; 8 studies; 575 patients; I2 = 53%) [20, 23, 26, 34, 36, 41, 43, 45]. Need for reintervention was also statistically lower for POEM in a 4-year study, with RR 0.33 (95% CI 0.16, 0.68; 98 patients) . Myotomy length was greater in POEM compared to HM in four studies with differences ranging from 0.9 cm to 8 cm [20, 22, 25, 45]. In four of five studies reporting on cost, each with a different definition, POEM was more expensive (Supplement 10).
Figure 3. Dysphagia as a patient reported binary outcome, as reported in observational studies on per oral endoscopic myotomy (POEM) and laparoscopic Heller myotomy (HM). For Podboy et al., “clinical failure” is represented, including Eckardt score > 3, reintervention, or hospitalization for achalasia.
Figure 4. Dysphagia by Eckardt score, as reported in observational studies on per oral endoscopic myotomy (POEM) and laparoscopic Heller myotomy (HM).
Figure 5. Reflux as a patient-reported binary outcome reported in observational studies on per oral endoscopic myotomy (POEM) and laparoscopic Heller myotomy (HM). Objective measures included pH studies, esophagitis seen on EGD, or reflux on an esophogram. When 2-3-month measures (Sanaka 2018, Wirsching 2019) are removed, the remaining 1-year objective results have a RR of 0.99 (95% CI 0.37, 2.70; I2 = 0%).
POEM vs. PD
Risk of Bias
Few comparative studies were available for POEM vs. PD, with 7 observational studies initially found and a single 2019 RCT by Ponds et al. The initial quality assessment for all 7 included observational studies are visually summarized in Figure 6, with study level judgments in Supplement 11, demonstrating predominantly low quality across all components. Ponds et al. was low risk of bias in all domains of the Cochrane Risk of Bias Tool 2.0 for the primary outcome of 2-year treatment success as measured by Eckardt score but was high risk of bias for all remaining secondary outcomes due to substantial missing outcome data in the PD arm from patients not included in these measurements. Details for each domain are in Supplement 12.
Figure 6. Risk of bias (RoB) for 7 observational studies addressing POEM (per oral endoscopic myotomy) versus PD (pneumatic dilation). Solid fill, striped, and no fill (white) patterns represent low, unclear risk, and high risk of bias, respectively. The bar at the bottom shows what percent of the total 7 observational studies have each type of risk of bias for both the individual domains and for the overall bias.
Randomized Control Trial Data
Table 3 summarizes outcomes from Ponds et al. This study demonstrated greater reduction in dysphagia by POEM, as measured by Eckardt score ≤ 3, and fewer reinterventions required by POEM. The remainder of the secondary outcomes were not significantly different between interventions. Quality of life was measured in multiple ways, including Achalasia disease specific quality of life score (DSQoL) in Table 3, which ranges from 10 to 33 where a lower score indicates a better quality of life. Other measures of quality of life, including a gastroesophageal reflux disease (GERD) questionnaire and the physical and mental component summary scores for SR-36 questionnaire, also showed no statistically significant difference between POEM and PD.
Table 3. Two-year outcomes for POEM versus PD in Ponds et al. 
Of the seven observational studies on POEM vs. PD, six reported on short term Eckardt score which favored POEM (Figure 7). There was no statistically significant difference in subgroups based on high vs. low-quality studies or pediatric (Tan et al. ) vs. adult studies. One study, Zheng et al., was the cause for statistically significant heterogeneity in all of these stratifications. On further investigation, Zheng et al. used pneumatic dilation only once in the PD arm, explaining why it favored POEM more strongly than studies with repeat interventions for PD patients . When this study was removed, the statistically significant heterogeneity in Figure 7 resolved (p = 0.28, I2 = 21%), but the mean difference still significantly favored POEM with a mean difference of -0.43 on the Eckardt scale (95% CI -0.71, -0.16; 5 studies; 405 patients). Of note, the mean scores for both POEM and PD were ≤ 3 in all but one study with wide confidence interval (Sanaka et al.), indicating both interventions on average achieved clinical remission , though POEM to a greater degree. Two studies also reported binary dysphagia [24, 35], which also significantly favored POEM with combined RR of 0.22 (95% CI 0.12, 0.39) (Supplement 12).
Three studies demonstrated greater short-term (6 month – 1 year) reflux (Figure 8) in POEM, three studies demonstrated greater risk for re-intervention in PD patients (Figure 9), and while most studies reported no perforations in either arm, a single study reported 3 perforations only in the PD arm (RR 0.39; 95% CI 0.02, 7.47) (Supplement 13). In two studies, cost per cure at the first year  and hospitalization cost  were greater for POEM (Supplement 14). In a single study , the LOS was longer for POEM patients with a mean difference of 4.33 days (95% CI 3.15, 5.51).
Fig 7. Short-term (< 1 year) Dysphagia by Eckardt scores for peroral endoscopic myotomy (POEM) versus pneumatic dilation (PD) based on observational studies. Without Zheng 2019, the mean difference is -0.43 (95% CI -0.71, -0.16).
Fig 8. Short-term (<1 year) patient reported reflux for peroral endoscopic myotomy (POEM) versus pneumatic dilation (PD) based on observational studies.
Fig 9. Need for planned reintervention for treatment failure for peroral endoscopic myotomy (POEM) versus pneumatic dilation (PD) based on observational studies.
The efficacy of POEM, as measured by patient-reported dysphagia, Eckardt scores and need for reintervention due to treatment failure, appears similar to that of HM but greater than that of PD. In particular, patients who undergo PD are more likely to require repeat intervention and less likely to have self-reported success by Eckardt score than those who undergo POEM.
A 2020 systematic review on PD versus LHM demonstrated that most studies have historically favored LHM with 4 recent studies reaching divergent conclusions presumed due to definition of repeated dilation as a failure or a natural part of this treatment protocol . This consideration pertains to comparisons between POEM and PD as well. While PD requires more interventions due to repeat dilations, these repeat dilations may be interpreted less as a failure and more as an inherent component of pneumatic dilatory therapy. What Pond et al. demonstrates (Table 3) in particular, however, is that POEM still achieves superior dysphagia resolution up to 2 years post-procedure, even when multiple dilations are used. The safety for all three procedures was comparable. There was no difference in mortality, treatment related serious adverse events, perforation, reoperation for complication, or unexpected ICU stay between POEM and HM and no difference in treatment-related serious adverse events between POEM and PD.
Reflux is the most frequently voiced concern for POEM; although reflux was found to worsen after POEM, this effect often varied greatly. For POEM versus Heller, low quality observational studies demonstrate no difference in effect for 6-month patient-reported heartburn or reflux symptoms, but worse early postoperative heartburn and more abnormal DeMeester composite scores. RCT data from Werner et al. demonstrated worse reflux esophagitis, patient-reported reflux, and greater use of PPI in the POEM group compared to LHM. However, only PPI use achieved statistical significance, and there were fewer patients with POEM that had severe esophagitis (grades C & D). Ultimately, only PPI use significantly favored HM, and the relevance of this increased use of PPI to physiologic reflux is uncertain as symptoms of achalasia or other gastrointestinal symptoms can be treated by PPI without proof of abnormal esophageal acidification. Nevertheless, the effect of reflux may be underestimated due to high risk of bias. There was missing outcome data for reflux esophagitis and abnormal pH in the RCT (Werner et al.), and the study that reported no difference in 6-month heartburn outcomes (Bhayani et al.) was rated with substantial risk of bias due to 25-40% attrition. For POEM versus PD, patient-reported reflux symptoms at 6-12 months were significantly worse in POEM patients (Figure 8) in observational studies. These symptoms, however, were described as either “mild heartburn” , or as relieved by oral PPI therapy [35, 42], the latter raising the aforementioned concern for poor correlation to objective gastroesophageal reflux. Additionally, evidence from a single RCT showed no significant esophagitis or patient-reported reflux. The long-term effect of POEM on clinically relevant reflux is still unclear.
Part of the heterogeneity in reflux findings may be due to inconsistent measurement and reporting. The literature reports reflux outcomes using a variety of subjective and objective measurements, including both continuous and binary definitions such as patient reported dichotomous reflux, patient reported reflux or GERD related quality of life scales, PPI use, percent abnormal pH, or DeMeester score making it difficult to pool results. One study even combined subjective measurements for POEM with objective measurements for HM . This variation is a severe obstacle in accurately gauging the effect that POEM versus HM or PD has on reflux. Given that POEM has demonstrated similar safety and equal, if not superior, efficacy compared to HM and PD, reporting standards should be used to better elucidate key side effects such as reflux and their clinical significance. Ideally, one or two consistent measurements should be used in the literature to enable better pooled results in the future. While patient reported symptoms and PPI use have often been used to reflect the impact of post myotomy reflux, both have been shown to poorly correlate with physiologic reflux , including after POEM specifically [49, 50], making both poor proxies for true reflux. Although, these outcomes should still be included for their role in clinical decision making, they should not be deemed as proxies for true physiological reflux. Instead, both objective measures of reflux and GERD-specific quality of life should be obtained, as important yet separate outcomes.
The majority of HM studies had 100% partial fundoplication rates, predominantly a Dor or Toupet fundoplication; with exception of Podboy et al., which reported 84% partial fundoplication rate . Thus, it is important to note that the above variation in reflux was not due to lack of a fundoplication following HM. Both Dor and Toupet after HM have shown similar reflux rates in two recent systematic reviews [51, 52], suggesting both are equally effective. According to Siddaiah-Subramanya et al., while there was greater quality of life in post-Heller Toupet patients, this was based on two studies with pooled mean difference of only 1.68 points on the GERD-HRQL scale , which ranges from 0 to 50 . There is no definitive advantage for Dor vs. Toupet fundoplication after HM. Fundoplication is not without side-effect, though, and it is interesting to note that none of the studies included in this review compared symptoms such as bloating, rectal flatulence and inability vomit or belch between POEM and HM groups outside of combined quality of life scores.
As with all systematic reviews, the evidence presented here is only as good as the evidence collected. Most comparative studies on POEM are observational, with small sample size, low event rates and were considered to be low quality. Only two RCTs met our inclusion criteria, one each for POEM versus PD and POEM versus HM [37, 44]. While at low risk of bias regarding the primary outcome of dysphagia, even these RCTs have some concern or high risk of bias for secondary outcomes (Supplements 6 and 12). Additionally, most studies had < 2-year follow up for both intervention arms and had notably shorter follow-up for POEM (Table 1). This study may have also limited its evidence by restricting studies to English, but this effect was likely small. Including non-English studies has been shown to have minimal impact on conclusions drawn from other systematic reviews . Thus, the restrictions in our literature search were unlikely to affect our conclusions. This review included more comparative evidence than prior systematic reviews [55-59], most notably including recent RCT data on POEM versus Heller myotomy or POEM versus PD which has thus far been missing. Two studies not included in our study, but included in multiple other systematic reviews, were deemed not truly comparative as they compared 6-month POEM outcomes to either 3-year  or 5-year  HM outcomes.
Despite the predominantly low-quality observational evidence with the limited randomized data, the observational studies and RCTs in this report are consistent with one another, particularly regarding efficacy and safety which support POEM as an equal and established alternative to Heller myotomy and as a superior choice to pneumatic dilation. As more robust studies are published, future meta-analyses can elucidate whether the favorable effect of POEM over pneumatic dilation can last past 2 years and can tease out the true incidence and clinical relevance of post-POEM reflux.
The authors would like to acknowledge Shauna Bostonian for her contribution in performing the literature search for all observational studies and Holly Ann Burt for her follow up literature search for new studies. We would additionally like to acknowledge Jennifer Malinowski for her assistance with data organization and preliminary statistical analyses and Ahmed M. Abou-Setta for his edits on the final drafts of this manuscript.
Funding: No funding was used for this study.
Compliance with ethical standards
Disclosures: Rebecca Dirks has equity interest in Johnson & Johnson unrelated to this project. Bethany Slater is a consultant for Boulder Surgical, which is unrelated to this project. Salvatore Docimo, Jr. is a consultant for Boston Scientific, which is unrelated to this project. Dimitrios Stefanidis has received research support to his institution from ExplORer Surgical Inc. and Bard, which are unrelated to this project. Aurora Pryor is a speaker for Ethicon, Gore, Merck and Stryker. She is a consultant for Obalon. Geoffrey Kohn, Noe Rodriguez and Jake Whiteside have no conflicts of interest or financial ties to disclose.
- O’Neill OM, Johnston BT, Coleman HG (2013) Achalasia: a review of clinical diagnosis, epidemiology, treatment and outcomes. World J Gastroenterol 19:5806-5812
- Sadowski DC, Ackah F, Jiang B, Svenson LW (2010) Achalasia: incidence, prevalence and survival. A population-based study. Neurogastroenterol Motil 22:e256-e261
- Nenshi R, Takata J, Stegienko S, Jacob B, Kortan P, Deitel W, Laporte A, Darling G, Urbach DR (2010) The cost of achalasia: quantifying the effect of symptomatic disease on patient cost burden, treatment time, and work productivity. Surg Innov 17:291-294
- Eckardt VF, Gockel I, Bernhard G (2004) Pneumatic dilation for achalasia: late results of a prospective follow up investigation. Gut 53:629-633
- Lake JM, Wong RK (2006) Review article: the management of achalasia – a comparison of different treatment modalities. Aliment Pharmacol Ther 24:909-918
- Stefanidis D, Richardson W, Farrell TM, Kohn GP, Augenstein V, Fanelli RD, SAGES Guidelines Committee (2012) SAGES guidelines for the surgical treatment of esophageal achalasia. Surg Endosc 26:296-311
- Payne WS (1989) Heller’s contribution to the surgical treatment of achalasia of the esophagus. 1914. Ann Thorac Surg 48:876-881
- Inoue H, Minami H, Kobayashi Y, Sato Y, Kaga M, Suzuki M, Satodate H, Odaka N, Itoh H, Kudo S (2010) Peroral endoscopic myotomy (POEM) for esophageal achalasia. Endoscopy 42:265-271
- Tuason J, Inoue H (2017) Current status of achalasia management: a review on diagnosis and treatment. J Gastroenterol 52:401-406
- Kohn GP, Dirks RC, Ansari MT, Clay J, Dunst CM, Lundell L, Marks JM, Molena D, Rooker C, Saxena P, Swanstrom L, Wong RK, Pryor AD, Stefanidis D. SAGES guidelines for the use of peroral endoscopic myotomy (POEM) for the treatment of achalasia. Surg Endosc. 2021 May;35(5):1931-1948.
- Higgins JPT, Thomas J, Chandler J, Cumpston M, Li T, Page MJ, Welch VA (editors). Cochrane Handbook for Systematic Reviews of Interventions version 6.0 (updated July 2019). Cochrane, 2019. Available from http://www.training.cochrane.org/handbook.
- Moher D, Liberati A, Tetzlaff J, Altman DG, PRISMA Group (2009) Preferred reporting items for systematic reviews and meta-analyses: the PRISMA statement. BMJ 339:b2535
- Cumpston M, Chandler J. Chapter IV: Updating a review. In: Higgins JPT, Thomas J, Chandler J, Cumpston M, Li T, Page MJ, Welch VA (editors). Cochrane Handbook for Systematic Reviews of Interventions version 6.0 (updated August 2019). Cochrane, 2019. Available from http://www.training.cochrane.org/handbook.
- Covidence systematic review software, Veritas Health Innovation, Melbourne, Australia. Available at http://www.covidence.org
- Sterne JAC, Savovic J, Page MJ, Elbers RG, Blencowe NS, Boutron I, Cates CJ, Cheng HY, Corbett MS, Eldridge SM, Emberson JR, Hernan MA, Hopewell S, Hrobjartsson A, Junqueira DR, Juni P, Kirkham JJ, Lasserson T, Li T, McAleenan A, Reeves BC, Shepperd S, Shrier I, Stewart LA, Tilling K, White IR, Whiting PF, Higgins JPT (2019) RoB 2: a revised tool for assessing risk of bias in randomised trials. BMJ 366:l4898
- Review Manager (RevMan) (2014), The Nordic Cochrane Centre, The Cochrane Collaboration, Copenhagen
- Schünemann H, Brożek J, Guyatt G, Oxman A, editors. (2013) GRADE handbook for grading quality of evidence and strength of recommendations. Updated October 2013. The GRADE Working Group, 2013. Available from http://guidelinedevelopment.org/handbook
- Sanaka MR, Hayat U, Thota PN, Jegadeesan R, Ray M, Gabbard SL, Wadhwa N, Lopez R, Baker ME, Murthy S, Raja S (2016) Efficacy of peroral endoscopic myotomy vs other achalasia treatments in improving esophageal function. World J Gastroenterol 22:4918-4925
- Peng L, Tian S, Du C, Yuan Z, Guo M, Lu L (2017) Outcome of peroral endoscopic myotomy (POEM) for treating achalasia compared with laparoscopic Heller myotomy (LHM). Surg Laparosc Endosc Percutan Tech 27:60-64
- Podboy AJ, Hwang JH, Rivas H, Azagury D, Hawn M, Lau J, Kamal A, Friedland S, Triadafilopoulos G, Zikos T, Clarke JO. Long-term outcomes of per-oral endoscopic myotomy compared to laparoscopic Heller myotomy for achalasia: a single-center experience. Surg Endosc. 2021 Feb;35(2):792-801
- Tan Y, Zhu H, Li C, Chu Y, Huo J, Liu D (2016) Comparison of peroral endoscopic myotomy and endoscopic balloon dilation for primary treatment of pediatric achalasia. J Pediatr Surg 51:1613-1618
- Caldaro T, Familiari P, Romeo EF, Gigante G, Marchese M, Contini AC, Federici di Abriola G, Cucchiara S, De Angelis P, Torroni F, Dall’Oglio L, Costamagna G (2015) Treatment of esophageal achalasia in children: Today and tomorrow. J Pediatr Surg 50:726-730
- Hungness ES, Teitelbaum EN, Santos BF, Arafat FO, Pandolfino JE, Kahrilas PJ, Soper NJ (2013) Comparison of perioperative outcomes between peroral esophageal myotomy (POEM) and laparoscopic Heller myotomy. J Gastrointest Surg 17:228-235
- Zheng Z, Zhao C, Su S, Fan X, Zhao W, Wang B, Jin H, Zhang L, Wang T, Wang B (2019) Peroral endoscopic myotomy versus pneumatic dilation – result from a retrospective study with 1-year follow-up. Z Gastroenterol 57:304-311
- Kumbhari V, Tieu AH, Onimaru M, El Zein MH, Teitelbaum EN, Ujiki MB, Gitelis ME, Modayil RJ, Hungness ES, Stavropoulos SN, Shiwaku H, Kunda R, Chiu P, Saxena P, Messallam AA, Inoue H, Khashab MA (2015) Peroral endoscopic myotomy (POEM) vs laparoscopic Heller myotomy (LHM) for the treatment of Type III achalasia in 75 patients: a multicenter comparative study. Endosc Int Open 3:E195-201
- Bhayani NH, Kurian AA, Dunst CM, Sharata AM, Rieder E, Swanstrom LL (2014) A comparative study on comprehensive, objective outcomes of laparoscopic Heller myotomy with per-oral endoscopic myotomy (POEM) for achalasia. Ann Surg 259:1098-1103
- de Pascale S, Repici A, Puccetti F, Carlani E, Rosati R, Fumagalli U (2017) Peroral endoscopic myotomy versus surgical myotomy for primary achalasia: single-center, retrospective analysis of 74 patients. Dis Esophagus 30:1-7
- Fumagalli U, Rosati R, De Pascale S, Porta M, Carlani E, Pestalozza A, Repici A (2016) Repeated surgical or endoscopic myotomy for recurrent dysphagia in patients after previous myotomy for achalasia. J Gastrointest Surg 20:494-499
- Greenleaf EK, Winder JS, Hollenbeak CS, Haluck RS, Mathew A, Pauli EM (2018) Cost-effectiveness of per oral endoscopic myotomy relative to laparoscopic Heller myotomy for the treatment of achalasia. Surg Endosc 32:39-45
- Hanna AN, Datta J, Ginzberg S, Dasher K, Ginsberg GG, Dempsey DT (2018) Laparoscopic Heller Myotomy vs per oral endoscopic myotomy: Patient-reported outcomes at a single institution. J Am Coll Surg 226:26:465-472e461
- Khashab MA, Kumbhari V, Tieu AH, El Zein MH, Ismail A, Ngamruengphong S, Singh VK, Kalloo AN, Clarke JO, Stein EM (2017) Peroral endoscopic myotomy achieves similar clinical response but incurs lesser charges compared to robotic heller myotomy. Saudi J Gastroenterol 23:91-96
- Kim GH, Jung KW, Jung HY, Kim MJ, Na HK, Ahn JY, Lee JH, Kim DH, Choi KD, Song HJ, Lee GH (2019) Superior clinical outcomes of peroral endoscopic myotomy compared with balloon dilation in all achalasia subtypes. J Gastroenterol Hepatol 34:659-665
- Kumagai K, Tsai JA, Thorell A, Lundell L, Hakanson B (2015) Per-oral endoscopic myotomy for achalasia. Are results comparable to laparoscopic Heller myotomy? Scand J Gastroenterol 50:505-512
- Leeds SG, Burdick JS, Ogola GO, Ontiveros E (2017) Comparison of outcomes of laparoscopic Heller myotomy versus per-oral endoscopic myotomy for management of achalasia. Proc (Bayl Univ Med Cent) 30:419-423
- Meng F, Li P, Wang Y, Ji M, Wu Y, Yu L, Niu Y, Lv F, Li W, Li W, Zhai H, Wu S, Zhang S (2017) Peroral endoscopic myotomy compared with pneumatic dilation for newly diagnosed achalasia. Surg Endosc 31:4665-4672
- Miller HJ, Neupane R, Fayezizadeh M, Majumder A, Marks JM (2017) POEM is a cost-effective procedure: cost-utility analysis of endoscopic and surgical treatment options in the management of achalasia. Surg Endosc 31:1636-1642
- Ponds FA, Fockens P, Lei A, Neuhaus H, Beyna T, Kandler J, Frieling T, Chiu PWY, Wu JCY, Wong VWY, Costamagna G, Familiari P, Kahrilas PJ, Pandolfino JE, Smout A, Bredenoord AJ (2019) Effect of peroral endoscopic myotomy vs pneumatic dilation on symptom severity and treatment outcomes among treatment-naive patients with achalasia: a randomized clinical trial. JAMA 322:134-144
- Ramirez M, Zubieta C, Ciotola F, Amenabar A, Badaloni A, Nachman F, Nieponice A (2018) Per oral endoscopic myotomy vs. laparoscopic Heller myotomy, does gastric extension length matter? Surg Endosc 32:282-288
- Sanaka MR, Thota PN, Parikh MP, Hayat U, Gupta NM, Gabbard S, Lopez R, Murthy S, Raja S (2019) Peroral endoscopic myotomy leads to higher rates of abnormal esophageal acid exposure than laparoscopic Heller myotomy in achalasia. Surg Endosc 33:2284-2292
- Schneider AM, Louie BE, Warren HF, Farivar AS, Schembre DB, Aye RW (2016) A matched comparison of per oral endoscopic myotomy to laparoscopic Heller myotomy in the treatment of achalasia. J Gastrointest Surg 20:1789-1796
- Ujiki MB, Yetasook AK, Zapf M, Linn JG, Carbray JM, Denham W (2013) Peroral endoscopic myotomy: A short-term comparison with the standard laparoscopic approach. Surgery 154:893-897; discussion 897-900
- Wang X, Tan Y, Lv L, Zhu H, Chu Y, Li C, Liu D (2016) Peroral endoscopic myotomy versus pneumatic dilation for achalasia in patients aged ≥ 65 years. Rev Esp Enferm Dig 108:637-641
- Ward MA, Gitelis M, Patel L, Vigneswaran Y, Carbray J, Ujiki MB (2017) Outcomes in patients with over 1-year follow-up after peroral endoscopic myotomy (POEM). Surg Endosc 31:1550-1557
- Werner YB, Hakanson B, Martinek J, Repici A, von Rahden BHA, Bredenoord AJ, Bisschops R, Messmann H, Vollberg MC, Noder T, Kersten JF, Mann O, Izbicki J, Pazdro A, Fumagalli U, Rosati R, Germer CT, Schijven MP, Emmermann A, von Renteln D, Fockens P, Boeckxstaens G, Rosch T (2019) Endoscopic or surgical myotomy in patients with idiopathic achalasia. N Engl J Med 381:2219-2229
- Wirsching A, Boshier PR, Klevebro F, Kaplan SJ, Seesing MF, El-Moslimany R, Ross A, Low DE (2019) Comparison of costs and short-term clinical outcomes of per-oral endoscopic myotomy and laparoscopic Heller myotomy. Am J Surg 218:706-711
- Eckardt AJ, Eckardt VF (2011) Treatment and surveillance strategies in achalasia: an update. Nat Rev Gastroenterol Hepatol 8:311-319
- de Heer J, Desai M, Boeckxstaens G, Zaninotto G, Fuchs KH, Sharma P, Schachschal G, Mann O, Rösch T, Werner Y. Pneumatic balloon dilatation versus laparoscopic Heller myotomy for achalasia: a failed attempt at meta-analysis. Surg Endosc. 2021 Feb;35(2):602-611
- Januszewicz W, Hartley J, Waldock W, Roberts G, Alias B, Hobson A, Wernisch L, di Pietro M (2019) Endoscopic measurement of gastric pH associates with persistent acid reflux in patients treated with proton-pump inhibitors for gastroesophageal reflux disease. United European Gastroenterol J 7:1389-1398
- Jones EL, Meara MP, Schwartz JS, Hazey JW, Perry KA (2016) Gastroesophageal reflux symptoms do not correlate with objective pH testing after peroral endoscopic myotomy. Surg Endosc 30:947-952
- Kumbhari V, Familiari P, Bjerregaard NC, Pioche M, Jones E, Ko WJ, Hayee B, Cali A, Ngamruengphong S, Mion F, Hernaez R, Roman S, Tieu AH, El Zein M, Ajayi T, Haji A, Cho JY, Hazey J, Perry KA, Ponchon T, Kunda R, Costamagna G, Khashab MA (2017) Gastroesophageal reflux after peroral endoscopic myotomy: a multicenter case-control study. Endoscopy 49:634-642
- Aiolfi A, Tornese S, Bonitta G, Cavalli M, Rausa E, Micheletto G, Campanelli G, Bona D (2020) Dor versus Toupet fundoplication after Laparoscopic Heller Myotomy: Systematic review and Bayesian meta-analysis of randomized controlled trials. Asian J Surg 43:20-28
- Siddaiah-Subramanya M, Yunus RM, Khan S, Memon B, Memon MA (2019) Anterior Dor or Posterior Toupet with Heller Myotomy for Achalasia Cardia: A Systematic Review and Meta-Analysis. World J Surg 43:1563-1570
- Velanovich V (2007) The development of the GERD-HRQL symptom severity instrument. Dis Esophagus 20:130-134
- Hartling L, Featherstone R, Nuspl M, Shave K, Dryden DM, Vandermeer B (2017) Grey literature in systematic reviews: a cross-sectional study of the contribution of non-English reports, unpublished studies and dissertations to the results of meta-analyses in child-relevant reviews. BMC Med Res Methodol 17:64
- Aiolfi A, Bona D, Riva CG, Micheletto G, Rausa E, Campanelli G, Olmo G, Bonitta G, Bonavina L (2020) Systematic Review and Bayesian Network Meta-Analysis Comparing Laparoscopic Heller Myotomy, Pneumatic Dilatation, and Peroral Endoscopic Myotomy for Esophageal Achalasia. J Laparoendosc Adv Surg Tech A 30:147-155
- Awaiz A, Yunus RM, Khan S, Memon B, Memon MA (2017) Systematic Review and Meta-Analysis of Perioperative Outcomes of Peroral Endoscopic Myotomy (POEM) and Laparoscopic Heller Myotomy (LHM) for Achalasia. Surg Laparosc Endosc Percutan Tech 27:123-131
- Evensen H, Kristensen V, Larssen L, Sandstad O, Hauge T, Medhus AW (2019) Outcome of peroral endoscopic myotomy (POEM) in treatment-naive patients. A systematic review. Scand J Gastroenterol 54:1-7
- Martins RK, Ribeiro IB, De Moura DTH, Hathorn KE, Bernardo WM, De Moura EGH (2020) Peroral (POEM) or surgical myotomy for the treatment of achalasia: a systematic review and meta-analysis. Arq Gastroenterol 57:79-86
- Cappell MS, Stavropoulos SN, Friedel D (2020) Updated Systematic Review of Achalasia, with a Focus on POEM Therapy. Dig Dis Sci 65:38-65
- Teitelbaum EN, Rajeswaran S, Zhang R, Sieberg RT, Miller FH, Soper NJ, Hungness ES (2013) Peroral esophageal myotomy (POEM) and laparoscopic Heller myotomy produce a similar short-term anatomic and functional effect. Surgery 154:885-891; discussion 891-882
- Chan SM, Wu JC, Teoh AY, Yip HC, Ng EK, Lau JY, Chiu PW (2016) Comparison of early outcomes and quality of life after laparoscopic Heller’s cardiomyotomy to peroral endoscopic myotomy for treatment of achalasia. Dig Endosc 28:27-32
This document was reviewed and approved by the Board of Governors of the Society of American Gastrointestinal and Endoscopic Surgeons (SAGES) in Nov 2020.
For more information please contact:
11300 West Olympic Blvd., Suite 600
Los Angeles, CA 90064
- (310) 437-0544
- (310) 437-0585
Guidelines for clinical practice are intended to indicate preferable approaches to medical problems as established by experts in the field. These recommendations will be based on existing data or a consensus of expert opinion when little or no data are available. Guidelines are applicable to all physicians who address the clinical problem(s) without regard to specialty training or interests, and are intended to indicate the preferable, but not necessarily the only acceptable approaches due to the complexity of the healthcare environment. Guidelines are intended to be flexible. Given the wide range of specifics in any health care problem, the surgeon must always choose the course best suited to the individual patient and the variables in existence at the moment of decision.
Guidelines are developed under the auspices of the Society of American Gastrointestinal and Endoscopic Surgeons and its various committees, and approved by the Board of Governors. Each clinical practice guideline has been systematically researched, reviewed and revised by the guidelines committee, and reviewed by an appropriate multidisciplinary team. The recommendations are therefore considered valid at the time of its production based on the data available. Each guideline is scheduled for periodic review to allow incorporation of pertinent new developments in medical research knowledge, and practice.