Calibrating the taxonomy of a megadiverse insect family: 3000 DNA barcodes from geometrid type specimens (Lepidoptera, Geometridae)

It is essential that any DNA barcode reference library be based upon correctly identified specimens. The Barcode of Life Data Systems (BOLD) requires information such as images, geo-referencing, and details on the museum holding the voucher specimen for each barcode record to aid recognition of potential misidentifications. Nevertheless, there are misidentifications and incomplete identifications (e.g., to a genus or family) on BOLD, mainly for species from tropical regions. Unfortunately, experts are often unavailable to correct taxonomic assignments due to time constraints and the lack of specialists for many groups and regions. However, considerable progress could be made if barcode records were available for all type specimens. As a result of recent improvements in analytical protocols, it is now possible to recover barcode sequences from museum specimens that date to the start of taxonomic work in the 18th century. The present study discusses success in the recovery of DNA barcode sequences from 2805 type specimens of geometrid moths which represent 1965 species, corresponding to about 9% of the 23 000 described species in this family worldwide and including 1875 taxa represented by name-bearing types. Sequencing success was high (73% of specimens), even for specimens that were more than a century old. Several case studies are discussed to show the efficiency, reliability, and sustainability of this approach.


Introduction
A crisis in taxonomy, often termed the taxonomic impediment or taxonomic gap, has repeatedly been identified in recent decades (Wilson 1985;Lipscomb et al. 2003;Scotland et al. 2003;Wheeler 2004;de Carvalho et al. 2005;Crisci 2006;Godfray 2007;Miller 2007;Forum Herbulot 2014;Tahseen 2014).Despite some more optimistic views (e.g., de Carvalho et al. 2007;Tahseen 2014), the taxonomic impediment remains a major concern given the urgent need for comprehensive biodiversity assessments because of the biodiversity crisis: the risk of human activity causing mass extinction (Wilson 1985(Wilson , 2003;;González-Oreja 2008;Dubois 2010).In this paper, we consider the ways in which DNA barcoding can accelerate the process of taxonomic inventory and the proper application of existing species names using one insect family, the Geometridae, as a model.
The Geometridae are one of the largest families in the animal kingdom (Scoble 1999;Scoble and Hausmann 2007).Most of its known species were described between 1880 and 1940, with 20 000 valid species described prior to 1965.By comparison, 3500 species have been described over the past 50 years, just 70 species per year, a rate which is far too slow to complete the registration of all species in this family in a timely manner given the conservative estimate of 40 000 geometrid species (see Miller et al. 2016).
Both the "traditional approach" (usually entirely morphological) to taxonomy and the modern integrative (morphological/molecular) approach are not only impeded by the limited number of taxonomists, but also by the difficulties and expense in gaining access to type material, the specimens on which the original description of each species is based.For each taxonomic revision the relevant types need to be checked, but they are usually deposited in many different natural history museums.Whilst more than 90% of geometrid type specimens are deposited in 10 major institutions, the remaining 5%-10% are widely dispersed elsewhere.Despite increasing access to online databases, the identity of a type specimen often cannot be accurately assessed by examining images.Despite this fact, many taxonomic revisions are done without direct examination of type specimens, often leading to erroneous decisions about the proper application of a particular Linnaean binomial.
DNA barcoding of Lepidoptera has proven a powerful tool for identifying (Hebert et al. 2003) and delimiting species (Ratnasingham and Hebert 2013;Hausmann et al. 2013), often revealing greater complexity than previously recognized (Mutanen et al. 2013;Hausmann et al. 2011;Huemer et al. 2014), with only a low percentage of species showing conflicts between molecular and morphological data (Hausmann et al. 2013).Yet, full resolution of the taxonomic implications of such results often requires sequence information from type material, a challenging task when century-old specimens are in-volved (Zimmermann et al. 2008;Dabney et al. 2013;Hebert et al. 2013).The first taxonomic revision to include sequence information from an old (150 years) type specimen (Hausmann et al. 2009a) involved a timeconsuming and costly "primer walking approach".This protocol did recover the entire DNA barcode through Sanger-sequencing, but it required six PCRs to recover overlapping sequence fragments and 12 sequencing reactions (forward/reverse for each amplicon; see Lees et al. 2010 for details on the protocol).Several subsequent studies have used the same approach to recover DNA barcodes from old type specimens of Lepidoptera (Hausmann et al. 2009b;Rougerie et al. 2012;Strutzenberger et al. 2012;Mutanen et al. 2015) and other groups (e.g., Puillandre et al. 2011).A recent effort at the Canadian Centre for DNA Barcoding (CCDB) aimed to develop a less-expensive protocol for obtaining sequence information from type specimens.This work examined type specimens of geometrids, mostly from the Natural History Museum (NHM, London) and the Zoologische Staatssammlung München (SNSB-ZSM, Munich).These analyses initially focused on the recovery of short fragments, 164 base pair (bp) and (or) 94 bp from the middle of the COI barcode fragment, as these allow unambiguous species identification in more than 90% of cases (Meusnier et al. 2008) and reliable connection to fulllength (658 bp) DNA barcodes from other individuals of the same species (Hebert et al. 2013).However, work was also directed toward the development of a protocol employing multiplex PCR to generate short amplicons for analysis using Next-Generation-Sequencing (NGS) to recover the full 658-bp barcode region (Prosser et al. 2016;see Speidel et al. 2015 for a recent study on North African lasiocampids involving a 194-year-old type specimen).This new approach is particularly valuable in gaining sequence information for the many species which are only known from type material and for resolving cases involving very closely related species.These new protocols are making it feasible to obtain the sequence information needed to establish the identity of type specimens in an objective, non-destructive and cost-effective fashion, eliminating this aspect of the taxonomic impediment at a stroke.
In the present study we use a large dataset of type material of various ages to explore the following issues: rates of success in sequence recovery with different protocols; time and cost of our methods compared to conventional approaches.We also explore the practical consequences of the availability of barcode data from type material, such as the level of unrecognised synonymy in Geometridae and the impact on taxonomy at the species and genus level by examining a case study, the species-rich genus Prasinocyma as currently constituted.

Sampling
Tissue samples were obtained from 3846 type specimens of geometrids.Duplicate tissue samples were taken from 110 of these specimens to provide the opportunity to compare the success in sequence recovery from separate DNA extracts prepared from the same specimen.These specimens belong to 2685 species/subspecies with 2556 taxa represented by name-bearing types, i.e., holo-, lecto-, neo-, and syntypes.Specimens from syntype series were counted as one.The number of type specimens and their type status are shown in Table 1.
The New Guinea types (NHM) were sampled as a basic part of a large project (GONGED: Geometridae of New Guinea Electronic Database; Holloway et al. 2009) to create a digital compilation of data for all geometrid species known from New Guinea (including some undescribed but clearly distinct taxa), with images of external characters of both sexes and characters of genital and abdominal morphology from slide-mounted preparations (Barrows et al. 2009;Holloway et al. 2009;Miller 2014a).When new dissections (preparation of genitalia) were required, we retained the remnant tissue for DNA analysis.We sought to use primary type material to characterize each species, but often needed to select other specimens to represent the opposite sex of the primary type(s), or to represent the type if its abdomen was lost.We have included these specimens here because they are now important voucher specimens representing a described species, and they are usually of the same age and source as the type material, thus providing additional proof of our ability to obtain sequences from old specimens.Although little used in entomology, the terms plesiotype or hypotype are often used for such specimens in other insect groups and in paleontology.Medler applied the term plesiotype in his extensive publications redescribing types of Flatidae, for example in Medler (1993): "If the primary type was a female, or syntype males were not available, then a representative male was selected as a plesiotype for illustration and measurement purposes and a blue plesiotype label attached to the specimen.Although without taxonomic status, the term plesiotype accurately identifies comparative material for future reference."The status of this useful concept as deployed in DNA barcoding could be placed on the agenda of the International Commission for Zoological Nomenclature for possible inclusion in the next revision of the Code of Zoological Nomenclature.

Specimen age
The age of the type specimen was calculated as the difference between the collection date on the label and the year of its sequence analysis (supplementary data, Table S1 2 ; Table 2).For 161 specimens in the dataset DS-GEOTYPES, a collection date on the label was lacking.For specimens lacking a collection date on the label, the date of the original description was employed as the age of the specimen (Table 2), an approach that will underestimate its true age.

DNA analysis
DNA extracts for the NHM types were derived from abdominal lysates that remained after preparation of specimens for genital dissections (Knölke et al. 2004, modified, see Prosser et al. 2016).The lysates were held frozen at the NHM until they were transferred to the CCDB for DNA extraction and subsequent processing.For the type specimens from all other museums, a single dry leg was removed and sent to the CCDB, where it was subjected to standard procedures for DNA extraction (Ivanova et al. 2006) with the additional precautions of performing all work in a dedicated clean room with dedicated equipment to minimize the risk of external contamination.
PCR amplification and DNA sequencing were performed at the CCDB using three approaches (Fig. 1): (1) Most (ϳ670) of the younger specimens, those 0-20-years-old, were analyzed via standard high- *Numbers of lectotypes and syntypes are probably underestimated, while some "holotypes" may prove to be lectotypes or syntypes after thorough study of the relevant literature.
† Assessments of type status were based on Scoble (1999) and may require some revision.NHM, Natural History Museum, London; SNSB-ZSM, Staatliche Naturwissenschaftliche Sammlungen Bayerns-Zoologische Staatssammlung München (Bavarian State Collection of Zoology, Munich).throughput protocols (Ivanova et al. 2006;deWaard et al. 2008), which can be accessed under http://www.dnabarcoding.ca/pa/ge/research/protocols.Briefly, a single pair of primers (Tables 3, 4) was used to amplify a 658-bp region near the 5= terminus of the mitochondrial cytochrome c oxidase I (COI) gene which includes the standard barcode region for the animal kingdom (Hebert et al. 2003).Specimens that failed to generate an amplicon were reanalyzed using two pairs of primers (Tables 3, 4), in separate reactions, targeting overlapping amplicons of 407 and 307 bp that jointly yield a 658-bp COI barcode (see Hebert et al. 2013).
(2) Most (ϳ3200) of the specimens greater than 20-yearsold were analyzed using primers targeting a 164-bp amplicon (Tables 3, 4) within the COI barcode region.Specimens that failed to generate an amplicon were reanalyzed using primers targeting a 94-bp region and (or) a 64-bp region (Tables 3, 4).(3) Full-length barcodes were recovered from more than 200 century-old specimens using Sanger sequence analysis of multiple short amplicons (Tables 3, 4; for details of the protocol see Lees et al. 2010).A NGSbased approach (Prosser et al. 2016) was also used to recover full-length barcodes from 101 century-old Geometridae specimens from the NHM (92) and SNSB-ZSM (9), see Fig. 1.Briefly, multiple short, overlapping DNA fragments were amplified in multiplex PCR reactions (Tables 3, 4) and sequenced on an Ion Torrent PGM (Life Technologies).Multiple samples were sequenced simultaneously by tracking the origin of sequence reads via unique multiplex identifier (MID) tags in the PCR primers.Following sequencing, reference-based assembly was used to generate sequence contigs, which are optimally 658 bp (i.e., a full-length barcode sequence).
Because of the very low DNA concentrations in DNA extracts prepared from old specimens, there is a high risk of contamination.Quality control of the Sanger sequencing protocol was done by analyzing one or two duplicate DNA extracts from 110 specimens.
For the geometrid types from NHM London, the quality control of the sequences which were generated in parallel from the same DNA extracts with the Sanger approach and with NGS revealed a 100% match in all cases (Prosser et al. 2016).The same study showed that all DNA sequences from century-old type specimens   perfectly matched (100%) sequences from recently collected specimens when these were available in the Barcode of Life Data Systems, BOLD (Ratnasingham and Hebert 2007).DNA extracts are stored at both the CCDB and in the DNA-Bank facility of the SNSB-ZSM (see http://www.zsm.mwn.de/dnabank/).All sequences are deposited in GenBank as well (see Table S1 2 ).NGS reads are available in the sequence read archives (SRA) under SRR1867944, SRR1867808, SRR1867811-SRR1867819, SRR1867935-SRR1867937, SRR1867942-SRR1867944, SRR1945335, SRR1945382-SRR1945389, SRR1946575, and SAMN04308941-SAMN04309011. Complete specimen data including images, voucher deposition, GenBank accession numbers, GPS coordinates, sequences, and trace files can easily be accessed in BOLD in the public dataset DS-GEOTYPES.

Rates of success in sequence recovery
Sequence information was recovered from 2805 of the 3846 type specimens (73%), providing coverage for 2071 of the 2685 taxa at species or subspecies level (77%), belonging to 1965 species and including 1615 name-bearing types (see Tables S1, S2 2 ; DS-GEOTYPES).The total of 3000 sequences in the dataset DS-GEOTYPES includes 102 vouchers with parallel sequence results reflecting the analysis of two different legs and 101 sequences derived through NGS records, of which 93 possess a corresponding Sanger sequence from the same type specimen.
Sanger analysis generated a COI sequence from approximately 80% of specimens less than 120-years-old, but success was considerably lower in older specimens (Figs. 2, 3).By comparison, sequence recovery via NGS (Prosser et al. 2016) was much higher for the 101 old specimens with 100% in recovering a sequence with read lengths ranging from 213 to 610 bp (after excluding nucleotide positions labeled as "N"), even when Sanger sequencing failed completely.In total, 94% of the sequences were longer than 400 bp.An analysis of the 2839 specimens with collection dates of the dataset DS-GEOTYPES shows that for Sanger sequences there is a significant trend of decreasing sequence length with increasing age (Fig. 3, R 2 = 0.26, P Ͻ Ͻ 0.01, zero values excluded) while there was no significant trend for the NGS sequences (R 2 = 0.007, P = 0.41).

Quality control Sanger sequencing
Quality control of the Sanger sequencing results was examined by analyzing a duplicate DNA extract from 110 specimens.Sequence recovery was successful for both extracts in 102 cases.The sequences were identical in 95 cases, while five specimens showed a 1-bp difference that likely reflected an artifact introduced during sequence editing.Contamination was detected in only two of the 220 DNA extracts (0.9%).Additional quality checking is possible for all type sequences by comparing the sequences with those of recent conspecific or congeneric samples.

Quality control Sanger sequencing versus NGS
For details of the quality control of the sequences that were generated in parallel from the same DNA extracts with the Sanger sequencing and with NGS, see under Material and methods.There was 100% concordance between sequences generated from the same specimen using NGS and Sanger sequencing for the 20 type specimens examined by Prosser et al. (2016).Our extended data set (93 type specimens) confirms this result.Similarly, the 164-bp sequences recovered in this study were identical to homologous sections of full barcodes from recent specimens.For the performance of minibarcodes in the genera Prasinocyma and Albinospila see below.

Potential synonymy
The barcodes from some types (7.5% of the 2071 taxon names) showed an exact match with the barcodes from the types of other taxa in the current classification (Scoble 1999;Scoble and Hausmann 2007), suggesting the need to re-evaluate these taxa (Table S3 2 ).In a few of these cases (e.g., genus Craspedosis) the type status awaits re-examination, the material concerned being potentially "plesiotypic" (without taxonomic/nomenclatural status).Note: All primers are mixed in equimolar ratios.The primer combinations correspond to those in Fig. 1.

Case study: genus Prasinocyma and allied genera
The closely allied genera Prasinocyma, Albinospila, and Orothalassodes (see Holloway 1996) are represented on BOLD by sequences from 1043 individuals, currently assigned to 110 species with Linnaean names (binomina) and clustering to 212 BINs plus 38 clearly separate lineages (>2%) without a BIN assignment because the sequence records were too short.Reliable calibration (verification of species or subspecies names) could be performed for 56 taxa, through sequencing of type material.For the Ethiopian fauna this covers 27 of the 41 species (66%; see Fig. 4; Hausmann et al. 2016).The tissue samples of another 75 type specimens (NHM, SNSB-ZSM) currently being sequenced will provide a near-complete DNA library for African Prasinocyma.Sequences recovered from 42 of the 105 type specimens of Prasinocyma and Albinospila (secondary types included, see DS-GEOTYPES and Table S1 2 ) were shorter than 300 bp.In 12 cases a comparison was possible with full-length barcodes (658 bp) from conspecific vouchers, in five cases from tissue taken from the same specimen.12 minibarcodes are nested completely within the equivalent longer sequences in a neighbor-joining analysis (complete deletion, Fig. 5).For personal use only.

Immediate impact on taxonomy
This study has demonstrated both the feasibility and importance of large-scale efforts to recover DNA barcodes from type specimens.As shown by the present studies on the Geometridae, the assembly of barcode records from type specimens will represent a powerful aid to taxonomy in two ways: (1) Clarification of synonymies: The present barcode results from type material of 1965 species (roughly 9% of the known global fauna) revealed 156 cases of potential synonymy of species or subspecies names.
Although these cases await validation through morphological re-examination, the present results suggest synonymy for at least 7% of currently recognized geometrid taxa in regions that have experienced limited taxonomic work, such as Papua New Guinea.When a DNA barcode library will be available for all type specimens of geometrids, the number of cases of potential synonymy will undoubtedly increase considerably.
(2) Valid reference library for known species: The importance of modern integrative taxonomy for progress in species descriptions is well exemplified by the In total, 27 barcoded type specimens are marked with a star.
recent revision of members of the genus Prasinocyma from Ethiopia (Hausmann et al. 2016) and Australasia (J.D. Holloway, unpublished).Prasinocyma is taxonomically complex, one of the very few genera for which the "grand-master" of geometridology, Claude Herbulot (1908Herbulot ( -2006)), could not assign species names in his collection because of their great similarity and the large number of species.Barcode analysis of specimens from Ethiopia revealed 41 clusters and included 27 barcodes from primary type specimens, mainly those of newly described taxa (Fig. 4; Hausmann et al. 2016).The 14 species clusters lacking barcodes for the type specimens from the NHM London are currently being processed in the framework of the "Afroemeralds project" with the NHM, an effort that will lead to an integrative revision for all 300+ species of Prasinocyma from Africa, of which only 94 were described prior to revi-sion of the Ethiopian fauna (Scoble 1999;Hausmann et al. 2016).
Prasinocyma is also an important geometrid genus in Australasia where its species diversity is uncertain and where its generic boundaries need clarification.Holloway (1996) moved several Bornean Prasinocyma to new genera, Albinospila and Orothalassodes, noting that Prasinocyma might be restricted to Africa, with all Australasian species belonging to other genera.Accordingly, Scoble (1999) listed all Australasian and one Oriental species separately under "Prasinocyma".Although COI barcode data alone cannot resolve generic boundaries reliably, there can often be reciprocal illumination of the situation through the interplay of morphological and barcoding approaches at specific and generic levels.A maximum likelihood tree for 577 sequences (>500 bp) representing members of Prasinocyma, Albinospila, and Orothalassodes (Fig. 6) shows that African taxa almost all Fig. 5. Neighbor-joining tree showing 100% concordance between sequences generated from 29 type specimens using NGS and Sanger sequencing (COI-5= sequence data; complete deletion; Kimura 2 parameter, built with MEGA6, Tamura et al. 2013) representing 12 species of Prasinocyma and Albinospila, with specimen ID, species name, country, type status, and sequence length.In each branch the short sequences (<300 bp) perfectly match their longer equivalents from other, conspecific vouchers (>500 bp).In six cases (marked by asterisk), the tissues were taken from the same specimen, in four cases these were generated by next generation sequencing (NGS, colourized).cluster apart from the Oriental and Australasian species, supporting Holloway's (1996) prediction.According to this analysis, Albinospila makes paraphyletic those Australasian species which are still combined with "Prasinocyma", one of which, P. corulea in Fig. 5, is the type species of the genus Pyrrhaspis (listed as synonym of Prasinocyma in current taxonomy).Another example is a COI cluster which includes Orothalassodes semimacula (Prout, 1925) (one syntype barcoded), O. retaka Holloway 1996, O. curiosa (Swinhoe, 1902), and an unnamed African "Prasinocyma" species, suggesting the need for revision of the current generic combinations within the large Thalassodes Guenée generic complex (Holloway 1996).It may well prove to be the appropriate placement for all Australasian "Prasinocyma".Complex situations such as this one are considerably clarified when barcode data are available from type specimens.

Vision
This study has established the feasibility of large-scale programs to generate DNA barcodes from type speci-mens.Moreover, DNA barcoding should ideally be adopted as a best practice standard for the designation of each new holo-, neo-, and lectotype (see Forum Herbulot 2014).Natural history museums (where name-bearing types are to be deposited according to the recommendation of the International Code of Zoological Nomenclature) should commit to allowing the sequence analysis of type specimens to facilitate taxonomic work by all researchers, including those who cannot afford barcode analysis.A precedent is set by the present study which releases data for 2805 geometrid type specimens representing 1965 species (9% of the global fauna).

Significance
The DNA barcode analysis of type specimens provides an objective, sustainable platform for taxonomy as it helps to detect synonymies and cryptic species as well as providing reliable identification of known species.Moreover, it will save time and costs by avoiding, in many cases, the need for museum visits and related morpho-Fig.6. Circularized maximum likelihood tree based on COI-5= sequence data (>500 bp, maximum likelihood, Kimura 2 parameter, partial deletion, built with MEGA6, Tamura et al. 2013) for 577 specimens belonging to three genera (Prasinocyma, Albinospila, and Orothalassodes) of geometrines.The sections in blue colour refer to African taxa while those in red are Oriental and Australasian taxa.O, node of basal divergence of Orothalassodes (but including one unidentified African "Prasinocyma", see text); A, node of basal divergence of Albinospila, all other taxa are combined with Prasinocyma in current classification.logical study including dissections, though these should ideally be conducted in parallel.
Taxonomic science has seen, in the last few years, a controversial discussion about minimum standards in alpha-taxonomy, largely motivated by the taxonomic impediment (Riedel et al. 2013;Forum Herbulot 2014;Brehm 2015).The "Forum Herbulot statement on accelerated biodiversity assessment" (Forum Herbulot 2014) shows ways, and makes proposals, to accelerate biodiversity assessment, and it defines minimum description standards and recommends for that purpose, as a keytool, fostering "projects of DNA barcoding of type specimens.National funding agencies and decision-makers should commit themselves quickly to provide substantial support and financial resources to generate DNA barcodes for all type specimens deposited in their national collections".Acceleration of species descriptions does not mean, as sometimes suspected, a return to lowquality descriptions, but instead it aims to overcome the taxonomic impediment by adopting modern technologies.In this way, taxonomy can be done in an integrative, minimalistic fashion, but much better than in the past because it is supported by digital photography of specimens and genital morphology, free online access, and type barcodes as molecular keys.
We should be careful not to give priority automatically to taxonomic decisions based on "traditional approaches", because this position would suggest that the analysis of morphology can "prove" species status while DNA barcodes cannot.Both approaches provide valuable data necessary to establish species hypotheses, but neither are sufficient to "prove" species status or species delimitation; they need to be integrated with other data.Therefore, both approaches should be recognized as complementary, as an "integrated taxonomic approach" (e.g., Teletchea 2010;Padial et al. 2010;Goldstein and deSalle 2011;Hausmann 2011;Kirichenko et al. 2015).
Since DNA barcodes have the potential to assess global biodiversity so much faster than morphological analysis, the related taxonomic descriptions and re-descriptions may be restricted to selected (essentially diagnostic) morphological characters whilst the complete set of morphological traits can be added later.Modern publication systems, linked to global databases like the Biodiversity Data Journal and similar initiatives (Penev et al. 2011;Smith et al. 2013;Hoffmann et al. 2014) provide a suitable informatics support system for this approach.
We conclude that the adoption of modern techniques coupled with free access to the resultant data can overcome major problems in taxonomy such as insufficiently detailed descriptions lacking adequate illustrations or types not easily being accessible and new descriptions based on doubtful traits/differences.Because DNA barcodes can certainly provide substantially more objective, "unique identifiers", they should be globally accessible online along with high-quality photographs (Miller 2014b).Incomplete species resolution due to barcodesharing (or overlap) has proven a very rare phenomenon (Mutanen et al. 2012;Huemer et al. 2014;Hausmann et al. 2011), particularly so in sympatry (Hausmann et al. 2013).If each of the more than 7000 geometrid species described by Prout, Warren, Walker, Inoue, and Herbulot (Scoble et al. 1995;Scoble and Hausmann 2007) had gained both a DNA barcode and a photograph of its morphology on BOLD at the time of its description, many current taxonomic problems would have been avoided.Conversely, if these researchers and their peers had been required to follow the "modern publication standard" (including wordy introductions, differential diagnoses from too many other taxa, comprehensive keys, long phylogenetic conclusions, extensive illustrations, etc.), knowledge of geometrid biodiversity would be much poorer today.The perfect can be the enemy of the good, and there will always be a "trade-off".No living geometrid taxonomist has published more than 300 species descriptions, and very few have published more than 200.Considering the small number of active taxonomists, several hundred years will be required to fully assess geometrid diversity as there are probably 15 000 -20 000 undescribed species.Additionally, if we employ these standards for re-descriptions of inadequately defined, but named species in the framework of revisions, resolution of the 35 000 descriptions of available taxa (species, subspecies, synonyms) will require at least half a millennium.Instead of waiting, we propose the "completion" of all existing descriptions through a major project involving the DNA barcoding of all type specimens with photo-documentation of the voucher, its genitalia (if available), and georeferencing.The cost of acquiring a 658-bp barcode using traditional Sanger sequencing was estimated to be roughly 4-fold higher for old type specimens than for freshly collected specimens (Strutzenberger et al, 2012).However, the rise of affordable NGS platforms with increased sequencing capacities makes it currently feasible to recover 658-bp barcodes from old type specimens for approximately the same cost as analyzing a fresh specimen via Sanger sequencing.Because the cost of sequencing type specimens using NGS will undoubtedly decrease further as the number of sequence reads increases, it is feasible to expect that it will soon be possible to recover a full barcode record from each type specimen for less than $10.As a consequence, the comprehensive analysis of representative type material from every known species can be completed with modest investment within 20 years as the current study analyzed nearly 10% of all geometrid type specimens in less than 3 years.J.D.H., S.E.M., and D.P. oversaw the tissue-sampling and databasing for specimens in the project "Geometridae of New Guinea Electronic Database" (GONGED) which examined type specimens from New Guinea at the NHM.For New Guinea specimens, imaging and specimen lysis was supported by the US National Institutes of Health through ICBG 5UO1TW006671 granted to University of Utah.Building the Papua New Guinea background library has been supported by the US National Science Foundation (via grants DEB-0211591, 0515678, and others), with assistance from Karolyn Darrow, Lauren Helgen, Margaret Rosati, and others.Michael Trizna helped with sophisticated tools to convert collecting dates to ages (years) for the specimen age table.Grant 2966 from the Gordon and Betty Moore Foundation made possible the sequence analysis and the development of protocols.We thank the Natural History Museum and its staff (particularly Jacqueline Mackenzie-Dodds, Geoff Martin, and John Chainey) for their support and for permitting access to type specimens.Many colleagues at the Centre for Biodiversity Genomics contributed to this study.We are particularly grateful to Evgeny Zakharov and Sujeevan Ratnasingham.John J. Wilson (associate editor), Sarah Adamowicz (editor), Gunnar Brehm, and an anonymous reviewer helped to improve the manuscript by giving numerous, very helpful comments.We thank Feza Can (Hatay), Sven Erlacher (Chemnitz), Andres Exposito (Mostoles), Gabriele Fiumi (Forlì), Claudio Flamigni (Bologna), Egbert Friedrich (Jena), Jörg Gelbrecht (Königs-Wusterhausen), John La Salle (Canberra), Jürgen Lenz (Harare), Antoine Lévêque (Beaugency), Victor Redondo (Zaragoza), Guy Sircoulomb (Paris), Peder Skou (Stenstrup), Manfred Sommerer (Munich), Dieter Stüning (Bonn), and Robert Trusch for allowing sequencing and inclusion of data from type specimens in their collections into our dataset.

Fig. 1 .
Fig. 1.Flow chart showing the processing pipeline of specimens of different age categories processed using various PCR amplification strategies (for the major contributions from SNSB-ZSM Munich and NHM London).For each amplification stage, the number of primers involved and the final sequence length are shown.* a few exceptions were made for century-old specimens by applying a 12-miniprimer approach to recover the whole 658-bp COI barcode fragment.

Fig. 2 .
Fig. 2. Success in recovery of COI sequence from type specimens of Geometridae via Sanger analysis versus age of vouchers.

Fig. 3 .
Fig. 3. Dot plot diagram of sequence lengths versus age of the sequenced type specimens of Geometridae, based on the 2839 specimens of DS-GEOTYPES with collection dates; blue dots: Sanger analysis, with significant trend line, R 2 = 0.26, P Ͻ Ͻ 0.01 (linear regression); orange dots: NGS analysis with non-significant line of best fit, R 2 = 0.007, P = 0.41.

Table 1 .
Number of individuals subjected to tissue sampling, and their type status.

Table 2 .
Numbers of specimens in each of nine age categories, subjected to tissue sampling.

Table 3 .
Primers employed in this study.If primers are mixed in a cocktail, the name of the cocktail is shown.Primer cocktails are mixed in equimolar ratios.N/A, not applicable. Note:

Table 4 .
Primer combinations used in this study.