About the Project
Crustacea is one of the most species-rich and ecologically diverse animal groups, yet it remains among the least represented in genome databases. CrustGP is a community effort to close that gap through coordinated taxon selection, standardised assembly and deposition practices, and collaboration across the research disciplines that depend on crustacean genomic data.
The Problem
Fewer than 35 high-quality genome assemblies exist across Crustacea as of early 2025, and coverage is heavily skewed: nearly half of the 57 recognised orders are absent from multigene phylogenetic analyses, with available transcriptomic data overwhelmingly concentrated in Malacostraca. The majority of crustacean diversity, including many ecologically and economically significant lineages, remains genomically uncharacterised.
Assembly presents genuine technical challenges that compound this underrepresentation. Crustacean genomes are often large, highly repetitive, and heterozygous, properties that reduce contiguity, complicate heterozygosity-aware assembly, and make annotation difficult. Field preservation conditions frequently degrade high-molecular-weight DNA, restricting access to the long-read data these genomes require. Addressing the gap therefore involves both improved coordination and continued methodological development.
Our Approach
Crustacean genome sequencing has largely been opportunistic, reflecting individual lab interests and funding opportunities rather than systematic coverage of the subphylum. Sequencing costs and assembly tools have improved substantially, but the field still lacks shared criteria for taxon prioritisation, agreed quality thresholds, and infrastructure for connecting researchers across systematics, genomics, ecology, fisheries, conservation, and aquaculture.
CrustGP addresses this by providing a coordination framework: agreed priority taxa, standardised assembly and annotation workflows, voucher-linked specimen requirements, and open data deposition as a baseline expectation. The aim is a comparative genomic resource that accumulates systematically across crustacean diversity, rather than remaining concentrated in a handful of tractable or commercially relevant species.
The Goal