<35
High-quality assemblies as of early 2025
~50%
Of crustacean orders absent from multigene analyses

The Problem

A Striking Gap

Fewer than 35 high-quality genome assemblies exist across Crustacea as of early 2025, and coverage is heavily skewed: nearly half of the 57 recognised orders are absent from multigene phylogenetic analyses, with available transcriptomic data overwhelmingly concentrated in Malacostraca. The majority of crustacean diversity, including many ecologically and economically significant lineages, remains genomically uncharacterised.

Assembly presents genuine technical challenges that compound this underrepresentation. Crustacean genomes are often large, highly repetitive, and heterozygous, properties that reduce contiguity, complicate heterozygosity-aware assembly, and make annotation difficult. Field preservation conditions frequently degrade high-molecular-weight DNA, restricting access to the long-read data these genomes require. Addressing the gap therefore involves both improved coordination and continued methodological development.

Our Approach

From Opportunistic to Coordinated

Crustacean genome sequencing has largely been opportunistic, reflecting individual lab interests and funding opportunities rather than systematic coverage of the subphylum. Sequencing costs and assembly tools have improved substantially, but the field still lacks shared criteria for taxon prioritisation, agreed quality thresholds, and infrastructure for connecting researchers across systematics, genomics, ecology, fisheries, conservation, and aquaculture.

CrustGP addresses this by providing a coordination framework: agreed priority taxa, standardised assembly and annotation workflows, voucher-linked specimen requirements, and open data deposition as a baseline expectation. The aim is a comparative genomic resource that accumulates systematically across crustacean diversity, rather than remaining concentrated in a handful of tractable or commercially relevant species.

The Goal

Project Priorities

01
Strategic Taxonomic Sampling
Taxon selection is guided by phylogenetic gap-filling across all major crustacean lineages, with priority given to orders and families absent from existing comparative datasets. Commercial relevance alone is not a sufficient criterion.
02
Quality & Open Data Standards
Assemblies are produced to the highest quality achievable for each taxon, with open deposition, complete metadata, voucher-linked specimens, and reproducible workflows as baseline requirements. Resources generated through CrustGP are intended for broad reuse across research communities.
03
Community Coordination
Closing the genomic gap in Crustacea requires input from systematics, genomics, ecology, conservation, fisheries, and aquaculture, disciplines that have largely operated independently. CrustGP provides the organisational structure to connect them around shared priorities and shared data.