Cayman enables large-scale analysis of gut microbiome carbohydrate-active enzyme repertoires
Ducarmon QR,
Karcher N,
Giri S,
Tytgat HLP,
Delannoy-Bruno O,
Pekel S,
Springer F,
Worz P,
Schudoma C,
Typas A,
Zeller G,
Nat Microbiol
(2026).
Abstract
Carbohydrate-active enzymes (CAZymes) are crucial for digesting glycans, but tools for CAZyme profiling and interpretation of substrate preferences in microbiome data are lacking. Here we develop a CAZyme profiler called Cayman (Carbohydrate Active Enzymes Profiling of Metagenomes) and a hierarchical substrate annotation scheme for use with genomic or shotgun metagenomic datasets. Using these tools, we systematically surveyed CAZymes in human gut microorganisms (n = 107,683 genomes) and identified several putative mucin-foraging bacteria, including Hungatella and Eisenbergiella species, which were confirmed experimentally. We compared CAZymes in gut metagenomes (n = 3,960) from high-income settings versus low- and middle-income settings and found that low- and middle-income setting metagenomes are enriched in fibre-degrading CAZymes, while CAZyme richness is generally higher in high-income setting metagenomes. Additional analysis (n = 1,998) indicated that metagenomes of individuals with colorectal cancer are depleted in fibre-targeting and enriched in glycosaminoglycan-targeting CAZymes. Finally, we inferred CAZyme substrates from genomic co-localization of CAZyme domains. Cayman is broadly applicable and freely available from https://github.com/zellerlab/cayman .