nnUZoo: A Unified Framework and Critical Analysis of CNN vs. Transformer vs. Mamba in Medical Image Segmentation

Software

Link to Source: Github, Paper

Summary: Open-source benchmarking framework comparing deep learning architectures for medical segmentation

nnUZoo is a comprehensive open-source benchmarking framework built upon the renowned nnUNet architecture, designed to provide fair and rigorous comparisons of various deep learning architectures for medical image segmentation. The framework incorporates CNNs, Transformers, and state-of-the-art Mamba-based models, enabling researchers to evaluate performance claims across different architectures objectively. nnUZoo includes five novel X2Net architectures that combine features from U2Net, nnUNet, CNNs, Transformers, and Mamba layers, offering researchers a comprehensive toolkit for advancing medical image segmentation research.

Graphical Abstract. Source: https://www.jacc.org/doi/10.1016/j.jacadv.2025.102168

The framework has been extensively evaluated on six diverse medical imaging datasets spanning microscopy, ultrasound, CT, MRI, and PET modalities, covering various anatomical regions and segmentation tasks. Results demonstrate that traditional CNN models like nnUNet and U2Net remain highly competitive, offering both speed and accuracy. While Transformer-based models show promise for certain imaging modalities, they require substantial computational resources. The Mamba-based SS2D2Net architecture achieved competitive accuracy with fewer parameters, though with increased training time. Published as an arXiv preprint (2025), nnUZoo serves as an essential resource for researchers seeking evidence-based architecture selection for medical image segmentation applications.