This website requires JavaScript.
DOI: 10.1101/2023.05.21.541653

PENet: A phenotype encoding network for automatic extraction and representation of morphological discriminative features

Z.Zhao Y. Lu Y. Tong X. Chen M. Bai
Discriminative traits are important in biodiversity and macroevolution, but extracting and representing these features from huge natural history collections using traditional methods can be challenging and time-consuming. To fully utilize the collections and their associated metadata, it is urgent now to increase the efficiency of automatic feature extraction and sample retrieval. We developed a Phenotype Encoding Network (PENet), a deep learning-based model that combines hashing methods to automatically extract and encode discriminative features into hash codes. We tested the performance of PENet on six datasets, including a newly constructed beetle dataset with six subfamilies and 6566 images, which covers more than 60% of the genera in the family Scarabaeidae. PENet showed excellent performance in feature extraction and image retrieval. Two visualization methods, t-SNE, and Grad-CAM, were used to evaluate the representation ability of the hash codes. Further, by using the hash codes generated from PENet, a phenetic distance tree was constructed based on the beetle dataset. The result indicated the hash codes could reveal the phenetic distances and relationships among categories to a certain extent. PENet provides an automatic way to extract and represent morphological discriminative features with higher efficiency, and the generated hash codes serve as a low-dimensional carrier of discriminative features and phenotypic distance information, allowing for broader applications in systematics and ecology.