Multibranch semantic image segmentation model based on edge optimization and category perception

Zhuolin Yang; Zhen Cao; Jianfang Cao; Zhiqiang Chen; Cunhe Peng

doi:10.1371/journal.pone.0315621

Multibranch semantic image segmentation model based on edge optimization and category perception

PLoS One. 2024 Dec 19;19(12):e0315621. doi: 10.1371/journal.pone.0315621. eCollection 2024.

Authors

Zhuolin Yang^{1

2}, Zhen Cao^{1

2}, Jianfang Cao^{1

2}, Zhiqiang Chen³, Cunhe Peng^{1

2}

Affiliations

¹ Department of Computer Science and Technology, Xinzhou Normal University, Xinzhou, China.
² School of Computer Science and Technology, Taiyuan University of Science and Technology, Taiyuan, China.
³ Department of Big Data and Intelligent Engineering, Shanxi Institute of Technology, Yangquan, China.

Abstract

In semantic image segmentation tasks, most methods fail to fully use the characteristics of different scales and levels but rather directly perform upsampling. This may cause some effective information to be mistaken for redundant information and discarded, which in turn causes object segmentation confusion. As a convolutional layer deepens, the loss of spatial detail information makes the segmentation effect achieved at the object boundary insufficiently accurate. To address the above problems, we propose an edge optimization and category-aware multibranch semantic segmentation network (ECMNet). First, an attention-guided multibranch fusion backbone network is used to connect features with different resolutions in parallel and perform multiscale information interaction to reduce the loss of spatial detail information. Second, a category perception module is used to learn category feature representations and guide the pixel classification process through an attention mechanism to optimize the resulting segmentation accuracy. Finally, an edge optimization module is used to integrate the edge features into the middle and the deep supervision layers of the network through an adaptive algorithm to enhance its ability to express edge features and optimize the edge segmentation effect. The experimental results show that the MIoU value reaches 79.2% on the Cityspaces dataset and 79.6% on the CamVid dataset, that the number of parameters is significantly lower than those of other models, and that the proposed method can effectively achieve improved semantic image segmentation performance and solve the partial category segmentation confusion problem, giving it certain application prospects.

Copyright: © 2024 Yang et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

MeSH terms

Algorithms*
Humans
Image Processing, Computer-Assisted* / methods
Neural Networks, Computer
Semantics*

Grants and funding

This study was funded by the National Natural Science Foundation of China (Grant No. 62372397) and the Shanxi Province Natural Fund Project (Grant Nos. 202203021221222, 202203021221229). The funders had no role in the study design, data collection and analysis, decision to publish, or preparation of the manuscript.