Advertisement
Original Contribution| Volume 49, ISSUE 5, P1129-1136, May 2023

Real-Time Automated Segmentation of Median Nerve in Dynamic Ultrasonography Using Deep Learning

  • Cheng-Liang Yeh
    Affiliations
    Graduate Institute of Biomedical Electronics and Bioinformatics, National Taiwan University, Taipei, Taiwan
    Search for articles by this author
  • Chueh-Hung Wu
    Affiliations
    Department of Physical Medicine and Rehabilitation, National Taiwan University Hospital, Taipei, Taiwan

    College of Medicine, National Taiwan University, Taipei, Taiwan
    Search for articles by this author
  • Ming-Yen Hsiao
    Affiliations
    Department of Physical Medicine and Rehabilitation, National Taiwan University Hospital, Taipei, Taiwan

    College of Medicine, National Taiwan University, Taipei, Taiwan
    Search for articles by this author
  • Po-Ling Kuo
    Correspondence
    Corresponding author. Electrical Engineering, Graduate Institute of Biomedical Electronics and Bioinformatics, National Taiwan University, MinDa Hall Room 519, No. 1, Section 4, Roosevelt Road Taipei, 10617 Taiwan.
    Affiliations
    Graduate Institute of Biomedical Electronics and Bioinformatics, National Taiwan University, Taipei, Taiwan

    College of Medicine, National Taiwan University, Taipei, Taiwan

    Electrical Engineering Department, National Taiwan University, Taipei, Taiwan
    Search for articles by this author
Open AccessPublished:February 03, 2023DOI:https://doi.org/10.1016/j.ultrasmedbio.2022.12.014

      Objective

      The morphological dynamics of the median nerve across the level extracted from dynamic ultrasonography are valuable for the diagnosis and evaluation of carpal tunnel syndrome (CTS), but the data extraction requires tremendous labor to manually segment the nerve across the image sequence. Our aim was to provide visually real-time, automated median nerve segmentation and subsequent data extraction in dynamic ultrasonography.

      Methods

      We proposed a deep-learning model modified from SOLOv2 and tailored for median nerve segmentation. Ensemble strategies combining several state-of-the-art models were also employed to examine whether the segmentation accuracy could be improved. Image data were acquired from nine normal participants and 59 patients with idiopathic CTS.

      Discussion

      Our model outperformed several state-of-the-art models with respect to inference speed, whereas the segmentation accuracy was on a par with that achieved by these models. When evaluated on a single 1080Ti GPU card, our model achieved an intersection over union score of 0.855 and Dice coefficient of 0.922 at 28.9 frames/s. The ensemble models slightly improved segmentation accuracy.

      Conclusion

      Our model has great potential for use in the clinical setting, as the real-time, automated extraction of the morphological dynamics of the median nerve allows clinicians to diagnose and treat CTS as the images are acquired.

      Keywords

      Abbreviations:

      MN (median nerve), CTS (carpal tunnel syndrome), CSA (cross-sectional area), DL (deep learning), CNN (convolutional neural network), FPS (frames/s), FPN (feature pyramid network), NMS (non-maximum suppression), IoU (intersection over union), MTS (Multi-training-stage)

      Introduction

      Dynamic ultrasonography is a promising tool for the diagnosis and evaluation of carpal tunnel syndrome (CTS), the entrapment of the median nerve (MN) at the wrist level, particularly in cases in normal electrophysiological findings are normal [
      • Al-Hashel JY
      • Rashad HM
      • Nouh MR
      • Amro HA
      • Khuraibet AJ
      • Shamov T
      • et al.
      Sonography in carpal tunnel syndrome with normal nerve conduction studies.
      ,
      • Aseem F
      • Williams JW
      • Walker FO
      • Cartwright MS
      Neuromuscular ultrasound in patients with carpal tunnel syndrome and normal nerve conduction studies.
      ,
      • Roghani RS
      • Holisaz MT
      • Norouzi AAS
      • Delbari A
      • Gohari F
      • Lokk J
      • et al.
      Sensitivity of high-resolution ultrasonography in clinically diagnosed carpal tunnel syndrome patients with hand pain and normal nerve conduction studies.
      ]. Dynamic ultrasonography involves the acquisition of an image sequence as the patient performs a particular maneuver and provides real-time visualization of the movement of soft tissues such as tendons and nerves [
      • Chang YC
      • Wang TG
      • Wu CH
      Sonographic detection of ulnar nerve compression during elbow extension.
      ,
      • Chang KS
      • Cheng YH
      • Wu CH
      • Ozcakar L
      Dynamic ultrasound imaging for the iliotibial band/snapping hip syndrome.
      ,
      • Hsiao MY
      • Shyu SG
      • Wu CH
      • Ozcakar L
      Dynamic ultrasound imaging for type A intrasheath subluxation of the peroneal tendons.
      ,
      • Wu CH
      • Shyu SG
      • Ozcakar L
      • Wang TG
      Dynamic ultrasound imaging for peroneal tendon subluxation.
      ]. In the carpal tunnel, the nerve is located between the flexor retinaculum and flexor digital tendons. As the fingers flex and extend, the nerve is deformed and translated at the transverse plane by the movement of the adjacent flexor digital tendons. Several morphological features of the entrapped nerve, such as centroid displacement, cross-sectional area (CSA) and circularity, exhibit aberrant spatiotemporal patterns as the fingers move, and are relevant to disease severity and the effects of perineural injection [
      • Kuo TT
      • Lee MR
      • Liao YY
      • Chen JP
      • Hsu YW
      • Yeh CK
      Assessment of median nerve mobility by ultrasound dynamic imaging for diagnosing carpal tunnel syndrome.
      ,
      • Wu CH
      • Syu WT
      • Lin MT
      • Yeh CL
      • Boudier-Revéret M
      • Hsiao MY
      • et al.
      Automated segmentation of median nerve in dynamic sonography using deep learning: evaluation of model performance.
      ,
      • Park D
      Ultrasonography of the transverse movement and deformation of the median nerve and its relationships with electrophysiological severity in the early stages of carpal tunnel syndrome.
      ,
      • Roomizadeh P
      • Eftekharsadat B
      • Abedini A
      • Ranjbar-Kiyakaleyeh S
      • Yousefi N
      • Safoora E
      • et al.
      Ultrasonographic assessment of carpal tunnel syndrome severity: a systematic review and meta-analysis.
      ,
      • Wang Y
      • Filius A
      • Zhao C
      • Passe SM
      • Thoreson AR
      • An KN
      • et al.
      Altered median nerve deformation and transverse displacement during wrist movement in patients with carpal tunnel syndrome.
      ]. Of note, an array of evidence indicates that the mobility of the MN as fingers moved estimated using dynamic ultrasonography was reduced in patients with CTS [
      • Lin MT
      • Liu IC
      • Chang HP
      • Wu CH
      Impaired median nerve mobility in patients with carpal tunnel syndrome: a systematic review and meta-analysis.
      ]. Dynamic ultrasonography is particularly valuable in detecting unusual etiologies of CTS in symptomatic patients with normal electrophysiological findings and normal MN morphology shown in static ultrasonography, for example, CTS caused by squeezing of the MN between the flexor pollicis longus and finger flexor tendons during finger flexion and extension [
      • Hung CY
      • Lam KHS
      • Wu YT
      Dynamic ultrasound for carpal tunnel syndrome caused by squeezed median nerve between the flexor pollicis longus and flexor digitorum tendons.
      ]. Quantitative measurement of these parameters, however, requires manual segmentation of the entrapped nerve across consecutive images and demands tons of human labor. In addition, the segmentation accuracy is appreciably dependent on the experience level of the examiner, and data reproducibility could be compromised when the data are processed by examiners with varying levels of experience or acquired under highly varied conditions [
      • Impink BG
      • Gagnon D
      • Collinger JL
      • Boninger ML
      Repeatability of ultrasonographic median nerve measures.
      ,
      • Fowler JR
      • Hirsch D
      • Kruse K
      The reliability of ultrasound measurements of the median nerve at the carpal tunnel inlet.
      ]. Low inter-rater and test–retest reliability in terms of intraclass correlation coefficient have been reported for the manual measurement of MN CSA and mobility [
      • Schrier VJMM
      • Evers S
      • Geske JR
      • Kremers WK
      • Villarraga HR
      • Kakar S
      • et al.
      Median nerve transverse mobility and outcome after carpal tunnel release.
      ,
      • Gonzalez-Suarez CB
      • Buenavente MLD
      • Cua RCA
      • Fidel MBC
      • Cabrera JTC
      • Regala CFG
      Inter-rater and intra-rater reliability of sonographic median nerve and wrist measurements.
      ]. This motivates the development of deep learning (DL)-based approaches to automatically segment the entrapped nerve in the image sequence, which minimizes human labor and segmentation inconsistency because the intra-rater reliability is always equal to 1 [
      • Festen RT
      • Schrier VJMM
      • Amadio PC
      Automated segmentation of the median nerve in the carpal tunnel using U-Net.
      ].
      Several studies have been reported to automatically segment nerves in ultrasonography using DL-based approaches [
      • Festen RT
      • Schrier VJMM
      • Amadio PC
      Automated segmentation of the median nerve in the carpal tunnel using U-Net.
      ,
      • Abraham N
      • Illanko K
      • Khan N
      • Androutsos D
      Deep learning for semantic segmentation of brachial plexus nerves in ultrasound images using U-Net and M-Net.
      ,
      • Baby M
      • Jereesh AS
      Automatic nerve segmentation of ultrasound images.
      ,
      • Cosmo MD
      • Chiara Fiorentino M
      • Villani FP
      • Sartini G
      • Smerilli G
      • Filippucci E
      • et al.
      Learning-based median nerve segmentation from ultrasound images for carpal tunnel syndrome evaluation.
      ,

      Hafiane A, Vieyres P, Delbos A. Deep learning with spatiotemporal consistency for nerve segmentation in ultrasound images. arXiv 1706.05870. 2017.

      ,
      • Horng MH
      • Yang CW
      • Sun YN
      • Yang TH
      DeepNerve: a new convolutional neural network for the localization and segmentation of the median nerve in ultrasound image sequences.
      ,
      • Zhao H
      • Sun N
      Improved U-Net model for nerve segmentation.
      ,
      • Smerilli G
      • Cipolletta E
      • Sartini G
      • Moscioni E
      • Di Cosmo M
      • Fiorentino MC
      • et al.
      Development of a convolutional neural network for the identification and the measurement of the median nerve on ultrasound images acquired at carpal tunnel level.
      ,
      • Di Cosmo M
      • Fiorentino MC
      • Villani FP
      • Frontoni E
      • Smerilli G
      • Filippucci E
      • et al.
      A deep learning approach to median nerve evaluation in ultrasound images of carpal tunnel inlet.
      ], mostly those based on U-Net [
      • Ronneberger O
      • Fischer P
      • Brox T
      U-Net: Convolutional networks for biomedical image segmentation.
      ], but the application in dynamic ultrasonography is scarcely addressed. Because peripheral nerves are usually small and their echoic appearance is easily confounded by speckle noise, delineation of nerve peripheries in static ultrasonography is generally challenging. Nerve visualization in dynamic ultrasonography is expected to be more difficult than that in static ultrasonography because of the motion noise and the change in the moved image features created by tissue anisotropy [
      • Festen RT
      • Schrier VJMM
      • Amadio PC
      Automated segmentation of the median nerve in the carpal tunnel using U-Net.
      ]. The models for MN segmentation in dynamic ultrasonography could be trained on the basis of a single frame or image sequence. Horng et al. [
      • Horng MH
      • Yang CW
      • Sun YN
      • Yang TH
      DeepNerve: a new convolutional neural network for the localization and segmentation of the median nerve in ultrasound image sequences.
      ] trained U-Net-based models with an image sequence manually cropped to contain mainly the MN and found that the model-integrated U-Net with two types of recursive neural network, MaskTrack and convolutional long short-term memory, significantly outperformed simple U-Net models in segmentation accuracy. Festen et al. [
      • Festen RT
      • Schrier VJMM
      • Amadio PC
      Automated segmentation of the median nerve in the carpal tunnel using U-Net.
      ] reported that when there was no manual intervention, the segmentation accuracy for MN achieved by the U-Net-shaped model trained on the basis of a single frame was similar to that trained with the image sequence, in which the spatial information from the previous segmented frame was used by the next frame [
      • Festen RT
      • Schrier VJMM
      • Amadio PC
      Automated segmentation of the median nerve in the carpal tunnel using U-Net.
      ]. We recently investigated the feasibility of automated segmentation of MN in dynamic ultrasonography collected from 52 participants with CTS using several state-of-the-art, end-to-end convolutional neural network (CNN) models, and determined that Mask-R-CNN [
      • He K
      • Gkioxari G
      • Dollár P
      • Girshick R
      Mask R-CNN.
      ] and DeepLabV3+ [

      Chen LC, Zhu Y, Papandreou G, Schroff F, Adam HJ. Encoder–decoder with atrous separable convolution for semantic image segmentation. CoRR Abs/1802.02611. 2018.

      ] marginally outperformed U-Net and semantic FPN [
      • Kirillov A
      • Girshick R
      • He K
      • Dollár P
      Panoptic feature pyramid networks.
      ] with respect to segmentation accuracy when the models were trained on the basis of a single frame [
      • Wu CH
      • Syu WT
      • Lin MT
      • Yeh CL
      • Boudier-Revéret M
      • Hsiao MY
      • et al.
      Automated segmentation of median nerve in dynamic sonography using deep learning: evaluation of model performance.
      ].
      The main obstacle to deploying these approaches in clinical setting is the long time taken to generate model output resulting from the computational complexity and the process of manual intervention. Compared with other image modalities, a major advantage of ultrasonography is the capability to provide information visually in real time. Simultaneous nerve segmentation and image acquisition, or a short delay between these two processes, allow the clinicians to immediately make the diagnosis and treatment decision based on the model output, and repeat data collection when there exists too much erroneous segmentation in the acquired video owing to unfavorable image conditions.
      In this work, we propose an instance segmentation approach, SOLOv2-MN, which was modified from the recently released SOLOv2 [
      • Wang X
      • Zhang R
      • Kong T
      • Li L
      • Shen C
      SOLOv2: dynamic and fast instance segmentation.
      ] and was aimed at accelerating the segmentation speedup to that accustomed in clinical operation, with segmentation accuracy at least on a par with that achieved by our previous approach [
      • Wu CH
      • Syu WT
      • Lin MT
      • Yeh CL
      • Boudier-Revéret M
      • Hsiao MY
      • et al.
      Automated segmentation of median nerve in dynamic sonography using deep learning: evaluation of model performance.
      ]. SOLOv2 is a concise, anchor box-free model that solves instance segmentation as classification tasks by categorizing individual pixels within an instance according to the size and location of that instance, and outperforms an array of state-of-the-art models for instance segmentation in both speed and accuracy. We compared the performance of the proposed model with that achieved by several state-of-the-art models for instance segmentation, including SOLOv2, Mask-R-CNN, a two-stage, region proposal-based model for instance segmentation that had been employed in MN segmentation [
      • Wu CH
      • Syu WT
      • Lin MT
      • Yeh CL
      • Boudier-Revéret M
      • Hsiao MY
      • et al.
      Automated segmentation of median nerve in dynamic sonography using deep learning: evaluation of model performance.
      ,
      • Smerilli G
      • Cipolletta E
      • Sartini G
      • Moscioni E
      • Di Cosmo M
      • Fiorentino MC
      • et al.
      Development of a convolutional neural network for the identification and the measurement of the median nerve on ultrasound images acquired at carpal tunnel level.
      ,
      • Di Cosmo M
      • Fiorentino MC
      • Villani FP
      • Frontoni E
      • Smerilli G
      • Filippucci E
      • et al.
      A deep learning approach to median nerve evaluation in ultrasound images of carpal tunnel inlet.
      ], YOLACT [

      Bolya D, Zhou C, Xiao F, Lee YJ. YOLACT: real-time instance segmentation. arXiv 1904.02689. 2019.

      ], a one-stage, fully convolutional model developed for real-time instance segmentation, and BlendMask [

      Chen H, Sun K, Tian Z, Shen C, Huang Y, Yan Y. BlendMask: top-down meets bottom-up for instance segmentation. arXiv 2001.00309. 2020.

      ], another fast model blending both the top-down and bottom-up approaches for instance segmentation. Additionally, we examined whether combination of the top-performing models using the ensemble learning strategy further improved segmentation accuracy.

      Methods

      Data set of image sequence

      Dynamic ultrasonography of the MN in the carpal tunnel as fingers moved was acquired from recruited participants with the approval from the institutional review board (IRB) at National Taiwan University Hospital (IRB No. NTUH-REC 201711014RINA, date of approval: January 17, 2018, title of clinical trial: The volume effect of hydrodissection for injection therapies in patients with carpal tunnel syndrome—evaluation model by shear wave ultrasound elastography and artificial intelligence imaging analysis, clinical trial registration https://clinicaltrials.gov identifier NCT03598322). Written informed consent was obtained from all participants in accordance with the Declaration of Helsinki. Subjects with a bifid MN were excluded because of the limited sample size. A total of 68 participants were enrolled in this study, including 9 normal participants and 59 patients with idiopathic CTS diagnosed by electrophysiological findings. The normal participants were 29.1 ± 2 y old (mean ± SD) and 44% were female; the patients with CTS were 57.7 ± 9.8 y old (mean ± SD) and 81% were female. The electrophysiological study data for those with CTS subjects were 32 ± 6.4 and 27.1 ± 5.4 ms for the latency of the finger-to-wrist and palm-to-wrist segments in sensory nerve conduction velocity, respectively, and 5.5 ± 1.5 ms for the distal latency in motor nerve conduction velocity (mean ± SD). The mean (±SD) Boston Carpal Tunnel Syndrome Questionnaire score for the patients with CTS was 39.7 ± 13.6. All participants were asked to place the examined wrist in neutral position with the palm facing upward. A 13- to 18-MHz linear transducer (Aplio 500, Canon Medical Systems Europe B.V., Zoetermeer, Netherlands) was placed at the level between the pisiform and scaphoid bones, and the image sequence of the MN in the transverse view was acquired by an experienced physiatrist at 38 frames/s (FPS). The participants were instructed to repeat five cycles of finger motion lasting 10–15 s, with the fingers fully extended in an open-palm posture initially, followed by full flexion in a clenched-fist posture and then back to the open-palm posture. The MN in each cine loop was manually demarcated by another expert using Labelme [

      Wada K. Labelme: image polygonal annotation with Python, <https://github com/wkentaro/labelme>; 2016 [accessed 13 September 2021].

      ] at 4- to 10-frame intervals. There were a total of 20,294 annotated frames, and about 80% (16,034 frames), 10% (2130 frames) and 10% of the total were used for model training, validation and tests, respectively. None of the images of the participants chosen for the validation or tests were used for model training. Normal participants were separated when splitting the data as well to ensure there were data from one normal participant in the validation and testing data sets. We removed the subject and vendor information in individual frames and rescaled the image back to the original size of 960 × 720 pixels. RGB values of individual frames were normalized by the mean values of the total images to remove the visual enhancement applied by the machine manufacturer and the operator. We employed data augmentation by randomly cropping the image without removal of the MN, flipping the image horizontally and randomly rotating the image clockwise and counterclockwise (<5°). Each augmented image was further rescaled randomly to yield a series of images ranging from 576 × 432 to 1152 × 864 in size to construct an image pyramid rich in features of multiple scales. A schematic summarizing the image normalization and data augmentation process is provided in the left half in Figure 1.
      Figure 1
      Figure 1Schematic illustrating the process of data augmentation, the scaling down of the input image in the inference stage and the architecture of SOLOv2-MN.

      Model design and implementation

      The architecture of the proposed SOLOv2-MN is depicted in the right half of Figure 1. The original architecture of SOLOv2 consists of a CNN and a feature pyramid network (FPN) backbone for feature extraction, followed by the branches for semantic classification and mask prediction. SOLOv2 seeks to distinguish instances according to the size of the objects by assigning objects of different sizes to different levels of the FPN. To distinguish instances by their center locations, SOLOv2 divides the feature map at each level into S × S grids, and assigns each instance to the grid cell containing the instance center, assuming that each grid cell must belong to one instance and thus one semantic category. In other words, there are S × S location classes for the instance center, and all the pixels of one instance are assigned to the same location class. After the repeated stages of conv, group norm and ReLU, the semantic category branch outputs a tensor of S × S × C, where C is the class number and the C-dimension output indicates the semantic class probability. The mask kernel branch resizes the feature map at one pyramid level of shape H × W × P into S × S × P and employs a series of convolutions to generate the kernel tensor of size S × S × D. The D-dimensional output predicts the convolution kernel weight conditioned on the individual grid cell, where D denotes the number of parameters. The mask feature branch unifies the FPN features P2 to P5 into a single, high-resolution output and then employs 1 × 1 conv, group norm and ReLU to generate the mask feature representation of shape H × W × E. For each grid cell, the corresponding kernel predicted by the mask kernel branch is convoluted with the mask feature to obtain the instance mask. The duplicated predictions are suppressed by the matrix non-maximum suppression (NMS) algorithm, which updates confidence scores by the decay factors calculated from the monotonic decreasing functions of prediction overlaps, and conducts NMS with parallel matrix operations to improve the efficiency.
      Our modification mainly involved simplifying the computation tasks and tailoring the model to be more specific for MN segmentation. We reduced the grid numbers to a quarter of that originally proposed by SOLOv2, and scaled down the input images for inference into three quarters. Given that one grid cell is assumed to belong to one instance in SOLOv2, a large number of grid cells are required for an image of lots of objects. As only one instance is needed to be predicted in our images, we reduced the grid number corresponding to the FPN feature P2 to P6 from the original [40 × 40, 36 × 36, 24 × 24, 16 × 16, 12 × 12] to [10 × 10, 9 × 9, 6 × 6, 4 × 4, 3 × 3]. Note that the mask predicted in the inference stage was scaled back to the original size of the input image, that is, 960 × 720 pixels. Of note, the image pyramid constructed for data augmentation was rich in features of multiple scales and assisted the model in learning the precise segmentation in the scaled images. Furthermore, we customized the instance scales at each FPN level in accordance with our data set. The instance scales proposed by SOLOv2 were designed for images containing multiple, various objects of a wide range of scales such as those seen in the MS COCO data set. Comprehensive inspection of the ground truth data in our images revealed that the scale of MN was about 100–200 pixels, which was estimated from the square root of nerve height × width. Thus, the instance scales employed in P2 to P6 were 192–2048, 96–480, 48–384, 24–288 and 1–192. Note that the instance scales were overlapped across different FPN levels to increase the number of positive samples and improve learning efficiency in the multi-scaled features.
      The state-of-the-art models chosen for comparison of model performance with SOLOv2-MN were Mask-R-CNN, SOLOv2, YOLACT, BlendMask and the frameworks merging these models using ensemble learning principles. Figure 2 illustrates the three ensemble strategies adopted in the present work. The multi-training-stage ensemble used the bagging ensemble principle to minimize model overfitting (Fig. 2a). Given SOLOv2 with five independent runs of training as an example, samples were drawn independently from the training data set in each run; the first, second, third, fourth and fifth run stopped at 30, 40, 50, 60 and 70 epochs, respectively, to monitor model overfitting; and the predictions made by individual runs were combined using majority voting. The learning rate warmup strategy was employed in model training, which increased the learning rate slowly to the pre-determined value, and decreased after a specific epoch. The multi-model ensemble simply combined the predictions generated by the four models that were well trained with the training data set using majority voting (Fig. 2b). The third strategy, inspired by the stacking principle and illustrated in Figure 3c, combined the two strategies illustrated in Figure 2a and 2b. The multi-training-stage ensemble was first employed for the four models; the predictions provided by the individual ensemble were then combined using majority voting.
      Figure 2
      Figure 2Schematics of the ensemble models. (a) Multi-training-stage (MTS) ensemble using SOLOv2 as the example. (b) Multi-model ensemble. (c) Multi-model ensemble using the multi-training-stage ensemble of the individual model.
      Figure 3
      Figure 3Visualization of median nerve (MN) boundaries predicted by various models in selected frames from an image sequence. Color representation: ground truth (green); YOLACT (yellow); multi-model ensemble using multi-training-stage ensemble of individual model (blue); BlendMask (purple); SOLOv2 (magenta); SOLOv2-MN (red); Mask-R-CNN (orange).
      All models were implemented with ResNet-50-FPN or ResNet-101-FPN pre-trained on ImageNet as the network backbone, and the networks for instance segmentation were initialized with the weights pre-trained on the MS COCO data set for the 3× schedule, roughly 37 COCO epochs. Most of the hyperparameters for model training were initially set as a batch size of 12, optimizer with a stochastic gradient descent, warmup epoch of 1, warmup factor of 0.001, learning rate of 0.001, weight decay of 0.00001, momentum of 0.99 and maximal training epoch of 70. The parameters were fine-tuned for individual models according to the training results. The model inference was evaluated using a single Nvidia GeForce GTX 1080Ti GPU card, and the input images were scaled down into three quarters.

      Performance evaluation

      We evaluated model performance with the average intersection over union (IoU) value, average precision, average recall, Dice coefficient and inference speed.

      Results

      Table 1 summarizes the segmentation performance of the five models with either ResNet50 or ResNet101 as the CNN backbone. SOLOv2-MN outperformed other models in the inference speed for the test data set, while keeping segmentation accuracy on a par with that of SOLOv2 and Mask-R-CNN. Given input images of 960 × 720 pixels in size, SOLOv2-MN segmented MN at 28.9 FPS on average, which is almost 52% faster than the second fastest model (BlendMask with ResNet50) and visually real time for the operator. When implemented with ResNet50, the best IoU score was achieved by YOLACT, followed by SOLOv2-MN, SOLOv2, BlendMask and Mask-R-CNN in last place. The increase in model complexity with use of deeper CNN backbones slightly improved segmentation accuracy in Mask-R-CNN and SOLOv2, but yielded the opposite effect in YOLACT and BlendMask. Note that the proposed SOLOv2-MN attempted to speed up the inference process and was not implemented with the deeper backbone.
      Table 1Performance of models implemented with various backbones
      ModelBackboneAverage precisionAverage recallAverage IoUDice coefficientInference speed (FPS)
      SOLOv2-MNResNet-50-FPN72.6720.7590.8550.92228.9
      Mask-R-CNNResNet-50-FPN71.6020.7570.8420.91417.6
      Mask-R-CNNResNet-101-FPN72.6980.7630.8530.92114.5
      YOLACTResNet-50-FPN73.7000.7770.8610.92513.0
      YOLACTResNet-101-FPN73.4000.7760.8620.92610.2
      SOLOv2ResNet-50-FPN72.9260.7580.8570.92118.7
      SOLOv2ResNet-101-FPN73.5590.7650.8570.92314.5
      BlendMaskResNet-50-FPN71.7670.7570.8520.92019.2
      BlendMaskResNet-101-FPN70.3750.7440.8450.91614.6
      FPS, frames/s; IOU, intersection over union
      Table 2 outlines the segmentation performance of ensemble learning using different strategies. The segmentation accuracies achieved by the various ensemble models were close. The multi-model ensemble and the multi-model ensemble using the multi-training-stage ensemble of the individual model slightly outperformed others, followed by the multi-training-stage ensemble of YOLACT. The multi-model ensemble outperformed the multi-training-stage ensemble of various models in terms of IoU score by 0.3%–1.2%, and surpassed the single model implemented with ResNet-101-FPN by 1.2%, 0.3%, 0.8% and 2% of the scores achieved by Mask-R-CNN, YOLACT, SOLOv2 and BlendMask, respectively. In general, the multi-training-stage ensemble learning did not or only mild improved the segmentation accuracy for each model, as seen by comparing the results in Table 2 with those in Table 1.
      Table 2Performance of ensemble models using various strategies
      Ensemble strategyCombined modelsAverage IoUDice coefficient
      Multi-training stageMask-R-CNN with ResNet-101-FPN, stage of 20, 25 and 30 epochs0.8550.922
      Multi-training stageYOLACT with ResNet-101-FPN, stage of 22, 29 and 37 epochs0.8620.926
      Multi-training stageSOLOv2 with ResNet-101-FPN, stage of 45, 60 and 70 epochs0.8560.923
      Multi-training stageBlendMask with ResNet-101-FPN, stage of 60, 70 and 80 epochs0.8530.921
      Multi-modelMask-R-CNN, YOLACT, SOLOv2 and BlendMask; all with ResNet-101-FPN0.8660.928
      Multi-training stage + multi-modelMask-R-CNN, YOLACT, SOLOv2 and BlendMask; all with ResNet-101-FPN0.8660.928
      IOU, intersection over union.
      Figure 3 illustrates the visual effects of MN segmentation across an image sequence predicted by different models. Because there was subtle change in MN morphology and position between consecutive images, the images chosen for illustration were at intervals of 8 frames. In general, the disparity between the MN boundaries demarcated by the ground truth (green) and those predicted by model was the smallest for YOLACT (yellow) and the multi-model ensemble using the multi-training-stage ensemble of the individual model (blue); the largest for the BlendMask (purple); and inbetween for SOLOv2 (magenta), SOLOv2-MN (red) and Mask-R-CNN (orange). These results are in line with those outlined in Tables 1 and 2.
      Figure 4 illustrates typical image conditions that usually had low IoU scores, in which the original images were displayed in the left column, and the nerves were demarcated by the ground truth (green) and model prediction (red) data in those in the right column; the number in the panel bottom denoted the IoU score. These examples illustrate the frequently encountered challenges in MN segmentation for dynamic ultrasonography using DL. Unlike in static ultrasonography, the participant was asked to perform a particular maneuver in dynamic ultrasonography when the images were acquired. This may cause relative motion or tilting between the probe and the participant's skin, resulting in changes in the anatomical plane and the echoic appearance of anisotropic structures such as the nerve. The anisotropic nature of the nerve also rendered its echoic pattern vulnerable to change during motion even though the anatomical plane was kept stationary throughout the image acquisition. As in those examples illustrated in Figure 4a and 4b, the honeycomb echotexture normally seen in the nerve interior disappeared when the MN was moved by the adjacent tissues. Under that condition, anatomical structures resembling the shape of MN and having a honeycomb-like appearance, such as the branched vessel in Figure 4a and the muscle belly in Figure 4b, may have higher confidence scores predicted by the model and were mistaken for the MN. The ambiguously honeycomb-like appearance of these structures resulted primarily from tissue deformation or tissue anisotropy. The branched vessel may haphazardly exhibit the honeycomb-like appearance when the branch transection was oriented on the scanning plane and the branched lumens were crowded together owing to the movement of the adjacent tissues. When the probe was scanned exactly on the transection of the pennate structure or the fascicles of the hand muscle, the muscle may exhibit a honeycomb appearance similar to the nerve interior, with a hyperechoic outline mimicking the epineurium. When the image was not acquired fast enough to catch the moved nerve, the nerve would appear blurred, which increased the difficulty for the model of precisely demarcating the nerve peripheries, as in the example in Figure 4c. Furthermore, the nerve in the carpal tunnel is surrounded by flexor tendons, which are also anisotropic structures and resemble the nerves in shape, and may be mistaken for the whole nerve or part of the nerve by the model, as in the example in Figure 4d.
      Figure 4
      Figure 4Examples of median nerve (MN) images with low intersection over union (IoU) scores. In the left and right columns are the input image and the image superimposed with the MN peripheries delineated by the ground truth (green) and model prediction (red), respectively. The number at the bottom of the panels in the right column indicates the associated IoU score.

      Discussion

      In this work, we proposed a lightweight, instance segmentation model, SOLOv2-MN, tailored for real-time segmentation of the MN in dynamic ultrasonography. When compared with several state-of-the-art instance segmentation models, including Mask-R-CNN, SOLOv2, YOLACT and BlendMask, and the frameworks merging these models using ensemble learning strategy, SOLOv2-MN far outperformed other models with respect to inference speed, whereas the achieved segmentation accuracy was close to that of other models. When evaluated on a single Nvidia GeForce GTX 1080Ti GPU card, the MN was segmented by SOLOv2-MN at an average speed of 28.9 FPS, which was close to that customarily used in clinical operation. The best accuracy was gained by the multi-model ensemble, followed by YOLACT, SOLOv2, SOLOv2-MN, Mask-R-CNN and BlendMask in last place. Our results indicate that SOLOv2-MN segmented MN at a speed visually real time for the operator, without substantial compromise of the segmentation accuracy. Table 3 summarizes the model type, data set characteristics and performance of several recent works regarding MN segmentation for ultrasonography of the carpal tunnel using DL approaches. Note that a quantitative comparison of model performance between these approaches cannot be determined, as the reported performances were conducted using different data sets.
      Table 3Recent studies addressing DL-based MN segmentation for ultrasonography of carpal tunnel
      StudyMain modelData set characteristicsPerformance
      US typeParticipant No.Frame No.F scoreIoUDCIS (FPS)
      Horng et al. 2020
      • Horng MH
      • Yang CW
      • Sun YN
      • Yang TH
      DeepNerve: a new convolutional neural network for the localization and segmentation of the median nerve in ultrasound image sequences.
      U-Net+ convLSTM+ MaskTrackDynamic4 CTS, 2 normal∼100800.901N/A0.898N/A
      Wu et al. 2021
      • Wu CH
      • Syu WT
      • Lin MT
      • Yeh CL
      • Boudier-Revéret M
      • Hsiao MY
      • et al.
      Automated segmentation of median nerve in dynamic sonography using deep learning: evaluation of model performance.
      Mask-R-CNN, DeepLabv3+Dynamic52 CTS18,625N/A0.832N/A11.8
      Festen et al. 2021
      • Festen RT
      • Schrier VJMM
      • Amadio PC
      Automated segmentation of the median nerve in the carpal tunnel using U-Net.
      U-NetDynamic99 CTS5,560N/AN/A0.88N/A
      Di Cosmo et al. 2021
      • Cosmo MD
      • Chiara Fiorentino M
      • Villani FP
      • Sartini G
      • Smerilli G
      • Filippucci E
      • et al.
      Learning-based median nerve segmentation from ultrasound images for carpal tunnel syndrome evaluation.
      Mask-R-CNNStatic53 CTS151N/AN/A0.931N/A
      Di Cosmo et al. 2022
      • Di Cosmo M
      • Fiorentino MC
      • Villani FP
      • Frontoni E
      • Smerilli G
      • Filippucci E
      • et al.
      A deep learning approach to median nerve evaluation in ultrasound images of carpal tunnel inlet.
      Mask-R-CNNStatic22 CTS, 81 not CTS246N/AN/A0.868N/A
      Smerilli et al. 2022
      • Smerilli G
      • Cipolletta E
      • Sartini G
      • Moscioni E
      • Di Cosmo M
      • Fiorentino MC
      • et al.
      Development of a convolutional neural network for the identification and the measurement of the median nerve on ultrasound images acquired at carpal tunnel level.
      Mask-R-CNNStatic22 CTS, 81 not CTS246N/AN/A0.88N/A
      This workSOLOv2-MNDynamic59 CTS, 9 normal20,294N/A0.8550.92228.9
      CTS, carpal tunnel syndrome; DC, Dice coefficient; DL, deep learning; IOU, intersection over union; IS, inference speed; MN, median nerve; N/A, not available; US, ultrasonography.
      The integration of the prediction generated by several top-ranked models using ensemble learning did not markedly improve segmentation accuracy. Ideally, a successful ensemble model is expected to yield better generalization performance after combing several individual models. The bootstrap strategy of bagging reduced the impact of large data variance and prevented model overfitting. The learning rate warmup strategy also prevented the model from being overfitted to the data seen in the early iterations resulting from the large weighting induced by the large loss and the small batch size limited by the memory size. The subtle improvement seen in the ensemble models may arise from the fact that most of the individual models had converged to the focal minimum, leading to a limited difference in performance achieved by the individual models. Given the trade-off between computational cost and inference speed, our results suggest that additional works are required for the application of the ensemble model in the clinical setting.
      An emergent trend for the non-operative treatment of CTS is MN neurolysis using hydrodissection, which involves mechanical release of the entrapped nerve by perineural injection with dextrose solution [
      • Wu CH
      • Syu WT
      • Lin MT
      • Yeh CL
      • Boudier-Revéret M
      • Hsiao MY
      • et al.
      Automated segmentation of median nerve in dynamic sonography using deep learning: evaluation of model performance.
      ,
      • Lin MT
      • Liao CL
      • Hsiao MY
      • Hsueh HW
      • Chao CC
      • Wu CH
      Volume matters in ultrasound-guided perineural dextrose injection for carpal tunnel syndrome: a randomized, double-blinded, three-arm trial.
      ,
      • Wu YT
      • Ke MJ
      • Ho TY
      • Li TY
      • Shen YP
      • Chen LC
      Randomized double-blinded clinical trial of 5% dextrose versus triamcinolone injection for carpal tunnel syndrome patients.
      ]. The rapid, automated extraction of the morphological characteristics of the MN during finger motion provides clinicians with immediate, objective visualization of the therapeutic effect of such an intervention. The two upper panels in Figure 5a and 5b depict the temporal profile of the centroid displacement of the MN of one patient with CTS collected before and 1 wk after the patient received the perineural injection, respectively, as the patient was asked to flex and extend the fingers periodically. The centroid displacement was defined as the spatial deviation of the MN centroid position with respect to that at the beginning of finger flexion. Each data set contained three successive cycles of finger motion, during which the centroid displacement exhibited larger spatial excursion when compared with that acquired between the consecutive cycles. The beginning and ending of individual cycle were determined by the sharp change in slopes and is represented by dotted and dash line, respectively, and the lines bounding the first, second and third cycles were highlighted in orange, medium green and azure, respectively. To facilitate data comparison, we normalized the duration of individual cycles to 1 and aligned the normalized data with respect to the same temporal portion of the cycle, as illustrated in the lower panels in Figure 5a and 5b. Before the injection, the centroid appeared to be gradually pushed away from its initial position by the adjacent tendons as the fingers flexed, moved to the farthermost position after one third to one quarter of the cycle passed, fluctuated at the position for a long while and gradually returned as the fingers extended. In contrast, after the injection, the centroid spent less than one-fifth of the cycle to arrive at the farthest position, and returned to the initial position as the fingers flexed and extended, respectively. The displacement pattern exhibited a relatively flattened indentation roughly lasting from 30% to the 50% of the cycle, which clearly separated the finger extension phase from the flexion phase. In both the before-injection and after-injection conditions, the maximal centroid displacement resulting from the finger motion was similar. But in the former condition, all of the centroid traced in the three cycles did not immediately or completely return to the initial position at the cycle ending, resulting in a residual displacement of approximately 1 mm. This may be attributed to the more resistive or viscous environment surrounding the nerve in the condition. The mean (± SD) IoU score predicted by SOLOv2-MN in the post-injection images was 0.852 ± 0.076 and did not differ significantly from that predicted in the pre-injection images (0.842 ± 0.052). Taken together, these data suggest that the nerve was more mobile as the fingers moved after injection compared with before injection.
      Figure 5
      Figure 5Temporal dynamics of the centroid displacement of the median nerve (MN) of one patient with carpal tunnel syndrome (CTS) acquired (a) before and (b) 1 wk after the perineural injection. Centroid displacement with respect to the initial position was plotted against time (upper panel) or the normalized duration (lower panel). The data consisted of three successive cycles of finger flexion and extension. The beginning and ending of individual cycles in the upper panels were annotated by dotted and dashed lines, respectively, and the lines bounding the first, second and third cycles were highlighted in orange, medium green and azure, respectively. Data for individual cycles in the lower panels were depicted by the same color drawn for the bounding lines and temporally aligned to the same temporal portion of the cycle to facilitate data comparison.
      There are challenges and limitations in the approaches described. The main challenge was the limited access to well-labeled data sets of musculoskeletal ultrasound for model pre-training. We addressed this obstacle by using a network backbone pre-trained on ImageNet for feature extraction and the models pre-trained on the MS COCO data set for instance segmentation. We found that backbone pre-training did reduce the time spent for model convergence, whereas the pre-training for instance segmentation did not substantially decrease learning time. This discrepancy may be attributed to the difference in object appearance and color mode between the ultrasound and natural images. Because the proposed framework was designed and trained according to our data set, which was collected by one physician using a single probe of one machine, generalization of the proposed model to the clinical setting may require framework revision and additional training using data acquired from diverse sources. However, this is expected to impose a tremendous burden on experts and physicians to annotate the data when the sample size is enlarged because a thorough analysis of the dynamic maneuver requires as much information as possible from the image sequence. This may be solved by introducing semi- or unsupervised frameworks to facilitate model training. Another limitation of the work is the erroneous segmentation in ambiguous structures, as illustrated in Figure 4. One way to tackle this is to incorporate into the model strategies similar to that frequently employed by human experts, in which the anatomical nature of the confusing structures could be determined by inspection of a short image sequence prior to or after the frame containing the ambiguous structure.

      Conclusions

      In summary, we have proposed a simple architecture capable of visually real-time instance segmentation of MN at the wrist level. The model outperformed several state-of-the-art models in inference speed, whereas segmentation accuracy was on a par with that achieved by these models. The morphological dynamic extracted from the image sequence, such as the nerve centroid displacement, provides clinicians with valuable information regarding disease states immediately after the image acquisition. The ensemble models using the bagging and stacking strategies slightly improved segmentation accuracy. Our model has high potential to be incorporated into the clinical setting to assist real-time diagnosis and evaluation of carpal tunnel syndrome using dynamic ultrasonography.

      Conflict of interest

      The authors declare no competing interests.

      Acknowledgments

      The authors are grateful for the funding supported by the Ministry of Science and Technology in Taiwan under Grants 110-2627-H-028-002, 111-2221-E-002-082, 110-2314-B-002-070 and 111-2314-B-002-164-MY2.

      Data availability statement

      The source codes for the models presented in this article are openly available at https://github.com/ChengLiangYeh/real_time_seg_MN (accessed November 25, 2022). Because of the legal and privacy concerns of the acquired images, the authors were assured that the image data would remain confidential and would not be shared.

      References

        • Al-Hashel JY
        • Rashad HM
        • Nouh MR
        • Amro HA
        • Khuraibet AJ
        • Shamov T
        • et al.
        Sonography in carpal tunnel syndrome with normal nerve conduction studies.
        Muscle Nerve. 2015; 51: 592-597
        • Aseem F
        • Williams JW
        • Walker FO
        • Cartwright MS
        Neuromuscular ultrasound in patients with carpal tunnel syndrome and normal nerve conduction studies.
        Muscle Nerve. 2017; 55: 913-915
        • Roghani RS
        • Holisaz MT
        • Norouzi AAS
        • Delbari A
        • Gohari F
        • Lokk J
        • et al.
        Sensitivity of high-resolution ultrasonography in clinically diagnosed carpal tunnel syndrome patients with hand pain and normal nerve conduction studies.
        J Pain Res. 2018; 11: 1319-1325
        • Chang YC
        • Wang TG
        • Wu CH
        Sonographic detection of ulnar nerve compression during elbow extension.
        Am J Phys Med Rehabil. 2014; 93: 636-637
        • Chang KS
        • Cheng YH
        • Wu CH
        • Ozcakar L
        Dynamic ultrasound imaging for the iliotibial band/snapping hip syndrome.
        Am J Phys Med Rehabil. 2015; 94: e55-e56
        • Hsiao MY
        • Shyu SG
        • Wu CH
        • Ozcakar L
        Dynamic ultrasound imaging for type A intrasheath subluxation of the peroneal tendons.
        Am J Phys Med Rehabil. 2015; 94: e53-e54
        • Wu CH
        • Shyu SG
        • Ozcakar L
        • Wang TG
        Dynamic ultrasound imaging for peroneal tendon subluxation.
        Am J Phys Med Rehabil. 2015; 94: e57-e58
        • Kuo TT
        • Lee MR
        • Liao YY
        • Chen JP
        • Hsu YW
        • Yeh CK
        Assessment of median nerve mobility by ultrasound dynamic imaging for diagnosing carpal tunnel syndrome.
        PLoS One. 2016; 11e0147051
        • Wu CH
        • Syu WT
        • Lin MT
        • Yeh CL
        • Boudier-Revéret M
        • Hsiao MY
        • et al.
        Automated segmentation of median nerve in dynamic sonography using deep learning: evaluation of model performance.
        Diagnostics (Basel). 2021; 11: 1893
        • Park D
        Ultrasonography of the transverse movement and deformation of the median nerve and its relationships with electrophysiological severity in the early stages of carpal tunnel syndrome.
        PM R. 2017; 9: 1085-1094
        • Roomizadeh P
        • Eftekharsadat B
        • Abedini A
        • Ranjbar-Kiyakaleyeh S
        • Yousefi N
        • Safoora E
        • et al.
        Ultrasonographic assessment of carpal tunnel syndrome severity: a systematic review and meta-analysis.
        Am J Phys Med Rehabil. 2019; 98: 373-381
        • Wang Y
        • Filius A
        • Zhao C
        • Passe SM
        • Thoreson AR
        • An KN
        • et al.
        Altered median nerve deformation and transverse displacement during wrist movement in patients with carpal tunnel syndrome.
        Acad Radiol. 2014; 21: 472-480
        • Lin MT
        • Liu IC
        • Chang HP
        • Wu CH
        Impaired median nerve mobility in patients with carpal tunnel syndrome: a systematic review and meta-analysis.
        Eur Radiol. 2022; (Published online November 17)
        • Hung CY
        • Lam KHS
        • Wu YT
        Dynamic ultrasound for carpal tunnel syndrome caused by squeezed median nerve between the flexor pollicis longus and flexor digitorum tendons.
        Pain Med. 2022; 23: 1343-1345
        • Impink BG
        • Gagnon D
        • Collinger JL
        • Boninger ML
        Repeatability of ultrasonographic median nerve measures.
        Muscle Nerve. 2010; 41: 767-773
        • Fowler JR
        • Hirsch D
        • Kruse K
        The reliability of ultrasound measurements of the median nerve at the carpal tunnel inlet.
        J Hand Surg. 2015; 40: 1992-1995
        • Schrier VJMM
        • Evers S
        • Geske JR
        • Kremers WK
        • Villarraga HR
        • Kakar S
        • et al.
        Median nerve transverse mobility and outcome after carpal tunnel release.
        Ultrasound Med Biol. 2019; 45: 2887-2897
        • Gonzalez-Suarez CB
        • Buenavente MLD
        • Cua RCA
        • Fidel MBC
        • Cabrera JTC
        • Regala CFG
        Inter-rater and intra-rater reliability of sonographic median nerve and wrist measurements.
        J Med Ultrasound. 2018; 26: 14
        • Festen RT
        • Schrier VJMM
        • Amadio PC
        Automated segmentation of the median nerve in the carpal tunnel using U-Net.
        Ultrasound Med Biol. 2021; 47: 1964-1969
        • Abraham N
        • Illanko K
        • Khan N
        • Androutsos D
        Deep learning for semantic segmentation of brachial plexus nerves in ultrasound images using U-Net and M-Net.
        in: Proceedings, Conference on Deep Learning for Semantic Segmentation of Brachial Plexus Nerves in Ultrasound Images Using U-Net and M-Net, New York IEEE, 2019: 85-89
        • Baby M
        • Jereesh AS
        Automatic nerve segmentation of ultrasound images.
        in: Proceedings, International Conference of Electronics, Communication and Aerospace Technology (ICECA), New York IEEE, 2017: 1107-1112
        • Cosmo MD
        • Chiara Fiorentino M
        • Villani FP
        • Sartini G
        • Smerilli G
        • Filippucci E
        • et al.
        Learning-based median nerve segmentation from ultrasound images for carpal tunnel syndrome evaluation.
        Annu Int Conf IEEE Eng Med Biol Soc. 2021; 2021: 3025-3028
      1. Hafiane A, Vieyres P, Delbos A. Deep learning with spatiotemporal consistency for nerve segmentation in ultrasound images. arXiv 1706.05870. 2017.

        • Horng MH
        • Yang CW
        • Sun YN
        • Yang TH
        DeepNerve: a new convolutional neural network for the localization and segmentation of the median nerve in ultrasound image sequences.
        Ultrasound Med Biol. 2020; 46: 2439-2452
        • Zhao H
        • Sun N
        Improved U-Net model for nerve segmentation.
        in: Zhao Y Kong X Taubman D Image and Graphics. ICIG 2017. Lecture Notes in Computer Science, Vol. 10667. Proccedings of conference on improved U-Net model for nerve segmentation, Cham Springer, 2017: 496-504
        • Smerilli G
        • Cipolletta E
        • Sartini G
        • Moscioni E
        • Di Cosmo M
        • Fiorentino MC
        • et al.
        Development of a convolutional neural network for the identification and the measurement of the median nerve on ultrasound images acquired at carpal tunnel level.
        Arthritis Res Ther. 2022; 24: 38
        • Di Cosmo M
        • Fiorentino MC
        • Villani FP
        • Frontoni E
        • Smerilli G
        • Filippucci E
        • et al.
        A deep learning approach to median nerve evaluation in ultrasound images of carpal tunnel inlet.
        Med Biol Eng Comput. 2022; 60: 3255-3264
        • Ronneberger O
        • Fischer P
        • Brox T
        U-Net: Convolutional networks for biomedical image segmentation.
        in: Proceedings of conference U-Net: convolutional networks for biomedical image segmentation, MICCAI 2015: Medical Imaging Computing and Computer-Assisted Intervention, Cham Springer, 2015: 234-241
        • He K
        • Gkioxari G
        • Dollár P
        • Girshick R
        Mask R-CNN.
        in: Proceedings, 2017 IEEE International Conference on Computer Vision (ICCV). The Institute of Electrical and Electronics Engineers, Inc. Piscataway, New Jersey, USA2017: 2980-2988 (22–29 October)
      2. Chen LC, Zhu Y, Papandreou G, Schroff F, Adam HJ. Encoder–decoder with atrous separable convolution for semantic image segmentation. CoRR Abs/1802.02611. 2018.

        • Kirillov A
        • Girshick R
        • He K
        • Dollár P
        Panoptic feature pyramid networks.
        in: Proccedings of conference on panoptic feature pyramid networks. The Institute of Electrical and Electronics Engineers, Inc. Piscataway, New Jersey, USA2019: 6399-6408
        • Wang X
        • Zhang R
        • Kong T
        • Li L
        • Shen C
        SOLOv2: dynamic and fast instance segmentation.
        in: Advances in Neural information processing systems. 2020 (33: p. 17721–32)
      3. Bolya D, Zhou C, Xiao F, Lee YJ. YOLACT: real-time instance segmentation. arXiv 1904.02689. 2019.

      4. Chen H, Sun K, Tian Z, Shen C, Huang Y, Yan Y. BlendMask: top-down meets bottom-up for instance segmentation. arXiv 2001.00309. 2020.

      5. Wada K. Labelme: image polygonal annotation with Python, <https://github com/wkentaro/labelme>; 2016 [accessed 13 September 2021].

        • Lin MT
        • Liao CL
        • Hsiao MY
        • Hsueh HW
        • Chao CC
        • Wu CH
        Volume matters in ultrasound-guided perineural dextrose injection for carpal tunnel syndrome: a randomized, double-blinded, three-arm trial.
        Front Pharmacol. 2020; 11625830
        • Wu YT
        • Ke MJ
        • Ho TY
        • Li TY
        • Shen YP
        • Chen LC
        Randomized double-blinded clinical trial of 5% dextrose versus triamcinolone injection for carpal tunnel syndrome patients.
        Ann Neurol. 2018; 84: 601-610