Semantic image segmentation, Dataset, Polygonal object, Transparent background, Augmentation, Segmentation network architecture, Empty image segmentation


Background. Every new semantic image segmentation task requires fine-tuning the segmentation network architecture that is very hard to perform on images of high resolution, which may contain many categories and involve huge computational resources. So, the question is whether it is possible to test segmentation network architectures much faster in order to find optimal solutions that could be imparted to real-world semantic image segmentation tasks.

Objective. The goal of the article is to design an infinitely scalable dataset, which could serve as a test platform for semantic image segmentation. The dataset will contain any number of entries of any size required for testing.

Methods. A new artificial dataset is designed for semantic image segmentation. The dataset is of grayscale images with the white background. A polygonal object is randomly placed on the background. The polygon edges are black, whereas the polygon body is transparent. Thus, a dataset image is a set of edges of a convex polygon on the white background. The polygon edge is one pixel thick but the transition between the white background and the polygon black edges includes gray pixels in the vicinity of one-pixel edges. Such a noise is an aftermath of the image file format conversion process. The number of edges of the polygon is randomly generated for every next image. The polygon size and position of its center of mass with respect to image margins are randomized as well.

Results. A toy dataset of any volume and image size from scratch can be generated. Besides, the dataset generator automatically labels pixels to classes “background” and “polygon”. The dataset does not need augmentation. Eventually, the dataset is infinitely scalable, and it will serve as a fast test platform for segmentation network architectures.

Conclusions. The considered examples of using the polygonal dataset confirm its appropriateness and capability of networks trained on it to successfully segment stacks of objects. Additionally, a criterion of early stopping is revealed based on empty image segmentation.

Author Biography

Vadim V. Romanuke, Polish Naval Academy

Вадим Васильевич Романюк


H.-J. He et al., “Image segmentation techniques”, in Computer Vision Technology for Food Quality Evaluation, 2nd ed., D.-W. Sun, Ed. San Diego: Academic Press, 2016, pp. 45–63. doi: 10.1016/B978-0-12-802232-0.00002-5

J. Rogowska, “Overview and fundamentals of medical image segmentation”, in Handbook of Medical Image Processing and Analysis, 2nd ed., I.N. Bankman, Ed. San Diego: Academic Press, 2009, pp. 73–90. doi: 10.1016/B978-012373904-9.50013-1

J.-T. Chien, “Deep neural network”, in Source Separation and Machine Learning, J.-T. Chien, Ed. Academic Press, 2019, pp. 259–320. doi: 10.1016/B978-0-12-804566-4.00019-X

V. Badrinarayanan et al., “SegNet: A deep convolutional encoder-decoder architecture for image segmentation”, IEEE Trans. Pattern Anal. Mach. Intell., vol. 39, iss. 12, pp. 2481–2495, 2017. doi: 10.1109/TPAMI.2016.2644615

V.V. Romanuke, “Smooth non-increasing square spatial extents of filters in convolutional layers of CNNs for image classifica­tion problems”, Appl. Comp. Syst., vol. 23, pp. 52–62, 2018. doi: 10.2478/acss-2018-0007

V.V. Romanuke, “Appropriate number and allocation of ReLUs in convolutional neural networks”, Naukovi Visti NTUU KPI, no. 1, pp. 69–78, 2017. doi: 10.20535/1810-0546.2017.1.88156

F.C. Pereira and S.S. Borysov, “Machine learning fundamentals”, in Mobility Patterns, Big Data and Transport Analytics, C. Antoniou et al., eds. Elsevier, 2019, pp. 9–29. doi: 10.1016/B978-0-12-812970-8.00002-6

V.V. Romanuke, “An attempt of finding an appropriate number of convolutional layers in CNNs based on benchmarks of heterogeneous datasets”, Electrical, Control and Communication Engineering, vol. 14, pp. 51–57, 2018. doi: 10.2478/ecce-2018-0006

V.V. Romanuke, “Appropriate number of standard 2´2 max pooling layers and their allocation in convolutional neu­- ral networks for diverse and heterogeneous datasets”, Inform. Technol. Manag. Sci., vol. 20, pp. 12–19, 2017. doi: 10.1515/itms-2017-0002

V.V. Romanuke, “Appropriateness of DropOut layers and allocation of their 0.5 rates across convolutional neural networks for CIFAR-10, EEACL26, and NORB datasets”, Appl. Comp. Syst., vol. 22, pp. 54–63, 2017. doi: 10.1515/acss-2017-0018

J.-J. Lv et al., “Data augmentation for face recognition”, Neurocomputing, vol. 230, pp. 184–196, 2017. doi: 10.1016/j.neucom.2016.12.025

D. Avis et al., “Simple on-line algorithms for convex polygons”, in Machine Intelligence and Pattern Recognition, vol. 2, G.T. Toussaint, Ed. North-Holland, 1985, pp. 23–42. doi: 10.1016/B978-0-444-87806-9.50007-4

E. Horowitz and M. Papa, “Polygon clipping: Analysis and experiences”, in Theoretical Studies in Computer Science, J.D. Ul­lman, Ed. Academic Press, 1992, pp. 315–339. doi: 10.1016/B978-0-12-708240-0.50016-2

R.L. Bowman, “Evaluating pseudo-random number generators”, in Chaos and Fractals, C.A. Pickover, Ed. Elsevier Science, 1998, pp. 133–142. doi: 10.1016/B978-044450002-1/50023-0

T. Bradley et al., “Parallelization techniques for random number generators”, in GPU Computing Gems Emerald Edition. Applications of GPU Computing Series, W.W. Hwu, Ed. Morgan Kaufmann, 2011, pp. 231–246. doi: 10.1016/B978-0-12-384988-5.00016-4

J. Jackman, “The basics of how compositing works”, in Bluescreen Compositing, J. Jackman, Ed. Oxford, UK: Focal Press, 2007, pp. 231–246. doi: 10.1016/B978-1-57820-283-6.50004-8

T. Praczyk, “A quick algorithm for horizon line detection in marine images”, J. Marine Sci. Technol., vol. 23, iss. 1, pp. 164–177, 2018. doi: 10.1007/s00773-017-0464-8