Database construction and remodeling method on traditional Yi nationality patterns of China with GAN model

admin

12 months ago

Database construction and remodeling method on traditional Yi nationality patterns of China with GAN model

Table of Contents

Building process from image database to GAN modeling

Central to this study is a method for transforming traditional patterns (Fig. 1). Later research will explore the implementation and applications of this transformation method.

This research method and process consists of four parts. The following sections will detail the experimental methods and key technologies employed in each part.

The workflow is divided into the following four steps:

Analysis of esthetic style of traditional patterns: This study conducts an in-depth analysis of the esthetic styles of traditional Yi ethnic patterns, focusing on color, form, and application vectors. This foundational analysis offers theoretical insights and design guidance for digital preservation and contemporary applications.
Database establishment for traditional patterns: A comprehensive database of traditional Yi patterns is developed through pattern collection, feature extraction, and image annotation. This database forms the foundation for model training and creative pattern design.
Developing the pattern-generating model: This study trains a generative adversarial network (GAN) to synthesize pattern images, optimizing the generator and discriminator to enhance quality and diversity. The GAN model, central to the pattern-generation framework, comprises two key components: a generator that creates new patterns and a discriminator that evaluates their similarity to authentic Yi patterns. During training, images from the traditional pattern database teach the generator to produce novel patterns with traditional features, while the discriminator learns to differentiate real patterns from generated ones. As training progresses, the generator utilizes learned features to produce innovative patterns, a pivotal step in reshaping traditional designs. The discriminator is refined to better distinguish real patterns from generated ones, prompting the generator to create higher-quality designs. Generator refinement is guided by feedback from the discriminator, involving adjustments to architecture or training parameters to improve quality and diversity. The refined generator ultimately synthesizes pattern images, used in evaluation and application phases to validate the model’s effectiveness and innovation. This framework reimagines traditional Yi patterns, supporting cultural preservation while offering technical guidance and inspiration for modern product design.
Application and evaluation of patterns: Patterns generated by the GAN are evaluated for innovation and practical applications.

Analysis of esthetic style of traditional patterns

Foundational in this research is an analysis of art style. This analysis develops a comprehensive understanding of the visual language, design principles, and cultural significance of traditional ethnic patterns. This understanding offers accurate and effective data, guiding the deep learning processes of the GAN models. By exploring the unique visual characteristics and cultural meanings of ethnic patterns, this analysis offers clear identification of pattern characteristics for training the GAN models, ensuring that generated patterns inherit traditional esthetic qualities while preserving cultural integrity. This process creates a high-quality training dataset, enabling the model to effectively identify and reconstruct the visual characteristics of various patterns. The key design elements identified through this analysis also act as critical parameters in the pattern generation stage, optimizing the detail quality and overall coherence. The following section will illustrate this analytical phase by focusing on “Yi nationality patterns” as a case study, detailing the methodological approach.

Yi culture, influenced by natural forces, scientific and technological advancements, societal structures, economic progress, and social mores, has evolved over a long history. This study, drawing upon a three-tiered theory of culture, employs extensive fieldwork and archival research to appraise traditional Yi patterns. These patterns are analyzed across material, functional/technological, and spiritual/esthetic dimensions to thoroughly explore their cultural significance and artistic merit.

The visual and concrete carriers of Yi culture are deep in the material stratum. At the behavioral level, underlying cultural principles and their transformations are reflected in the conventions and practices connected with the material stratum. The spiritual/esthetic level forms the bedrock and foundation for cultural continuity. This necessitates further research of the material, behavioral, and spiritual/esthetic dimensions of traditional Yi patterns, dividing them into various cultural characteristics, and categorizing their cultural expressions and esthetic qualities, as detailed in Table 1.

Table 1 Analysis of esthetic style of Yi traditional patterns

The outward expression of cultural values is clearly apparent in the material stratum, which exhibits a rich array of cultural symbols through appropriate visual forms. This study mainly evaluates the characteristics of Yi patterns across three dimensions: type, color, and composition. In terms of pattern types, Yi artisans employ methods of exaggeration and deformation while maintaining the essential forms of their subjects. They express their aspirations for a better life through patterns ranging from abstract and irregular, to concrete representations of nature such as flora, fauna, and insects, and finally to regular geometric designs. Regarding color, the Yi people of Liangshan primarily utilize the main colors of red, yellow, and black, coordinating these with secondary hues. Compositionally, Yi artists abstract the principles of nature to express an esthetic of order. They adhere to principles of variety, unity, balance, proportion, scale, rhythm, and rhyme. Moreover, they utilize methods such as repetition, continuity, symmetry, equilibrium, and hierarchical organization to create patterns that are both complex and well-ordered. Common compositional structures include several designs, radiating patterns, and bilateral symmetry^9,10,29.

In their practical and technological dimensions, Yi nationality patterns constitute a social practice expressed through production methods, lifestyles, and customs shaped by specific environments. This study evaluates the folk character of these patterns, analyzing them through the perspective of application scenarios and production methods. Regarding application scenarios, Yi patterns appear frequently in traditional architecture, furniture, and craft objects such as lacquerware, silverware, apparel, carvings, and paintings. Their configurations are constrained by factors such as production methods, available materials, and artistic intent. As for production methods, Yi apparel typically incorporates embroidery, floral artistry, and painted decoration. Embroidery is frequently employed in blouses, headscarves, small bags, and women’s skirts, while floral artistry comprises methods including picking, pasting, piercing, trifling, appliqué, and coiling. Painting, accordingly, comprises three basic components: dots, lines, and planes. Yi crafts and methods demonstrate programmatic characteristics that can exhibit order and rhythm³⁰.

The belief and esthetic dimensions of these patterns offer insights into the psychology and humanistic values of the Yi people. This study summarizes their ethnic beliefs, esthetic principles, and underlying philosophies of creation. The beliefs of the Yi people generally center around the veneration of nature, ancestors, and symbolic representation. Their esthetic preferences often draw inspiration from nature and work, giving voice to aspirations for a wonderful and united existence. In their creative philosophy, the Yi people prioritize utility, ethnic identity, religious significance, and a primal esthetic sensibility.

Analyzing the esthetic qualities of Yi patterns through the method of esthetic logic can render the cultural phenomena of traditional Yi nationality patterns more regular and give a theoretical basis for the digital transformation of patterns. Digitally reimagining traditional Yi patterns necessitates exploring material properties, integrating contemporary values, and welcoming innovative components. It also demands adherence to the cultural core, assimilating valuable advancements from broader societal development, and changing the ethnic cultural meaning and artistic character into a design language compatible with computer-based artificial intelligence.

Furthermore, it is worth noting that the construction of the GAN model primarily depends on a detailed analysis of tangible pattern features, including morphology, color, and medium. In contrast, analyzing abstract elements like beliefs and philosophies offers unique academic value. Such analysis helps researchers systematically classify and manage pattern data during collection and offers a multidimensional retrieval system for later database use. For instance, researchers can search the database by visual attributes like color and morphology or by abstract keywords such as symbolic meaning, implications, or religious beliefs. This approach improves database clarity, enabling efficient matching with specific research needs.

Meanwhile, a key goal of reshaping patterns with the GAN model is to integrate generated patterns into modern product design, enhancing their market potential. Summarizing the abstract meanings of generated patterns provides valuable references for pattern applications. For instance, product designers can choose patterns based on visual features or their cultural connotations, emotional resonance, and social symbolism. Distilling and summarizing abstract meanings enhance the cultural value and usability of generated patterns while offering theoretical and practical guidance for applying traditional patterns in modern contexts. Ultimately, this research provides new pathways for innovatively inheriting and commercializing traditional patterns.

Following an analysis of the patterns’ esthetic qualities, the next phase involves developing a GAN model-based pattern database, organized according to these esthetic principles. This process comprises six key steps: manual image data collection, image pre-processing, calculation of pattern eigenvectors, computational data acquisition, pattern annotation, and adjustment and improvement of the database. Utilizing Yi nationality patterns as a case study, this research will continue to take Yi nationality patterns as an example to introduce the specific operational methods for each step.

First, manual data collection is essential for database construction. Once a sufficient sample size is achieved, eigenvector extraction is performed on these samples to enable computerized database expansion. Finally, the database is improved through numbering and manual adjustment. This lays the foundation for the construction of image-generation models. This stage applies digital acquisition, processing, repair, preservation, management, integration, and other means to store, acquire, and process pattern data through artificial intelligence and computer programs³¹. It builds a database based on this to achieve digital inheritance and protection of traditional culture. The entire process is divided into the following six steps, as presented in Fig. 2:

Step 1—Manual acquisition of patterns: First, pattern data are collected from websites featuring Yi traditional culture, with promotional materials, e-commerce platforms, cultural and creative products, apparel, and historical artifacts. This dataset, exceeding 1000 pieces, produces a foundation for image pre-processing and computerized image acquisition.
Step 2—Image pre-processing: Image pre-processing is essential to enhance image quality and recover valuable authentic information potentially compromised by varying degrees of noise introduced during image acquisition. This procedure involves the following three steps. Specifically, grayscale or binary conversion is omitted for the patterns, as the color of Yi nationality patterns is a crucial medium for conveying ethnic beliefs and meanings.

Fig. 2

The database construction of Yi traditional pattern.

Filter pre-processing: Common filters include the mean filter, median filter, and Gaussian filter. Considering the complexity of the patterns and the Gaussian filter’s beneficial properties, such as its effective smoothing, edge preservation, and adaptability, it is selected for pattern pre-processing³².

Denoising pre-processing: Established methods include wavelet denoising, BM3D denoising, and NL means denoising. Wavelet denoising is preferred for pre-processing pattern images due to its superior performance in preserving image detail, frequency selectivity, adaptability, and broad applicability³³.

Normalization pre-processing: This critical step adjusts image brightness, contrast, and color balance, and scales pixel values to a specified range, optimizing them for algorithmic requirements³⁴.

$$H(i,j)=1/(2\pi ^)* \exp (-(^+^)/(2^))$$

(1)

$${X}^{\wedge}={W}^{\wedge}\,\cdot\,X$$

(2)

$$I^{\prime\prime} (x^{\prime} ,y^{\prime} )=I^{\prime} (x^{\prime} ,y^{\prime} )-{I}\_{\min }/{I}\_{\max }-{I}\_{\min }$$

(3)

The calculations for the three preceding pre-processing steps are given by formulas (1)–(3), respectively. In formula (1), “H(i, j)” denotes the elements of the convolution kernel, “m” and “n” represent the kernel’s center coordinates, and “σ” is the Gaussian function’s standard deviation. In formula (2), “W^” denotes the inverse wavelet transform matrix, and “X^“ represents the denoised image. In formula (3), “I″”and “I′’” represent the pixel values of the normalized and adjusted images, respectively, while “I_min” and “I_max” denote the corresponding minimum and maximum pixel values. In short, pre-processing pattern images in MATLAB, OpenCV, and Adobe Photoshop utilizing this procedure efficiently creates a preliminary database of Yi nationality patterns derived from manually collected samples.
Step 3—Pattern eigenvector calculation: To expand the dataset through automated online image collection, it is essential to verify whether these online images indeed represent Yi nationality patterns. The eigenvector calculation, involving geometric and textural features, serves as a critical verification step, ensuring that the constructed pattern database accurately reflects the characteristics of traditional Yi patterns. This process enhances the precision of subsequent classification, recognition, and image-matching tasks.

The technical implementation of this step is structured into three clearly defined phases:

Feature retrieval: Feature retrieval encompasses both geometric and textural aspects. Geometric features primarily include pattern dimensions, contours, and edge characteristics, which are extracted via edge detection algorithms (e.g., Canny edge detection) and contour detection algorithms. Textural features such as color distribution, grayscale variations, and texture patterns are acquired through Gaussian filtering and wavelet transforms. Specifically, Gaussian filtering reduces noise and preserves edge clarity, which is crucial for accurate edge and contour detection. Its mathematical expression is provided as follows³⁵:

$$G(x,y)={1}/({2\pi {\sigma }^{2}})\,*\,{{{exp}}}\,{(-({{x}^{2}+{y}^{2}})/({2{\sigma }^{2}))}}$$

(4)

where “G (x, y)” denotes the two-dimensional Gaussian filter function and represents the standard deviation. Gaussian filtering efficiently removes image noise while preserving crucial edge details. The wavelet transform is mathematically expressed as

$$W=\mathop{\sum}\limits_{n}f(n)\,\psi(n)$$

(5)

where “f(n)” is the original image data at sampling point “n”, “ψ(n)” represents the wavelet transform function, and signifies the wavelet coefficient values.

In practical implementation, we utilized Matlab’s image processing toolbox to perform these operations, applying built-in functions for Gaussian filtering, edge detection (such as the Canny or Sobel method), and wavelet transforms (e.g., Haar wavelet) to achieve robust feature extraction.

Feature encoding: Feature encoding is performed using vector quantization with the k-means clustering algorithm to encode the extracted pattern features. Vector quantization aims to reduce feature dimensionality, facilitating efficient storage and retrieval. This step’s mathematical formulation is expressed as follows^36,37: Given an eigenvector “x_i” and an encoding vector “s_i” of length “m”, the frequency “s_i(j)” with which eigenvector “x_i” maps to the “jth” cluster is represented by

$${s}_{i}(j)=\mathop{\sum }\limits_{{x}_{{\rm{i}}}{\rm{\epsilon }}{C}_{j}}{{\rm{||}}{x}_{i}-{c}_{j}{\rm{||}}}^{2}$$

(6)

where “C_j” indicates the set of eigenvectors assigned to the “jth” cluster, and “c_j” represents the cluster center.

Feature matching: Encoded eigenvectors of unidentified patterns are systematically compared to established reference eigenvectors within the database. The matching process involves quantitatively measuring the similarity or distance between encoded eigenvectors. By using Python-based scripts, the measured eigenvectors are effectively matched against stored reference data, enabling the reliable classification and recognition of Yi nationality patterns.

The resulting eigenvectors encapsulate essential visual attributes, including texture, color features, contour shapes, grayscale information, directional lines, and proportional color-block distributions. This rigorous technical approach ensures the database’s robustness, supporting subsequent computerized pattern recognition and extraction tasks detailed in step 4.
Step 4—Computerized pattern acquisition: Web crawlers are developed targeting a multitude of websites related to Yi nationality patterns, including those focused on cultural promotion and dissemination, shopping, clothing, historical relics, and cultural creativity. From these web pages, images are extracted and downloaded. The Python libraries “requests” and “Beautiful Soup” are utilized for retrieving web pages, parsing image links, and downloading the images. Once the image data is obtained, the pre-processing, eigenvector recognition, and saving procedures described in Step 2 are repeated.
Step 5—Labeling of patterns: This step involves the collection, analysis, and annotation of traditional Yi patterns, drawing upon their esthetic characteristics, feature encoding, pattern interpretation, and common usage scenarios. A labeling system is then established, based on F (pattern), C (color), and S (composition), utilizing a one-to-many image-label relationship (PhotoID → label1, label2, label3). Each image is assigned a unique feature label, allowing designers to quickly access the label information for any given image.
Step 6—Manual adjustment and improvement of database: Following the completion of the preceding five steps, a designer intervenes to perform final adjustments to the database. This includes verifying the accuracy and effectiveness of the data, assessing the feasibility of reintegrating previously removed images, and carrying out any other necessary operations. Moreover, the designer can enrich the database by generating original patterns through induction, abstraction, deformation, decomposition, and combination, thereby achieving pattern decomposition and transformation, optimizing and generalization, variation, and extension of patterns.

Construction of an image-generation model for traditional patterns

Following the curation and administration of the image database, the process proceeds to its crucial third phase: the development of the GAN model. In contrast to traditional handmade patterns, GAN models can rapidly produce numerous novel patterns by adjusting model parameters, thereby governing the patterns’ appearance and characteristics. This introduces fresh concepts and opportunities for the design and progress of traditional patterns. Deep learning image generation models, containing autoencoders, generative adversarial networks, and deep belief networks, etc, help image synthesis, information retrieval, and trans model conversions between images, text, and audio³⁸. This study will employ a GAN, considering its strong generative capabilities, unsupervised learning nature, capacity for diversity, scalability, and interpretability, to construct a generative model for traditional Yi patterns. This construction will unfold in three steps:

Step 1—GAN model architecture: This study will utilize a deep convolutional neural network-based GAN (DCGAN) comprising a generator and a discriminator. The GAN seeks to create images, through the generator, sufficiently realistic to mislead the discriminator. The corresponding formulations for the generator and discriminator are presented in Eqs. (7) and (8). In these equations, “G” represents the generator, “D” denotes the discriminator, “z” indicates random noise, “θ“ expresses the parameters of the generator and discriminator, “x” depicts the input or generated pattern image, and H, W, and C, respectively, represent the height, width, and number of channels of the generated image.

$$G({z;}\,{\theta }_{g}):z\,\epsilon \,{R}^{d}\to x\,\epsilon \,{R}^{H}\times^{{W}}\times^{C}$$

(7)

$$D({x;}\,{\theta }\_{d}):x\,\epsilon \,{R}^{H}\times^{W}\times^{C}\to [0,1]$$

(8)

The objective function of the GAN model includes the loss functions of both the generator and discriminator, and measures the difference between generated and real images³⁹. The mathematical representations of these loss functions are given in formulas (9) and (10). In these formulas, “L” represents the loss function, “z” denotes random noise, “pz(z)” expresses the noise distribution, “PDATA(x)” indicates the real image distribution, “x” depicts a real pattern image, and “1−D(G(z))” illustrates the discriminator’s discrimination result on the generated image.

$${L}_{{\rm {G}}}=-\frac {1}{2}\,E_{\{{z \sim {{p}_{z(z)}\}}\left[\log(D(G(z)))\right]}}$$

(9)

$${L}_{{\rm {D}}}=-\frac {1}{2}\,E_{\{{x \sim {p}_{{{data}}(x)}\}\,[\log (D(x))]}}-\frac {1}{2}E_{{\{{z \sim {p}_{z}(z)\}}[\log (1-D(G(z)))]}}$$

(10)
Step 2—GAN model training: The generator and discriminator are trained independently and iteratively, with the loss function monitored until the GAN model’s loss function converges⁴⁰. First, the discriminator is trained on real images, its parameters are updated through gradient descent to reduce its loss function. Then, the generator is trained to produce images from noise, which are then passed to the discriminator. This process also decreases the loss function through gradient descent applied to the generator’s parameters^41,42,43,44. The corresponding mathematical representations are detailed in formulas (11) and (12). In these formulas, “θ“ represents the parameters of the discriminator or generator, “α“ denotes the learning rate, and “∇“ expresses the gradient of the discriminator’s or generator’s loss function with respect to the discriminator’s parameters. This iterative parameter update through gradient descent derives from the network’s ability to create and recognize images.

$${\theta }_{{{d}}}={\theta }_{{{d}}}-\alpha {\nabla }_{{\{{\theta }_{{{d}}}\}}{L}_{{\rm {D}}}}$$

(11)

$${\theta }_{{{g}}}={\theta }_{{{g}}}-\alpha {\nabla }_{{\{{\theta }_{{\rm {g}}}\}}{L}_{{\rm {D}}}}$$

(12)
Stage 3—Pattern image synthesis: Following the training of the GAN model with the complete codebase, the generator component can be employed to synthesize novel pattern images. By introducing random noise as input, the generator produces new pattern images, thereby achieving the innovative generation of traditional Yi patterns.

In summary, these three stages enable the construction of an image synthesis model for traditional Yi patterns. Simultaneously, the innovative application of generative models offers new avenues and concepts for the innovation and development of traditional patterns.

Application and evaluation of innovative patterns

After the successful construction and generation of pattern databases, image models, and simple patterns, an evaluation of the synthesized images is necessary before their practical application. This study adopts three methods: DR, MOS, and FID, to evaluate the generated pattern images. DR measures image diversity and uniformity by calculating inter-cluster distances. MOS is a subjective evaluation method based on human ratings, while FID compares the distributions of real and generated images³⁸. A lower FID value indicates higher quality. The procedures for these three methods are detailed in formulas (13), (14), and (15).

$${{{DR}}}=\sum (1-D(i,j))/(k(k-1)/2)$$

(13)

$${{{MOS}}}=(\sum {qi})/n$$

(14)

$${{FID}}=\left\|\mu^{1}-\mu^{2}\right\|^{2}+{{Tr}}\left({\Sigma}^{1}+{\Sigma}^{2}-{2}({\Sigma}^{1}{\Sigma}^{2})^{\{\frac{1}{2}\}}\right)$$

(15)

In formula (13), “D” represents the distance between clusters, “k” denotes the number of clusters, and the value range of DR is [0,1]. If the score is higher, the image will be more diverse. In formula (14), “qi” expresses the score of the “ith” subject, “n” indicates the number of subjects, and the range of MOS values is^1,5. If the score is higher, the image quality will be higher. In formula (15), “μ¹” and “μ²”, respectively depict the mean vectors of the real image and the generated image, “∑” explains the covariance matrix of the image, “Tr” conveys the trace operation of the matrix, and the range of FID values is [0, ∞). If the score is smaller, the image quality will be better.

These three evaluation indicators can help designers evaluate the diversity, realism, and visual quality of generated images, and determine the advantages and disadvantages of the generated model. This can be applied in various physical fields for creative design and expression and realize the diversified application of innovative patterns in social life. The specific form and physical carrier are demonstrated in supplementary information.

link