AI for virtual fitting: application scenarios and enable technologies

  1. What is virtual fitting technology?
  2. Examples of virtual fitting rooms in both online and regular retail.
  3. The essential technologies behind virtual fitting.
  4. The relevant open & commercial datasets.
Photo by julien Tromeur on Unsplash

What is virtual fitting technology?

Virtual fitting rooms, leverage the power of technology to allow shoppers to see the size, style and fit of items without physically touching or buying them.

Examples of virtual fitting rooms in both online and regular retail

Despite it being called Virtual Fitting Rooms, there is nothing about physical rooms, the only need here is screens. Therefore, the technology has wildly used in smartphone apps and brick-and-mortar retailers.

virtual dressing room
YourFit 1822 Denim Product Video
Photo by ThisisEngineering RAEng on Unsplash

The essential technologies behind virtual fitting.

Despite virtual Fitting Rooms looking like magic, the basic technologies are more complicated.

1. Image segmentation

For a fashion picture, the image segmentation technology can distinguish the clothes, pants, and various accessories of the figure in the picture at the pixel level. In the virtual fitting scene, it is often necessary to identify and segment the clothes worn by the experiencer, so as to replace the virtual clothes more accurately.

Mask R-CNN ,from[2]

2. Cross Modal Retrieval And Generation

Cross-modal technology in Fashion field can be divided into two aspects: generating text descriptions through images, and retrieving relevant clothing pictures from massive data through a descriptive text (cross-modal retrieval). Cross-modal techniques often require massive image text description and large-scale training of paired images. The commonly used framework of this kind of task is: by extracting image side features and text side features and optimizing the spatial distance of related modes in the same subspace, the spatial distance of unrelated modal features is gradually approaching, and the irrelevant modal features are gradually estranged.


3. Image Generation

Image generation can be divided into conditional generation and unconditional generation. Conditional generation provides examples of clothing to generate images of similar styles, represented by pix2PIx models. Random disturbance is used as input to generate more diversified images such as DCGAN and ProGAN.

Image source:

4. 3D reconstruction

As the general example picture is a flat garment photo in 2d, 3d reconstruction technology is needed to re-model the garment in a 3D scene. 3d reconstruction virtual makeover 1 technology can be divided into two aspects: based on traditional graphics algorithm and based on deep learning algorithm.

4. The relevant open & commercial datasets

Visiting stores, queuing for fitting rooms and try-on, those days are long gone.

Image source:
  2. He, Kaiming, et al. “Mask r-cnn.” Proceedings of the IEEE international conference on computer vision. 2017.
  3. Brouet, R. , et al. “Design preserving garment transfer.” ACM Transactions on Graphics (TOG) — SIGGRAPH 2012 Conference Proceedings (2012).
  4. GUAN P, REISS L, HIRSHBERG D A, et al.Drape: Dressing any person[J]. ACM Transactions on Graphics (TOG), 2012,31(4):1–10.
  5. Li, Xirong, et al. “W2vv++ fully deep learning for ad-hoc video search.” Proceedings of the 27th ACM International Conference on Multimedia. 2019.



Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store