Methodology & Data

How We Measure What Virtual Try-On Actually Does

Name: Wearo Virtual Try-On Impact Metrics
Creator: Wearo
Published: 2026-05-12
License: https://wearo.io/en/terms-of-service

We treat every claim on this site as a hypothesis we have to defend. This page documents how Wearo measures the impact of AI virtual try-on across conversion, returns, average order value, and engagement: what we observed with enterprise partners, what the academic literature says, and where our data has real limitations.

Reviewed and signed by Wandrille Stutz, Founder, Wearo. Last updated: May 16, 2026.

A Claim Without a Method Is Just a Number

Conversion lift, return reduction, AOV uplift, engagement metrics: every fashion-tech vendor cites them, almost none publish how they were measured. We think that's a problem. Marketing teams should be able to evaluate a vendor's numbers the way a quantitative analyst evaluates a financial report: with a method, a sample, a window, and an explicit list of what was and wasn't controlled for. This page is our attempt to publish that level of detail for the metrics Wearo cites on its product pages.

How do we measure?

All Wearo metrics are computed from first-party event data captured by the widget and the host store's analytics pipeline. We use a consistent definition across partners so comparisons across pilots remain apples-to-apples.

What counts as a conversion

A conversion is a completed purchase: a terminal-state order-confirmed event on your platform (order paid, status confirmed). We do not count add-to-cart, wishlist, or session-level intent signals as conversions. This is deliberately the most conservative definition available.

Engaged vs. non-engaged cohorts

For every pilot we compare two cohorts within the same time window: sessions where the visitor opened the Wearo widget and uploaded at least one image (engaged), versus sessions on identical product pages where the widget was shown but never opened (non-engaged). Both cohorts experience the same site, the same products, the same prices, and the same marketing surface. The only delta is whether they chose to interact with the try-on.

Attribution window

We attribute conversion to a session up to 7 days from the engagement event, using the host store's standard session/customer identifier. Beyond 7 days the signal becomes too noisy to isolate from broader marketing campaigns.

Sample size and time period

Our published metrics are computed across thousands of sessions per pilot, over rolling 30-to-90-day windows on partner stores. We do not publish exact session counts to protect partner confidentiality, but we are willing to share methodology specifics under NDA on request.

What We Observed

Across pilots with ready-to-wear brand partners, four headline patterns emerged consistently. We present them as ranges, not point estimates, because the actual lift depends on traffic mix, product category, price point, and how prominently the widget is surfaced.

Conversion lift

Observed range: 5–9× engaged vs. non-engaged
Sample basis: Thousands of sessions per pilot, 30–90 day windows
What it means: Visitors who reached a rendered try-on result. Selection-biased toward higher-intent shoppers (see FAQ).

Return rate reduction

Observed range: Up to 30% below partner baseline
Sample basis: Orders containing at least one try-on event on the purchased SKU
What it means: Mechanism: closing the visual expectation gap between the catalog photo and the customer's mirror moment at home. Industry baseline for apparel returns sits around 25-30% (NRF 2023 consumer returns report).

Average order value

Observed range: Meaningful uplift on engaged orders
Sample basis: Compared to store-level AOV baseline over the same window
What it means: Shoppers add complementary items (matching top, second colourway) after seeing the rendered look.

Engagement signals

Observed range: Longer dwell time + non-trivial share rate
Sample basis: Per-session dwell time and post-render share events
What it means: Shared try-on results bring returning traffic with above-average conversion intent.

Metric	Observed range	Sample basis	What it means
Conversion lift	5–9× engaged vs. non-engaged	Thousands of sessions per pilot, 30–90 day windows	Visitors who reached a rendered try-on result. Selection-biased toward higher-intent shoppers (see FAQ).
Return rate reduction	Up to 30% below partner baseline	Orders containing at least one try-on event on the purchased SKU	Mechanism: closing the visual expectation gap between the catalog photo and the customer's mirror moment at home. Industry baseline for apparel returns sits around 25-30% (NRF 2023 consumer returns report).
Average order value	Meaningful uplift on engaged orders	Compared to store-level AOV baseline over the same window	Shoppers add complementary items (matching top, second colourway) after seeing the rendered look.
Engagement signals	Longer dwell time + non-trivial share rate	Per-session dwell time and post-render share events	Shared try-on results bring returning traffic with above-average conversion intent.

Case Study: A Premium French Fashion Retailer

Our longest-running enterprise pilot is with a premium French fashion retailer (anonymised by NDA). The numbers below are from a single deployment and are not promised as typical results. They are reported here as a concrete data point alongside the ranges above.

Conversion lift among widget-engaged sessions: ≈ 11× the non-engaged baseline
Widget click-through rate from the product page: ≈ 3%
Try-on completion rate (started → result generated): ≈ 18.8%
Return rate on widget-engaged orders: Approximately -28% from the partner category baseline

What this does, and does not, mean

This is one deployment, in one product category (premium ready-to-wear), with one traffic mix. The 11× figure is an observed pilot result, not a forecast for other partners. We present it because it is real and verifiable under NDA, not as a guarantee. The general 5–9× range above is the more defensible number for prospective customers: it averages across multiple pilots and product types.

Industry Research & Academic Consensus

Our internal data is corroborated by a growing peer-reviewed literature on virtual try-on. We cite three findings that are directly relevant to the metrics on this page.

Perceived immersion mediates the effect of AI-powered try-on on online buying intention

Gao & Liang (2025), Sustainability (MDPI) ↗

Gao and Liang (2025), publishing in MDPI Sustainability, model how four core attributes of AI-powered try-on (visual vividness, interactive control, personalized configuration, and ease of use) shape online buying intention in fashion e-commerce. Using a modified S-O-R framework on a 366-respondent online survey analysed with PLS-SEM, they find that perceived immersion is a significant mediator alongside perceived utilitarian and hedonic value, with brand trust moderating the strength of the effect. The construct they test (impulsive buying intention in a young-Chinese-consumer sample) is not identical to the considered purchases on premium European stores Wearo serves, but the underlying mechanism is what we observe in our own data: the conversion lift concentrates in sessions where the visitor reaches a rendered result (the moment of perceived immersion), not merely in sessions where the widget is opened.

Virtual try-on systems and purchase confidence (systematic review)

Chen, C., Ni, J., & Zhang, P. (2024) ↗

Chen, Ni, and Zhang (2024) published a systematic review of 69 academic papers on virtual try-on systems in fashion consumption (MDPI Applied Sciences, Donghua University). They identify the key factors that drive consumer purchasing decisions and VTO adoption intentions across the existing literature. Wearo does not directly address sizing decisions (those require a dedicated size prediction tool), but the broader VTO-to-purchase relationship their review documents is what we see at work behind our up-to-30% return reduction figure: when shoppers see the garment rendered on themselves before buying, the post-delivery 'this looks nothing like the photo' moment becomes rare.

Virtual try-on drives brand cognitive engagement and brand attitudes

Lavoye, V., et al. (2023), Journal of Services Marketing ↗

Lavoye, Sipilä, Mero, and Tarkiainen (2023), publishing in the Journal of Services Marketing, ran a 500-participant quasi-experiment on virtual try-on for sunglasses and lipsticks. They find that when shoppers feel the virtual self represents them authentically (self-presence), they engage in greater style exploration, which increases brand cognitive processing and ultimately improves brand attitudes. The engagement and dwell-time effects we monitor are consistent with this mechanism: visitors who interact with the widget invest more cognitive attention in the brand, which correlates with both in-session conversion and return visits.

Frequently Asked Questions About Our Data

How exactly do you define a conversion?

A completed purchase (Shopify-confirmed order or equivalent terminal-state checkout event on the host platform). We do not count add-to-cart, wishlist, or session-level intent signals. This is deliberately the most conservative definition available, so our numbers are robust against any reasonable definition of conversion.

Aren't engaged visitors self-selected? Wouldn't they have bought anyway?

Yes. And that is the most important honest disclosure on this page. Visitors who choose to open the widget have stronger purchase intent on average than those who don't. The 5–9× number is not 'Wearo causes 9× conversion'; it is 'sessions in which a visitor engaged with the widget converted 5–9× more than sessions on the same pages where the widget was shown but not opened.' The non-engaged cohort is the cleanest control we can construct without running a randomised on/off experiment, which is on our roadmap for future pilots. When that data exists, we'll publish it here and adjust this answer.

Are these results typical for every product category?

No. Our published data is from premium fashion ready-to-wear. Categories with high price points, complex fit, or strong visual differentiation (dresses, outerwear, knitwear) tend to show stronger lift. Basics, accessories, and lower-AOV categories tend to show weaker lift. Anyone evaluating Wearo for an out-of-category use case should expect to run their own pilot rather than assume our numbers transfer.

What sample size and time period are these numbers based on?

Thousands of sessions per pilot, on rolling 30-to-90-day windows. We do not publish exact session counts to protect partner confidentiality; we share specifics under NDA on request from prospective customers and analysts.

Was this an independent third-party study?

No. The data on this page is first-party: Wearo measured the impact of Wearo. We are explicit about that. Independent validation is on our roadmap. Until then, the strongest external corroboration we offer is the peer-reviewed literature in the section above, which converges on the same mechanisms (perceived immersion, visual confidence, and a narrowing expectation gap between catalog imagery and reality) that drive our internal numbers.

How do you measure return reduction?

By comparing the return rate on orders that included at least one Wearo try-on event for a purchased SKU against the partner's category baseline. The 'up to 30%' figure is the upper end of what we have observed; the realised reduction is smaller on partners with already-low baseline return rates and on categories where visual surprise at delivery is rare to begin with (basics, simple cuts where the catalog photo closely matches the at-home mirror moment).

What 'engagement' metrics do you actually look at?

Dwell time on the product page, widget click-through rate, try-on completion rate (started → rendered), and share rate on the rendered result. We track them as health indicators for each pilot, not as standalone marketing claims, because they correlate strongly with downstream conversion.

Run Your Own Numbers

Every metric on this page is visible in the Analytics dashboard inside your Wearo account from the moment the widget is installed. The free credits granted with high-volume plans easily cover ~15 days of live traffic, enough to see your first results and calibrate the rollout.

Start free trial Book a demo