Imagine you and I search for a movie on the same app and we both end up getting a completely different recommendation list – even when both of us like same type of movies. And what if it was just because of your age, gender, race β or even personality type. Sounds strange, right?
But thatβs exactly what can happen when AI-powered systems recommend things to us. Introducing FairEval – a framework that is trying to solve this problem.
FairEval is a method to test whether AI recommendation systems β especially those built on Large Language Models (LLMs) β are treating different users fairly.
Most of todayβs AI tools try to guess what you might like, based on what youβve liked in the past. But they may also (without meaning to) treat you differently depending on who you are.
FairEval checks if this is happening.
π How does FairEval work?
It runs two sets of prompts through the AI:
One that is completely neutral β no personal details.
One that includes sensitive info β like race, religion, or personality.
Then it compares the two sets of recommendations to see if they are different β and if so, how much and why.
To do this, it uses several smart techniques:
β
Sensitive-to-Neutral Similarity Range (SNSR) & Sensitive-to-Neutral Similarity Variance (SNSV) check if certain groups are getting better or worse recommendations
β
Personality-Aware Fairness Score (PAFS), a new one, checks how consistent the system is across different personalities
β
Search Engine Results Page Score (SERP) & Preference Ranking Agreement Grade (PRAG) check how much impact your identity has on the top results
π― Why does this matter for ecommerce and marketplace platforms?
Most ecommerce and marketplace sites today use AI to show you:
1οΈβ£ What products to buy
2οΈβ£ Which listings to see first
3οΈβ£ What price and promotions you qualify for
Now imagine if these recommendations vary just because of your gender, religion, or personality β not your actual preferences.
Thatβs a problem.
Letβs say two users want a backpack. But one is shown more expensive or lower-quality options just because of how the AI interpreted their profile. That creates a poor experience β and itβs unfair.
Frameworks like FairEval help catch these problems early. They make sure the AI is not unintentionally biased β and that everyone gets a fair shot, whether you’re a shopper or a seller.
β
If you’re building AI in ecommerce, ask yourself:
π Is the AI fair across age, gender, region, and personality?
π Are we testing for fairness before going live?
FairEval gives us the tools to ask these questions β and fix whatβs broken before it affects real people.
π How can you ensure that your “AI Recommendations” are fair to everyone!