Reputation Should Be Queryable

A lot of people aren’t deep into how reputation works in Web3.

But after ~4 years building across DeFi, privacy, and now the Intuition MCP… one pattern keeps showing up:

Every system gives you a number - and calls it trust.


Most systems answer the wrong question.

They try to tell you:

“Is this person trusted?”

But that’s not how decisions work.


The Real Question Is Contextual

You’re not asking:

“Is Luda trusted?”

You’re asking:

  • Is Luda a trusted Solidity dev?

  • Is Luda a trusted trader?

  • Is Luda trusted by people I trust?

Same person.
Different answers.


Why Most Systems Fail

They collapse everything into:

  • one score

  • one label

  • one dimension

So you lose:

  • why someone is trusted

  • who trusts them

  • for what they’re trusted

A single number flattens everything that matters.


Intuition Changes the Primitive

Reputation isn’t a score.
It’s attestations

Example:

Billy → Luda
“strong Solidity developer” (0.9)

Zet → Luda
“reliable trader” (0.8)


Luda doesn’t have a reputation.

Luda has multiple reputations


Reputation Becomes Queryable

Instead of:

“Give me Luda trust score”

You can ask:

  • Who are the most trusted Solidity devs?

  • Who is trusted by Zet for trading?

  • Who is trusted by people Billy trusts?

Reputation stops being something you read.

It becomes something you query


Predicate Filtering = Signal Control

This is where things clicked for me while building the https://mcp.intuition.box.

We are implemented predicate filtering with weighted scoring across multiple predicate types.

The difference?

Night and day.


You can filter by:

  • predicate

  • source

  • weight

Example:

Ignore generic endorsements

Only use:

→ “is Solidity developer”
→ from high-trust devs

Now you get:

high signal, low noise


Why This Matters for Agents

Agents don’t need generic answers.

They need relevant answers

A hiring agent shouldn’t care about:

  • trading reputation

  • social popularity

It should filter for:

→ dev attestations
→ from credible sources


And this is why I also build a trust lens system into the Intuition MCP which will enable.

Each lens = a filtered view of the graph

Agents pick a lens → get only the signal they need


Builder Opportunities

This unlocks primitives that feel underexplored:

1. Reputation Query Engines
The query becomes the product

2. Domain Leaderboards
Separate trust per domain – not one global list

3. Predicate Marketplaces
Communities define what trust means

4. Agent-Specific Filters
Each agent defines its own trust logic


Where It Gets Interesting

Combine:

  • predicate filtering

  • graph traversal

  • trust propagation

And you can ask:

“Find Solidity devs trusted by people Billy trusts, weighted by multi-hop trust”


That’s not a score.

That’s logic


Open Questions

  • Who defines the “right” predicates?

  • Should they standardize or evolve?

  • Can agents learn what matters over time?

  • How do we reduce spam + low-signal attestations?


Final Thought

Reputation shouldn’t be something you check.

It should be something you ask


What’s the first reputation query you’d run on the graph?


This resonates hard. Especially the point about domain leaderboards and predicate filtering as separate primitives.

I’ve been building exactly this — contextual trust scoring where each [Agent] [hasAgentSkill] [Skill] triple has its own vault and its own independent score. So you don’t ask “is this agent trusted” but “is this agent trusted FOR this specific capability.” The overall score becomes a weighted average of domain scores.

Flipped the query too — instead of “what can this agent do” you ask “who is the best agent for this domain” and get a ranked leaderboard per skill. Different domains, different rankings, different top agents.

On your open question about reducing low-signal attestations — one approach I’m exploring is accuracy-weighted staking. Your track record as an evaluator determines how much your signal weighs. Consistently back agents that maintain trust → your weight goes up. Back agents that crash → it goes down. It’s a natural spam filter because low-quality evaluators lose influence over time without needing manual moderation.

Your point about agents picking a lens is interesting. Each lens could map to a predicate + evaluator weight filter. High-accuracy evaluators only, specific domain only → clean signal for agent decision-making.

1 Like

Out of curiosity how do you guys foresee the atom / triple structure looking? My use case is slightly different since my extension will be focused on who users trust on specific topics like politics, tech, crypto, sports, etc. Our users will browse X.com and be able to apply different lenses (topical trust circles). It’s a bit tricky since a typical X feed includes Tweets from a variety of topics but I can probably handle that by giving each lens a different color, etc.

I originally figured just a I - trust - smilingkylan.eth type triple would work, and we’d be able to extend with a (I - trust - smilingkylan.eth) - for topic - technology but I’m not sure how much value the initial (inner) triple truly has.

So if we go with something like smilingkylan.eth - is trusted for topic - technology that could definitely work. You could still figure out their aggregate trustworthiness by querying the first 2 of 3 atoms of the triple… if that’s of any value.

The big step that needs to be taken if we move forward with a scheme like this is figuring out how to divvy up the different topics. Maybe start off with broad topics (like the ones I mentioned earlier) then let more specific topics pop up in response to community demand? We could theoretically vote on it as a list / ranking as well although any vote is subject to manipulation. Still, such a critical list would likely get a lot of staking on it from whales so maybe they’d be able to drown out any attempt at manipulation.

These are good conversations and it sounds like most of the dapp developers are starting to converge towards a network-wide convention, which is nice.

Edit: another important question is whether you guys expect to give a user’s stake amount as a weight for their trust circle? Weighted scores can be a bit harder for users to audit (ie “are claims by my trust circle being weighed correctly by the stake amount that I trust them?”)

Edit #2: the other implication of this scheme is that new users will start off with empty trust circles which is a terrible user experience. This means we will need to find a (decentralized) way to get new users a large list of people to trust. I kinda expect different dapps / individuals to put together lists of people they recommend users trust. The cool thing is that with batched staking a new user can follow dozens of EVM accounts as trustworthy for a given topic… but I’m not sure if we have much of that infrastructure built within the community just yet. And how would a dapp like Hive Mind decide which EVM accounts to trust for which topics? Should EVM accounts nominate themselves for given topics? What would that self-nomination process look like? Where would it take place? “Topical trust lists” (or whatever we want to call them) may end up playing a critical role in the future of Intuition. Or maybe the network will just look at how must stake is organically staked onto EVM accounts for specific topics and encourages new users to also trust them for that specific topic? That would probably create an echo effect though :thinking:

1 Like

Yeah this resonates

The “trusted FOR something” framing is exactly the shift

And the accuracy-weighted staking idea is kinda powerful ngl
It basically makes bad signal decay on its own

The cold start problem feels tricky though

How do you stop early bad actors from shaping the initial weights?

This is a really thoughtful breakdown

I think you’re circling the right problem:
not just how to store trust - but how to make it usable across different contexts


On the triple structure, I’d lean strongly toward:

smilingkylan.eth → trusted_for → technology

instead of nesting like:

(I → trust → you) → for topic → X

The nested version feels harder to reason about and query

If the goal is composability and clean queries, keeping the predicate contextual from the start feels cleaner


The interesting part is you can still recover “global-ish” trust by aggregating across domains

So instead of:

global score → primary

It becomes:

domain scores → primary
global → derived


On topics — yeah this is where it gets tricky

I wouldn’t over-optimize it early

Start with broad domains:

  • tech

  • crypto

  • politics

  • sports

Then let more granular ones emerge on top of that

Trying to standardize too early might actually slow things down


Your point on accuracy-weighted evaluators + stake is :fire:

Feels like two separate axes though:

  • stake = economic weight

  • accuracy = informational weight

I wouldn’t collapse them too quickly

Because high stake ≠ high signal


On the empty trust circle problem - this is the real UX bottleneck imo

Cold start kills everything if not handled properly

Your idea of “topical trust lists” makes sense

But I think the key is:

→ make them forkable + composable

So instead of one canonical list, you get:

  • Billy tech trust list

  • Zet dev trust list

  • curated DAO lists

New users just subscribe to a base layer

Then refine over time


Also agree on the echo chamber risk

If we purely follow “most staked = most trusted”
we’ll just recreate popularity loops

So maybe:

→ discovery should bias diversity
→ not just weight


On your lens idea - I like the direction a lot

Feels like:

lens = predicate + source filter + weighting logic

Different lenses = different “views of trust”


One thing I’m still thinking about:

Should topics be:

  1. fixed primitives (clean, comparable)

  2. or fully emergent (flexible, messy)

Feels like the right answer might be a hybrid


Also curious how you’re thinking about:

→ topic overlap

Like tech vs crypto vs AI

Do you see those as separate graphs or overlapping layers?