Solid Protocol Case Study
Overview
Solid (Social Linked Data) is Tim Berners-Lee's project for decentralized personal data storage. Users control their data in "pods" (personal online datastores), apps request permission to access specific data.
Core Architecture: RDF-based data in pods, Web Access Control (WAC) for permissions, apps are separate from storage, Linked Data Platform (LDP) for standard operations.
Key Insight: Solid scores 19/30 in our evaluation - meets minimum viability (P1, P6, P8 all adequate) but has significant weaknesses. The guild's skepticism has merit, but the reasons matter.
Architecture Decisions Mapped to Principles
Storage: RDF in Pods (Mutable Personal Data Store)
Implementation:
- Each user has a pod (web server hosting their data)
- All data stored as RDF triples
- Files and resources accessed via HTTP/LDP
- Can self-host or use pod provider
Principle Alignment:
- Strongly supports P1 (Agent Sovereignty) - user owns pod (Score: 2/2)
- Strongly supports P3 (Semantic Richness) - RDF enables formal semantics (Score: 2/2)
- Strongly supports P6 (Interoperability) - RDF is W3C standard (Score: 2/2)
- Fails P9 (Performance Pragmatism) - RDF queries are slow (Score: 0/2)
Requirements Addressed:
- R10 (Tool-independent representation) - RDF is universal
- R11 (Relation preservation) - triples preserve relationships
- R13 (Human-readable export) - Turtle serialization is readable
Requirements Violated:
- R9 (Performance at scale) - SPARQL is prohibitively slow
- R37 (Low-friction integration) - RDF has steep learning curve
Gap Analysis:
- Does NOT address GAP-1 (Temporal Integrity) - no built-in versioning
- Partially addresses GAP-3 (Contextual Access) - WAC enables fine-grained control but complexity barrier
Why the Guild Might Dismiss: Performance is genuinely poor. SPARQL queries on even modest datasets (10k+ triples) are slow. For decades of personal knowledge, this is prohibitive.
What's Actually Good: RDF does provide maximum semantic richness. If performance weren't an issue, the expressiveness is unmatched. The question is whether that expressiveness is worth the cost for personal knowledge.
Related Terms: See glossary-engineering - RDF, Triple Store, SPARQL
Schema: RDF Vocabularies (Centralized Ontologies)
Implementation:
- Use existing vocabularies (schema.org, FOAF, Dublin Core)
- Create custom vocabularies if needed
- Everything is RDF, so vocabularies compose naturally
- No namespace collision (URIs are global)
Principle Alignment:
- Strongly supports P3 (Semantic Richness) - formal ontologies (Score: 2/2)
- Supports P6 (Interoperability) - shared vocabularies (Score: 2/2)
- Conflicts with P4 (Schema Pluralism) - assumes centralized ontology (Score: 2/2 but philosophically problematic)
- Conflicts with P5 (Friction Minimization) - requires upfront modeling (Score: 0/2)
Requirements Addressed:
- R12 (Schema evolution without data loss) - RDF accommodates new properties
- R11 (Relation preservation) - relationships are first-class
Why the Guild Might Dismiss: The "choose your ontology" problem. Either:
- Use existing vocabularies (lowest-common-denominator, doesn't fit personal knowledge)
- Create custom vocabulary (now you need to be ontology engineer)
- Mix vocabularies (complexity explosion)
For personal note-taking, having to think "is this schema:Thing or foaf:Agent or should I create my:Thought?" is friction.
What's Actually Good: If you DO model properly, the interoperability is real. Your data can be consumed by any RDF-aware tool. This is P6 (Interoperability) at its best.
Comparison to atproto:
- atproto Lexicons: Namespaced, versioned, decentralized evolution
- Solid Vocabularies: Global URIs, centralized standards, immediate interoperability
Different tradeoff: atproto optimizes for evolution, Solid for immediate compatibility.
Access Control: Web Access Control (WAC)
Implementation:
- ACL documents specify permissions per resource
- ACLs are themselves RDF
- Can grant read, write, append, control
- Supports inheritance (folder to contents)
- Agent-based (specific users) or class-based (anyone, authenticated users)
Example ACL (Turtle):
<#authorization1>
a acl:Authorization;
acl:agent <https://alice.example/profile#me>;
acl:accessTo <mnemegram123>;
acl:mode acl:Read, acl:Write.
Principle Alignment:
- Strongly supports P8 (Protection by Default) - explicit ACLs (Score: 2/2)
- Strongly supports P10 (Contextual Access) - very expressive (Score: 2/2)
- Weakens P9 (Performance) - checking RDF ACLs on every request (Score: 0/2)
Requirements Addressed:
- R5 (Fine-grained access control) - per-resource ACLs
- R6 (Mnemegram-level access) - yes, per-resource
- R9 (Auditable access grants) - ACLs are inspectable RDF
Gap Analysis:
- Addresses GAP-3 (Contextual Access Control) - One of only two systems (with Notion) that score 2/2 on P10
Why the Guild Might Dismiss: WAC is complex. To use Solid effectively, you need to:
- Understand RDF
- Understand LDP (Linked Data Platform)
- Understand ACL model
- Write ACL documents for every resource
- Debug permission issues (SPARQL queries on ACL triples)
This is not "just use your notes." This is "be a semantic web engineer."
What's Actually Good: WAC is probably the most expressive access control system in existence. You can encode almost any permission rule. The problem is that expressiveness comes with complexity.
Comparison:
- Solid WAC: Maximum expressiveness, high complexity
- atproto: Simple permissions, limited expressiveness
- Obsidian: No access control (local files)
- Capabilities: Medium expressiveness, UX challenge
Solid chose maximum expressiveness. Reasonable choice IF you have semantic web expertise. Poor choice for general population.
Identity: WebID
Implementation:
- Each user has WebID (URI pointing to profile document)
- Profile is RDF describing user
- WebID used for authentication
- Can be self-hosted or on pod provider
Principle Alignment:
- Supports P1 (Agent Sovereignty) - self-hostable identity (Score: 2/2)
- Supports P6 (Interoperability) - standard profile format (Score: 2/2)
Requirements Addressed:
- R10 (Tool-independent representation) - WebID is URI
- R22 (Decadal maintainability) - web-based, not company-dependent
Note: Solid is transitioning to DIDs (Decentralized Identifiers) in newer specs. This would align with atproto's approach.
Query: SPARQL (The Performance Killer)
Implementation:
- Query pods using SPARQL
- Can federate queries across multiple pods
- Rich graph queries, inference, reasoning
Example Query:
PREFIX memex: <http://example.org/memex#>
SELECT ?thought ?date WHERE {
?thought memex:about <#DistributedSystems> .
?thought memex:created ?date .
FILTER (?date > "2024-01-01"^^xsd:date)
}
ORDER BY ?date
Principle Alignment:
- Maximizes P3 (Semantic Richness) - can reason over data (Score: 2/2)
- Catastrophically fails P9 (Performance Pragmatism) (Score: 0/2)
Why the Guild Should Dismiss (Legitimately): SPARQL performance at scale is genuinely bad. For personal knowledge spanning decades:
- 100k triples: Slow
- 1M triples: Very slow
- 10M triples: Unusable
atproto has millions of records per active user and performs well. Solid can't handle this scale.
The Fundamental Problem: RDF triple stores are not optimized for the query patterns personal knowledge requires. Graph databases (Neo4j) are much faster for graph traversal because they use different indexes.
What Would Fix This:
- Use property graph database instead of triple store
- Add SPARQL → Cypher translation layer
- But then you're not really using RDF anymore
This Is The Real Reason To Dismiss Solid: Not the complexity (that can be abstracted). Not the friction (that can be reduced). The performance ceiling is too low for personal data operations at scale.
System Evaluation Summary
Overall Score: 19/30 (63%)
Essential Principles (P1, P6, P8): 6/6 - Meets minimum viability
Detailed Scores: | Principle | Score | Rationale | |-----------|-------|-----------| | P1: Agent Sovereignty | 2/2 | Full pod ownership, self-hostable | | P2: Temporal Integrity | 0/2 | No built-in versioning | | P3: Semantic Richness | 2/2 | RDF enables formal semantics | | P4: Schema Pluralism | 2/2 | RDF accommodates any schema | | P5: Friction Minimization | 0/2 | High learning curve | | P6: Interoperability | 2/2 | W3C standards throughout | | P7: Collective Possibility | 1/2 | Can share but complex | | P8: Protection by Default | 2/2 | WAC fine-grained ACLs | | P9: Performance Pragmatism | 0/2 | SPARQL too slow | | P10: Contextual Access | 2/2 | WAC very expressive | | P11: Proactive Surfacing | 0/2 | No surfacing mechanisms | | P12: Provenance Traceability | 1/2 | Can model in RDF, not automatic | | P13: Heterogeneous Integration | 2/2 | RDF handles any data type | | P14: Longevity Over Features | 2/2 | W3C standard, open | | P15: Graceful Degradation | 1/2 | Can work offline but complex |
Strengths:
- Maximum semantic richness (P3: 2/2)
- Best access control (P10: 2/2)
- True interoperability (P6: 2/2)
- Agent sovereignty (P1: 2/2)
- W3C standard (P14: 2/2)
Critical Weaknesses:
- Performance (P9: 0/2) - This is the killer
- No temporal integrity (P2: 0/2) - Gap-1 unaddressed
- High friction (P5: 0/2) - Semantic web expertise required
- No proactive surfacing (P11: 0/2) - Query-only
Should the Guild Dismiss Solid?
Legitimate Reasons to Dismiss:
- Performance is genuinely inadequate (P9: 0/2)
- SPARQL doesn't scale to personal knowledge volumes
- This isn't fixable without abandoning RDF
-
For decades of daily capture, this is disqualifying
-
No temporal integrity (GAP-1 not addressed)
- Versioning is external (git or similar)
-
Core weakness for reflective knowledge work
-
Complexity barrier prevents adoption
- Requires semantic web expertise
- Friction kills actual use
- Designed by/for researchers, not general population
Illegitimate Reasons to Dismiss:
- "RDF is overkill" - Maybe, but semantic richness IS valuable
- The problem is performance cost, not expressiveness itself
-
If fast RDF existed, it would be excellent for personal knowledge
-
"Too complex" - Complexity can be abstracted
- Apps can hide RDF from users
- WAC could have better UX
-
This is solvable with better tooling
-
"Nobody uses it" - Chicken-egg problem
- Low adoption partially due to performance
- Also due to lack of compelling apps
- Network effects matter, but doesn't invalidate architecture
Nuanced Position:
Solid's architecture makes principled choices:
- Maximize semantic richness → RDF
- Maximize expressiveness → WAC
- Maximize interoperability → W3C standards
These are GOOD choices for certain values. The problem is the performance tradeoff is too severe for personal data operations at scale.
The Correct Dismissal: "Solid's RDF architecture cannot achieve the performance needed for personal data operations at scale (P9). While its semantic richness (P3) and access control (P10) are excellent, SPARQL's performance ceiling makes it unsuitable for decades of knowledge. We need Solid's expressiveness with Neo4j's performance."
Not: "Solid is overengineered academic nonsense" But: "Solid made a reasonable bet on RDF that performance data doesn't support"
What Personal Data Ops Can Learn from Solid
Adopt from Solid:
- Fine-grained access control philosophy (even if not WAC specifically)
- Semantic richness as goal (even if not RDF specifically)
- Pod model (user-owned storage, app separation)
- Agent sovereignty as non-negotiable
- Standards-based interoperability
Avoid from Solid:
- Triple stores for primary storage (too slow)
- SPARQL for queries (too slow)
- Requiring users to understand ontologies
- Complexity without abstraction layer
The Hybrid Approach: What if you had:
- Solid's pod model (user-owned storage)
- Solid's WAC philosophy (fine-grained contextual access)
- Property graphs instead of RDF (faster queries)
- Capabilities instead of ACL documents (simpler UX)
- Lexicons instead of ontologies (easier evolution)
You'd get Solid's benefits without its performance penalty.
Comparison with atproto
| Dimension | Solid | atproto | Winner |
|---|---|---|---|
| Semantic Richness | RDF (maximum) | Lexicons (good) | Solid |
| Performance | Poor (SPARQL) | Good (custom indexes) | atproto |
| Access Control | Excellent (WAC) | Basic (app-level) | Solid |
| Temporal Integrity | None | Excellent (commits) | atproto |
| Interoperability | Maximum (RDF) | Good (open protocol) | Solid |
| Complexity | Very high | Medium | atproto |
| Adoption | Very low | Growing | atproto |
Synthesis: Neither is perfect for personal data operations:
- Solid: Right goals, wrong implementation (performance)
- atproto: Right implementation, wrong goals (social not knowledge)
Ideal system: atproto's architecture (commits, performance) + Solid's goals (semantics, access control) + neither's current implementation.
Open Questions
- Can property graphs provide Solid-level semantics with better performance?
- Could SPARQL be optimized for personal knowledge query patterns?
- Is there a "RDF without the slowness" solution?
- Can Solid's pod model work with non-RDF storage?
- Would capabilities provide WAC's expressiveness with better UX?
Conclusion: Was the Guild Right to Dismiss Solid?
Short answer: Partially.
Long answer: The guild is right that Solid, as currently implemented, is not suitable for personal data operations at scale. The performance issues (P9: 0/2) are genuine and disqualifying.
However, Solid's design philosophy is sound:
- Agent sovereignty (P1)
- Semantic richness (P3)
- Fine-grained access control (P10)
- Interoperability (P6)
The mistake was betting on RDF/SPARQL for the implementation layer. The principles are correct; the technology stack can't deliver the performance.
Recommendation: Don't dismiss Solid's GOALS. Dismiss Solid's IMPLEMENTATION (specifically RDF triple stores and SPARQL).
Build something with:
- Solid's pod model and sovereignty principles
- atproto's commit model for temporal integrity
- Property graphs (not RDF) for semantic richness with performance
- Capabilities (not WAC documents) for access control
- Lexicons (not ontologies) for schema evolution
Final Assessment: Solid is a noble failure. It aimed for the right goals but chose an implementation stack that cannot deliver the performance required. The guild should learn from Solid's principles while avoiding its architectural choices.
References
Protocol Documentation:
- Main docs: https://solidproject.org/
- Specification: https://solidproject.org/TR/protocol
- WAC: https://solidproject.org/TR/wac
Related Analysis:
- system-evaluation - Full scoring
- gap-analysis - GAP-3 (Solid is one of two adequate solutions)
- access-control-models - WAC in depth
- query-approaches - SPARQL performance issues
- atproto-analysis - Comparison system