vault backup: 2026-02-27 10:49:39
This commit is contained in:
453
Test/EventKit/04-FederationArchitecture.md
Normal file
453
Test/EventKit/04-FederationArchitecture.md
Normal file
@@ -0,0 +1,453 @@
|
||||
---
|
||||
tags:
|
||||
- eventkit
|
||||
---
|
||||
|
||||
# Federation Architecture
|
||||
|
||||
## Purpose
|
||||
|
||||
Enable multiple EventKit instances to communicate securely, allowing cross-company workflows like [[01-InventoryManagement#Sub-hire Management|sub-hire tracking]], equipment availability queries, and cross-boundary [[05-BarcodeAndQRScanning|asset tracking]] — while each company retains full sovereignty over their own data.
|
||||
|
||||
---
|
||||
|
||||
## Design Principles
|
||||
|
||||
| Principle | Description |
|
||||
| -------------------------- | ------------------------------------------------------------------------------------- |
|
||||
| **Data sovereignty** | Each instance owns its data. Nothing is shared without explicit configuration |
|
||||
| **Opt-in federation** | Federation is optional. An instance works fully standalone |
|
||||
| **Mutual trust** | Both parties must agree to federate. Either can revoke at any time |
|
||||
| **Minimal exposure** | Only share what's needed. Internal events, clients, crew, and costs are never exposed |
|
||||
| **Cryptographic identity** | Instances authenticate via key pairs and mutual TLS |
|
||||
| **Eventual consistency** | Federation is async — network outages don't break local operations |
|
||||
| **Hosting agnostic** | Federation works identically regardless of where or how an instance is hosted |
|
||||
|
||||
---
|
||||
|
||||
## Deployment & Hosting Models
|
||||
|
||||
Federation operates at the **application protocol level**, not the infrastructure level. This means instances federate identically whether they are self-hosted, on dedicated servers, or sharing infrastructure via a managed hosting provider.
|
||||
|
||||
### Three Hosting Tiers
|
||||
|
||||
```mermaid
|
||||
graph TB
|
||||
subgraph "Managed SaaS Provider"
|
||||
subgraph "Company A (Tenant)"
|
||||
A_App["App A"]
|
||||
A_DB[(DB A)]
|
||||
end
|
||||
subgraph "Company B (Tenant)"
|
||||
B_App["App B"]
|
||||
B_DB[(DB B)]
|
||||
end
|
||||
end
|
||||
|
||||
subgraph "Dedicated Hosting Provider"
|
||||
subgraph "Company C (Isolated VM)"
|
||||
C_App["App C"]
|
||||
C_DB[(DB C)]
|
||||
end
|
||||
end
|
||||
|
||||
subgraph "Company D (Self-Hosted)"
|
||||
D_App["App D"]
|
||||
D_DB[(DB D)]
|
||||
end
|
||||
|
||||
A_App <-->|"Federation"| B_App
|
||||
A_App <-->|"Federation"| C_App
|
||||
B_App <-->|"Federation"| D_App
|
||||
C_App <-->|"Federation"| D_App
|
||||
```
|
||||
|
||||
| Tier | Model | Data Isolation | Best For |
|
||||
| -------------------- | --------------------------------------------------------------------------------------- | ----------------------------------- | ----------------------------------------------------------- |
|
||||
| **Managed SaaS** | Multiple companies on shared infrastructure, each with their own database and subdomain | Logical (separate DBs, same server) | Small companies wanting zero ops overhead |
|
||||
| **Dedicated Hosted** | Isolated instance on a hosting provider's infrastructure (own VM/container) | Physical (separate VMs) | Mid-size companies wanting isolation without self-hosting |
|
||||
| **Self-Hosted** | Company runs their own server on their own infrastructure | Full physical | Large companies or those with strict data sovereignty needs |
|
||||
|
||||
### How It Works Per Tier
|
||||
|
||||
#### Managed SaaS
|
||||
|
||||
- A hosting provider runs many EventKit instances on shared infrastructure
|
||||
- Each company gets their own **subdomain** (e.g. `companya.eventkit.io`, `companyb.eventkit.io`)
|
||||
- Each company has a **separate database** — data is never mixed
|
||||
- Companies on the **same provider** federate through the same protocol as external instances
|
||||
- The provider can optionally optimise same-host federation (internal routing instead of public internet), but the protocol remains identical
|
||||
- Think of it like **email hosting** — Gmail hosts millions of accounts, but they all speak SMTP to each other and to self-hosted mail servers
|
||||
|
||||
#### Dedicated Hosted
|
||||
|
||||
- A hosting provider runs a **dedicated VM or container** per company
|
||||
- Full resource isolation — own CPU, memory, storage
|
||||
- Company gets a **custom domain** or provider subdomain
|
||||
- Provider handles updates, backups, and monitoring
|
||||
- Federates identically to self-hosted instances
|
||||
|
||||
#### Self-Hosted
|
||||
|
||||
- Company downloads EventKit and runs it on their own infrastructure
|
||||
- Full control over data, updates, and configuration
|
||||
- Requires technical staff to manage
|
||||
- Federates with any other instance regardless of how it's hosted
|
||||
|
||||
### Federation Transparency
|
||||
|
||||
A critical design property: **no instance can tell how another instance is hosted**. The federation protocol is the same regardless:
|
||||
|
||||
| Scenario | Federation Behaviour |
|
||||
| ----------------------------------------- | ----------------------------------------------------------------------- |
|
||||
| SaaS tenant ↔ SaaS tenant (same provider) | Standard federation protocol (provider may optimise routing internally) |
|
||||
| SaaS tenant ↔ Self-hosted | Standard federation protocol over the internet |
|
||||
| Self-hosted ↔ Self-hosted | Standard federation protocol over the internet |
|
||||
| Dedicated hosted ↔ SaaS tenant | Standard federation protocol |
|
||||
|
||||
### Hosting Provider Responsibilities
|
||||
|
||||
If a hosting provider offers managed EventKit instances, their responsibilities include:
|
||||
|
||||
| Responsibility | Description |
|
||||
| -------------------- | ------------------------------------------------------------------ |
|
||||
| **Provisioning** | Spin up new instances with a domain, database, and TLS certificate |
|
||||
| **Isolation** | Ensure tenant data is never accessible to other tenants |
|
||||
| **Updates** | Roll out EventKit updates across managed instances |
|
||||
| **Backups** | Automated backups per tenant with restore capability |
|
||||
| **Monitoring** | Health checks, uptime monitoring, alerting |
|
||||
| **TLS certificates** | Manage certificates for federation (mTLS) and web access |
|
||||
| **DNS** | Manage subdomains or custom domain mapping |
|
||||
|
||||
### Multi-Tenant vs. Multi-Instance (Provider Implementation)
|
||||
|
||||
Hosting providers can choose their internal architecture:
|
||||
|
||||
| Approach | Description | Pros | Cons |
|
||||
| ---------------------------- | ----------------------------------------------- | ----------------------------------- | ----------------------------------------- |
|
||||
| **Multi-instance** | One container/VM + database per tenant | Strongest isolation, simple ops | Higher resource cost per tenant |
|
||||
| **Shared app, separate DBs** | One application pool, one database per tenant | Lower resource cost, easier updates | Moderate isolation |
|
||||
| **True multi-tenant** | One application, one database, tenant ID column | Lowest cost | Weakest isolation, complex access control |
|
||||
|
||||
**Recommended for providers**: Multi-instance (containerised) for the best balance of isolation and operability. Each tenant gets a Docker container + PostgreSQL database, managed via orchestration (e.g. Kubernetes).
|
||||
|
||||
---
|
||||
|
||||
## Instance Architecture
|
||||
|
||||
```mermaid
|
||||
graph TB
|
||||
subgraph "Single Instance (any hosting tier)"
|
||||
A_Core["Core Application"]
|
||||
A_Fed["Federation Service"]
|
||||
A_DB[(Database)]
|
||||
A_Core --> A_Fed
|
||||
A_Core --> A_DB
|
||||
end
|
||||
|
||||
subgraph "Partner Instance (any hosting tier)"
|
||||
B_Core["Core Application"]
|
||||
B_Fed["Federation Service"]
|
||||
B_DB[(Database)]
|
||||
B_Core --> B_Fed
|
||||
B_Core --> B_DB
|
||||
end
|
||||
|
||||
A_Fed <-->|"mTLS + Signed Requests"| B_Fed
|
||||
```
|
||||
|
||||
### Components
|
||||
|
||||
| Component | Role |
|
||||
| ---------------------- | ------------------------------------------------------------------------------------------ |
|
||||
| **Core Application** | The main EventKit instance — handles all local operations |
|
||||
| **Federation Service** | A dedicated service/module that handles all inter-instance communication |
|
||||
| **Trust Store** | Stores public keys and connection details for all trusted partner instances |
|
||||
| **Outbox / Inbox** | Queue for outbound and inbound federation messages (ensures delivery even during downtime) |
|
||||
|
||||
---
|
||||
|
||||
## Trust Establishment
|
||||
|
||||
### Handshake Flow
|
||||
|
||||
```mermaid
|
||||
sequenceDiagram
|
||||
participant A as Instance A
|
||||
participant B as Instance B
|
||||
|
||||
Note over A: Admin initiates federation request
|
||||
A->>B: POST /federation/request
|
||||
Note right of A: {instance_url, public_key, company_name, contact_email}
|
||||
|
||||
B-->>A: 202 Accepted (pending review)
|
||||
|
||||
Note over B: Admin reviews request
|
||||
Note over B: Admin approves
|
||||
|
||||
B->>A: POST /federation/confirm
|
||||
Note right of B: {signed_challenge, public_key, permissions_granted}
|
||||
|
||||
A-->>B: 200 OK
|
||||
Note over A,B: ✅ Federation established
|
||||
```
|
||||
|
||||
### Trust Levels
|
||||
|
||||
Companies can grant different permission levels to each partner:
|
||||
|
||||
| Level | Permissions |
|
||||
| ---------------- | ------------------------------------------------------------- |
|
||||
| **Basic** | View company profile only |
|
||||
| **Availability** | Query equipment availability for date ranges |
|
||||
| **Sub-hire** | Request and track sub-hire transactions |
|
||||
| **Full** | All above + cross-instance asset scanning + condition history |
|
||||
|
||||
Trust levels can be **changed or revoked** at any time by either party.
|
||||
|
||||
### Sub-hire Pricing
|
||||
|
||||
Sub-hire prices are **negotiated per-transaction** out-of-band (phone, email). The federation protocol tracks equipment movement, not pricing. Each instance records its own costs:
|
||||
|
||||
| Party | Records |
|
||||
| ------------ | -------------------------------------------------------------- |
|
||||
| **Lender** | Revenue from sub-hire, return deadline, condition at dispatch |
|
||||
| **Borrower** | Day rate / agreed cost, event allocation, condition at receipt |
|
||||
|
||||
### Asset Ownership Transfer
|
||||
|
||||
Assets can be **permanently transferred** between instances (e.g. selling equipment to a partner):
|
||||
|
||||
| Step | Action |
|
||||
| ---- | --------------------------------------------------------------------- |
|
||||
| 1 | Seller initiates transfer request via federation |
|
||||
| 2 | Buyer accepts — asset data (history, maintenance) transferred |
|
||||
| 3 | Asset UUID remains the same — QR code URL updated to new instance |
|
||||
| 4 | Seller's instance marks asset as "Transferred" (end of local history) |
|
||||
| 5 | Buyer's instance creates asset with full imported history |
|
||||
|
||||
### Federation Versioning
|
||||
|
||||
The federation protocol relies on **Protobuf's built-in backward compatibility**:
|
||||
|
||||
- New fields are **additive** — old instances ignore unknown fields
|
||||
- Deprecated fields are **never removed**, only marked obsolete
|
||||
- Breaking changes are **documented** and communicated to partners
|
||||
- No formal version negotiation during handshake — keep it simple
|
||||
|
||||
### Dispute Resolution
|
||||
|
||||
Equipment condition disputes are handled **out-of-band** (phone, email). EventKit provides supporting evidence:
|
||||
|
||||
| Feature | Description |
|
||||
| -------------------------- | ------------------------------------------------------ |
|
||||
| Timestamped condition logs | Before/after condition recorded at dispatch and return |
|
||||
| Photo evidence | Optional photos attached to condition logs |
|
||||
| Audit trail | Full history of who scanned what and when |
|
||||
| Dispute handling | Out-of-band — no formal in-app dispute workflow |
|
||||
|
||||
### Federation Downtime Handling
|
||||
|
||||
| Behaviour | Description |
|
||||
| ---------------------- | ------------------------------------------------------- |
|
||||
| **Outbox queue** | Pending requests queued when partner is offline |
|
||||
| **Auto-retry** | Automatic retry when partner comes back online |
|
||||
| **Admin notification** | Notify admin after configurable hours of failed retries |
|
||||
| **Manual retry** | Option to manually retry or cancel queued requests |
|
||||
|
||||
### Federation Data Visibility
|
||||
|
||||
How much partner inventory data is visible is **configurable per trust level**:
|
||||
|
||||
| Trust Level | Visibility |
|
||||
| ------------ | ------------------------------------------------------- |
|
||||
| **Minimal** | Real-time availability queries only |
|
||||
| **Standard** | Cached summary (category counts, general availability) |
|
||||
| **Full** | Product catalogue shared (not individual asset details) |
|
||||
|
||||
---
|
||||
|
||||
## Global Asset Identity
|
||||
|
||||
Every asset needs a globally unique identifier that works across instances.
|
||||
|
||||
### ID Format
|
||||
|
||||
```
|
||||
UUID: 550e8400-e29b-41d4-a716-446655440000
|
||||
URN: urn:eventkit:{instance-id}:asset:{uuid}
|
||||
QR URL: https://inventory.company-a.com/asset/550e8400-e29b-41d4-a716-446655440000
|
||||
```
|
||||
|
||||
| Component | Purpose |
|
||||
| --------------- | --------------------------------------------------------------------------------- |
|
||||
| **UUID** | Canonical, globally unique identifier — generated locally, no coordination needed |
|
||||
| **Instance ID** | Identifies the owning instance — part of the URN for federated lookups |
|
||||
| **QR URL** | Points to the asset's page on the owning instance — works with any phone camera |
|
||||
|
||||
### Asset Resolution
|
||||
|
||||
When scanning an asset that belongs to another instance:
|
||||
|
||||
```mermaid
|
||||
sequenceDiagram
|
||||
participant Scanner as Company B (Scanner)
|
||||
participant B_API as Company B Instance
|
||||
participant A_API as Company A Instance
|
||||
|
||||
Scanner->>B_API: Scan QR → URL: company-a.com/asset/{uuid}
|
||||
B_API->>B_API: URL domain ≠ local → federated asset
|
||||
B_API->>A_API: GET /federation/asset/{uuid}
|
||||
A_API->>A_API: Check: Is Company B a trusted partner?
|
||||
A_API-->>B_API: 200 OK {asset_details, status, condition}
|
||||
B_API-->>Scanner: Display asset info + "Owned by Company A"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Federation API (gRPC via ConnectRPC)
|
||||
|
||||
Federation between instances uses **gRPC over HTTP/2** with mTLS. The API is defined in Protocol Buffers and served via ConnectRPC, meaning it also supports the Connect protocol (HTTP/1.1 + JSON) for debugging and testing.
|
||||
|
||||
### Protobuf Service Definition
|
||||
|
||||
```protobuf
|
||||
// proto/eventkit/v1/federation.proto
|
||||
syntax = "proto3";
|
||||
package eventkit.v1;
|
||||
|
||||
service FederationService {
|
||||
// Trust management
|
||||
rpc RequestFederation(FederationRequest) returns (FederationRequestResponse);
|
||||
rpc ConfirmFederation(FederationConfirm) returns (FederationConfirmResponse);
|
||||
rpc RevokeFederation(RevokeFederationRequest) returns (RevokeFederationResponse);
|
||||
rpc ListPartners(ListPartnersRequest) returns (ListPartnersResponse);
|
||||
|
||||
// Availability
|
||||
rpc QueryAvailability(AvailabilityQuery) returns (AvailabilityResponse);
|
||||
rpc QueryProductAvailability(ProductAvailabilityQuery) returns (ProductAvailabilityResponse);
|
||||
|
||||
// Sub-hire
|
||||
rpc RequestSubhire(SubhireRequest) returns (SubhireResponse);
|
||||
rpc AcceptSubhire(SubhireActionRequest) returns (SubhireActionResponse);
|
||||
rpc RejectSubhire(SubhireActionRequest) returns (SubhireActionResponse);
|
||||
rpc DispatchSubhire(SubhireActionRequest) returns (SubhireActionResponse);
|
||||
rpc ReceiveSubhire(SubhireActionRequest) returns (SubhireActionResponse);
|
||||
rpc ReturnSubhire(SubhireActionRequest) returns (SubhireActionResponse);
|
||||
rpc ListSubhires(ListSubhiresRequest) returns (ListSubhiresResponse);
|
||||
|
||||
// Asset resolution
|
||||
rpc GetAsset(GetFederatedAssetRequest) returns (FederatedAsset);
|
||||
rpc ReportAssetCheckin(AssetCheckinReport) returns (AssetReportResponse);
|
||||
rpc ReportAssetCheckout(AssetCheckoutReport) returns (AssetReportResponse);
|
||||
rpc ReportAssetDamage(AssetDamageReport) returns (AssetReportResponse);
|
||||
|
||||
// Real-time event stream (server-streaming)
|
||||
rpc SubscribeEvents(SubscribeEventsRequest) returns (stream FederationEvent);
|
||||
}
|
||||
```
|
||||
|
||||
### RPC Reference
|
||||
|
||||
| Category | RPC Method | Purpose |
|
||||
| ---------------- | -------------------------- | ----------------------------------------------------- |
|
||||
| **Trust** | `RequestFederation` | Initiate federation handshake with a partner instance |
|
||||
| **Trust** | `ConfirmFederation` | Accept a incoming federation request |
|
||||
| **Trust** | `RevokeFederation` | Revoke trust — cancels all pending operations |
|
||||
| **Trust** | `ListPartners` | List current federation partners (local) |
|
||||
| **Availability** | `QueryAvailability` | Query available equipment for a date range |
|
||||
| **Availability** | `QueryProductAvailability` | Check specific product availability |
|
||||
| **Sub-hire** | `RequestSubhire` | Request equipment from a partner |
|
||||
| **Sub-hire** | `AcceptSubhire` | Accept a sub-hire request |
|
||||
| **Sub-hire** | `RejectSubhire` | Reject a sub-hire request |
|
||||
| **Sub-hire** | `DispatchSubhire` | Notify partner that equipment has been dispatched |
|
||||
| **Sub-hire** | `ReceiveSubhire` | Confirm receipt of sub-hired equipment |
|
||||
| **Sub-hire** | `ReturnSubhire` | Notify partner that equipment is being returned |
|
||||
| **Sub-hire** | `ListSubhires` | List all active sub-hire transactions |
|
||||
| **Asset** | `GetAsset` | Get asset details (respects trust level) |
|
||||
| **Asset** | `ReportAssetCheckin` | Notify owner: "We received your asset" |
|
||||
| **Asset** | `ReportAssetCheckout` | Notify owner: "We're returning your asset" |
|
||||
| **Asset** | `ReportAssetDamage` | Report damage on a partner's asset |
|
||||
| **Events** | `SubscribeEvents` | Server-streaming RPC for real-time federation events |
|
||||
|
||||
### Federation Event Stream
|
||||
|
||||
Instead of webhooks, federation uses a **gRPC server-streaming RPC** for real-time notifications. This is more reliable than webhooks (built-in reconnection, backpressure) and works naturally with gRPC:
|
||||
|
||||
```protobuf
|
||||
message FederationEvent {
|
||||
string event_id = 1;
|
||||
google.protobuf.Timestamp timestamp = 2;
|
||||
oneof event {
|
||||
SubhireRequestedEvent subhire_requested = 10;
|
||||
SubhireAcceptedEvent subhire_accepted = 11;
|
||||
SubhireDispatchedEvent subhire_dispatched = 12;
|
||||
SubhireReceivedEvent subhire_received = 13;
|
||||
SubhireReturnedEvent subhire_returned = 14;
|
||||
AssetDamagedEvent asset_damaged = 15;
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
| Event | Trigger |
|
||||
| -------------------- | -------------------------------- |
|
||||
| `subhire_requested` | New sub-hire request received |
|
||||
| `subhire_accepted` | Sub-hire request accepted |
|
||||
| `subhire_dispatched` | Equipment dispatched to partner |
|
||||
| `subhire_received` | Equipment received by partner |
|
||||
| `subhire_returned` | Equipment returned to owner |
|
||||
| `asset_damaged` | Asset damage reported by partner |
|
||||
|
||||
> [!NOTE]
|
||||
> The event stream automatically reconnects on disconnection. Events are persisted in the outbox and replayed from the last acknowledged sequence number, so no events are lost during downtime.
|
||||
|
||||
---
|
||||
|
||||
## Security
|
||||
|
||||
| Measure | Description |
|
||||
| ---------------------- | -------------------------------------------------------------------------------- |
|
||||
| **mTLS** | Mutual TLS — both instances verify each other's certificate |
|
||||
| **Request signing** | Each request includes a cryptographic signature using the instance's private key |
|
||||
| **Scoped permissions** | Tokens carry specific scopes (e.g. `availability:read` ≠ `subhire:write`) |
|
||||
| **Rate limiting** | Per-partner rate limits to prevent abuse |
|
||||
| **IP allowlisting** | Optional — restrict federation to known IP ranges |
|
||||
| **Audit logging** | Every federation interaction is logged on both sides |
|
||||
| **Data minimisation** | Only share the minimum data needed for the requested operation |
|
||||
| **Revocation** | Either party can instantly revoke trust — all pending requests are cancelled |
|
||||
|
||||
---
|
||||
|
||||
## Offline & Resilience
|
||||
|
||||
| Scenario | Behaviour |
|
||||
| --------------------- | -------------------------------------------------------------------------------- |
|
||||
| **Partner offline** | Outbound messages queue in the outbox; delivered when partner comes back |
|
||||
| **Network partition** | Local operations continue unaffected; federation syncs when connectivity returns |
|
||||
| **Conflicting edits** | Owner instance is always the source of truth for asset data |
|
||||
| **Message ordering** | Messages include timestamps and sequence numbers for correct ordering |
|
||||
|
||||
---
|
||||
|
||||
## Data Sharing Summary
|
||||
|
||||
| Data | Shared with Partners? | Notes |
|
||||
| ------------------------ | ------------------------ | ---------------------------------------------------------- |
|
||||
| Company name & contact | ✅ Always | Part of federation profile |
|
||||
| Equipment catalogue | ✅ If availability trust | Product names, categories, quantities available |
|
||||
| Individual asset details | ✅ If full trust | UUID, model, condition (owner controls detail level) |
|
||||
| Equipment pricing | ⚠️ Optional | Owner chooses to share rate cards or negotiate per-request |
|
||||
| Event details | ❌ Never | Internal data |
|
||||
| Client information | ❌ Never | Internal data |
|
||||
| Crew details | ❌ Never | Internal data |
|
||||
| Financial data | ❌ Never | Internal data |
|
||||
| Damage/repair history | ⚠️ Selective | Owner decides what condition info to expose |
|
||||
|
||||
---
|
||||
|
||||
## Related Documentation
|
||||
|
||||
- [[00-SystemOverview]] — High-level system overview
|
||||
- [[01-InventoryManagement]] — Asset data that gets federated
|
||||
- [[05-BarcodeAndQRScanning]] — Scanning federated assets
|
||||
- [[07-TechnicalRequirements]] — Deployment and security considerations
|
||||
Reference in New Issue
Block a user