--- tags: - eventkit --- # Federation Architecture ## Purpose Enable multiple EventKit instances to communicate securely, allowing cross-company workflows like [[01 - Inventory Management#Sub-hire Management|sub-hire tracking]], equipment availability queries, and cross-boundary [[05 - Barcode and QR Scanning|asset tracking]] — while each company retains full sovereignty over their own data. --- ## Design Principles | Principle | Description | | -------------------------- | ------------------------------------------------------------------------------------- | | **Data sovereignty** | Each instance owns its data. Nothing is shared without explicit configuration | | **Opt-in federation** | Federation is optional. An instance works fully standalone | | **Mutual trust** | Both parties must agree to federate. Either can revoke at any time | | **Minimal exposure** | Only share what's needed. Internal events, clients, crew, and costs are never exposed | | **Cryptographic identity** | Instances authenticate via key pairs and mutual TLS | | **Eventual consistency** | Federation is async — network outages don't break local operations | | **Hosting agnostic** | Federation works identically regardless of where or how an instance is hosted | --- ## Deployment & Hosting Models Federation operates at the **application protocol level**, not the infrastructure level. This means instances federate identically whether they are self-hosted, on dedicated servers, or sharing infrastructure via a managed hosting provider. ### Three Hosting Tiers ```mermaid graph TB subgraph "Managed SaaS Provider" subgraph "Company A (Tenant)" A_App["App A"] A_DB[(DB A)] end subgraph "Company B (Tenant)" B_App["App B"] B_DB[(DB B)] end end subgraph "Dedicated Hosting Provider" subgraph "Company C (Isolated VM)" C_App["App C"] C_DB[(DB C)] end end subgraph "Company D (Self-Hosted)" D_App["App D"] D_DB[(DB D)] end A_App <-->|"Federation"| B_App A_App <-->|"Federation"| C_App B_App <-->|"Federation"| D_App C_App <-->|"Federation"| D_App ``` | Tier | Model | Data Isolation | Best For | | -------------------- | --------------------------------------------------------------------------------------- | ----------------------------------- | ----------------------------------------------------------- | | **Managed SaaS** | Multiple companies on shared infrastructure, each with their own database and subdomain | Logical (separate DBs, same server) | Small companies wanting zero ops overhead | | **Dedicated Hosted** | Isolated instance on a hosting provider's infrastructure (own VM/container) | Physical (separate VMs) | Mid-size companies wanting isolation without self-hosting | | **Self-Hosted** | Company runs their own server on their own infrastructure | Full physical | Large companies or those with strict data sovereignty needs | ### How It Works Per Tier #### Managed SaaS - A hosting provider runs many EventKit instances on shared infrastructure - Each company gets their own **subdomain** (e.g. `companya.eventkit.io`, `companyb.eventkit.io`) - Each company has a **separate database** — data is never mixed - Companies on the **same provider** federate through the same protocol as external instances - The provider can optionally optimise same-host federation (internal routing instead of public internet), but the protocol remains identical - Think of it like **email hosting** — Gmail hosts millions of accounts, but they all speak SMTP to each other and to self-hosted mail servers #### Dedicated Hosted - A hosting provider runs a **dedicated VM or container** per company - Full resource isolation — own CPU, memory, storage - Company gets a **custom domain** or provider subdomain - Provider handles updates, backups, and monitoring - Federates identically to self-hosted instances #### Self-Hosted - Company downloads EventKit and runs it on their own infrastructure - Full control over data, updates, and configuration - Requires technical staff to manage - Federates with any other instance regardless of how it's hosted ### Federation Transparency A critical design property: **no instance can tell how another instance is hosted**. The federation protocol is the same regardless: | Scenario | Federation Behaviour | | ----------------------------------------- | ----------------------------------------------------------------------- | | SaaS tenant ↔ SaaS tenant (same provider) | Standard federation protocol (provider may optimise routing internally) | | SaaS tenant ↔ Self-hosted | Standard federation protocol over the internet | | Self-hosted ↔ Self-hosted | Standard federation protocol over the internet | | Dedicated hosted ↔ SaaS tenant | Standard federation protocol | ### Hosting Provider Responsibilities If a hosting provider offers managed EventKit instances, their responsibilities include: | Responsibility | Description | | -------------------- | ------------------------------------------------------------------ | | **Provisioning** | Spin up new instances with a domain, database, and TLS certificate | | **Isolation** | Ensure tenant data is never accessible to other tenants | | **Updates** | Roll out EventKit updates across managed instances | | **Backups** | Automated backups per tenant with restore capability | | **Monitoring** | Health checks, uptime monitoring, alerting | | **TLS certificates** | Manage certificates for federation (mTLS) and web access | | **DNS** | Manage subdomains or custom domain mapping | ### Multi-Tenant vs. Multi-Instance (Provider Implementation) Hosting providers can choose their internal architecture: | Approach | Description | Pros | Cons | | ---------------------------- | ----------------------------------------------- | ----------------------------------- | ----------------------------------------- | | **Multi-instance** | One container/VM + database per tenant | Strongest isolation, simple ops | Higher resource cost per tenant | | **Shared app, separate DBs** | One application pool, one database per tenant | Lower resource cost, easier updates | Moderate isolation | | **True multi-tenant** | One application, one database, tenant ID column | Lowest cost | Weakest isolation, complex access control | **Recommended for providers**: Multi-instance (containerised) for the best balance of isolation and operability. Each tenant gets a Docker container + PostgreSQL database, managed via orchestration (e.g. Kubernetes). --- ## Instance Architecture ```mermaid graph TB subgraph "Single Instance (any hosting tier)" A_Core["Core Application"] A_Fed["Federation Service"] A_DB[(Database)] A_Core --> A_Fed A_Core --> A_DB end subgraph "Partner Instance (any hosting tier)" B_Core["Core Application"] B_Fed["Federation Service"] B_DB[(Database)] B_Core --> B_Fed B_Core --> B_DB end A_Fed <-->|"mTLS + Signed Requests"| B_Fed ``` ### Components | Component | Role | | ---------------------- | ------------------------------------------------------------------------------------------ | | **Core Application** | The main EventKit instance — handles all local operations | | **Federation Service** | A dedicated service/module that handles all inter-instance communication | | **Trust Store** | Stores public keys and connection details for all trusted partner instances | | **Outbox / Inbox** | Queue for outbound and inbound federation messages (ensures delivery even during downtime) | --- ## Trust Establishment ### Handshake Flow ```mermaid sequenceDiagram participant A as Instance A participant B as Instance B Note over A: Admin initiates federation request A->>B: POST /federation/request Note right of A: {instance_url, public_key, company_name, contact_email} B-->>A: 202 Accepted (pending review) Note over B: Admin reviews request Note over B: Admin approves B->>A: POST /federation/confirm Note right of B: {signed_challenge, public_key, permissions_granted} A-->>B: 200 OK Note over A,B: ✅ Federation established ``` ### Trust Levels Companies can grant different permission levels to each partner: | Level | Permissions | | ---------------- | ------------------------------------------------------------- | | **Basic** | View company profile only | | **Availability** | Query equipment availability for date ranges | | **Sub-hire** | Request and track sub-hire transactions | | **Full** | All above + cross-instance asset scanning + condition history | Trust levels can be **changed or revoked** at any time by either party. ### Sub-hire Pricing Sub-hire prices are **negotiated per-transaction** out-of-band (phone, email). The federation protocol tracks equipment movement, not pricing. Each instance records its own costs: | Party | Records | | ------------ | -------------------------------------------------------------- | | **Lender** | Revenue from sub-hire, return deadline, condition at dispatch | | **Borrower** | Day rate / agreed cost, event allocation, condition at receipt | ### Asset Ownership Transfer Assets can be **permanently transferred** between instances (e.g. selling equipment to a partner): | Step | Action | | ---- | --------------------------------------------------------------------- | | 1 | Seller initiates transfer request via federation | | 2 | Buyer accepts — asset data (history, maintenance) transferred | | 3 | Asset UUID remains the same — QR code URL updated to new instance | | 4 | Seller's instance marks asset as "Transferred" (end of local history) | | 5 | Buyer's instance creates asset with full imported history | ### Federation Versioning The federation protocol relies on **Protobuf's built-in backward compatibility**: - New fields are **additive** — old instances ignore unknown fields - Deprecated fields are **never removed**, only marked obsolete - Breaking changes are **documented** and communicated to partners - No formal version negotiation during handshake — keep it simple ### Dispute Resolution Equipment condition disputes are handled **out-of-band** (phone, email). EventKit provides supporting evidence: | Feature | Description | | -------------------------- | ------------------------------------------------------ | | Timestamped condition logs | Before/after condition recorded at dispatch and return | | Photo evidence | Optional photos attached to condition logs | | Audit trail | Full history of who scanned what and when | | Dispute handling | Out-of-band — no formal in-app dispute workflow | ### Federation Downtime Handling | Behaviour | Description | | ---------------------- | ------------------------------------------------------- | | **Outbox queue** | Pending requests queued when partner is offline | | **Auto-retry** | Automatic retry when partner comes back online | | **Admin notification** | Notify admin after configurable hours of failed retries | | **Manual retry** | Option to manually retry or cancel queued requests | ### Federation Data Visibility How much partner inventory data is visible is **configurable per trust level**: | Trust Level | Visibility | | ------------ | ------------------------------------------------------- | | **Minimal** | Real-time availability queries only | | **Standard** | Cached summary (category counts, general availability) | | **Full** | Product catalogue shared (not individual asset details) | --- ## Global Asset Identity Every asset needs a globally unique identifier that works across instances. ### ID Format ``` UUID: 550e8400-e29b-41d4-a716-446655440000 URN: urn:eventkit:{instance-id}:asset:{uuid} QR URL: https://inventory.company-a.com/asset/550e8400-e29b-41d4-a716-446655440000 ``` | Component | Purpose | | --------------- | --------------------------------------------------------------------------------- | | **UUID** | Canonical, globally unique identifier — generated locally, no coordination needed | | **Instance ID** | Identifies the owning instance — part of the URN for federated lookups | | **QR URL** | Points to the asset's page on the owning instance — works with any phone camera | ### Asset Resolution When scanning an asset that belongs to another instance: ```mermaid sequenceDiagram participant Scanner as Company B (Scanner) participant B_API as Company B Instance participant A_API as Company A Instance Scanner->>B_API: Scan QR → URL: company-a.com/asset/{uuid} B_API->>B_API: URL domain ≠ local → federated asset B_API->>A_API: GET /federation/asset/{uuid} A_API->>A_API: Check: Is Company B a trusted partner? A_API-->>B_API: 200 OK {asset_details, status, condition} B_API-->>Scanner: Display asset info + "Owned by Company A" ``` --- ## Federation API (gRPC via ConnectRPC) Federation between instances uses **gRPC over HTTP/2** with mTLS. The API is defined in Protocol Buffers and served via ConnectRPC, meaning it also supports the Connect protocol (HTTP/1.1 + JSON) for debugging and testing. ### Protobuf Service Definition ```protobuf // proto/eventkit/v1/federation.proto syntax = "proto3"; package eventkit.v1; service FederationService { // Trust management rpc RequestFederation(FederationRequest) returns (FederationRequestResponse); rpc ConfirmFederation(FederationConfirm) returns (FederationConfirmResponse); rpc RevokeFederation(RevokeFederationRequest) returns (RevokeFederationResponse); rpc ListPartners(ListPartnersRequest) returns (ListPartnersResponse); // Availability rpc QueryAvailability(AvailabilityQuery) returns (AvailabilityResponse); rpc QueryProductAvailability(ProductAvailabilityQuery) returns (ProductAvailabilityResponse); // Sub-hire rpc RequestSubhire(SubhireRequest) returns (SubhireResponse); rpc AcceptSubhire(SubhireActionRequest) returns (SubhireActionResponse); rpc RejectSubhire(SubhireActionRequest) returns (SubhireActionResponse); rpc DispatchSubhire(SubhireActionRequest) returns (SubhireActionResponse); rpc ReceiveSubhire(SubhireActionRequest) returns (SubhireActionResponse); rpc ReturnSubhire(SubhireActionRequest) returns (SubhireActionResponse); rpc ListSubhires(ListSubhiresRequest) returns (ListSubhiresResponse); // Asset resolution rpc GetAsset(GetFederatedAssetRequest) returns (FederatedAsset); rpc ReportAssetCheckin(AssetCheckinReport) returns (AssetReportResponse); rpc ReportAssetCheckout(AssetCheckoutReport) returns (AssetReportResponse); rpc ReportAssetDamage(AssetDamageReport) returns (AssetReportResponse); // Real-time event stream (server-streaming) rpc SubscribeEvents(SubscribeEventsRequest) returns (stream FederationEvent); } ``` ### RPC Reference | Category | RPC Method | Purpose | | ---------------- | -------------------------- | ----------------------------------------------------- | | **Trust** | `RequestFederation` | Initiate federation handshake with a partner instance | | **Trust** | `ConfirmFederation` | Accept a incoming federation request | | **Trust** | `RevokeFederation` | Revoke trust — cancels all pending operations | | **Trust** | `ListPartners` | List current federation partners (local) | | **Availability** | `QueryAvailability` | Query available equipment for a date range | | **Availability** | `QueryProductAvailability` | Check specific product availability | | **Sub-hire** | `RequestSubhire` | Request equipment from a partner | | **Sub-hire** | `AcceptSubhire` | Accept a sub-hire request | | **Sub-hire** | `RejectSubhire` | Reject a sub-hire request | | **Sub-hire** | `DispatchSubhire` | Notify partner that equipment has been dispatched | | **Sub-hire** | `ReceiveSubhire` | Confirm receipt of sub-hired equipment | | **Sub-hire** | `ReturnSubhire` | Notify partner that equipment is being returned | | **Sub-hire** | `ListSubhires` | List all active sub-hire transactions | | **Asset** | `GetAsset` | Get asset details (respects trust level) | | **Asset** | `ReportAssetCheckin` | Notify owner: "We received your asset" | | **Asset** | `ReportAssetCheckout` | Notify owner: "We're returning your asset" | | **Asset** | `ReportAssetDamage` | Report damage on a partner's asset | | **Events** | `SubscribeEvents` | Server-streaming RPC for real-time federation events | ### Federation Event Stream Instead of webhooks, federation uses a **gRPC server-streaming RPC** for real-time notifications. This is more reliable than webhooks (built-in reconnection, backpressure) and works naturally with gRPC: ```protobuf message FederationEvent { string event_id = 1; google.protobuf.Timestamp timestamp = 2; oneof event { SubhireRequestedEvent subhire_requested = 10; SubhireAcceptedEvent subhire_accepted = 11; SubhireDispatchedEvent subhire_dispatched = 12; SubhireReceivedEvent subhire_received = 13; SubhireReturnedEvent subhire_returned = 14; AssetDamagedEvent asset_damaged = 15; } } ``` | Event | Trigger | | -------------------- | -------------------------------- | | `subhire_requested` | New sub-hire request received | | `subhire_accepted` | Sub-hire request accepted | | `subhire_dispatched` | Equipment dispatched to partner | | `subhire_received` | Equipment received by partner | | `subhire_returned` | Equipment returned to owner | | `asset_damaged` | Asset damage reported by partner | > [!NOTE] > The event stream automatically reconnects on disconnection. Events are persisted in the outbox and replayed from the last acknowledged sequence number, so no events are lost during downtime. --- ## Security | Measure | Description | | ---------------------- | -------------------------------------------------------------------------------- | | **mTLS** | Mutual TLS — both instances verify each other's certificate | | **Request signing** | Each request includes a cryptographic signature using the instance's private key | | **Scoped permissions** | Tokens carry specific scopes (e.g. `availability:read` ≠ `subhire:write`) | | **Rate limiting** | Per-partner rate limits to prevent abuse | | **IP allowlisting** | Optional — restrict federation to known IP ranges | | **Audit logging** | Every federation interaction is logged on both sides | | **Data minimisation** | Only share the minimum data needed for the requested operation | | **Revocation** | Either party can instantly revoke trust — all pending requests are cancelled | --- ## Offline & Resilience | Scenario | Behaviour | | --------------------- | -------------------------------------------------------------------------------- | | **Partner offline** | Outbound messages queue in the outbox; delivered when partner comes back | | **Network partition** | Local operations continue unaffected; federation syncs when connectivity returns | | **Conflicting edits** | Owner instance is always the source of truth for asset data | | **Message ordering** | Messages include timestamps and sequence numbers for correct ordering | --- ## Data Sharing Summary | Data | Shared with Partners? | Notes | | ------------------------ | ------------------------ | ---------------------------------------------------------- | | Company name & contact | ✅ Always | Part of federation profile | | Equipment catalogue | ✅ If availability trust | Product names, categories, quantities available | | Individual asset details | ✅ If full trust | UUID, model, condition (owner controls detail level) | | Equipment pricing | ⚠️ Optional | Owner chooses to share rate cards or negotiate per-request | | Event details | ❌ Never | Internal data | | Client information | ❌ Never | Internal data | | Crew details | ❌ Never | Internal data | | Financial data | ❌ Never | Internal data | | Damage/repair history | ⚠️ Selective | Owner decides what condition info to expose | --- ## Related Documentation - [[00 - System Overview]] — High-level system overview - [[01 - Inventory Management]] — Asset data that gets federated - [[05 - Barcode and QR Scanning]] — Scanning federated assets - [[07 - Technical Requirements]] — Deployment and security considerations