Files
obsidian/Test/EventKit/04 - Federation Architecture.md

20 KiB

tags
tags
eventkit

Federation Architecture

Purpose

Enable multiple EventKit instances to communicate securely, allowing cross-company workflows like 01 - Inventory Management#Sub-hire Management, equipment availability queries, and cross-boundary 05 - Barcode and QR Scanning — while each company retains full sovereignty over their own data.


Design Principles

Principle Description
Data sovereignty Each instance owns its data. Nothing is shared without explicit configuration
Opt-in federation Federation is optional. An instance works fully standalone
Mutual trust Both parties must agree to federate. Either can revoke at any time
Minimal exposure Only share what's needed. Internal events, clients, crew, and costs are never exposed
Cryptographic identity Instances authenticate via key pairs and mutual TLS
Eventual consistency Federation is async — network outages don't break local operations
Hosting agnostic Federation works identically regardless of where or how an instance is hosted

Deployment & Hosting Models

Federation operates at the application protocol level, not the infrastructure level. This means instances federate identically whether they are self-hosted, on dedicated servers, or sharing infrastructure via a managed hosting provider.

Three Hosting Tiers

graph TB
    subgraph "Managed SaaS Provider"
        subgraph "Company A (Tenant)"
            A_App["App A"]
            A_DB[(DB A)]
        end
        subgraph "Company B (Tenant)"
            B_App["App B"]
            B_DB[(DB B)]
        end
    end

    subgraph "Dedicated Hosting Provider"
        subgraph "Company C (Isolated VM)"
            C_App["App C"]
            C_DB[(DB C)]
        end
    end

    subgraph "Company D (Self-Hosted)"
        D_App["App D"]
        D_DB[(DB D)]
    end

    A_App <-->|"Federation"| B_App
    A_App <-->|"Federation"| C_App
    B_App <-->|"Federation"| D_App
    C_App <-->|"Federation"| D_App
Tier Model Data Isolation Best For
Managed SaaS Multiple companies on shared infrastructure, each with their own database and subdomain Logical (separate DBs, same server) Small companies wanting zero ops overhead
Dedicated Hosted Isolated instance on a hosting provider's infrastructure (own VM/container) Physical (separate VMs) Mid-size companies wanting isolation without self-hosting
Self-Hosted Company runs their own server on their own infrastructure Full physical Large companies or those with strict data sovereignty needs

How It Works Per Tier

Managed SaaS

  • A hosting provider runs many EventKit instances on shared infrastructure
  • Each company gets their own subdomain (e.g. companya.eventkit.io, companyb.eventkit.io)
  • Each company has a separate database — data is never mixed
  • Companies on the same provider federate through the same protocol as external instances
  • The provider can optionally optimise same-host federation (internal routing instead of public internet), but the protocol remains identical
  • Think of it like email hosting — Gmail hosts millions of accounts, but they all speak SMTP to each other and to self-hosted mail servers

Dedicated Hosted

  • A hosting provider runs a dedicated VM or container per company
  • Full resource isolation — own CPU, memory, storage
  • Company gets a custom domain or provider subdomain
  • Provider handles updates, backups, and monitoring
  • Federates identically to self-hosted instances

Self-Hosted

  • Company downloads EventKit and runs it on their own infrastructure
  • Full control over data, updates, and configuration
  • Requires technical staff to manage
  • Federates with any other instance regardless of how it's hosted

Federation Transparency

A critical design property: no instance can tell how another instance is hosted. The federation protocol is the same regardless:

Scenario Federation Behaviour
SaaS tenant ↔ SaaS tenant (same provider) Standard federation protocol (provider may optimise routing internally)
SaaS tenant ↔ Self-hosted Standard federation protocol over the internet
Self-hosted ↔ Self-hosted Standard federation protocol over the internet
Dedicated hosted ↔ SaaS tenant Standard federation protocol

Hosting Provider Responsibilities

If a hosting provider offers managed EventKit instances, their responsibilities include:

Responsibility Description
Provisioning Spin up new instances with a domain, database, and TLS certificate
Isolation Ensure tenant data is never accessible to other tenants
Updates Roll out EventKit updates across managed instances
Backups Automated backups per tenant with restore capability
Monitoring Health checks, uptime monitoring, alerting
TLS certificates Manage certificates for federation (mTLS) and web access
DNS Manage subdomains or custom domain mapping

Multi-Tenant vs. Multi-Instance (Provider Implementation)

Hosting providers can choose their internal architecture:

Approach Description Pros Cons
Multi-instance One container/VM + database per tenant Strongest isolation, simple ops Higher resource cost per tenant
Shared app, separate DBs One application pool, one database per tenant Lower resource cost, easier updates Moderate isolation
True multi-tenant One application, one database, tenant ID column Lowest cost Weakest isolation, complex access control

Recommended for providers: Multi-instance (containerised) for the best balance of isolation and operability. Each tenant gets a Docker container + PostgreSQL database, managed via orchestration (e.g. Kubernetes).


Instance Architecture

graph TB
    subgraph "Single Instance (any hosting tier)"
        A_Core["Core Application"]
        A_Fed["Federation Service"]
        A_DB[(Database)]
        A_Core --> A_Fed
        A_Core --> A_DB
    end

    subgraph "Partner Instance (any hosting tier)"
        B_Core["Core Application"]
        B_Fed["Federation Service"]
        B_DB[(Database)]
        B_Core --> B_Fed
        B_Core --> B_DB
    end

    A_Fed <-->|"mTLS + Signed Requests"| B_Fed

Components

Component Role
Core Application The main EventKit instance — handles all local operations
Federation Service A dedicated service/module that handles all inter-instance communication
Trust Store Stores public keys and connection details for all trusted partner instances
Outbox / Inbox Queue for outbound and inbound federation messages (ensures delivery even during downtime)

Trust Establishment

Handshake Flow

sequenceDiagram
    participant A as Instance A
    participant B as Instance B

    Note over A: Admin initiates federation request
    A->>B: POST /federation/request
    Note right of A: {instance_url, public_key, company_name, contact_email}

    B-->>A: 202 Accepted (pending review)

    Note over B: Admin reviews request
    Note over B: Admin approves

    B->>A: POST /federation/confirm
    Note right of B: {signed_challenge, public_key, permissions_granted}

    A-->>B: 200 OK
    Note over A,B: ✅ Federation established

Trust Levels

Companies can grant different permission levels to each partner:

Level Permissions
Basic View company profile only
Availability Query equipment availability for date ranges
Sub-hire Request and track sub-hire transactions
Full All above + cross-instance asset scanning + condition history

Trust levels can be changed or revoked at any time by either party.


Global Asset Identity

Every asset needs a globally unique identifier that works across instances.

ID Format

UUID:    550e8400-e29b-41d4-a716-446655440000
URN:     urn:eventkit:{instance-id}:asset:{uuid}
QR URL:  https://inventory.company-a.com/asset/550e8400-e29b-41d4-a716-446655440000
Component Purpose
UUID Canonical, globally unique identifier — generated locally, no coordination needed
Instance ID Identifies the owning instance — part of the URN for federated lookups
QR URL Points to the asset's page on the owning instance — works with any phone camera

Asset Resolution

When scanning an asset that belongs to another instance:

sequenceDiagram
    participant Scanner as Company B (Scanner)
    participant B_API as Company B Instance
    participant A_API as Company A Instance

    Scanner->>B_API: Scan QR → URL: company-a.com/asset/{uuid}
    B_API->>B_API: URL domain ≠ local → federated asset
    B_API->>A_API: GET /federation/asset/{uuid}
    A_API->>A_API: Check: Is Company B a trusted partner?
    A_API-->>B_API: 200 OK {asset_details, status, condition}
    B_API-->>Scanner: Display asset info + "Owned by Company A"

Federation API (gRPC via ConnectRPC)

Federation between instances uses gRPC over HTTP/2 with mTLS. The API is defined in Protocol Buffers and served via ConnectRPC, meaning it also supports the Connect protocol (HTTP/1.1 + JSON) for debugging and testing.

Protobuf Service Definition

// proto/eventkit/v1/federation.proto
syntax = "proto3";
package eventkit.v1;

service FederationService {
  // Trust management
  rpc RequestFederation(FederationRequest) returns (FederationRequestResponse);
  rpc ConfirmFederation(FederationConfirm) returns (FederationConfirmResponse);
  rpc RevokeFederation(RevokeFederationRequest) returns (RevokeFederationResponse);
  rpc ListPartners(ListPartnersRequest) returns (ListPartnersResponse);

  // Availability
  rpc QueryAvailability(AvailabilityQuery) returns (AvailabilityResponse);
  rpc QueryProductAvailability(ProductAvailabilityQuery) returns (ProductAvailabilityResponse);

  // Sub-hire
  rpc RequestSubhire(SubhireRequest) returns (SubhireResponse);
  rpc AcceptSubhire(SubhireActionRequest) returns (SubhireActionResponse);
  rpc RejectSubhire(SubhireActionRequest) returns (SubhireActionResponse);
  rpc DispatchSubhire(SubhireActionRequest) returns (SubhireActionResponse);
  rpc ReceiveSubhire(SubhireActionRequest) returns (SubhireActionResponse);
  rpc ReturnSubhire(SubhireActionRequest) returns (SubhireActionResponse);
  rpc ListSubhires(ListSubhiresRequest) returns (ListSubhiresResponse);

  // Asset resolution
  rpc GetAsset(GetFederatedAssetRequest) returns (FederatedAsset);
  rpc ReportAssetCheckin(AssetCheckinReport) returns (AssetReportResponse);
  rpc ReportAssetCheckout(AssetCheckoutReport) returns (AssetReportResponse);
  rpc ReportAssetDamage(AssetDamageReport) returns (AssetReportResponse);

  // Real-time event stream (server-streaming)
  rpc SubscribeEvents(SubscribeEventsRequest) returns (stream FederationEvent);
}

RPC Reference

Category RPC Method Purpose
Trust RequestFederation Initiate federation handshake with a partner instance
Trust ConfirmFederation Accept a incoming federation request
Trust RevokeFederation Revoke trust — cancels all pending operations
Trust ListPartners List current federation partners (local)
Availability QueryAvailability Query available equipment for a date range
Availability QueryProductAvailability Check specific product availability
Sub-hire RequestSubhire Request equipment from a partner
Sub-hire AcceptSubhire Accept a sub-hire request
Sub-hire RejectSubhire Reject a sub-hire request
Sub-hire DispatchSubhire Notify partner that equipment has been dispatched
Sub-hire ReceiveSubhire Confirm receipt of sub-hired equipment
Sub-hire ReturnSubhire Notify partner that equipment is being returned
Sub-hire ListSubhires List all active sub-hire transactions
Asset GetAsset Get asset details (respects trust level)
Asset ReportAssetCheckin Notify owner: "We received your asset"
Asset ReportAssetCheckout Notify owner: "We're returning your asset"
Asset ReportAssetDamage Report damage on a partner's asset
Events SubscribeEvents Server-streaming RPC for real-time federation events

Federation Event Stream

Instead of webhooks, federation uses a gRPC server-streaming RPC for real-time notifications. This is more reliable than webhooks (built-in reconnection, backpressure) and works naturally with gRPC:

message FederationEvent {
  string event_id = 1;
  google.protobuf.Timestamp timestamp = 2;
  oneof event {
    SubhireRequestedEvent subhire_requested = 10;
    SubhireAcceptedEvent subhire_accepted = 11;
    SubhireDispatchedEvent subhire_dispatched = 12;
    SubhireReceivedEvent subhire_received = 13;
    SubhireReturnedEvent subhire_returned = 14;
    AssetDamagedEvent asset_damaged = 15;
  }
}
Event Trigger
subhire_requested New sub-hire request received
subhire_accepted Sub-hire request accepted
subhire_dispatched Equipment dispatched to partner
subhire_received Equipment received by partner
subhire_returned Equipment returned to owner
asset_damaged Asset damage reported by partner

Note

The event stream automatically reconnects on disconnection. Events are persisted in the outbox and replayed from the last acknowledged sequence number, so no events are lost during downtime.


Security

Measure Description
mTLS Mutual TLS — both instances verify each other's certificate
Request signing Each request includes a cryptographic signature using the instance's private key
Scoped permissions Tokens carry specific scopes (e.g. availability:readsubhire:write)
Rate limiting Per-partner rate limits to prevent abuse
IP allowlisting Optional — restrict federation to known IP ranges
Audit logging Every federation interaction is logged on both sides
Data minimisation Only share the minimum data needed for the requested operation
Revocation Either party can instantly revoke trust — all pending requests are cancelled

Offline & Resilience

Scenario Behaviour
Partner offline Outbound messages queue in the outbox; delivered when partner comes back
Network partition Local operations continue unaffected; federation syncs when connectivity returns
Conflicting edits Owner instance is always the source of truth for asset data
Message ordering Messages include timestamps and sequence numbers for correct ordering

Data Sharing Summary

Data Shared with Partners? Notes
Company name & contact Always Part of federation profile
Equipment catalogue If availability trust Product names, categories, quantities available
Individual asset details If full trust UUID, model, condition (owner controls detail level)
Equipment pricing ⚠️ Optional Owner chooses to share rate cards or negotiate per-request
Event details Never Internal data
Client information Never Internal data
Crew details Never Internal data
Financial data Never Internal data
Damage/repair history ⚠️ Selective Owner decides what condition info to expose