A healthcare AI platform needed to ingest ECG recordings from dozens of clinic sites into a cloud processing system. The original scope called for AWS Storage Gateway — the standard managed service for bridging SMB file shares to S3 storage. It was the obvious choice until AI agents analyzing the architecture flagged a fundamental security gap that had not been part of the original requirements review.

What started as a straightforward infrastructure task became a custom security engineering project — one that ultimately delivered a system that was more secure, 20% faster, and 73% cheaper than the AWS service it replaced.

The Security Gap

The core issue is structural, not configurational. AWS Storage Gateway acts as a proxy between the SMB protocol and S3. When a clinic connects to the gateway, the gateway authenticates the SMB session and then issues S3 API calls on the clinic's behalf. But every S3 call originates from the gateway instance itself.

This creates cascading isolation failures:

  • All organizations share one SMB auth surface with guest access
  • S3 condition keys resolve to the gateway's IP, not the clinic's
  • IAM roles apply to all shares — no per-org storage isolation

For a platform handling protected health information across multiple healthcare organizations, this is a data isolation violation waiting to happen.

graph TD
    A1["Clinic Connection"] --> B1{"AWS Storage<br/>Gateway"}
    B1 --> C1["Shared IAM Role"]
    C1 --> D1["Single S3 Backend<br/>No Isolation"]

    A2["Clinic Connection"] --> B2{"Custom SMB<br/>Server"}
    B2 --> C2["IP Validation +<br/>Per-Org Credentials"]
    C2 --> D2["Per-Org S3 Bucket<br/>Full Isolation"]

    style A1 fill:#1a1a2e,stroke:#e94560,color:#fff
    style B1 fill:#1a1a2e,stroke:#e94560,color:#fff
    style C1 fill:#1a1a2e,stroke:#e94560,color:#fff
    style D1 fill:#1a1a2e,stroke:#e94560,color:#fff
    style A2 fill:#1a1a2e,stroke:#16c79a,color:#fff
    style B2 fill:#1a1a2e,stroke:#16c79a,color:#fff
    style C2 fill:#1a1a2e,stroke:#16c79a,color:#fff
    style D2 fill:#1a1a2e,stroke:#16c79a,color:#fff

Alternatives Evaluated and Rejected

Before building a custom solution, ML LABS systematically evaluated every AWS-native approach. AI agents compressed what would have been weeks of security architecture research into hours of structured analysis.

  • Per-share IAM roles: not supported by Storage Gateway
  • S3 Access Points: source IP is the gateway's — cannot distinguish clinics
  • Active Directory: $110-350/month, requires clinic-side domain joining
  • Separate Gateway per org: ~$550/month each, unworkable at scale
  • AWS Transfer Family: solves isolation but requires protocol changes

Every managed service either failed to provide per-organization isolation or required clinics to change their network configuration.

A Custom SMB Server

ML LABS built a custom SMB server — a self-managed Samba instance on EC2 that provides per-organization access control enforced at the protocol level, before authentication even begins.

Protocol-Level Isolation

Each organization gets a dedicated Samba share with IP-based access control using hosts allow and hosts deny directives. When a connection arrives, Samba checks the source IP against the share's allowed list. If the IP doesn't match, the connection is refused before any credential exchange occurs.

A managed service that collapses all tenants into a single IAM role is not multi-tenant security — it is shared infrastructure where one misconfigured share name breaks isolation.

Real-Time File Sync

The custom SMB server uses inotify — the Linux kernel's file system event notification — to detect new files the instant they're written. An event-driven sync process uploads files to the organization's S3 bucket within 1-2 seconds of write completion, compared to Storage Gateway's batched approach with variable-length delays.

Multi-Region Deployment

The platform operates in both US and UK regions to satisfy data residency requirements. The custom SMB server is deployed in both regions with identical configuration management, ensuring UK clinic data never transits through US infrastructure.

Monitoring and Audit

Every connection attempt, file sync event, and access denial is logged to CloudWatch. Automated alarms fire on failed connection attempts, sync delays, disk utilization thresholds, and configuration changes outside the deployment pipeline.

Performance and Cost

As the platform scales, it needs one custom SMB instance with failover — not a separate Storage Gateway per organization. Adding a new clinic means a configuration change, not new infrastructure.

  • 20% faster data movement — real-time inotify sync vs. batched gateway
  • 73% cost reduction — $150/mo EC2 vs. ~$550/mo Storage Gateway
  • Zero cross-org data leakage — protocol-level IP restriction by design

The cost stays flat while the number of organizations grows, and the operational surface stays small enough for a single team to manage.

First Steps

  1. Trace the identity chain. Verify the client's identity is preserved at every hop — proxies, gateways, and service boundaries.
  2. Find the collapse point. Check whether any layer merges multiple clients into a single identity. That is where isolation breaks.
  3. Test managed services. Evaluate whether they meet your isolation requirements. If not, custom protocol-level enforcement is the alternative.

Practical Solution Pattern

Replace managed file ingestion services that lack per-tenant isolation with protocol-level enforcement that refuses unauthorized connections before authentication begins. Use IP-based access control, per-organization credentials, and per-organization storage destinations to make cross-tenant access structurally impossible. Add real-time event-driven sync to eliminate the latency penalty of batched approaches.

This works because the security model shifts from "trust the managed service to isolate tenants" to "enforce isolation at the protocol layer where it cannot be bypassed." If your organization needs a security architecture review of existing ingestion infrastructure, an AI Technical Assessment maps the gaps before they become incidents.