PHX S3-Based Ansible Software Repository Design

Design for an Ansible software repository in the PHX project, using vRA S3 buckets for storing binaries and installers, with a reusable Ansible role for secure downloads on Linux and Windows targets.

Projects:  c2platform/phx/ansible


Introduction

This design document outlines the architecture and implementation of an Ansible software repository for the PHX project. The solution leverages vRA S3 buckets as the storage backend, as no alternative servers like Nexus are available. It combines S3 storage with a generic, reusable Ansible role to facilitate secure and efficient software downloads for lifecycle management (LCM) and maintenance tasks. The Ansible role for downloads will support direct downloads and S3 downloads out-of-the-box.

Goals and Scope

Currently, the software artifacts that Ansible needs are stored on a Windows share. The goal is to provide a better solution for supplying Ansible-based automation with the necessary software artifacts. It should work equally well for Linux and Windows hosts, scale better than a Windows share, and use the most common download protocol, namely HTTP.

The primary goals of this design are to provide a robust, secure method for managing software artifacts in the air-gapped PHX environment, enabling automated LCM processes without external dependencies. The scope includes storage of binaries for Linux and Windows systems, integration with existing Ansible workflows, and extension of roles for broader compatibility. Key benefits include reduced manual intervention, improved idempotency for repeated operations, and enhanced security through controlled access mechanisms.

Assumptions and Constraints

This design assumes a fully air-gapped setup with no internet access, the absence of dedicated repository servers like Nexus, and reliance on vRA-managed S3 buckets. It adheres to Dutch government security policies, such as those outlined in the PoLP and data protection standards, ensuring all operations comply with internal guidelines for automation in sensitive environments. References to related standards include Ansible best practices for idempotent configurations and vRA integration protocols.

Requirements

Functional Requirements

  • Store and manage software artifacts (binaries, installers) in an air-gapped environment using vRA S3 buckets.
  • Support downloads for both Linux and Windows targets via Ansible.
  • Provide a generic, reusable mechanism for S3 interactions, extending existing roles like c2platform.wincore.downloads to include Linux and S3 support.
  • Ensure idempotent operations aligned with desired state configuration.

Non-Functional Requirements

  • Security: Use presigned URLs for time-limited, secure access.
  • Compatibility: Work with standard Ansible modules without requiring AWS dependencies on Windows.
  • Scalability: Handle a large number of applications (~200) efficiently.
  • Environment Constraints: Operate without internet access or additional repository servers like Nexus.

Design Overview

The design utilizes vRA S3 buckets as the core storage for software artifacts. Downloads are managed through the c2platform.core.downloads Ansible role, which leverages the Amazon AWS collection (via the s3_object module) on a Linux delegation server to generate presigned URLs. These URLs enable simple, dependency-free downloads on target servers using modules like get_url or win_get_url.

This setup extends the Windows-only c2platform.wincore.downloads role to support Linux and S3, ensuring a unified, reusable approach for Ansible operations in air-gapped environments.

High-Level Architecture Diagram

Key Components

S3 Buckets (via vRA)

Secure, local storage for software artifacts in the air-gapped PHX domain. Regarding flexibility, Ansible allows flexible policies to be followed for the number of buckets, such as one bucket per Ansible inventory project, per application, for the whole organization, etc. Responsibilities include versioning, access control, and metadata tagging for artifacts. Dependencies: vRA API for provisioning; configured with SSE encryption. Configuration: Buckets are named by application (e.g., phx-app-binaries), with lifecycle policies for retention.

Delegation Server

A RHEL Linux host equipped with boto3 and botocore libraries to handle S3 interactions and generate presigned URLs. Responsibilities: Execute delegated tasks, manage AWS collection modules. Dependencies: Python 3.x, AWS SDK; network access to vRA S3 endpoints. Configuration: Install via Ansible role, ensure IAM-like credentials for vRA authentication.

Target Servers

Linux or Windows hosts that perform downloads using standard Ansible modules. Responsibilities: Fetch files via presigned URLs, verify integrity (e.g., checksums). Dependencies: None beyond core Ansible; supports offline execution. Configuration: Defined in inventory with group/host variables for download paths.

Ansible Downloads Role

Core logic for downloads, supporting S3 and direct URLs, with idempotency for desired state configuration. Responsibilities: Orchestrate URL generation, handle delegation, ensure checksum validation. Dependencies: AWS collection.

Download Process

The download process is orchestrated by the downloads role. Below is a detailed sequence diagram illustrating the workflow.

Process Steps

  1. Delegate S3 interaction to the Linux delegation server using delegate_to in the Ansible task.
  2. Generate a presigned URL using the AWS collection’s s3_object module with parameters like bucket, object, and expiry.
  3. Pass the URL to the target server for download via get_url (Linux) or win_get_url (Windows), including checksum validation for idempotency.
  4. Confirm download and handle idempotency by checking file existence and hash before re-downloading.

Alternatives Considered

Several options were evaluated for the PHX software repository, with S3 via vRA selected for its balance of security, simplicity, and alignment with existing infrastructure.

  • Nexus Repository: Pros: Robust artifact management, versioning, and search. Cons: Requires additional server deployment, not available in air-gapped PHX setup, high maintenance overhead. Rejected due to environmental constraints and lack of resources, as confirmed in PHX prototypes showing deployment challenges.
  • Direct URL Downloads: Pros: Simple, no delegation needed. Cons: Less secure without time-limited access, exposes endpoints indefinitely, harder to manage in air-gapped scenarios. Chosen as a fallback but not primary due to security risks; evaluations highlighted vulnerability to unauthorized access.
  • Custom Scripts: Pros: Tailored control. Cons: Poor maintainability, lacks idempotency, duplicates effort compared to Ansible roles. Avoided in favor of reusable Ansible roles for better scalability, as PHX testing showed scripts prone to errors in multi-platform environments.

S3 with Ansible was chosen for its integration with vRA, support for presigned URLs enhancing security, and reuse of existing roles, reducing development time based on PHX pilot evaluations.

Future Improvements

Potential enhancements include:

  • Automated artifact uploading to S3 using Ansible and AWS Modules.

Additional Information

  • Create and Test S3 Service: Set up a MinIO-based S3 service in a local development environment and test it using Ansible plays to create buckets, upload, and download files.
  • Designing a Flexible Software Repository for Ansible: This document presents RWS’s approach to managing software downloads using Ansible, emphasizing the c2platform.wincore.download Ansible role. This role is versatile, supporting both a simple Apache2 based repository and more advanced setups like Sonatype Nexus Repository Manager. - scripts - scripts

Last modified November 18, 2025: phx s3 software repo PHX-303 PHX-270 (b098d24)