← Back to blog
4 min readMarch 17, 2025

AWS Organizations operator framework

AWS Organizations operator framework

Operating an AWS organization often leads to the need to create scripts to inventory resources (EC2 instances, RDS instances, S3 buckets…), remediate tags, delete unused resources….

When it comes to run these operation scripts across an AWS organization, parallelization is the key to speed up the process. Sequential processing of the accounts and regions in a unique script can take hours to finish. Thanks to Step Functions and Lambda, we can process all accounts and operated regions in minutes from an operator account. Cherry on the cake, it's serverless and it almost does not incur additional costs.

Cross-account roles

In order to execute actions in other accounts, we deploy cross-account roles in all accounts via Cloudformation Stackset. These cross-accounts roles are assumed by operation lambdas.

CloudFormation template for cross-account roles deployed organization-wide via StackSet

AWSTemplateFormatVersion: 2010-09-09
Description: Cross account roles for operations

Parameters:

  OperationAccountId:
    Type: String
    Description: Account ID of the account where the role will be created

Resources:

  OperationAdmin:
    Type: 'AWS::IAM::Role'
    Properties:
      RoleName: operation-admin
      AssumeRolePolicyDocument:
        Statement:
          - Action: 'sts:AssumeRole'
            Effect: Allow
            Principal:
              AWS: !Sub 'arn:aws:iam::${OperationAccountId}:role/operation-admin-lambda'
        Version: 2012-10-17
      Description: Ops admin role assumed by Lambda functions from Operations Accounts
      ManagedPolicyArns:
        - 'arn:aws:iam::aws:policy/AdministratorAccess'

  OperationReadonly:
    Type: 'AWS::IAM::Role'
    Properties:
      RoleName: operation-readonly
      Path: '/'
      Description: Ops readonly role assumed by Lambda functions from Operations Accounts
      AssumeRolePolicyDocument:
        Statement:
          - Action: 'sts:AssumeRole'
            Effect: Allow
            Principal:
              AWS: !Sub 'arn:aws:iam::${OperationAccountId}:role/operation-readonly-lambda'
      ManagedPolicyArns:
       - 'arn:aws:iam::aws:policy/ReadOnlyAccess'

CloudFormation template for lambda roles deployed only in the operation account

AWSTemplateFormatVersion: 2010-09-09
Description: Cross account lambda roles for operations

Resources:

  OperationAdminLambda:
    Type: AWS::IAM::Role
    Properties:
      RoleName: operation-admin-lambda
      Path: "/"
      Description: "Allows lambda to assume role operation-admin in target accounts"
      AssumeRolePolicyDocument:
        Statement:
          - Action: sts:AssumeRole
            Effect: Allow
            Principal:
              Service: lambda.amazonaws.com
      Policies:
        - PolicyName: "assume-role"
          PolicyDocument:
            Statement:
              - Action:
                  - sts:AssumeRole
                Effect: Allow
                Resource:
                  - arn:aws:iam::*:role/operation-admin
      ManagedPolicyArns:
        - arn:aws:iam::aws:policy/service-role/AWSLambdaBasicExecutionRole

  OperationReadonlyLambda:
    Type: AWS::IAM::Role
    Properties:
      RoleName: operation-readonly-lambda
      Path: "/"
      Description: "Allows lambda to assume role operation-readonly in target accounts"
      AssumeRolePolicyDocument:
        Statement:
          - Action: sts:AssumeRole
            Effect: Allow
            Principal:
              Service: lambda.amazonaws.com
      Policies:
        - PolicyName: "assume-role"
          PolicyDocument:
            Statement:
              - Action:
                  - sts:AssumeRole
                Effect: Allow
                Resource:
                  - arn:aws:iam::*:role/operation-readonly
      ManagedPolicyArns:
        - arn:aws:iam::aws:policy/service-role/AWSLambdaBasicExecutionRole

Step machines

The step machines is split into 3 parts.

1. The Lambda fanout

Its aim is to fetch the account list from the organization master account, and build an array of object for each combination of account/region where we want to execute our operation script.

Output example:

{
  "AccountRegions": [
    {
      "AccountId": "2345678765434545343432",
      "AccountName": "project-1-prod",
      "Region": "eu-west-1"
    },
    {
      "AccountId": "2345678765434545343432",
      "AccountName": "project-1-prod",
      "Region": "us-east-1"
    },
    {
      "AccountId": "987654345567575364564",
      "AccountName": "project-2-dev",
      "Region": "eu-west-1"
    },
    {
      "AccountId": "987654345567575364564",
      "AccountName": "project-2-dev",
      "Region": "us-east-1"
    }
  ]
}

The object could contain also other properties required by the worker lambda.

2. The workers map

The map that iterates through the array to execute a Lambda worker with concurrency. As of this writing, the maximum execution is 50. This step can contain at least one Lambda, but you can add tasks if you need to split your process even more.

3. The result handler

As a last step, we can export the results of the map as a csv file into S3, send an email, a Slack notification….

How to handle the payload limit

As your organization or the amount of resources grows, you will reach the 256 KB size limit of the task outputs. It often occurs after the map step because it joins the output of all the Lambda workers. FYI, I suppose that this limit is related to the Lambda asynchronous invocation payload limit.

Fanout output

Let's say you have common parameters for the Lambda workers. For example a list of resource ids such as Security Hub standard arns. Repeating these arns in the AccountRegions items would generate a too big payload. To only have them once, we will leverage the parameters property of the map to build the input payload for each concurrent workers.

Fanout output:

{
  "StandardArns": [
    "arn:aws:securityhub:REGION::standards/aws-foundational-security-best-practices/v/1.0.0",
    "arn:aws:securityhub:REGION::standards/cis-aws-foundations-benchmark/v/1.4.0"
  ],
  "AccountRegions": [
    {
      "AccountId": "2345678765434545343432",
      "AccountName": "project-1-prod",
      "Region": "eu-west-1"
    },
    {
      "AccountId": "987654345567575364564",
      "AccountName": "project-2-dev",
      "Region": "eu-west-1"
    }
  ]
}

Map definition:

Type: Map
ItemsPath: $.AccountRegions
Parameters:
  AccountId.$: $.Map.Item.Value.AccountId
  AccountName.$: $.Map.Item.Value.AccountName
  Region.$: $.Map.Item.Value.Region
  StandardArns.$: $.StandardArns
Iterator:
  StartAt: inventory
  States:
    inventory:
      End: 'true'
      Resource: arn:aws:lambda:eu-west-1:23456765432332435:function:securityhubInventory-47IgsXHr66xW
      Type: Task
MaxConcurrency: 0
Next: export

Input worker example:

{
  "AccountId": "2345678765434545343432",
  "AccountName": "project-1-prod",
  "Region": "eu-west-1",
  "StandardArns": [
    "arn:aws:securityhub:REGION::standards/aws-foundational-security-best-practices/v/1.0.0",
    "arn:aws:securityhub:REGION::standards/cis-aws-foundations-benchmark/v/1.4.0"
  ]
}

Workers map output

One solution is to push the data in DynamoDB or S3, and read it from the last step.

You can either push your data to DynamoDB from your lambda or via the "arn:aws:states:::dynamodb:putItem" resource from Step Functions.

Task definition:

Type: Task
Resource: arn:aws:states:::dynamodb:putItem
Parameters:
  Item:
    accountId:
      S.$: $.accountId
    region:
      S.$: $.region
    standards:
      S.$: $.standards
    status:
      S.$: $.status
  TableName: MyInventoryTable
OutputPath: $.SdkHttpMetadata.HttpStatusCode
End: true

Notice the OutputPath to minimize the output payload size.

What's next?

It depends on your needs. Step Functions is a very powerful orchestrator. Here are more example scripts:

  • Inventories: unused EBS volumes, unused NAT gateways, lambda functions, VPC, accounts, Cloudfront distribution, tags, public IPs, public endpoints
  • Compliance: tags remediation, security groups open ingress remediation
  • Security: security hub standards and controls management, Inspector activation

Code example available on github.

See how VizCon works in 10 minutes

Book a personalized demo and discover how VizCon visualizes your live AWS infrastructure.

Book a demo