Create a cross-account machine learning training and deployment environment with AWS Code Pipeline

A continuous combination and constant delivery (CI/CD) pipeline helps you automate steps in your artificial intelligence (ML) applications such as data consumption, data preparation, feature engineering, modeling training, and design deployment. A pipeline across several AWS accounts improves security, dexterity, and durability due to the fact that an AWS account supplies a natural security and gain access to limit for your AWS resources. This can keep your production environment available and safe for your consumers while keeping training separate.
Setting up the required AWS Identity and Access Management (IAM) permissions for a multi-account CI/CD pipeline for ML workloads can be tough. AWS supplies services such as the AWS MLOps Framework and Amazon SageMaker Pipelines to help consumers deploy cross-account ML pipelines rapidly. However, consumers still wish to know how to set up the best cross-accounts IAM roles and trust relationships to create a cross-account pipeline while encrypting their main ML artifact store.
This post is aimed is at helping consumers who recognize with, and choose AWS CodePipeline as their DevOps and automation tool of choice. Although we utilize artificial intelligence DevOps (MLOps) as an example in this post, you can use this post as a basic guide on establishing a cross-account pipeline in CodePipeline integrating numerous other AWS services. For MLOps in basic, we recommend utilizing SageMaker Pipelines.
Architecture Overview
We release a basic ML pipeline across three AWS accounts. The following in an architecture diagram to represent what you will develop.

For this post, we use 3 accounts:

The shared service account (account A) holds an AWS CodeCommit repository, an Amazon Simple Storage Service (Amazon S3) container, an AWS Key Management Service (AWS KMS) key, and the CodePipeline. The Amazon S3 container consists of training data in CSV format and the design artifacts. You can access the GitHub repo for sample files, design templates, and policies.
The pipeline, hosted in the first account, uses AWS CloudFormation to release an AWS Step Functions workflow to train an ML model in the training account (account B). When training completes, the pipeline invokes an AWS Lambda function to produce an Amazon SageMaker design endpoint in the production account.
To make sure security of your ML information and artifacts, the bucket policy rejects any unencrypted uploads. For example, things should by encrypted using the KMS key hosted in the shared service account. Furthermore, only the CodePipeline and the appropriate roles in the training account and the production account have approvals to utilize the KMS key.
When the pipeline finds a change in the CodeCommit repository (for example, brand-new training data is packed in the Amazon S3), CodePipeline develops or updates the AWS CloudFormation stack in the training account. The stack creates a Step Function state maker.
The state maker starts a training job in the training account utilizing SageMaker XGBoost Container on the training information in Amazon S3. As soon as training finishes, it outputs the design artifact to the output course.
CodePipeline awaits manual approval before the last of the pipeline to verify training outcomes and ready for production.
When authorized, Lambda releases the design to a SageMaker endpoint for production in the production account (account C).

Account A– shared service account
Account B– training account
Account C– production account

Deploy AWS CloudFormation Templates
If you wish to follow along, clone the Git repository to your regional maker. You must have:

In the production account (Account C).

Acknowledge that AWS CloudFormation may produce IAM resources and Create stack.

When the state maker successfully runs, the pipeline demands manual approval prior to the final stage. On the CodePipeline console pick Review and approve the modifications. This moves the pipeline into the last stage of conjuring up the Lambda function to deploy the model.
When the training job completes, the Lambda function in the production account deploys the design endpoint. To do this, the Lambda function assumes the function in the shared service account to run the needed PutJobSuccessResult CodePipeline command.

On to the CodeCommit page select the repository produced by AWS CloudFormation. Upload the sf-sm-train-demo. json file into the empty repository. The sf-sm-train-demo. json file should be updated with the values from the AWS CloudFormation design template outputs. Offer a name, e-mail, and optional message to the primary branch and Commit changes.

Conclusion.
In this article, you produced a cross-account ML pipeline using AWS CodePipeline, AWS CloudFormation, and Lambda. You established the needed IAM policies and roles to allow this cross-account gain access to utilizing a shared services account to hold the ML artifacts in an S3 pail and a customer handled KMS secret for file encryption. You deployed a pipeline where various accounts released different phases of the pipeline using AWS CloudFormation to create a Step Functions state machine for design training and Lambda to invoke it.
You can utilize the actions detailed here to establish cross-account pipelines to fit your work. For example, you can use CodePipeline to securely monitor and release SageMaker endpoints. CodePipeline assists you automate actions in your software application shipment process to allow agility and performance for your groups. Contact your account team to find out how to you can get begun today!

About the Authors.
Peter Chung is a Solutions Architect for AWS, and is passionate about assisting customers discover insights from their data. He has been developing services to assist organizations make data-driven decisions in both the personal and public sectors. He holds all AWS accreditations as well as 2 GCP certifications. He delights in coffee, cooking, staying active, and spending quality time with his family.
He got his Ph.D. in Operations Research after he broke his consultants research study grant account and stopped working to deliver the Nobel Prize he assured. Currently he assists consumers in the monetary service and insurance coverage industry build device knowing services on AWS.
He helps business consumers run and build device knowing services on AWS. David delights in treking and following the newest machine discovering development.
Rajdeep Saha is an expert solutions architect for serverless and containers at Amazon Web Services (AWS). He helps customers create safe and secure and scalable applications on AWS. Rajdeep is passionate about helping and teaching newbies about cloud computing. He is based out of New York City and uses Twitter, sparingly at @_rajdeepsaha.

Select Next.
You can include tags. Otherwise, keep the default stack choice configurations and select Next.
Acknowledge that AWS CloudFormation might develop IAM resources and Create stack.

AWS CloudFormation creates the needed IAM policies, roles, and trust relationships for your cross-account pipeline. Well take the AWS resources and IAM functions produced by the templates to populate our pipeline and step function workflow meanings.
Establishing the pipeline.
To run the ML pipeline, you should update the step and the pipeline Functions state machine meaning files. You can download the files from the Git repository. Change the string values within the angle brackets (e.g. << TrainingAccountID >>) with the worths developed by AWS CloudFormation.
In the shared service account, browse to the Amazon S3 console and select the Amazon S3 pail produced by AWS CloudFormation. Publish the train.csv file from the Git repository and location in a folder identified Data.
Keep in mind: The pail policy denies any upload actions if not used with the KMS secret. As a workaround, eliminate the container policy, upload the files, and re-apply the pail policy.

It will take a few minutes for AWS CloudFormation to release the resources. Well use the outputs from the stack in A as input for the stacks in B and C.
In the training account (Account B).

Select Create stack in the AWS CloudFormation console in the production account.
Select Upload a design template file and select c-cfn-blog. yml file. Select Next.
Provide the KMS crucial ARN, S3 container name, and shared service account ID. Select.
You can include tags. Otherwise, keep the default stack option setups and select Next.
Acknowledge that AWS CloudFormation may develop IAM resources and Create stack.

In the shared service account (Account A).

You can add tags. Otherwise, keep the default stack alternative setups and select Next.

Congratulations! Youve constructed the structure for setting up a cross-account ML pipeline utilizing CodePipeline for training and implementation. You can now see a live SageMaker endpoint in the shared service account created by the Lambda function in the production account.

The function ARN for accessing CodeCommit in the shared service account.
The role ARN of the KMS secret.
The shared S3 pail name.
The AWS account ID for the shared service account.

git clone https://github.com/aws-samples/cross-account-ml-train-deploy-codepipeline

Browse to the AWS CloudFormation console.
Select Create stack.
Select Upload a design template file and choose the a-cfn-blog. yml file. Select Next.
Offer a stack name, a CodeCommit repo name, and the three AWS accounts used for the pipeline. Select Next.

Select Create stack in the AWS CloudFormation console in the training account.
Select Upload a template file and select the b-cfn-blog. yml file. Select Next.
Offer the stack a name and supply the following criteria from the outputs in stack A:.

A pipeline across numerous AWS accounts improves security, agility, and durability because an AWS account supplies a natural security and access border for your AWS resources. The shared service account (account A) holds an AWS CodeCommit repository, an Amazon Simple Storage Service (Amazon S3) container, an AWS Key Management Service (AWS KMS) key, and the CodePipeline. The pipeline, hosted in the very first account, utilizes AWS CloudFormation to release an AWS Step Functions workflow to train an ML design in the training account (account B). Furthermore, just the CodePipeline and the pertinent functions in the training account and the production account have approvals to use the KMS secret.
You can now see a live SageMaker endpoint in the shared service account created by the Lambda function in the production account.

After a successful action, the custom-made deploy-cf-train-test stage develops an AWS CloudFormation template in the training account. You can check the CodePipeline status in the console.
AWS CloudFormation Code releases a Step Functions mention machine to start a design training job by presuming the CodePipeline function in the shared services account. The cross-account function in the training account allows access to the S3 container, KMS secret, CodeCommit repo, and pass role Step Functions specify device execution.

Now that everything is established, you can create and deploy the pipeline.
Releasing the pipeline.
We protected our S3 bucket with the pail policy and KMS key. Only the CodePipeline service function, and the cross-account functions developed by the AWS CloudFormation template in the training and production accounts can use the key. The same applies to the CodeCommit repository. We can run the following command from the shared services account to develop the pipeline.

aws codepipeline create-pipeline — cli-input-json file:// test_pipeline_v3. json.

Run the following command to copy the Git repository.

Leave a Reply

Your email address will not be published.