Home Exams Contribute Leaderboard

Home Exams Contribute Leaderboard Login

Loading provider exams...

© 2026 ZAROS. All Rights Reserved.

info@examcademy.com

Features

About Contribute Questions Leaderboard

Community

Legal

Privacy Policy Terms of Service Legal Notice DMCA

ExamCademy is not affiliated with or endorsed by any certification providers. All trademarks are the property of their respective owners.

/

/

AWS Certified Machine Learning Engineer - Associat… Practice Exam

Join Us Among the Stars

Sign Up & unlock 100% of Exam Questions

Log in / Sign up

No Strings Attached!

Exams
Amazon
AWS Certified Machine Learning Engineer - Associate MLA-C01

AWS Certified Machine Learning Engineer - Associate MLA-C01Preview

By Amazon

Updated

25Q per page

About the AWS Certified Machine Learning Engineer - Associate MLA-C01 Exam

255Practice Questions

3Study Modes

Free

Mode Selection

Choose how you want to study this exam. About study modes →

Topic

Sort

Question 1

Data Preparation for Machine Learning (ML)

Save question

Case study -
An ML engineer is developing a fraud detection model on AWS. The training dataset includes transaction logs, customer profiles, and tables from an on-premises MySQL database. The transaction logs and customer profiles are stored in Amazon S3.
The dataset has a class imbalance that affects the learning of the model's algorithm. Additionally, many of the features have interdependencies. The algorithm is not capturing all the desired underlying patterns in the data.
Which AWS service or feature can aggregate the data from the various data sources?

A Amazon EMR Spark jobs
B Amazon Kinesis Data Streams
C Amazon DynamoDB
D AWS Lake Formation

0

Question 2

Deployment and Orchestration of ML Workflows

0

Question 3

ML Solution Monitoring, Maintenance, and Security

0

Question 4

ML Solution Monitoring, Maintenance, and Security

0

Question 5

Deployment and Orchestration of ML Workflows

0

Page 1 of 11 • Questions 1-25 of 255

Know a question that should be here? Contribute to this exam

A company that has hundreds of data scientists is using Amazon SageMaker to create ML models. The models are in model groups in the SageMaker Model Registry.
The data scientists are grouped into three categories: computer vision, natural language processing (NLP), and speech recognition. An ML engineer needs to implement a solution to organize the existing models into these groups to improve model discoverability at scale. The solution must not affect the integrity of the model artifacts and their existing groupings.
Which solution will meet these requirements?

A Create a custom tag for each of the three categories. Add the tags to the model packages in the SageMaker Model Registry.
B Create a model group for each category. Move the existing models into these category model groups.
C Use SageMaker ML Lineage Tracking to automatically identify and tag which model groups should contain the models.
D Create a Model Registry collection for each of the three categories. Move the existing model groups into the collections.

A company has trained and deployed an ML model by using Amazon SageMaker. The company needs to implement a solution to record and monitor all the API call events for the SageMaker endpoint. The solution also must provide a notification when the number of API call events breaches a threshold.
Which solution will meet these requirements?

A Use SageMaker Debugger to track the inferences and to report metrics. Create a custom rule to provide a notification when the threshold is breached.
B Use SageMaker Debugger to track the inferences and to report metrics. Use the tensor_variance built-in rule to provide a notification when the threshold is breached.
C Log all the endpoint invocation API events by using AWS CloudTrail. Use an Amazon CloudWatch dashboard for monitoring. Set up a CloudWatch alarm to provide notification when the threshold is breached.
D Add the Invocations metric to an Amazon CloudWatch dashboard for monitoring. Set up a CloudWatch alarm to provide notification when the threshold is breached.

An ML engineer trained an ML model on Amazon SageMaker to detect automobile accidents from dosed-circuit TV footage. The ML engineer used SageMaker Data Wrangler to create a training dataset of images of accidents and non-accidents.
The model performed well during training and validation. However, the model is underperforming in production because of variations in the quality of the images from various cameras.
Which solution will improve the model's accuracy in the LEAST amount of time?

A Collect more images from all the cameras. Use Data Wrangler to prepare a new training dataset.
B Recreate the training dataset by using the Data Wrangler corrupt image transform. Specify the impulse noise option.
C Recreate the training dataset by using the Data Wrangler enhance image contrast transform. Specify the Gamma contrast option.
D Recreate the training dataset by using the Data Wrangler resize image transform. Crop all images to the same size.

A company is using Amazon SageMaker to create ML models. The company's data scientists need fine-grained control of the ML workflows that they orchestrate. The data scientists also need the ability to visualize SageMaker jobs and workflows as a directed acyclic graph (DAG). The data scientists must keep a running history of model discovery experiments and must establish model governance for auditing and compliance verifications.
Which solution will meet these requirements?

A Use AWS CodePipeline and its integration with SageMaker Studio to manage the entire ML workflows. Use SageMaker ML Lineage Tracking for the running history of experiments and for auditing and compliance verifications.
B Use AWS CodePipeline and its integration with SageMaker Experiments to manage the entire ML workflows. Use SageMaker Experiments for the running history of experiments and for auditing and compliance verifications.
C Use SageMaker Pipelines and its integration with SageMaker Studio to manage the entire ML workflows. Use SageMaker ML Lineage Tracking for the running history of experiments and for auditing and compliance verifications.
D Use SageMaker Pipelines and its integration with SageMaker Experiments to manage the entire ML workflows. Use SageMaker Experiments for the running history of experiments and for auditing and compliance verifications.

Question 6

ML Solution Monitoring, Maintenance, and Security

0

Question 7

ML Solution Monitoring, Maintenance, and Security

Question 8

Deployment and Orchestration of ML Workflows

Question 9

Deployment and Orchestration of ML Workflows

Question 10

Deployment and Orchestration of ML Workflows

Question 11

ML Solution Monitoring, Maintenance, and Security

Question 12

ML Model Development

Question 13

Data Preparation for Machine Learning (ML)

Question 14

Data Preparation for Machine Learning (ML)

Question 15

ML Solution Monitoring, Maintenance, and Security

Question 16

ML Solution Monitoring, Maintenance, and Security

Question 17

ML Model Development

Question 18

ML Model Development

Question 19

Data Preparation for Machine Learning (ML)

Question 20

Data Preparation for Machine Learning (ML)

Question 21

Deployment and Orchestration of ML Workflows

Question 22

Deployment and Orchestration of ML Workflows

Question 23

ML Model Development

Question 24

Deployment and Orchestration of ML Workflows

Question 25

ML Model Development

Ask AstroTutor

0

Ask AstroTutor

0

Ask AstroTutor

0

Ask AstroTutor

0

Ask AstroTutor

0

Ask AstroTutor

0

Ask AstroTutor

0

Ad

Want a break from the ads?

Become a Supporter and enjoy a completely ad-free experience, plus unlock Learn Mode, Exam Mode, AstroTutor AI, and more.

Ask AstroTutor

0

Ask AstroTutor

0

Ask AstroTutor

0

Ask AstroTutor

0

Ask AstroTutor

0

Ask AstroTutor

0

Ask AstroTutor

0

Ask AstroTutor

0

Ask AstroTutor

0

Ask AstroTutor

0

Ask AstroTutor

0

Ask AstroTutor

0

A company needs to create a central catalog for all the company's ML models. The models are in AWS accounts where the company developed the models initially. The models are hosted in Amazon Elastic Container Registry (Amazon ECR) repositories.
Which solution will meet these requirements?

A Configure ECR cross-account replication for each existing ECR repository. Ensure that each model is visible in each AWS account.
B Create a new AWS account with a new ECR repository as the central catalog. Configure ECR cross-account replication between the initial ECR repositories and the central catalog.
C Use the Amazon SageMaker Model Registry to create a model group for models hosted in Amazon ECR. Create a new AWS account. In the new account, use the SageMaker Model Registry as the central catalog. Attach a cross-account resource policy to each model group in the initial AWS accounts.
D Use an AWS Glue Data Catalog to store the models. Run an AWS Glue crawler to migrate the models from the ECR repositories to the Data Catalog. Configure cross-account access to the Data Catalog.

Case Study -
A company is building a web-based AI application by using Amazon SageMaker. The application will provide the following capabilities and features: ML experimentation, training, a central model registry, model deployment, and model monitoring.
The application must ensure secure and isolated use of training data during the ML lifecycle. The training data is stored in Amazon S3.
The company needs to use the central model registry to manage different versions of models in the application.
Which action will meet this requirement with the LEAST operational overhead?

A Create a separate Amazon Elastic Container Registry (Amazon ECR) repository for each model.
B Use Amazon Elastic Container Registry (Amazon ECR) and unique tags for each model version.
C Use the SageMaker Model Registry and model groups to catalog the models.
D Use the SageMaker Model Registry and unique tags for each model version.

A company has AWS Glue data processing jobs that are orchestrated by an AWS Glue workflow. The AWS Glue jobs can run on a schedule or can be launched manually.
The company is developing pipelines in Amazon SageMaker Pipelines for ML model development. The pipelines will use the output of the AWS Glue jobs during the data processing phase of model development. An ML engineer needs to implement a solution that integrates the AWS Glue jobs with the pipelines.
Which solution will meet these requirements with the LEAST operational overhead?

A Use AWS Step Functions for orchestration of the pipelines and the AWS Glue jobs.
B Use processing steps in SageMaker Pipelines. Configure inputs that point to the Amazon Resource Names (ARNs) of the AWS Glue jobs.
C Use Callback steps in SageMaker Pipelines to start the AWS Glue workflow and to stop the pipelines until the AWS Glue jobs finish running.
D Use Amazon EventBridge to invoke the pipelines and the AWS Glue jobs in the desired order.

Case Study -
A company is building a web-based AI application by using Amazon SageMaker. The application will provide the following capabilities and features: ML experimentation, training, a central model registry, model deployment, and model monitoring.
The application must ensure secure and isolated use of training data during the ML lifecycle. The training data is stored in Amazon S3.
The company is experimenting with consecutive training jobs.
How can the company MINIMIZE infrastructure startup times for these jobs?

A Use Managed Spot Training.
B Use SageMaker managed warm pools.
C Use SageMaker Training Compiler.
D Use the SageMaker distributed data parallelism (SMDDP) library.

Case Study -
A company is building a web-based AI application by using Amazon SageMaker. The application will provide the following capabilities and features: ML experimentation, training, a central model registry, model deployment, and model monitoring.
The application must ensure secure and isolated use of training data during the ML lifecycle. The training data is stored in Amazon S3.
The company must implement a manual approval-based workflow to ensure that only approved models can be deployed to production endpoints.
Which solution will meet this requirement?

A Use SageMaker Experiments to facilitate the approval process during model registration.
B Use SageMaker ML Lineage Tracking on the central model registry. Create tracking entities for the approval process.
C Use SageMaker Model Monitor to evaluate the performance of the model and to manage the approval.
D Use SageMaker Pipelines. When a model version is registered, use the AWS SDK to change the approval status to "Approved."

Case Study -
A company is building a web-based AI application by using Amazon SageMaker. The application will provide the following capabilities and features: ML experimentation, training, a central model registry, model deployment, and model monitoring.
The application must ensure secure and isolated use of training data during the ML lifecycle. The training data is stored in Amazon S3.
The company needs to run an on-demand workflow to monitor bias drift for models that are deployed to real-time endpoints from the application.
Which action will meet this requirement?

A Configure the application to invoke an AWS Lambda function that runs a SageMaker Clarify job.
B Invoke an AWS Lambda function to pull the sagemaker-model-monitor-analyzer built-in SageMaker image.
C Use AWS Glue Data Quality to monitor bias.
D Use SageMaker notebooks to compare the bias.

Case study -
An ML engineer is developing a fraud detection model on AWS. The training dataset includes transaction logs, customer profiles, and tables from an on-premises MySQL database. The transaction logs and customer profiles are stored in Amazon S3.
The dataset has a class imbalance that affects the learning of the model's algorithm. Additionally, many of the features have interdependencies. The algorithm is not capturing all the desired underlying patterns in the data.
After the data is aggregated, the ML engineer must implement a solution to automatically detect anomalies in the data and to visualize the result.
Which solution will meet these requirements?

A Use Amazon Athena to automatically detect the anomalies and to visualize the result.
B Use Amazon Redshift Spectrum to automatically detect the anomalies. Use Amazon QuickSight to visualize the result.
C Use Amazon SageMaker Data Wrangler to automatically detect the anomalies and to visualize the result.
D Use AWS Batch to automatically detect the anomalies. Use Amazon QuickSight to visualize the result.

Case study -
An ML engineer is developing a fraud detection model on AWS. The training dataset includes transaction logs, customer profiles, and tables from an on-premises MySQL database. The transaction logs and customer profiles are stored in Amazon S3.
The dataset has a class imbalance that affects the learning of the model's algorithm. Additionally, many of the features have interdependencies. The algorithm is not capturing all the desired underlying patterns in the data.
The training dataset includes categorical data and numerical data. The ML engineer must prepare the training dataset to maximize the accuracy of the model.
Which action will meet this requirement with the LEAST operational overhead?

A Use AWS Glue to transform the categorical data into numerical data.
B Use AWS Glue to transform the numerical data into categorical data.
C Use Amazon SageMaker Data Wrangler to transform the categorical data into numerical data.
D Use Amazon SageMaker Data Wrangler to transform the numerical data into categorical data.

Case study -
An ML engineer is developing a fraud detection model on AWS. The training dataset includes transaction logs, customer profiles, and tables from an on-premises MySQL database. The transaction logs and customer profiles are stored in Amazon S3.
The dataset has a class imbalance that affects the learning of the model's algorithm. Additionally, many of the features have interdependencies. The algorithm is not capturing all the desired underlying patterns in the data.
Before the ML engineer trains the model, the ML engineer must resolve the issue of the imbalanced data.
Which solution will meet this requirement with the LEAST operational effort?

A Use Amazon Athena to identify patterns that contribute to the imbalance. Adjust the dataset accordingly.
B Use Amazon SageMaker Studio Classic built-in algorithms to process the imbalanced dataset.
C Use AWS Glue DataBrew built-in features to oversample the minority class.
D Use the Amazon SageMaker Data Wrangler balance data operation to oversample the minority class.

A company has deployed an XGBoost prediction model in production to predict if a customer is likely to cancel a subscription. The company uses Amazon SageMaker Model Monitor to detect deviations in the F1 score.
During a baseline analysis of model quality, the company recorded a threshold for the F1 score. After several months of no change, the model's F1 score decreases significantly.
What could be the reason for the reduced F1 score?

A Concept drift occurred in the underlying customer data that was used for predictions.
B The model was not sufficiently complex to capture all the patterns in the original baseline data.
C The original baseline data had a data quality issue of missing values.
D Incorrect ground truth labels were provided to Model Monitor during the calculation of the baseline.

A company has a team of data scientists who use Amazon SageMaker notebook instances to test ML models. When the data scientists need new permissions, the company attaches the permissions to each individual role that was created during the creation of the SageMaker notebook instance.
The company needs to centralize management of the team's permissions.
Which solution will meet this requirement?

A Create a single IAM role that has the necessary permissions. Attach the role to each notebook instance that the team uses.
B Create a single IAM group. Add the data scientists to the group. Associate the group with each notebook instance that the team uses.
C Create a single IAM user. Attach the AdministratorAccess AWS managed IAM policy to the user. Configure each notebook instance to use the IAM user.
D Create a single IAM group. Add the data scientists to the group. Create an IAM role. Attach the AdministratorAccess AWS managed IAM policy to the role. Associate the role with the group. Associate the group with each notebook instance that the team uses.

An ML engineer needs to use an ML model to predict the price of apartments in a specific location.
Which metric should the ML engineer use to evaluate the model's performance?

A Accuracy
B Area Under the ROC Curve (AUC)
C F1 score
D Mean absolute error (MAE)

An ML engineer has trained a neural network by using stochastic gradient descent (SGD). The neural network performs poorly on the test set. The values for training loss and validation loss remain high and show an oscillating pattern. The values decrease for a few epochs and then increase for a few epochs before repeating the same cycle.
What should the ML engineer do to improve the training process?

A Introduce early stopping.
B Increase the size of the test set.
C Increase the learning rate.
D Decrease the learning rate.

An ML engineer needs to process thousands of existing CSV objects and new CSV objects that are uploaded. The CSV objects are stored in a central Amazon S3 bucket and have the same number of columns. One of the columns is a transaction date. The ML engineer must query the data based on the transaction date.
Which solution will meet these requirements with the LEAST operational overhead?

A Use an Amazon Athena CREATE TABLE AS SELECT (CTAS) statement to create a table based on the transaction date from data in the central S3 bucket. Query the objects from the table.
B Create a new S3 bucket for processed data. Set up S3 replication from the central S3 bucket to the new S3 bucket. Use S3 Object Lambda to query the objects based on transaction date.
C Create a new S3 bucket for processed data. Use AWS Glue for Apache Spark to create a job to query the CSV objects based on transaction date. Configure the job to store the results in the new S3 bucket. Query the objects from the new S3 bucket.
D Create a new S3 bucket for processed data. Use Amazon Data Firehose to transfer the data from the central S3 bucket to the new S3 bucket. Configure Firehose to run an AWS Lambda function to query the data based on transaction date.

A company has a large, unstructured dataset. The dataset includes many duplicate records across several key attributes.
Which solution on AWS will detect duplicates in the dataset with the LEAST code development?

A Use Amazon Mechanical Turk jobs to detect duplicates.
B Use Amazon QuickSight ML Insights to build a custom deduplication model.
C Use Amazon SageMaker Data Wrangler to pre-process and detect duplicates.
D Use the AWS Glue FindMatches transform to detect duplicates.

A company needs to run a batch data-processing job on Amazon EC2 instances. The job will run during the weekend and will take 90 minutes to finish running. The processing can handle interruptions. The company will run the job every weekend for the next 6 months.
Which EC2 instance purchasing option will meet these requirements MOST cost-effectively?

A Spot Instances
B Reserved Instances
C On-Demand Instances
D Dedicated Instances

An ML engineer has an Amazon Comprehend custom model in Account A in the us-east-1 Region. The ML engineer needs to copy the model to Account В in the same Region.
Which solution will meet this requirement with the LEAST development effort?

A Use Amazon S3 to make a copy of the model. Transfer the copy to Account B.
B Create a resource-based IAM policy. Use the Amazon Comprehend ImportModel API operation to copy the model to Account B.
C Use AWS DataSync to replicate the model from Account A to Account B.
D Create an AWS Site-to-Site VPN connection between Account A and Account В to transfer the model.

An ML engineer is training a simple neural network model. The ML engineer tracks the performance of the model over time on a validation dataset. The model's performance improves substantially at first and then degrades after a specific number of epochs.
Which solutions will mitigate this problem? (Choose two.)

A Enable early stopping on the model.
B Increase dropout in the layers.
C Increase the number of layers.
D Increase the number of neurons.
E Investigate and reduce the sources of model bias.

A company has a Retrieval Augmented Generation (RAG) application that uses a vector database to store embeddings of documents. The company must migrate the application to AWS and must implement a solution that provides semantic search of text files. The company has already migrated the text repository to an Amazon S3 bucket.
Which solution will meet these requirements?

A Use an AWS Batch job to process the files and generate embeddings. Use AWS Glue to store the embeddings. Use SQL queries to perform the semantic searches.
B Use a custom Amazon SageMaker notebook to run a custom script to generate embeddings. Use SageMaker Feature Store to store the embeddings. Use SQL queries to perform the semantic searches.
C Use the Amazon Kendra S3 connector to ingest the documents from the S3 bucket into Amazon Kendra. Query Amazon Kendra to perform the semantic searches.
D Use an Amazon Textract asynchronous job to ingest the documents from the S3 bucket. Query Amazon Textract to perform the semantic searches.

A company uses Amazon Athena to query a dataset in Amazon S3. The dataset has a target variable that the company wants to predict.
The company needs to use the dataset in a solution to determine if a model can predict the target variable.
Which solution will provide this information with the LEAST development effort?

A Create a new model by using Amazon SageMaker Autopilot. Report the model's achieved performance.
B Implement custom scripts to perform data pre-processing, multiple linear regression, and performance evaluation. Run the scripts on Amazon EC2 instances.
C Configure Amazon Macie to analyze the dataset and to create a model. Report the model's achieved performance.
D Select a model from Amazon Bedrock. Tune the model with the data. Report the model's achieved performance.