AWS Certified Machine Learning - Specialty - (MLS-C01) Logo
Amazon Logo

AWS Certified Machine Learning - Specialty - (MLS-C01) Exam Questions

333

Total Questions

SEP
2025

Last Updated

1st

1st Try Guaranteed

Expert Verified

Experts Verified

Question 1 Single Choice

A data science team at your company is planning to utilize Amazon SageMaker to train an XGBoost model to predict customer churn. The dataset comprises millions of rows, necessitating significant pre-processing to ensure model accuracy. To handle this task efficiently, the team has decided to leverage Apache Spark due to its capability for large-scale data processing. As the lead architect, you are tasked with designing a solution that integrates Apache Spark for data pre-processing while optimizing for simplicity and scalability.


What is the simplest architecture that allows the team to pre-process the data at scale using Apache Spark before training the model with XGBoost on SageMaker?

Question 2 Single Choice

Considering that a company uses the built-in PCA algorithm in Amazon SageMaker and stores its training data on Amazon S3, it has observed significant expenses linked to the use of Amazon Elastic Block Store (EBS) volumes with their SageMaker training instances.


Which parameter setting should they adjust in the AlgorithmSpecification to effectively reduce these EBS costs?

Question 3 Single Choice

In Amazon Elastic File System (EFS), when monitoring performance metrics indicates that the IOPS usage is nearing 100%, which of the following actions should be taken to effectively manage the file system's performance?

Question 4 Multiple Choice

A machine learning team is building a recommendation system using user clickstream data collected from a popular e-commerce website. The raw data is semi-structured JSON and includes nested fields for session activity, product views, and user metadata. The team wants to process this data daily for feature engineering and store the transformed data in a format that is:

  • Efficient for analytical queries

  • Compatible with Amazon SageMaker training jobs

  • Cost-effective to store at scale

Which of the following solutions would best meet these requirements? (Select TWO)

Question 5 Multiple Choice

In an effort to optimize a machine learning model on Amazon SageMaker, you find that the automatic hyperparameter tuning job is excessively resource-intensive and costly. Which TWO of the following strategies could effectively reduce these costs? (Select TWO)

Question 6 Single Choice

A healthcare company is planning to develop a machine learning model to predict patient readmission rates based on historical patient data. The data science team needs to create a data repository that integrates various types of patient data such as demographics, previous medical history, medication records, and lab test results.


Which strategy should the data engineering team use to identify and organize the primary data sources effectively, ensuring the data is accessible and formatted suitably for training the machine learning model?

Question 7 Single Choice

A data analyst is tasked with performing exploratory data analysis on a dataset of tweets to understand user sentiment towards various topics. The goal is to label tweets accurately for further sentiment analysis. Which AWS service or feature should the analyst use to efficiently categorize and label the dataset, ensuring a solid foundation for subsequent detailed analysis?

Question 8 Single Choice

A leading news portal seeks to deliver personalized article recommendations by daily training a machine learning model using historical clickstream data. The volume of incoming data is consistent but experiences substantial spikes during major elections, leading to increased site traffic. Which architecture would ensure the most cost-effective and reliable framework for accommodating these conditions?

Question 9 Single Choice

A data engineering team is tasked with optimizing the storage of large-scale satellite imagery data, which will be used to train an Amazon SageMaker MXNet image classification algorithm.


Which data format should they use to ensure optimal training performance?

Question 10 Single Choice

An autonomous vehicle technology company is seeking an AWS solution capable of classifying street sign images with minimal latency, handling thousands of images each second. Which AWS services would most effectively fulfill this requirement?

Page: 1 / 34