AWS Certified Machine Learning - Specialty - (MLS-C01) Logo
Amazon Logo

AWS Certified Machine Learning - Specialty - (MLS-C01) Exam Questions

333

Total Questions

SEP
2025

Last Updated

1st

1st Try Guaranteed

Expert Verified

Experts Verified

Question 11 Single Choice

A data science team at your organization is tasked with creating a machine learning model to forecast the sale prices of houses using characteristics such as the home's square footage. However, approximately 10% of the entries in the modest-sized training dataset are missing the square footage attribute. Given the importance of model accuracy in your application, which approach should the team employ to handle missing values in the training data effectively?

Question 12 Multiple Choice

A data scientist is training a deep learning model for image classification using a convolutional neural network (CNN). The model performs exceptionally well on the training data but significantly underperforms on new, unseen images. To minimize overfitting and improve the model's generalization to new data, which TWO of the following approaches should the data scientist take? (Select TWO)

Question 13 Single Choice

As a data scientist involved in the development of a self-driving car system, your task is to implement a computer vision solution capable of categorizing every pixel in images captured by the car's cameras. The categories include identifying objects like people, buildings, roads, signs, and vehicles.


How would you implement a computer vision solution capable of classifying every pixel in images captured by the car's cameras?

Question 14 Single Choice

A financial services firm is leveraging Amazon SageMaker to develop machine learning models that predict market trends. Due to the sensitive nature of their data, the firm's policy prohibits direct internet access from their virtual private cloud (VPC) to ensure the security of their data. They require the ability to use SageMaker notebook instances for model development without exposing these instances to the internet. What approach should the firm take to securely utilize SageMaker notebooks within their VPC in compliance with their security policy?

Question 15 Single Choice

A data scientist at a retail company is analyzing customer purchase patterns to segment them into distinct groups for targeted marketing campaigns. To achieve this, the scientist is employing k-Means clustering. What is the most effective method for selecting the optimal number of clusters (k) to accurately categorize customers into meaningful segments?

Question 16 Multiple Choice

As part of a major content digitization initiative, your team has been tasked with organizing a vast library of encyclopedia articles to enable efficient search and retrieval. The articles are currently stored in raw text format, without any pre-assigned topic labels or categories. To unlock the full value of this content, you need a way to automatically classify the articles into relevant topics with minimal manual effort. Which AWS services or tools would you recommend using to tackle this challenge? (SELECT TWO)

Question 17 Single Choice

A healthcare analytics firm is leveraging Amazon SageMaker to train machine learning models on sensitive patient data. To comply with strict data privacy regulations, the training jobs are configured to run within a Virtual Private Cloud (VPC) that lacks direct internet access. What method should be employed to ensure these training jobs can securely access training data stored in an Amazon S3 bucket?

Question 18 Single Choice

In a clinical trial dataset that encompasses a variety of features including Mean Arterial Pressure (MAP), it's observed that the features exhibit low correlation with one another. The dataset is almost complete, with less than 1% of the MAP values missing. Aside from a few outliers, the MAP data is relatively uniformly distributed, and all other features are fully accounted for. Given these characteristics, which approach should be adopted to manage the missing MAP data most effectively?

Question 19 Single Choice

A team is developing a "Universal Translator" application that can recognize spoken language, translate it into English, and then articulate the English translation audibly. Which sequence of AWS services should be implemented to achieve this functionality?

Question 20 Single Choice

An AI developer is fine-tuning a deep learning model for image recognition tasks. During the training process, the model's performance is measured by its accuracy on a separate validation dataset after each training epoch. The model demonstrates consistent improvement in accuracy up to the 100th epoch. However, post-100th epoch, while the training accuracy still improves, the validation accuracy starts to decline. What is the most probable remediation for this divergence in accuracy trends between the training and validation sets?

Page: 2 / 34