Certified Machine Learning Associate Logo
Databricks Logo

Certified Machine Learning Associate Exam Questions

278

Total Questions

SEP
2025

Last Updated

1st

1st Try Guaranteed

Expert Verified

Experts Verified

Question 11 Single Choice

A machine learning engineer has evaluated a new Staging version of a model in the MLflow Model Registry. After passing all the tests, the engineer would like to move this model to production by transitioning it to the Production stage in the Model Registry.

From which section in Databricks Machine Learning can the engineer achieve this?

Choose only ONE best answer.

Question 12 Multiple Choice

How to Reduce Overfitting?

Select ALL that Apply.

Question 13 Single Choice

In Databricks AutoML, how can you navigate to the best model code across all of

the model iterations?

Choose only ONE best answer.

Question 14 Single Choice

A data scientist has constructed a random forest regressor pipeline and integrated it as the final stage in a Spark ML Pipeline. They've initiated a cross-validation process, setting the pipeline with Random forest regressor method inside of it.

What potential downside could arise from making a pipeline inside the cross-validation process?

Choose only ONE best answer.

Question 15 Single Choice

A novice data scientist has recently joined an ongoing machine learning project. The project operates as a daily retraining scheduled job, housed in a Databricks Repository. The scientist's task is to enhance the feature engineering of the pipeline's preprocessing phase. They aim to amend the code in a way that will seamlessly integrate into the project without altering the daily operations.

Which strategy should the data scientist adopt to successfully execute this task?

Choose only ONE best answer.

Question 16 Single Choice

In Databricks Model Registry, how are different versions of a model with the same model name distinguished from each other?

Choose only ONE best answer.

Question 17 Single Choice

What is the primary use case for mapInPandas() in Databricks?

Choose only ONE best answer.

Question 18 Single Choice

A data scientist is carrying out hyperparameter optimization using an iterative optimization algorithm. Each assessment of unique hyperparameter values is being trained on a distinct compute node. They are conducting eight evaluations in total on eight compute nodes. Although the accuracy of the model varies across the eight evaluations, they observe that there's no consistent pattern of enhancement in the accuracy.

What modifications could the data scientist make to enhance their model's accuracy throughout the tuning process?

Choose only ONE best answer.

Question 19 Single Choice

What is the reason behind the compatibility of pandas API syntax within a Pandas UDF function when applied to a Spark DataFrame?

Choose only ONE best answer.

Question 20 Single Choice

Which in-memory columnar data format is used by Pandas API on Spark to efficiently transfer data between JVM and Python processes?

Page: 2 / 28