Certified Data Analyst Associate Logo
Databricks Logo

Certified Data Analyst Associate Exam Questions

330

Total Questions

SEP
2025

Last Updated

1st

1st Try Guaranteed

Expert Verified

Experts Verified

Question 11 Single Choice

In a Databricks Lakehouse, you are working with silver-level data which involves refined, cleansed, and transformed datasets.

You have a silver table named CustomerInteractions that needs deduplication based on the customer_id and interaction_date columns.

The requirement is to retain only the latest interaction for each customer on any given date. Which approach would be the most efficient for cleaning this data in Databricks?

Question 12 Single Choice

How can you determine if a table in Databricks is managed or unmanaged?

Question 13 Single Choice

Which of the following best describes the key audience and side audiences for Databricks SQL? Select the most appropriate option.

Question 14 Single Choice

In a Databricks SQL environment, you are tasked with analyzing sales data. The table SalesRecords contains columns SaleID, ProductID, SaleAmount, and SaleDate.

You need to find all products that had their first sale amounting to more than $500.

Which SQL query using a subquery appropriately retrieves this information?

Question 15 Single Choice

Regarding the accessibility and functionality of Databricks SQL dashboards, which statement best reflects their use in a business environment by various stakeholders?

Question 16 Single Choice

In the context of Unity Catalog in Databricks, what is the purpose of specifying a MANAGED LOCATION when creating a catalog or schema?

Question 17 Single Choice

You have a table customer_orders in Databricks SQL with a column order_info in JSON format. This column includes a nested array items, where each element has product_id (string) and quantity (integer).

Which of the following queries returns a single row with two columns—product_id and quantity—for the product with product_id equal to “X100”?


Syntax used to create the table CustomerOrders:

Question 18 Single Choice

In Databricks, how does the persistence of data differ between a view and a temporary view?

Question 19 Multiple Choice

In a Databricks SQL context, consider a dataset with columns 'Department', 'Employee', and 'Sales'. You are required to analyze the data using the ROLLUP and CUBE functions.


Given this scenario, select the correct statements regarding the type of aggregations ROLLUP and CUBE would generate when applied to the 'Department' and 'Employee' columns.

Question 20 Single Choice

What is the primary purpose of Databricks SQL endpoints/warehouses in a data analytics environment?

Page: 2 / 33