

Certified Data Analyst Associate Exam Questions
Question 11 Single Choice
In a Databricks Lakehouse, you are working with silver-level data which involves refined, cleansed, and transformed datasets.
You have a silver table named CustomerInteractions that needs deduplication based on the customer_id and interaction_date columns.
The requirement is to retain only the latest interaction for each customer on any given date. Which approach would be the most efficient for cleaning this data in Databricks?
Question 12 Single Choice
How can you determine if a table in Databricks is managed or unmanaged?
Question 13 Single Choice
Which of the following best describes the key audience and side audiences for Databricks SQL? Select the most appropriate option.
Question 14 Single Choice
In a Databricks SQL environment, you are tasked with analyzing sales data. The table SalesRecords contains columns SaleID, ProductID, SaleAmount, and SaleDate.
You need to find all products that had their first sale amounting to more than $500.
Which SQL query using a subquery appropriately retrieves this information?
Question 15 Single Choice
Regarding the accessibility and functionality of Databricks SQL dashboards, which statement best reflects their use in a business environment by various stakeholders?
Question 16 Single Choice
In the context of Unity Catalog in Databricks, what is the purpose of specifying a MANAGED LOCATION when creating a catalog or schema?
Question 17 Single Choice
You have a table customer_orders in Databricks SQL with a column order_info in JSON format. This column includes a nested array items, where each element has product_id (string) and quantity (integer).
Which of the following queries returns a single row with two columns—product_id and quantity—for the product with product_id equal to “X100”?
Syntax used to create the table CustomerOrders:
- CREATE TABLE customer_orders AS
- SELECT
- '{
- "items": [
- {"product_id": "X100", "quantity": 3},
- {"product_id": "Y200", "quantity": 1}
- ]
- }' as order_info;
Question 18 Single Choice
In Databricks, how does the persistence of data differ between a view and a temporary view?
Question 19 Multiple Choice
In a Databricks SQL context, consider a dataset with columns 'Department', 'Employee', and 'Sales'. You are required to analyze the data using the ROLLUP and CUBE functions.
Given this scenario, select the correct statements regarding the type of aggregations ROLLUP and CUBE would generate when applied to the 'Department' and 'Employee' columns.
Question 20 Single Choice
What is the primary purpose of Databricks SQL endpoints/warehouses in a data analytics environment?





