Which of the following layers of the medallion architecture is most commonly used by data analysts?
ANone of these layers are used by data analysts
BGold
CAll of these layers are used equally by data analysts
DSilver
EBronze
Which of the following approaches can be used to connect Databricks to Fivetran for data ingestion?
AUse Workflows to establish a SQL warehouse (formerly known as a SQL endpoint) for Fivetran to interact with
BUse Delta Live Tables to establish a cluster for Fivetran to interact with
CUse Partner Connect's automated workflow to establish a cluster for Fivetran to interact with
DUse Partner Connect's automated workflow to establish a SQL warehouse (formerly known as a SQL endpoint) for Fivetran to interact with
EUse Workflows to establish a cluster for Fivetran to interact with
Which of the following describes how Databricks SQL should be used in relation to other business intelligence (BI) tools like Tableau, Power BI, and looker?
AAs an exact substitute with the same level of functionality
BAs a substitute with less functionality
CAs a complete replacement with additional functionality
DAs a complementary tool for professional-grade presentations
EAs a complementary tool for quick in-platform BI work
A data analyst has recently joined a new team that uses Databricks SQL, but the analyst has never used Databricks before. The analyst wants to know where in Databricks SQL they can write and execute SQL queries.
On which of the following pages can the analyst write and execute SQL queries?
AData page
BDashboards page
CQueries page
DAlerts page
ESQL Editor page
A data analyst wants to create a dashboard with three main sections: Development, Testing, and Production. They want all three sections on the same dashboard, but they want to clearly designate the sections using text on the dashboard.
Which of the following tools can the data analyst use to designate the Development, Testing, and Production sections using text?
ASeparate endpoints for each section
BSeparate queries for each section
CMarkdown-based text boxes
DDirect text written into the dashboard in editing mode
ESeparate color palettes for each section
A data analyst needs to use the Databricks Lakehouse Platform to quickly create SQL queries and data visualizations. It is a requirement that the compute resources in the platform can be made serverless, and it is expected that data visualizations can be placed within a dashboard.
Which of the following Databricks Lakehouse Platform services/capabilities meets all of these requirements?
ADelta Lake
BDatabricks Notebooks
CTableau
DDatabricks Machine Learning
EDatabricks SQL
Which of the following should data analysts consider when working with personally identifiable information (PII) data?
AOrganization-specific best practices for PII data
BLegal requirements for the area in which the data was collected
CNone of these considerations
DLegal requirements for the area in which the analysis is being performed
EAll of these considerations
Delta Lake stores table data as a series of data files, but it also stores a lot of other information.
Which of the following is stored alongside data files when using Delta Lake?
ANone of these
BTable metadata, data summary visualizations, and owner account information
CTable metadata
DData summary visualizations
EOwner account information
Which of the following benefits of using Databricks SQL is provided by Data Explorer?
AIt can be used to run UPDATE queries to update any tables in a database.
BIt can be used to view metadata and data, as well as view/change permissions.
CIt can be used to produce dashboards that allow data exploration.
DIt can be used to make visualizations that can be shared with stakeholders.
EIt can be used to connect to third party BI cools.
A data analyst created and is the owner of the managed table my_ table. They now want to change ownership of the table to a single other user using Data Explorer.
Which of the following approaches can the analyst use to complete the task?
AEdit the Owner field in the table page by removing their own account
BEdit the Owner field in the table page by selecting All Users
CEdit the Owner field in the table page by selecting the new owner's account
DEdit the Owner field in the table page by selecting the Admins group
EEdit the Owner field in the table page by removing all access
A data analyst has a managed table table_name in database database_name. They would now like to remove the table from the database and all of the data files associated with the table. The rest of the tables in the database must continue to exist.
Which of the following commands can the analyst use to complete the task without producing an error?
ADROP DATABASE database_name;
BDROP TABLE database_name.table_name;
CDELETE TABLE database_name.table_name;
DDELETE TABLE table_name FROM database_name;
EDROP TABLE table_name FROM database_name;
A data analyst runs the following command:
SELECT age, country -
FROM my_table -
WHERE age >= 75 AND country = 'canada';
Which of the following tables represents the output of the above command?
A
B
C
D
E
A data analyst has created a user-defined function using the following line of code:
CREATE FUNCTION price(spend DOUBLE, units DOUBLE)
RETURNS DOUBLE -
RETURN spend / units;
Which of the following code blocks can be used to apply this function to the customer_spend and customer_units columns of the table customer_summary to create column customer_price?
ASELECT PRICE customer_spend, customer_units AS customer_priceFROM customer_summary
BSELECT price -FROM customer_summary
CSELECT function(price(customer_spend, customer_units)) AS customer_priceFROM customer_summary
DSELECT double(price(customer_spend, customer_units)) AS customer_priceFROM customer_summary
ESELECT price(customer_spend, customer_units) AS customer_priceFROM customer_summary
Consider the following two statements:
Statement 1:
Statement 2:
Which of the following describes how the result sets will differ for each statement when they are run in Databricks SQL?
AThe first statement will return all data from the customers table and matching data from the orders table. The second statement will return all data from the orders table and matching data from the customers table. Any missing data will be filled in with NULL.
BWhen the first statement is run, only rows from the customers table that have at least one match with the orders table on customer_id will be returned. When the second statement is run, only those rows in the customers table that do not have at least one match with the orders table on customer_id will be returned.
CThere is no difference between the result sets for both statements.
DBoth statements will fail because Databricks SQL does not support those join types.
EWhen the first statement is run, all rows from the customers table will be returned and only the customer_id from the orders table will be returned. When the second statement is run, only those rows in the customers table that do not have at least one match with the orders table on customer_id will be returned.
A data analyst has been asked to count the number of customers in each region and has written the following query:
If there is a mistake in the query, which of the following describes the mistake?
AThe query is using count(*), which will count all the customers in the customers table, no matter the region.
BThe query is missing a GROUP BY region clause.
CThe query is using ORDER BY, which is not allowed in an aggregation.
DThere are no mistakes in the query.
EThe query is selecting region, but region should only occur in the ORDER BY clause.
Which of the following is a benefit of Databricks SQL using ANSI SQL as its standard SQL dialect?
AIt has increased customization capabilities
BIt is easy to migrate existing SQL queries to Databricks SQL
CIt allows for the use of Photon's computation optimizations
DIt is more performant than other SQL dialects
EIt is more compatible with Spark's interpreters
A business analyst has been asked to create a data entity/object called sales_by_employee. It should always stay up-to-date when new data are added to the sales table. The new entity should have the columns sales_person, which will be the name of the employee from the employees table, and sales, which will be all sales for that particular sales person. Both the sales table and the employees table have an employee_id column that is used to identify the sales person.
Which of the following code blocks will accomplish this task?
A
B
C
D
E
A data analyst has been asked to use the below table sales_table to get the percentage rank of products within region by the sales:
The result of the query should look like this:
Which of the following queries will accomplish this task?
A
B
C
D
E
A data analyst is processing a complex aggregation on a table with zero null values and their query returns the following result:
Which of the following queries did the analyst run to obtain the above result?
A
B
C
D
E
A data analyst has been asked to produce a visualization that shows the flow of users through a website.
Which of the following is used for visualizing this type of flow?
AHeatmap
BChoropleth
CWord Cloud
DPivot Table
ESankey
A data analyst has been asked to configure an alert for a query that returns the income in the accounts_receivable table for a date range. The date range is configurable using a Date query parameter.
The Alert does not work.
Which of the following describes why the Alert does not work?
AAlerts don't work with queries that access tables.
BQueries that return results based on dates cannot be used with Alerts.
CThe wrong query parameter is being used. Alerts only work with Date and Time query parameters.
DQueries that use query parameters cannot be used with Alerts.
EThe wrong query parameter is being used. Alerts only work with dropdown list query parameters, not dates.
A data analyst is working with gold-layer tables to complete an ad-hoc project. A stakeholder has provided the analyst with an additional dataset that can be used to augment the gold-layer tables already in use.
Which of the following terms is used to describe this data augmentation?
AData testing
BAd-hoc improvements
CLast-mile dashboarding
DLast-mile ETL
EData enhancement
A data analyst runs the following command:
INSERT INTO stakeholders.suppliers TABLE stakeholders.new_suppliers;
What is the result of running this command?
AThe suppliers table now contains both the data it had before the command was run and the data from the new_suppliers table, and any duplicate data is deleted.
BThe command fails because it is written incorrectly.
CThe suppliers table now contains both the data it had before the command was run and the data from the new_suppliers table, including any duplicate data.
DThe suppliers table now contains the data from the new_suppliers table, and the new_suppliers table now contains the data from the suppliers table.
EThe suppliers table now contains only the data from the new_suppliers table.
Data professionals with varying titles use the Databricks SQL service as the primary touchpoint with the Databricks Lakehouse Platform. However, some users will use other services like Databricks Machine Learning or Databricks Data Science and Engineering.
Which of the following roles uses Databricks SQL as a secondary service while primarily using one of the other services?
ABusiness analyst
BSQL analyst
CData engineer
DBusiness intelligence analyst
EData analyst
After running DESCRIBE EXTENDED accounts.customers;, the following was returned:
Now, a data analyst runs the following command:
DROP accounts.customers;
Which of the following describes the result of running this command?
ARunning SELECT * FROM delta. dbfs:/stakeholders/customers results in an error.
BRunning SELECT * FROM accounts.customers will return all rows in the table.
CAll files with the .customers extension are deleted.
DThe accounts.customers table is removed from the metastore, and the underlying data files are deleted.
EThe accounts.customers table is removed from the metastore, but the underlying data files are untouched.