A new CUSTOMER table is created by a data pipeline in a Snowflake schema where MANAGED ACCESS is enabled.
Which roles can grant access to the CUSTOMER table? (Choose three.)
AThe role that owns the schema
BThe role that owns the database
CThe role that owns the CUSTOMER table
DThe SYSADMIN role
EThe SECURITYADMIN role
FThe USERADMIN role with the MANAGE GRANTS privilege
Given the table SALES which has a clustering key of column CLOSED_DATE, which table function will return the average clustering depth for the SALES_REPRESENTATIVE column for the North American region?
Cselect system$clustering_depth('Sales', 'sales_representative') where region = 'North America';
Dselect system$clustering_information('Sales', 'sales_representative') where region = 'North America’;
A Data Engineer is working on a Snowflake deployment in AWS eu-west-1 (Ireland). The Engineer is planning to load data from staged files into target tables using the COPY INTO command.
Which sources are valid? (Choose three.)
AInternal stage on GCP us-central1 (Iowa)
BInternal stage on AWS eu-central-1 (Frankfurt)
CExternal stage on GCP us-central1 (Iowa)
DExternal stage in an Amazon S3 bucket on AWS eu-west-1 (Ireland)
EExternal stage in an Amazon S3 bucket on AWS eu-central-1 (Frankfurt)
FSSD attached to an Amazon EC2 instance on AWS eu-west-1 (Ireland)
A large table with 200 columns contains two years of historical data. When queried, the table is filtered on a single day. Below is the Query Profile:
Using a size 2XL virtual warehouse, this query took over an hour to complete.
What will improve the query performance the MOST?
AIncrease the size of the virtual warehouse.
BIncrease the number of clusters in the virtual warehouse.
CImplement the search optimization service on the table.
DAdd a date column as a cluster key on the table.
A Data Engineer wants to create a new development database (DEV) as a clone of the permanent production database (PROD). There is a requirement to disable Fail-safe for all tables.
Which command will meet these requirements?
ACREATE DATABASE DEV -CLONE PROD -FAIL_SAFE = FALSE;
BCREATE DATABASE DEV -CLONE PROD;
CCREATE TRANSIENT DATABASE DEV -CLONE PROD;
DCREATE DATABASE DEV -CLONE PROD -DATA_RETENTION_TIME_IN DAYS = 0;
Which query will show a list of the 20 most recent executions of a specified task, MYTASK, that have been scheduled within the last hour that have ended or are still running?
A
B
C
D
A stream called TRANSACTIONS_STM is created on top of a TRANSACTIONS table in a continuous pipeline running in Snowflake. After a couple of months, the TRANSACTIONS table is renamed TRANSACTIONS_RAW to comply with new naming standards.
What will happen to the TRANSACTIONS_STM object?
ATRANSACTIONS_STM will keep working as expected.
BTRANSACTIONS_STM will be stale and will need to be re-created.
CTRANSACTIONS_STM will be automatically renamed TRANSACTIONS_RAW_STM.
DReading from the TRANSACTIONS_STM stream will succeed for some time after the expected STALE_TIME.
A database contains a table and a stored procedure defined as:
The log_table is initially empty and a Data Engineer issues the following command:
CALL insert_log(NULL::VARCHAR);
No other operations are affecting the log_table.
What will be the outcome of the procedure call?
AThe log_table contains zero records and the stored procedure returned 1 as a return value.
BThe log_table contains one record and the stored procedure returned 1 as a return value.
CThe log_table contains one record and the stored procedure returned NULL as a return value.
DThe log_table contains zero records and the stored procedure returned NULL as a return value.
A Data Engineer has developed a dashboard that will issue the same SQL select clause to Snowflake every 12 hours.
How long will Snowflake use the persisted query results from the result cache, provided that the underlying data has not changed?
A12 hours
B24 hours
C14 days
D31 days
A Data Engineer executes a complex query and wants to make use of Snowflake’s query results caching capabilities to reuse the results.
Which conditions must be met? (Choose three.)
AThe results must be reused within 72 hours.
BThe query must be executed using the same virtual warehouse.
CThe USED_CACHED_RESULT parameter must be included in the query.
DThe table structure contributing to the query result cannot have changed.
EThe new query must have the same syntax as the previously executed query.
FThe micro-partitions cannot have changed due to changes to other data in the table.
Which Snowflake objects does the Snowflake Kafka connector use? (Choose three.)
APipe
BServerless task
CInternal user stage
DInternal table stage
EInternal named stage
FStorage integration
A Data Engineer needs to load JSON output from some software into Snowflake using Snowpipe.
Which recommendations apply to this scenario? (Choose three.)
ALoad large files (1 GB or larger).
BEnsure that data files are 100-250 MB (or larger) in size, compressed.
CLoad a single huge array containing multiple records into a single table row.
DVerify each value of each unique element stores a single native data type (string or number).
EExtract semi-structured data elements containing null values into relational columns before loading.
FCreate data files that are less than 100 MB and stage them in cloud storage at a sequence greater than once each minute.
How can the following relational data be transformed into semi-structured data using the LEAST amount of operational overhead?
AUse the TO_JSON function.
BUse the PARSE_JSON function to produce a VARIANT value.
CUse the OBJECT_CONSTRUCT function to return a Snowflake object.
DUse the TO_VARIANT function to convert each of the relational columns to VARIANT.
Which methods can be used to create a DataFrame object in Snowpark? (Choose three.)
Asession.jdbc_connection()
Bsession.read.json()
Csession.table()
DDataFrame.write()
Esession.builder()
Fsession.sql()
What is the purpose of the BUILD_STAGE_FILE_URL function in Snowflake?
AIt generates an encrypted URL for accessing a file in a stage.
BIt generates a staged URL for accessing a file in a stage.
CIt generates a permanent URL for accessing files in a stage.
DIt generates a temporary URL for accessing a file in a stage.
A company has an extensive script in Scala that transforms data by leveraging DataFrames. A Data Engineer needs to move these transformations to Snowpark.
What characteristics of data transformations in Snowpark should be considered to meet this requirement? (Choose two.)
AIt is possible to join multiple tables using DataFrames.
BSnowpark operations are executed lazily on the server.
CUser-Defined Functions (UDFs) are not pushed down to Snowflake.
DSnowpark requires a separate cluster outside of Snowflake for computations.
EColumns in different DataFrames with the same name should be referred to with squared brackets.
The following is returned from SYSTEM$CLUSTERING_INFORMATION() for a table named ORDERS with a DATE column named O_ORDERDATE:
What does the total_constant_partition_count value indicate about this table?
AThe table is clustered very well on O_ORDERDATE, as there are 493 micro-partitions that could not be significantly improved by reclustering.
BThe table is not clustered well on O_ORDERDATE, as there are 493 micro-partitions where the range of values in that column overlap with every other micro-partition in the table.
CThe data in O_ORDERDATE does not change very often, as there are 493 micro-partitions containing rows where that column has not been modified since the row was created.
DThe data in O_ORDERDATE has a very low cardinality, as there are 493 micro-partitions where there is only a single distinct value in that column for all rows in the micro-partition.
The JSON below is stored in a VARIANT column named V in a table named jCustRaw:
Which query will return one row per team member (stored in the teamMembers array) along with all of the attributes of each team member?
A
B
C
D
A company is building a dashboard for thousands of Analysts. The dashboard presents the results of a few summary queries on tables that are regularly updated. The query conditions vary by topic according to what data each Analyst needs. Responsiveness of the dashboard queries is a top priority, and the data cache should be preserved.
How should the Data Engineer configure the compute resources to support this dashboard?
AAssign queries to a multi-cluster virtual warehouse with economy auto-scaling. Allow the system to automatically start and stop clusters according to demand.
BAssign all queries to a multi-cluster virtual warehouse set to maximized mode. Monitor to determine the smallest suitable number of clusters.
CCreate a virtual warehouse for every 250 Analysts. Monitor to determine how many of these virtual warehouses are being utilized at capacity.
DCreate a size XL virtual warehouse to support all the dashboard queries. Monitor query runtimes to determine whether the virtual warehouse should be resized.
A Data Engineer ran a stored procedure containing various transactions. During the execution, the session abruptly disconnected, preventing one transaction from committing or rolling back. The transaction was left in a detached state and created a lock on resources.
What step must the Engineer take to immediately run a new transaction?
ACall the system function SYSTEM$ABORT_TRANSACTION.
BCall the system function SYSTEM$CANCEL_TRANSACTION.
CSet the LOCK_TIMEOUT to FALSE in the stored procedure.
DSet the TRANSACTION_ABORT_ON_ERROR to TRUE in the stored procedure.
Which output is provided by both the SYSTEM$CLUSTERING_DEPTH function and the SYSTEM$CLUSTERING_INFORMATION function?
Aaverage_depth
Bnotes
Caverage_overlaps
Dtotal_partition_count
A table is loaded using Snowpipe and truncated afterwards. Later, a Data Engineer finds that the table needs to be reloaded, but the metadata of the pipe will not allow the same files to be loaded again.
How can this issue be solved using the LEAST amount of operational overhead?
AWait until the metadata expires and then reload the file using Snowpipe.
BModify the file by adding a blank row to the bottom and re-stage the file.
CSet the FORCE=TRUE option in the Snowpipe COPY INTO command.
DRecreate the pipe by using the CREATE OR REPLACE PIPE command.
Which methods will trigger an action that will evaluate a DataFrame? (Choose two.)
ADataFrame.random_split()
BDataFrame.collect()
CDataFrame.select()
DDataFrame.col()
EDataFrame.show()
A Data Engineer is evaluating the performance of a query in a development environment.
Based on the Query Profile, what are some performance tuning options the Engineer can use? (Choose two.)
AAdd a LIMIT to the ORDER BY if possible
BUse a multi-cluster virtual warehouse with the scaling policy set to standard
CMove the query to a larger virtual warehouse
DCreate indexes to ensure sorted access to data
EIncrease the MAX_CLUSTER_COUNT
A Data Engineer has created table t1 with one column c1 with datatype VARIANT: create or replace table t1 (c1 variant);
The Engineer has loaded the following JSON data set, which has information about 4 laptop models, into the table.
The Engineer now wants to query that data set so that results are shown as normal structured data. The result should be 4 rows and 4 columns, without the double quotes surrounding the data elements in the JSON data.
The result should be similar to the use case where the data was selected from a normal relational table t2, where t2 has string data type columns model_id, model, manufacturer, and model_name, and is queried with the SQL clause select * from t2;
Which select command will produce the correct results?