Loading questions...
Updated
Want a break from the ads?
Become a Supporter and enjoy a completely ad-free experience, plus unlock Learn Mode, Exam Mode, AstroTutor AI, and more.
A Data Scientist wants to get the coefficient of determination for two columns in a table.
Which function can be used for that?
A Data Scientist noticed a high usage on Snowflake Cortex LLM functions.
Which command will restrict user access?
A Data Scientist is tasked with gathering a summary of each customer review. The data is stored in a Snowflake table in a column named reviews.
How should they do this with the LEAST operational overhead?
A Data Scientist trained a neural network with 50 epochs. After training the network, the results showed a high accuracy on training data but a fairly low accuracy on test data (for example, overfitting problem).
Which technique would help in resolving the overfitting issue? (Choose two.)
A Data Scientist has noticed that a model is producing more accurate predictions.
What metric should be used to evaluate why the model predictions have improved?
A Data Scientist executed this command to create a task:

What is the reason?
A Data Scientist wants to capture customer profiles likely to churn for a company. The training data set has 2% churn and 98% no churn, and they set 1 to churn and 0 to no churn. Then the Data Scientist ran the model in Snowpark.
After capturing the confusion matrix, what value could be misleading on how the model is performing due to an imbalanced data set?
Which algorithms are examples of supervised learning? (Choose two.)
A Data Scientist created a session with Snowpark Python (snowflake-snowpark-python package) with this code:

Which parameters are required to authenticate the session? (Choose two.)
What does variance measure in statistics?
While logging a model with the Snowflake Model Registry by calling the registry’s log_model method, which arguments are required? (Choose two.)
A Data Scientist needs to present results from a linear regression model used to estimate the quality of a user’s experience. The model performed measurements using multiple standardized numeric features that were then applied to a numeric satisfaction scale.
What diagram should be used to present the effect of each feature on user satisfaction scores?
A Data Scientist has a trained a model that has been stored as a file on a Snowflake stage. Now they want to use a Python User-Defined Function (UDF) for scoring data in Snowflake with the model
One requirement is that they should be able to replace the mode file with a new one without having to recreate the UDF.
What is required in order to accomplish this?
A Data Scientist at a telecommunications company wants to store churn scores that are calculated with a Snowflake external function, configured to connect a model endpoint.
The environment conditions are as follows:
The External Function DDL:

The CUSTOMER_DAILY table:

The churn value stored in a new table called CUSTOMER_CHURN:

Based on the environmental conditions, how could the Data Scientist persistently store the churn scores from the external function in Showflake?
Which features can be used with the Snowflake Model Registry to seamlessly manage models from development to production lifecycles? (Choose two.)
A Data Scientist is training a model. They used scikit-learn to create a partial dependency plot using the following command:

The following charts are created as a result:

Which statements reflect what is depicted in the chats? (Choose three.)
This correlation matrix was created when performing feature engineering:

Which combination of variables is the MOST correlated and could possibly help with feature reduction?
A Data Scientist is building a data pipeline for a customer churn model. To enable efficient processing of the model, they add a stream to the customer table.
Which function should be used to check if the stream has new or updated data?
A Data Scientist executes a SQL NULL argument to a Python User-Defined Function (UDF) in a Snowflake string data type.
What will be returned as a translated Python value?
Which step of the machine learning lifecycle does hyperparameter tuning fall under?
A remote weather sensor malfunctions and produces temperature readings higher than the normal range which was around 69.8°F (21°C).
Ignoring units, what is the correct order of the magnitude of these key measures?
This chart in Snowsight for a New York City ride-share bike service shows the number of trips taken to the destination borough:

A Data Scientist wants to build a classifier that predicts which borough will be the most likely destination when a trip is initiated.
Which techniques should be used to handle the class imbalance depicted in the Snowsight chart? (Choose two.)
For which Snowflake Cortex LLM functions would both input and output tokens be counted? (Choose three.)
A company’s platform team wants to integrate their existing data lake with Snowflake. The data lake is hundreds of TBs in size and the team does not want to duplicate most of the data into Snowflake. A Data Scientist at the company wants to be able to query and access the data lake’s metadata. There is already an external stage in Snowflake referencing the data lake’s location.
What is the MOST efficient way to integrate the existing data lake into the Snowflake environment?
Which characteristic applies to Snowpark Python stored procedures?