A Data Scientist noticed a high usage on Snowflake Cortex LLM functions.
Which command will restrict user access?
AREVOKE ROLE SNOWFLAKE.CORTEX_USERFROM ROLE PUBLIC;
BDROP DATABASE ROLE SNOWFLAKE.CORTEX_USER;
CDROP ROLE SNOWFLAKE.CORTEX_USER;
DREVOKE DATABASE ROLE SNOWFLAKE.CORTEX_USERFROM ROLE PUBLIC;
A Data Scientist is tasked with gathering a summary of each customer review. The data is stored in a Snowflake table in a column named reviews.
How should they do this with the LEAST operational overhead?
AUse the SQL function snow1ake.cortex.summarize (‘reviews’) to summarize using the built-in summarization model.
BUse the SQL function snowlake.cortex.embed('e5-base-v2, ‘reviews’) to summarize using the built-in text embedding model
CUse the SQL function call snowlake.cortex.complete(‘llama3.1-405b’, ‘reviews’) to summarize with the llama3.1-405b model.
DUse Snowpark Container Services to deploy the llama3.1-405b model from the Hugging Face Hub and submit the ‘text_data’ column to the model API.
A Data Scientist trained a neural network with 50 epochs. After training the network, the results showed a high accuracy on training data but a fairly low accuracy on test data (for example, overfitting problem).
Which technique would help in resolving the overfitting issue? (Choose two.)
AUse regularization.
BUse gradient descent
CModify the learning rate.
DImplement early stopping,
EAdd additional features to the training data.
A Data Scientist has noticed that a model is producing more accurate predictions.
What metric should be used to evaluate why the model predictions have improved?
AConfidence intervals
BShapley Additive exPlanations (SHAP)
CRoot Mean Squared Error (RMSE)
DR2
Question 6
Implement Snowflake data science best practices
0
Question 7
Data science concepts
Question 8
Data science concepts
Question 9
Implement Snowflake data science best practices
Question 10
Data science concepts
Question 11
Train and use machine learning models
Question 12
Data science concepts
Question 13
Train and use machine learning models
Question 14
Train and use machine learning models
Question 15
Train and use machine learning models
Question 16
Data science concepts
Question 17
Prepare data and use feature engineering in Snowflake
Question 18
Prepare data and use feature engineering in Snowflake
Question 19
Data science concepts
Question 20
Train and use machine learning models
Question 21
Data science concepts
Question 22
Train and use machine learning models
Question 23
Use GenAI and LLM capabilities in Snowflake
Question 24
Implement Snowflake data science best practices
Question 25
Implement Snowflake data science best practices
Ask AstroTutor
0
Ask AstroTutor
0
Ask AstroTutor
0
Ask AstroTutor
0
Ask AstroTutor
0
Ask AstroTutor
0
Ask AstroTutor
0
Ad
Want a break from the ads?
Become a Supporter and enjoy a completely ad-free experience, plus unlock Learn Mode, Exam Mode, AstroTutor AI, and more.
Ask AstroTutor
0
Ask AstroTutor
0
Ask AstroTutor
0
Ask AstroTutor
0
Ask AstroTutor
0
Ask AstroTutor
0
Ask AstroTutor
0
Ask AstroTutor
0
Ask AstroTutor
0
Ask AstroTutor
0
Ask AstroTutor
0
Ask AstroTutor
0
A Data Scientist executed this command to create a task:
What is the reason?
AThe task has not been set to resume.
BThe cron job must be specified in UTC.
CThe SQL is invalid for calling a stored procedure.
DThe ALLOW_OV ERLAPPING_EXECUTION command is not valid with a fixed SCHEDULE.
A Data Scientist wants to capture customer profiles likely to churn for a company. The training data set has 2% churn and 98% no churn, and they set 1 to churn and 0 to no churn. Then the Data Scientist ran the model in Snowpark.
After capturing the confusion matrix, what value could be misleading on how the model is performing due to an imbalanced data set?
AAccuracy
BF-measure
CPrecision
DRecall
Which algorithms are examples of supervised learning? (Choose two.)
AApriori
BCART
CK-Means
DPCA
ESVM
A Data Scientist created a session with Snowpark Python (snowflake-snowpark-python package) with this code:
Which parameters are required to authenticate the session? (Choose two.)
AAccount
BDatabase
CRole
DUser
EVirtual warehouse
What does variance measure in statistics?
AThe variability from the minimum value
BThe variability from the median value
CThe variability from the mean value
DThe variability from the mode value
While logging a model with the Snowflake Model Registry by calling the registry’s log_model method, which arguments are required? (Choose two.)
Acode_paths
Bmodel
Cmodel_name
Dpython_version
Eversion_name
A Data Scientist needs to present results from a linear regression model used to estimate the quality of a user’s experience. The model performed measurements using multiple standardized numeric features that were then applied to a numeric satisfaction scale.
What diagram should be used to present the effect of each feature on user satisfaction scores?
AConfusion matrix
BCorrelation matrix
CScatterplot
DCoefficient plot
A Data Scientist has a trained a model that has been stored as a file on a Snowflake stage. Now they want to use a Python User-Defined Function (UDF) for scoring data in Snowflake with the model
One requirement is that they should be able to replace the mode file with a new one without having to recreate the UDF.
What is required in order to accomplish this?
ALoad the model file using the path to the model file including stage and file name in the UDF.
BDeploy the UDF using Snowpark for Python and use the trained model object in the UDF.
CProvide the path to the model file including stage and file name as a parameter to the function and load the file in the UDF.
DAdd the path to the model file including stage and file name to the IMPORT parameter for the UDF and load the file in the UDF.
A Data Scientist at a telecommunications company wants to store churn scores that are calculated with a Snowflake external function, configured to connect a model endpoint.
The environment conditions are as follows:
The External Function DDL:
The CUSTOMER_DAILY table:
The churn value stored in a new table called CUSTOMER_CHURN:
Based on the environmental conditions, how could the Data Scientist persistently store the churn scores from the external function in Showflake?
Ainsert into customer_churnselectcust_id,get_churn(*) as churn_scorefromcustomer_daily;
Binsert into customer_churnselectcust_id,call get_churn(object_construct(*)) as churn_scorefromcustomer_daily;
Cinsert into customer_churnselectcust_id,call get_churn(*) as churn_scorefromcustomer_daily;
Dinsert into customer_churnselectcust_id,get_churn(object_construct(*)) as churn_scorefromcustomer_daily;
Which features can be used with the Snowflake Model Registry to seamlessly manage models from development to production lifecycles? (Choose two.)
ATags
BZero-copy clones
CAccount replication
DModel Copy
ETime Travel
A Data Scientist is training a model. They used scikit-learn to create a partial dependency plot using the following command:
The following charts are created as a result:
Which statements reflect what is depicted in the chats? (Choose three.)
AAs Median income increases, the target housing price decreases.
BAs Median income increases, the target housing price increases.
CAs the Age of the House increases, the target housing price decreases.
DAs the Age of the House increases, the target housing price is fairly constant.
EAs the Average Occupancy increases, the target housing price decreases.
FAs the Average Occupancy increases, the target housing price increases.
This correlation matrix was created when performing feature engineering:
Which combination of variables is the MOST correlated and could possibly help with feature reduction?
ADIS and CRIM
BDIS and TAX
CRAD and TAX
DINDUS and CRIM
A Data Scientist is building a data pipeline for a customer churn model. To enable efficient processing of the model, they add a stream to the customer table.
Which function should be used to check if the stream has new or updated data?
ASYSTEM$STREAM_STATUS(‘MYSTREAM’)
BSYSTEM$STREAM_HAS_DATA(‘MYSTREAM')
CSYSTEM$STREAM_GET UPDATES (‘MYSTREAM’)
DSYSTEM$STREAM_GET_ TABLE_TIMESTAMP (‘MYSTREAM’)
A Data Scientist executes a SQL NULL argument to a Python User-Defined Function (UDF) in a Snowflake string data type.
What will be returned as a translated Python value?
A1
BFalse
CNone
DAn empty string
Which step of the machine learning lifecycle does hyperparameter tuning fall under?
AModel training
BModel deployment
CModel validation
DFeature engineering
A remote weather sensor malfunctions and produces temperature readings higher than the normal range which was around 69.8°F (21°C).
Ignoring units, what is the correct order of the magnitude of these key measures?
AMean> Median > Skewness > 0(A right skew and a mean greater than the median)
BMedian > Mean > Skewness > 0(A right skew and a median greater than the mean)
CMean > Median > 0 > Skewness(A left skew and a mean greater than the median)
DMedian > Mean > 0 > Skewness(A left skew and a median greater than the mean)
This chart in Snowsight for a New York City ride-share bike service shows the number of trips taken to the destination borough:
A Data Scientist wants to build a classifier that predicts which borough will be the most likely destination when a trip is initiated.
Which techniques should be used to handle the class imbalance depicted in the Snowsight chart? (Choose two.)
AIntroduce regularization parameters.
BCollapse all minority classes into a single class.
CUndersample the Manhattan borough for training.
DUtilize bootstrapping to synthesize additional data for the non-Manhattan boroughs.
EUtilize Synthetic Minority Oversampling Technique (SMOTE) to oversample the non-Manhattan boroughs for training.
For which Snowflake Cortex LLM functions would both input and output tokens be counted? (Choose three.)
ASUMMARIZE
BSENTIMENT
CTRANSLATE
DEMBED_TEXT_768
EEMBED_TEXT_1024
FCLASSIFY_TEXT
A company’s platform team wants to integrate their existing data lake with Snowflake. The data lake is hundreds of TBs in size and the team does not want to duplicate most of the data into Snowflake. A Data Scientist at the company wants to be able to query and access the data lake’s metadata. There is already an external stage in Snowflake referencing the data lake’s location.
What is the MOST efficient way to integrate the existing data lake into the Snowflake environment?
AMove the data lake files to an internal stage in Snowflake to allow for access to the metadata and data from the data lake.
BUsing the existing external stage, create SELECT statements that can be run on-demand, referencing the files in the data lake directly.
CCreate a PIPE for each data lake table which will allow for on-demand querying of the data lake files by leveraging the Snowpipe service.
DCreate an external table for each corresponding data lake table to enable querying data stored in files in the data lake as if the data lake table was inside a database.
Which characteristic applies to Snowpark Python stored procedures?