You are an ML engineer at a manufacturing company. You are creating a classification model for a predictive maintenance use case. You need to predict whether a crucial machine will fail in the next three days so that the repair crew has enough time to fix the machine before it breaks. Regular maintenance of the machine is relatively inexpensive, but a failure would be very costly. You have trained several binary classifiers to predict whether the machine will fail, where a prediction of 1 means that the ML model predicts a failure.
You are now evaluating each model on an evaluation dataset. You want to choose a model that prioritizes detection while ensuring that more than 50% of the maintenance jobs triggered by your model address an imminent machine failure. Which model should you choose?
AThe model with the highest area under the receiver operating characteristic curve (AUC ROC) and precision greater than 0.5
BThe model with the lowest root mean squared error (RMSE) and recall greater than 0.5.
CThe model with the highest recall where precision is greater than 0.5.
DThe model with the highest precision where recall is greater than 0.5.
You built a custom ML model using scikit-learn. Training time is taking longer than expected. You decide to migrate your model to Vertex AI Training, and you want to improve the model’s training time. What should you try out first?
ATrain your model in a distributed mode using multiple Compute Engine VMs.
BTrain your model using Vertex AI Training with CPUs.
CMigrate your model to TensorFlow, and train it using Vertex AI Training.
DTrain your model using Vertex AI Training with GPUs.
You are an ML engineer at a retail company. You have built a model that predicts a coupon to offer an ecommerce customer at checkout based on the items in their cart. When a customer goes to checkout, your serving pipeline, which is hosted on Google Cloud, joins the customer's existing cart with a row in a BigQuery table that contains the customers' historic purchase behavior and uses that as the model's input. The web team is reporting that your model is returning predictions too slowly to load the coupon offer with the rest of the web page. How should you speed up your model's predictions?
AAttach an NVIDIA P100 GPU to your deployed model’s instance.
BUse a low latency database for the customers’ historic purchase behavior.
CDeploy your model to more instances behind a load balancer to distribute traffic.
DCreate a materialized view in BigQuery with the necessary data for predictions.
You work for a small company that has deployed an ML model with autoscaling on Vertex AI to serve online predictions in a production environment. The current model receives about 20 prediction requests per hour with an average response time of one second. You have retrained the same model on a new batch of data, and now you are canary testing it, sending ~10% of production traffic to the new model. During this canary test, you notice that prediction requests for your new model are taking between 30 and 180 seconds to complete. What should you do?
ASubmit a request to raise your project quota to ensure that multiple prediction services can run concurrently.
BTurn off auto-scaling for the online prediction service of your new model. Use manual scaling with one node always available.
CRemove your new model from the production environment. Compare the new model and existing model codes to identify the cause of the performance bottleneck.
DRemove your new model from the production environment. For a short trial period, send all incoming prediction requests to BigQuery. Request batch predictions from your new model, and then use the Data Labeling Service to validate your model’s performance before promoting it to production.
You want to train an AutoML model to predict house prices by using a small public dataset stored in BigQuery. You need to prepare the data and want to use the simplest, most efficient approach. What should you do?
AWrite a query that preprocesses the data by using BigQuery and creates a new table. Create a Vertex AI managed dataset with the new table as the data source.
BUse Dataflow to preprocess the data. Write the output in TFRecord format to a Cloud Storage bucket.
CWrite a query that preprocesses the data by using BigQuery. Export the query results as CSV files, and use those files to create a Vertex AI managed dataset.
DUse a Vertex AI Workbench notebook instance to preprocess the data by using the pandas library. Export the data as CSV files, and use those files to create a Vertex AI managed dataset.
You work for a hospital that wants to optimize how it schedules operations. You need to create a model that uses the relationship between the number of surgeries scheduled and beds used. You want to predict how many beds will be needed for patients each day in advance based on the scheduled surgeries. You have one year of data for the hospital organized in 365 rows.
The data includes the following variables for each day:
• Number of scheduled surgeries
• Number of beds occupied
• Date
You want to maximize the speed of model development and testing. What should you do?
ACreate a BigQuery table. Use BigQuery ML to build a regression model, with number of beds as the target variable, and number of scheduled surgeries and date features (such as day of week) as the predictors.
BCreate a BigQuery table. Use BigQuery ML to build an ARIMA model, with number of beds as the target variable, and date as the time variable.
CCreate a Vertex AI tabular dataset. Train an AutoML regression model, with number of beds as the target variable, and number of scheduled minor surgeries and date features (such as day of the week) as the predictors.
DCreate a Vertex AI tabular dataset. Train a Vertex AI AutoML Forecasting model, with number of beds as the target variable, number of scheduled surgeries as a covariate and date as the time variable.
You work for a rapidly growing social media company. Your team builds TensorFlow recommender models in an on-premises CPU cluster. The data contains billions of historical user events and 100,000 categorical features. You notice that as the data increases, the model training time increases. You plan to move the models to Google Cloud. You want to use the most scalable approach that also minimizes training time. What should you do?
ADeploy the training jobs by using TPU VMs with TPUv3 Pod slices, and use the TPUEmbeading API
BDeploy the training jobs in an autoscaling Google Kubernetes Engine cluster with CPUs
CDeploy a matrix factorization model training job by using BigQuery ML
DDeploy the training jobs by using Compute Engine instances with A100 GPUs, and use the tf.nn.embedding_lookup API
You work for a semiconductor manufacturing company. You need to create a real-time application that automates the quality control process. High-definition images of each semiconductor are taken at the end of the assembly line in real time. The photos are uploaded to a Cloud Storage bucket along with tabular data that includes each semiconductor’s batch number, serial number, dimensions, and weight. You need to configure model training and serving while maximizing model accuracy. What should you do?
AUse Vertex AI Data Labeling Service to label the images, and tram an AutoML image classification model. Deploy the model, and configure Pub/Sub to publish a message when an image is categorized into the failing class.
BUse Vertex AI Data Labeling Service to label the images, and train an AutoML image classification model. Schedule a daily batch prediction job that publishes a Pub/Sub message when the job completes.
CConvert the images into an embedding representation. Import this data into BigQuery, and train a BigQuery ML K-means clustering model with two clusters. Deploy the model and configure Pub/Sub to publish a message when a semiconductor’s data is categorized into the failing cluster.
DImport the tabular data into BigQuery, use Vertex AI Data Labeling Service to label the data and train an AutoML tabular classification model. Deploy the model, and configure Pub/Sub to publish a message when a semiconductor’s data is categorized into the failing class.
You developed a Vertex AI ML pipeline that consists of preprocessing and training steps and each set of steps runs on a separate custom Docker image. Your organization uses GitHub and GitHub Actions as CI/CD to run unit and integration tests. You need to automate the model retraining workflow so that it can be initiated both manually and when a new version of the code is merged in the main branch. You want to minimize the steps required to build the workflow while also allowing for maximum flexibility. How should you configure the CI/CD workflow?
ATrigger a Cloud Build workflow to run tests, build custom Docker images, push the images to Artifact Registry, and launch the pipeline in Vertex AI Pipelines.
BTrigger GitHub Actions to run the tests, launch a job on Cloud Run to build custom Docker images, push the images to Artifact Registry, and launch the pipeline in Vertex AI Pipelines.
CTrigger GitHub Actions to run the tests, build custom Docker images, push the images to Artifact Registry, and launch the pipeline in Vertex AI Pipelines.
DTrigger GitHub Actions to run the tests, launch a Cloud Build workflow to build custom Docker images, push the images to Artifact Registry, and launch the pipeline in Vertex AI Pipelines.
You are working with a dataset that contains customer transactions. You need to build an ML model to predict customer purchase behavior. You plan to develop the model in BigQuery ML, and export it to Cloud Storage for online prediction. You notice that the input data contains a few categorical features, including product category and payment method. You want to deploy the model as quickly as possible. What should you do?
AUse the TRANSFORM clause with the ML.ONE_HOT_ENCODER function on the categorical features at model creation and select the categorical and non-categorical features.
BUse the ML.ONE_HOT_ENCODER function on the categorical features and select the encoded categorical features and non-categorical features as inputs to create your model.
CUse the CREATE MODEL statement and select the categorical and non-categorical features.
DUse the ML.MULTI_HOT_ENCODER function on the categorical features, and select the encoded categorical features and non-categorical features as inputs to create your model.
You need to develop an image classification model by using a large dataset that contains labeled images in a Cloud Storage bucket. What should you do?
AUse Vertex AI Pipelines with the Kubeflow Pipelines SDK to create a pipeline that reads the images from Cloud Storage and trains the model.
BUse Vertex AI Pipelines with TensorFlow Extended (TFX) to create a pipeline that reads the images from Cloud Storage and trains the model.
CImport the labeled images as a managed dataset in Vertex AI and use AutoML to train the model.
DConvert the image dataset to a tabular format using Dataflow Load the data into BigQuery and use BigQuery ML to train the model.
You are developing a model to detect fraudulent credit card transactions. You need to prioritize detection, because missing even one fraudulent transaction could severely impact the credit card holder. You used AutoML to tram a model on users' profile information and credit card transaction data After training the initial model, you notice that the model is failing to detect many fraudulent transactions. How should you adjust the training parameters in AutoML to improve model performance? (Choose two.)
AIncrease the score threshold
BDecrease the score threshold.
CAdd more positive examples to the training set
DAdd more negative examples to the training set
EReduce the maximum number of node hours for training
You need to deploy a scikit-leam classification model to production. The model must be able to serve requests 24/7, and you expect millions of requests per second to the production application from 8 am to 7 pm. You need to minimize the cost of deployment. What should you do?
ADeploy an online Vertex AI prediction endpoint. Set the max replica count to 1
BDeploy an online Vertex AI prediction endpoint. Set the max replica count to 100
CDeploy an online Vertex AI prediction endpoint with one GPU per replica. Set the max replica count to 1
DDeploy an online Vertex AI prediction endpoint with one GPU per replica. Set the max replica count to 100
You work with a team of researchers to develop state-of-the-art algorithms for financial analysis. Your team develops and debugs complex models in TensorFlow. You want to maintain the ease of debugging while also reducing the model training time. How should you set up your training environment?
AConfigure a v3-8 TPU VM. SSH into the VM to train and debug the model.
BConfigure a v3-8 TPU node. Use Cloud Shell to SSH into the Host VM to train and debug the model.
CConfigure a n1 -standard-4 VM with 4 NVIDIA P100 GPUs. SSH into the VM and use ParameterServerStraregv to train the model.
DConfigure a n1-standard-4 VM with 4 NVIDIA P100 GPUs. SSH into the VM and use MultiWorkerMirroredStrategy to train the model.
You created an ML pipeline with multiple input parameters. You want to investigate the tradeoffs between different parameter combinations. The parameter options are
• Input dataset
• Max tree depth of the boosted tree regressor
• Optimizer learning rate
You need to compare the pipeline performance of the different parameter combinations measured in F1 score, time to train, and model complexity. You want your approach to be reproducible, and track all pipeline runs on the same platform. What should you do?
A
Use BigQueryML to create a boosted tree regressor, and use the hyperparameter tuning capability.2. Configure the hyperparameter syntax to select different input datasets: max tree depths, and optimizer learning rates. Choose the grid search option.
B
Create a Vertex AI pipeline with a custom model training job as part of the pipeline. Configure the pipeline’s parameters to include those you are investigating.2. In the custom training step, use the Bayesian optimization method with F1 score as the target to maximize.
C
Create a Vertex AI Workbench notebook for each of the different input datasets.2. In each notebook, run different local training jobs with different combinations of the max tree depth and optimizer learning rate parameters.3. After each notebook finishes, append the results to a BigQuery table.
D
Create an experiment in Vertex AI Experiments.2. Create a Vertex AI pipeline with a custom model training job as part of the pipeline. Configure the pipeline’s parameters to include those you are investigating.3. Submit multiple runs to the same experiment, using different values for the parameters.
You received a training-serving skew alert from a Vertex AI Model Monitoring job running in production. You retrained the model with more recent training data, and deployed it back to the Vertex AI endpoint, but you are still receiving the same alert. What should you do?
AUpdate the model monitoring job to use a lower sampling rate.
BUpdate the model monitoring job to use the more recent training data that was used to retrain the model.
CTemporarily disable the alert. Enable the alert again after a sufficient amount of new production traffic has passed through the Vertex AI endpoint.
DTemporarily disable the alert until the model can be retrained again on newer training data. Retrain the model again after a sufficient amount of new production traffic has passed through the Vertex AI endpoint.
You work for a delivery company. You need to design a system that stores and manages features such as parcels delivered and truck locations over time. The system must retrieve the features with low latency and feed those features into a model for online prediction. The data science team will retrieve historical data at a specific point in time for model training. You want to store the features with minimal effort. What should you do?
AStore features in Bigtable as key/value data.
BStore features in Vertex AI Feature Store.
CStore features as a Vertex AI dataset, and use those features to train the models hosted in Vertex AI endpoints.
DStore features in BigQuery timestamp partitioned tables, and use the BigQuery Storage Read API to serve the features.
You have created a Vertex AI pipeline that automates custom model training. You want to add a pipeline component that enables your team to most easily collaborate when running different executions and comparing metrics both visually and programmatically. What should you do?
AAdd a component to the Vertex AI pipeline that logs metrics to a BigQuery table. Query the table to compare different executions of the pipeline. Connect BigQuery to Looker Studio to visualize metrics.
BAdd a component to the Vertex AI pipeline that logs metrics to a BigQuery table. Load the table into a pandas DataFrame to compare different executions of the pipeline. Use Matplotlib to visualize metrics.
CAdd a component to the Vertex AI pipeline that logs metrics to Vertex ML Metadata. Use Vertex AI Experiments to compare different executions of the pipeline. Use Vertex AI TensorBoard to visualize metrics.
DAdd a component to the Vertex AI pipeline that logs metrics to Vertex ML Metadata. Load the Vertex ML Metadata into a pandas DataFrame to compare different executions of the pipeline. Use Matplotlib to visualize metrics.
You developed a custom model by using Vertex AI to forecast the sales of your company’s products based on historical transactional data. You anticipate changes in the feature distributions and the correlations between the features in the near future. You also expect to receive a large volume of prediction requests. You plan to use Vertex AI Model Monitoring for drift detection and you want to minimize the cost. What should you do?
AUse the features for monitoring. Set a monitoring-frequency value that is higher than the default.
BUse the features for monitoring. Set a prediction-sampling-rate value that is closer to 1 than 0.
CUse the features and the feature attributions for monitoring. Set a monitoring-frequency value that is lower than the default.
DUse the features and the feature attributions for monitoring. Set a prediction-sampling-rate value that is closer to 0 than 1.
You recently used XGBoost to train a model in Python that will be used for online serving. Your model prediction service will be called by a backend service implemented in Golang running on a Google Kubernetes Engine (GKE) cluster. Your model requires pre and postprocessing steps. You need to implement the processing steps so that they run at serving time. You want to minimize code changes and infrastructure maintenance, and deploy your model into production as quickly as possible. What should you do?
AUse FastAPI to implement an HTTP server. Create a Docker image that runs your HTTP server, and deploy it on your organization’s GKE cluster.
BUse FastAPI to implement an HTTP server. Create a Docker image that runs your HTTP server, Upload the image to Vertex AI Model Registry and deploy it to a Vertex AI endpoint.
CUse the Predictor interface to implement a custom prediction routine. Build the custom container, upload the container to Vertex AI Model Registry and deploy it to a Vertex AI endpoint.
DUse the XGBoost prebuilt serving container when importing the trained model into Vertex AI. Deploy the model to a Vertex AI endpoint. Work with the backend engineers to implement the pre- and postprocessing steps in the Golang backend service.
You work for a retail company. You have a managed tabular dataset in Vertex AI that contains sales data from three different stores. The dataset includes several features, such as store name and sale timestamp. You want to use the data to train a model that makes sales predictions for a new store that will open soon. You need to split the data between the training, validation, and test sets. What approach should you use to split the data?
AUse Vertex AI manual split, using the store name feature to assign one store for each set
BUse Vertex AI default data split
CUse Vertex AI chronological split, and specify the sales timestamp feature as the time variable
DUse Vertex AI random split, assigning 70% of the rows to the training set, 10% to the validation set, and 20% to the test set
You have developed a BigQuery ML model that predicts customer chum, and deployed the model to Vertex AI Endpoints. You want to automate the retraining of your model by using minimal additional code when model feature values change. You also want to minimize the number of times that your model is retrained to reduce training costs. What should you do?
A1 Enable request-response logging on Vertex AI Endpoints2. Schedule a TensorFlow Data Validation job to monitor prediction drift3. Execute model retraining if there is significant distance between the distributions
B
Enable request-response logging on Vertex AI Endpoints2. Schedule a TensorFlow Data Validation job to monitor training/serving skew3. Execute model retraining if there is significant distance between the distributions
C
Create a Vertex AI Model Monitoring job configured to monitor prediction drift2. Configure alert monitoring to publish a message to a Pub/Sub queue when a monitoring alert is detected3. Use a Cloud Function to monitor the Pub/Sub queue, and trigger retraining in BigQuery
D
Create a Vertex AI Model Monitoring job configured to monitor training/serving skew2. Configure alert monitoring to publish a message to a Pub/Sub queue when a monitoring alert is detected3. Use a Cloud Function to monitor the Pub/Sub queue, and trigger retraining in BigQuery
You have been tasked with deploying prototype code to production. The feature engineering code is in PySpark and runs on Dataproc Serverless. The model training is executed by using a Vertex AI custom training job. The two steps are not connected, and the model training must currently be run manually after the feature engineering step finishes. You need to create a scalable and maintainable production process that runs end-to-end and tracks the connections between steps. What should you do?
ACreate a Vertex AI Workbench notebook. Use the notebook to submit the Dataproc Serverless feature engineering job. Use the same notebook to submit the custom model training job. Run the notebook cells sequentially to tie the steps together end-to-end.
BCreate a Vertex AI Workbench notebook. Initiate an Apache Spark context in the notebook and run the PySpark feature engineering code. Use the same notebook to run the custom model training job in TensorFlow. Run the notebook cells sequentially to tie the steps together end-to-end.
CUse the Kubeflow pipelines SDK to write code that specifies two components:- The first is a Dataproc Serverless component that launches the feature engineering job- The second is a custom component wrapped in the create_custom_training_job_from_component utility that launches the custom model training jobCreate a Vertex AI Pipelines job to link and run both components
DUse the Kubeflow pipelines SDK to write code that specifies two components- The first component initiates an Apache Spark context that runs the PySpark feature engineering code- The second component runs the TensorFlow custom model training codeCreate a Vertex AI Pipelines job to link and run both components.
You work for an online grocery store. You recently developed a custom ML model that recommends a recipe when a user arrives at the website. You chose the machine type on the Vertex AI endpoint to optimize costs by using the queries per second (QPS) that the model can serve, and you deployed it on a single machine with 8 vCPUs and no accelerators.
A holiday season is approaching and you anticipate four times more traffic during this time than the typical daily traffic. You need to ensure that the model can scale efficiently to the increased demand. What should you do?
A
Maintain the same machine type on the endpoint.2. Set up a monitoring job and an alert for CPU usage.3. If you receive an alert, add a compute node to the endpoint.
B
Change the machine type on the endpoint to have 32 vCPUs.2. Set up a monitoring job and an alert for CPU usage.3. If you receive an alert, scale the vCPUs further as needed.
C
Maintain the same machine type on the endpoint Configure the endpoint to enable autoscaling based on vCPU usage.2. Set up a monitoring job and an alert for CPU usage.3. If you receive an alert, investigate the cause.
D
Change the machine type on the endpoint to have a GPU. Configure the endpoint to enable autoscaling based on the GPU usage.2. Set up a monitoring job and an alert for GPU usage.3. If you receive an alert, investigate the cause.
You work at a bank. You need to develop a credit risk model to support loan application decisions. You decide to implement the model by using a neural network in TensorFlow. Due to regulatory requirements, you need to be able to explain the model’s predictions based on its features. When the model is deployed, you also want to monitor the model’s performance over time. You decided to use Vertex AI for both model development and deployment. What should you do?
AUse Vertex Explainable AI with the sampled Shapley method, and enable Vertex AI Model Monitoring to check for feature distribution drift.
BUse Vertex Explainable AI with the sampled Shapley method, and enable Vertex AI Model Monitoring to check for feature distribution skew.
CUse Vertex Explainable AI with the XRAI method, and enable Vertex AI Model Monitoring to check for feature distribution drift.
DUse Vertex Explainable AI with the XRAI method, and enable Vertex AI Model Monitoring to check for feature distribution skew.