A Generative Al Engineer has created a RAG application to look up answers to questions about a series of fantasy novels that are being asked on the author’s web forum. The fantasy novel texts are chunked and embedded into a vector store with metadata (page number, chapter number, book title), retrieved with the user’s query, and provided to an LLM for response generation. The Generative AI Engineer used their intuition to pick the chunking strategy and associated configurations but now wants to more methodically choose the best values.
Which TWO strategies should the Generative AI Engineer take to optimize their chunking strategy and parameters? (Choose two.)
AChange embedding models and compare performance.
BAdd a classifier for user queries that predicts which book will best contain the answer. Use this to filter retrieval.
CChoose an appropriate evaluation metric (such as recall or NDCG) and experiment with changes in the chunking strategy, such as splitting chunks by paragraphs or chapters. Choose the strategy that gives the best performance metric.
DPass known questions and best answers to an LLM and instruct the LLM to provide the best token count. Use a summary statistic (mean, median, etc.) of the best token counts to choose chunk size.
ECreate an LLM-as-a-judge metric to evaluate how well previous questions are answered by the most appropriate chunk. Optimize the chunking parameters based upon the values of the metric.
A Generative Al Engineer interfaces with an LLM with prompt/response behavior that has been trained on customer calls inquiring about product availability. The LLM is designed to output “In Stock” if the product is available or only the term “Out of Stock” if not.
Which prompt will work to allow the engineer to respond to call classification labels correctly?
ARespond with “In Stock” if the customer asks for a product.
BYou will be given a customer call transcript where the customer asks about product availability. The outputs are either “In Stock” or “Out of Stock”. Format the output in JSON, for example: {“call_id”: “123”, “label”: “In Stock”}.
CRespond with “Out of Stock” if the customer asks for a product.
DYou will be given a customer call transcript where the customer inquires about product availability. Respond with “In Stock” if the product is available or “Out of Stock” if not.
A company has a typical RAG-enabled, customer-facing chatbot on its website.
Select the correct sequence of components a user's questions will go through before the final output is returned. Use the diagram above for reference.
C1.response-generating LLM, 2.vector search, 3.context-augmented prompt, 4.embedding model
D1.response-generating LLM, 2.context-augmented prompt, 3.vector search, 4.embedding model
A Generative AI Engineer is testing a simple prompt template in LangChain using the code below, but is getting an error.
Assuming the API key was properly defined, what change does the Generative AI Engineer need to make to fix their chain?
A
B
C
D
A Generative AI Engineer is designing an LLM-powered live sports commentary platform. The platform provides real-time updates and LLM-generated analyses for any users who would like to have live summaries, rather than reading a series of potentially outdated news articles.
Which tool below will give the platform access to real-time data for generating game analyses based on the latest game scores?
ADatabricksIQ
BFoundation Model APIs
CFeature Serving
DAutoML
Question 6
Governance
0
Question 7
Governance
Question 8
Data Preparation
Question 9
Design Applications
Question 10
Assembling and Deploying Applications
Question 11
Evaluation and Monitoring
Question 12
Application Development
Question 13
Evaluation and Monitoring
Question 14
Evaluation and Monitoring
Question 15
Design Applications
Question 16
Design Applications
Question 17
Assembling and Deploying Applications
Question 18
Data Preparation
Question 19
Governance
Question 20
Evaluation and Monitoring
Question 21
Governance
Question 22
Data Preparation
Question 23
Design Applications
Question 24
Data Preparation
Question 25
Evaluation and Monitoring
Ask AstroTutor
0
Ask AstroTutor
0
Ask AstroTutor
0
Ask AstroTutor
0
Ask AstroTutor
0
Ask AstroTutor
0
Ask AstroTutor
0
Ad
Want a break from the ads?
Become a Supporter and enjoy a completely ad-free experience, plus unlock Learn Mode, Exam Mode, AstroTutor AI, and more.
Ask AstroTutor
0
Ask AstroTutor
0
Ask AstroTutor
0
Ask AstroTutor
0
Ask AstroTutor
0
Ask AstroTutor
0
Ask AstroTutor
0
Ask AstroTutor
0
Ask AstroTutor
0
Ask AstroTutor
0
Ask AstroTutor
0
Ask AstroTutor
0
When developing an LLM application, it’s crucial to ensure that the data used for training the model complies with licensing requirements to avoid legal risks.
Which action is NOT appropriate to avoid legal risks?
AReach out to the data curators directly before you have started using the trained model to let them know.
BUse any available data you personally created which is completely original and you can decide what license to use.
COnly use data explicitly labeled with an open license and ensure the license terms are followed.
DReach out to the data curators directly after you have started using the trained model to let them know.
A Generative AI Engineer is developing a chatbot designed to assist users with insurance-related queries. The chatbot is built on a large language model (LLM) and is conversational. However, to maintain the chatbot’s focus and to comply with company policy, it must not provide responses to questions about politics. Instead, when presented with political inquiries, the chatbot should respond with a standard message:
“Sorry, I cannot answer that. I am a chatbot that can only answer questions around insurance.”
Which framework type should be implemented to solve this?
ASafety Guardrail
BSecurity Guardrail
CContextual Guardrail
DCompliance Guardrail
A Generative Al Engineer is responsible for developing a chatbot to enable their company’s internal HelpDesk Call Center team to more quickly find related tickets and provide resolution. While creating the GenAI application work breakdown tasks for this project, they realize they need to start planning which data sources (either Unity Catalog volume or Delta table) they could choose for this application. They have collected several candidate data sources for consideration: call_rep_history: a Delta table with primary keys representative_id, call_id. This table is maintained to calculate representatives’ call resolution from fields call_duration and call start_time. transcript Volume: a Unity Catalog Volume of all recordings as a *.wav files, but also a text transcript as *.txt files. call_cust_history: a Delta table with primary keys customer_id, cal1_id. This table is maintained to calculate how much internal customers use the HelpDesk to make sure that the charge back model is consistent with actual service use. call_detail: a Delta table that includes a snapshot of all call details updated hourly. It includes root_cause and resolution fields, but those fields may be empty for calls that are still active. maintenance_schedule – a Delta table that includes a listing of both HelpDesk application outages as well as planned upcoming maintenance downtimes.
They need sources that could add context to best identify ticket root cause and resolution.
Which TWO sources do that? (Choose two.)
Acall_cust_history
Bmaintenance_schedule
Ccall_rep_history
Dcall_detail
Etranscript Volume
A Generative Al Engineer is creating an LLM-based application. The documents for its retriever have been chunked to a maximum of 512 tokens each. The Generative Al Engineer knows that cost and latency are more important than quality for this application. They have several context length levels to choose from.
Which will fulfill their need?
Acontext length 514; smallest model is 0.44GB and embedding dimension 768
Bcontext length 2048: smallest model is 11GB and embedding dimension 2560
Ccontext length 32768: smallest model is 14GB and embedding dimension 4096
Dcontext length 512: smallest model is 0.13GB and embedding dimension 384
A Generative AI Engineer is designing a RAG application for answering user questions on technical regulations as they learn a new sport.
What are the steps needed to build this RAG application and deploy it?
AIngest documents from a source –> Index the documents and saves to Vector Search –> User submits queries against an LLM –> LLM retrieves relevant documents –> Evaluate model –> LLM generates a response –> Deploy it using Model Serving
BIngest documents from a source –> Index the documents and save to Vector Search –> User submits queries against an LLM –> LLM retrieves relevant documents –> LLM generates a response -> Evaluate model –> Deploy it using Model Serving
CIngest documents from a source –> Index the documents and save to Vector Search –> Evaluate model –> Deploy it using Model Serving
DUser submits queries against an LLM –> Ingest documents from a source –> Index the documents and save to Vector Search –> LLM retrieves relevant documents –> LLM generates a response –> Evaluate model –> Deploy it using Model Serving
A Generative AI Engineer just deployed an LLM application at a digital marketing company that assists with answering customer service inquiries.
Which metric should they monitor for their customer service LLM application in production?
ANumber of customer inquiries processed per unit of time
BEnergy usage per query
CFinal perplexity scores for the training of the model
DHuggingFace Leaderboard values for the base LLM
A Generative AI Engineer is building a Generative AI system that suggests the best matched employee team member to newly scoped projects. The team member is selected from a very large team. The match should be based upon project date availability and how well their employee profile matches the project scope. Both the employee profile and project scope are unstructured text.
How should the Generative Al Engineer architect their system?
ACreate a tool for finding available team members given project dates. Embed all project scopes into a vector store, perform a retrieval using team member profiles to find the best team member.
BCreate a tool for finding team member availability given project dates, and another tool that uses an LLM to extract keywords from project scopes. Iterate through available team members’ profiles and perform keyword matching to find the best available team member.
CCreate a tool to find available team members given project dates. Create a second tool that can calculate a similarity score for a combination of team member profile and the project scope. Iterate through the team members and rank by best score to select a team member.
DCreate a tool for finding available team members given project dates. Embed team profiles into a vector store and use the project scope and filtering to perform retrieval to find the available best matched team members.
A Generative AI Engineer has a provisioned throughput model serving endpoint as part of a RAG application and would like to monitor the serving endpoint’s incoming requests and outgoing responses. The current approach is to include a micro-service in between the endpoint and the user interface to write logs to a remote server.
Which Databricks feature should they use instead which will perform the same task?
AVector Search
BLakeview
CDBSQL
DInference Tables
A Generative Al Engineer is building a system which will answer questions on latest stock news articles.
Which will NOT help with ensuring the outputs are relevant to financial news?
AImplement a comprehensive guardrail framework that includes policies for content filters tailored to the finance sector.
BIncrease the compute to improve processing speed of questions to allow greater relevancy analysis
CImplement a profanity filter to screen out offensive language.
DIncorporate manual reviews to correct any problematic outputs prior to sending to the users
A Generative AI Engineer has been asked to build an LLM-based question-answering application. The application should take into account new documents that are frequently published. The engineer wants to build this application with the least cost and least development effort and have it operate at the lowest cost possible.
Which combination of chaining components and configuration meets these requirements?
AFor the application a prompt, a retriever, and an LLM are required. The retriever output is inserted into the prompt which is given to the LLM to generate answers.
BThe LLM needs to be frequently with the new documents in order to provide most up-to-date answers.
CFor the question-answering application, prompt engineering and an LLM are required to generate answers.
DFor the application a prompt, an agent and a fine-tuned LLM are required. The agent is used by the LLM to retrieve relevant content that is inserted into the prompt which is given to the LLM to generate answers.
A Generative AI Engineer wants to build an LLM-based solution to help a restaurant improve its online customer experience with bookings by automatically handling common customer inquiries. The goal of the solution is to minimize escalations to human intervention and phone calls while maintaining a personalized interaction. To design the solution, the Generative AI Engineer needs to define the input data to the LLM and the task it should perform.
Which input/output pair will support their goal?
AInput: Online chat logs; Output: Group the chat logs by users, followed by summarizing each user’s interactions
BInput: Online chat logs; Output: Buttons that represent choices for booking details
A Generative AI Engineer I using the code below to test setting up a vector store:
Assuming they intend to use Databricks managed embeddings with the default embedding model, what should be the next logical function call?
Avsc.get_index()
Bvsc.create_delta_sync_index()
Cvsc.create_direct_access_index()
Dvsc.similarity_search()
What is an effective method to preprocess prompts using custom code before sending them to an LLM?
ADirectly modify the LLM’s internal architecture to include preprocessing steps
BIt is better not to introduce custom code to preprocess prompts as the LLM has not been trained with examples of the preprocessed prompts
CRather than preprocessing prompts, it’s more effective to postprocess the LLM outputs to align the outputs to desired outcomes
DWrite a MLflow PyFunc model that has a separate function to process the prompts
A Generative AI Engineer is developing an LLM application that users can use to generate personalized birthday poems based on their names.
Which technique would be most effective in safeguarding the application, given the potential for malicious user inputs?
AImplement a safety filter that detects any harmful inputs and ask the LLM to respond that it is unable to assist
BReduce the time that the users can interact with the LLM
CAsk the LLM to remind the user that the input is malicious but continue the conversation with the user
DIncrease the amount of compute that powers the LLM to process input faster
Which indicator should be considered to evaluate the safety of the LLM outputs when qualitatively assessing LLM responses for a translation use case?
AThe ability to generate responses in code
BThe similarity to the previous language
CThe latency of the response and the length of text generated
DThe accuracy and relevance of the responses
A Generative AI Engineer is developing a patient-facing healthcare-focused chatbot. If the patient’s question is not a medical emergency, the chatbot should solicit more information from the patient to pass to the doctor’s office and suggest a few relevant pre-approved medical articles for reading. If the patient’s question is urgent, direct the patient to calling their local emergency services.
Given the following user input:
“I have been experiencing severe headaches and dizziness for the past two days.”
Which response is most appropriate for the chatbot to generate?
AHere are a few relevant articles for your browsing. Let me know if you have questions after reading them.
BPlease call your local emergency services.
CHeadaches can be tough. Hope you feel better soon!
DPlease provide your age, recent activities, and any other symptoms you have noticed along with your headaches and dizziness.
A Generative Al Engineer is building a RAG application that answers questions about internal documents for the company SnoPen AI.
The source documents may contain a significant amount of irrelevant content, such as advertisements, sports news, or entertainment news, or content about other companies.
Which approach is advisable when building a RAG application to achieve this goal of filtering irrelevant information?
AKeep all articles because the RAG application needs to understand non-company content to avoid answering questions about them.
BInclude in the system prompt that any information it sees will be about SnoPenAI, even if no data filtering is performed.
CInclude in the system prompt that the application is not supposed to answer any questions unrelated to SnoPen AI.
DConsolidate all SnoPen AI related documents into a single chunk in the vector database.
After changing the response generating LLM in a RAG pipeline from GPT-4 to a model with a shorter context length that the company self-hosts, the Generative AI Engineer is getting the following error:
What TWO solutions should the Generative AI Engineer implement without changing the response generating model? (Choose two.)
AUse a smaller embedding model to generate embeddings
BReduce the maximum output tokens of the new model
CDecrease the chunk size of embedded documents
DReduce the number of records retrieved from the vector database
ERetrain the response generating model using ALiBi
A Generative Al Engineer has successfully ingested unstructured documents and chunked them by document sections. They would like to store the chunks in a Vector Search index. The current format of the dataframe has two columns: (i) original document file name (ii) an array of text chunks for each document.
What is the most performant way to store this dataframe?
ASplit the data into train and test set, create a unique identifier for each document, then save to a Delta table
BFlatten the dataframe to one chunk per row, create a unique identifier for each row, and save to a Delta table
CFirst create a unique identifier for each document, then save to a Delta table
DStore each chunk as an independent JSON file in Unity Catalog Volume. For each JSON file, the key is the document section name and the value is the array of text chunks for that section
A Generative AI Engineer has created a RAG application which can help employees retrieve answers from an internal knowledge base, such as Confluence pages or Google Drive. The prototype application is now working with some positive feedback from internal company testers. Now the Generative Al Engineer wants to formally evaluate the system’s performance and understand where to focus their efforts to further improve the system.
How should the Generative AI Engineer evaluate the system?
AUse cosine similarity score to comprehensively evaluate the quality of the final generated answers.
BCurate a dataset that can test the retrieval and generation components of the system separately. Use MLflow’s built in evaluation metrics to perform the evaluation on the retrieval and generation components.
CBenchmark multiple LLMs with the same data and pick the best LLM for the job.
DUse an LLM-as-a-judge to evaluate the quality of the final answers generated.