Databricks Databricks-Generative-AI-Engineer-Associate today updated questions

Databricks Certified Generative AI Engineer Associate Questions and Answers

Question 1

Which indicator should be considered to evaluate the safety of the LLM outputs when qualitatively assessing LLM responses for a translation use case?

Options:

The ability to generate responses in code

The similarity to the previous language

The latency of the response and the length of text generated

The accuracy and relevance of the responses

Question 2

A Generative AI Engineer developed an LLM application using the provisioned throughput Foundation Model API. Now that the application is ready to be deployed, they realize their volume of requests are not sufficiently high enough to create their own provisioned throughput endpoint. They want to choose a strategy that ensures the best cost-effectiveness for their application.

What strategy should the Generative AI Engineer use?

Options:

Switch to using External Models instead

Deploy the model using pay-per-token throughput as it comes with cost guarantees

Change to a model with a fewer number of parameters in order to reduce hardware constraint issues

Throttle the incoming batch of requests manually to avoid rate limiting issues

Question 3

A Generative AI Engineer is building an LLM to generate article summaries in the form of a type of poem, such as a haiku, given the article content. However, the initial output from the LLM does not match the desired tone or style.

Which approach will NOT improve the LLM’s response to achieve the desired response?

Options:

Provide the LLM with a prompt that explicitly instructs it to generate text in the desired tone and style

Use a neutralizer to normalize the tone and style of the underlying documents

Include few-shot examples in the prompt to the LLM

Fine-tune the LLM on a dataset of desired tone and style

Question 4

What is an effective method to preprocess prompts using custom code before sending them to an LLM?

Options:

Directly modify the LLM’s internal architecture to include preprocessing steps

It is better not to introduce custom code to preprocess prompts as the LLM has not been trained with examples of the preprocessed prompts

Rather than preprocessing prompts, it’s more effective to postprocess the LLM outputs to align the outputs to desired outcomes

Write a MLflow PyFunc model that has a separate function to process the prompts

Question 5

A Generative Al Engineer is creating an LLM system that will retrieve news articles from the year 1918 and related to a user's query and summarize them. The engineer has noticed that the summaries are generated well but often also include an explanation of how the summary was generated, which is undesirable.

Which change could the Generative Al Engineer perform to mitigate this issue?

Options:

Split the LLM output by newline characters to truncate away the summarization explanation.

Tune the chunk size of news articles or experiment with different embedding models.

Revisit their document ingestion logic, ensuring that the news articles are being ingested properly.

Provide few shot examples of desired output format to the system and/or user prompt.

Question 6

A Generative AI Engineer is designing a chatbot for a gaming company that aims to engage users on its platform while its users play online video games.

Which metric would help them increase user engagement and retention for their platform?

Options:

Randomness

Diversity of responses

Lack of relevance

Repetition of responses

Question 7

A Generative AI Engineer wants to build an LLM-based solution to help a restaurant improve its online customer experience with bookings by automatically handling common customer inquiries. The goal of the solution is to minimize escalations to human intervention and phone calls while maintaining a personalized interaction. To design the solution, the Generative AI Engineer needs to define the input data to the LLM and the task it should perform.

Which input/output pair will support their goal?

Options:

Input: Online chat logs; Output: Group the chat logs by users, followed by summarizing each user’s interactions

Input: Online chat logs; Output: Buttons that represent choices for booking details

Input: Customer reviews; Output: Classify review sentiment

Input: Online chat logs; Output: Cancellation options

Question 8

A Generative Al Engineer is tasked with developing an application that is based on an open source large language model (LLM). They need a foundation LLM with a large context window.

Which model fits this need?

Options:

DistilBERT

MPT-30B

Llama2-70B

DBRX

Question 9

A Generative AI Engineer is developing a chatbot designed to assist users with insurance-related queries. The chatbot is built on a large language model (LLM) and is conversational. However, to maintain the chatbot’s focus and to comply with company policy, it must not provide responses to questions about politics. Instead, when presented with political inquiries, the chatbot should respond with a standard message:

“Sorry, I cannot answer that. I am a chatbot that can only answer questions around insurance.”

Which framework type should be implemented to solve this?

Options:

Safety Guardrail

Security Guardrail

Contextual Guardrail

Compliance Guardrail

Question 10

A Generative Al Engineer would like an LLM to generate formatted JSON from emails. This will require parsing and extracting the following information: order ID, date, and sender email. Here’s a sample email:

They will need to write a prompt that will extract the relevant information in JSON format with the highest level of output accuracy.

Which prompt will do that?

Options:

You will receive customer emails and need to extract date, sender email, and order ID. You should return the date, sender email, and order ID information in JSON format.

You will receive customer emails and need to extract date, sender email, and order ID. Return the extracted information in JSON format.

Here’s an example: {“date”: “April 16, 2024”, “sender_email”: “sarah.lee925@gmail.com”, “order_id”: “RE987D”}

You will receive customer emails and need to extract date, sender email, and order ID. Return the extracted information in a human-readable format.

You will receive customer emails and need to extract date, sender email, and order ID. Return the extracted information in JSON format.

Question 11

A Generative Al Engineer is responsible for developing a chatbot to enable their company’s internal HelpDesk Call Center team to more quickly find related tickets and provide resolution. While creating the GenAI application work breakdown tasks for this project, they realize they need to start planning which data sources (either Unity Catalog volume or Delta table) they could choose for this application. They have collected several candidate data sources for consideration:

call_rep_history: a Delta table with primary keys representative_id, call_id. This table is maintained to calculate representatives’ call resolution from fields call_duration and call start_time.

transcript Volume: a Unity Catalog Volume of all recordings as a *.wav files, but also a text transcript as *.txt files.

call_cust_history: a Delta table with primary keys customer_id, cal1_id. This table is maintained to calculate how much internal customers use the HelpDesk to make sure that the charge back model is consistent with actual service use.

call_detail: a Delta table that includes a snapshot of all call details updated hourly. It includes root_cause and resolution fields, but those fields may be empty for calls that are still active.

maintenance_schedule – a Delta table that includes a listing of both HelpDesk application outages as well as planned upcoming maintenance downtimes.

They need sources that could add context to best identify ticket root cause and resolution.

Which TWO sources do that? (Choose two.)

Options:

call_cust_history

maintenance_schedule

call_rep_history

call_detail

transcript Volume

Answer:

D, E

Explanation:

In the context of developing a chatbot for a company's internal HelpDesk Call Center, the key is to select data sources that provide the most contextual and detailed information about the issues being addressed. This includes identifying the root cause and suggesting resolutions. The two most appropriate sources from the list are:

Call Detail (Option D):

Contents: This Delta table includes a snapshot of all call details updated hourly, featuring essential fields like root_cause and resolution.

Relevance: The inclusion of root_cause and resolution fields makes this source particularly valuable, as it directly contains the information necessary to understand and resolve the issues discussed in the calls. Even if some records are incomplete, the data provided is crucial for a chatbot aimed at speeding up resolution identification.

Transcript Volume (Option E):

Contents: This Unity Catalog Volume contains recordings in .wav format and text transcripts in .txt files.

Relevance: The text transcripts of call recordings can provide in-depth context that the chatbot can analyze to understand the nuances of each issue. The chatbot can use natural language processing techniques to extract themes, identify problems, and suggest resolutions based on previous similar interactions documented in the transcripts.

Why Other Options Are Less Suitable:

A (Call Cust History): While it provides insights into customer interactions with the HelpDesk, it focuses more on the usage metrics rather than the content of the calls or the issues discussed.

B (Maintenance Schedule): This data is useful for understanding when services may not be available but does not contribute directly to resolving user issues or identifying root causes.

C (Call Rep History): Though it offers data on call durations and start times, which could help in assessing performance, it lacks direct information on the issues being resolved.

Therefore, Call Detail and Transcript Volume are the most relevant data sources for a chatbot designed to assist with identifying and resolving issues in a HelpDesk Call Center setting, as they provide direct and contextual information related to customer issues.

Question 12

A Generative Al Engineer interfaces with an LLM with prompt/response behavior that has been trained on customer calls inquiring about product availability. The LLM is designed to output “In Stock” if the product is available or only the term “Out of Stock” if not.

Which prompt will work to allow the engineer to respond to call classification labels correctly?

Options:

Respond with “In Stock” if the customer asks for a product.

You will be given a customer call transcript where the customer asks about product availability. The outputs are either “In Stock” or “Out of Stock”. Format the output in JSON, for example: {“call_id”: “123”, “label”: “In Stock”}.

Respond with “Out of Stock” if the customer asks for a product.

You will be given a customer call transcript where the customer inquires about product availability. Respond with “In Stock” if the product is available or “Out of Stock” if not.

Question 13

A small and cost-conscious startup in the cancer research field wants to build a RAG application using Foundation Model APIs.

Which strategy would allow the startup to build a good-quality RAG application while being cost-conscious and able to cater to customer needs?

Options:

Limit the number of relevant documents available for the RAG application to retrieve from

Pick a smaller LLM that is domain-specific

Limit the number of queries a customer can send per day

Use the largest LLM possible because that gives the best performance for any general queries

Load More Databricks-Generative-AI-Engineer-Associate Questions

Winter Special Flat 65% Limited Time Discount offer - Ends in 0d 00h 00m 00s - Coupon code: suredis

Databricks Databricks-Generative-AI-Engineer-Associate Databricks Certified Generative AI Engineer Associate Exam Practice Test

Databricks Certified Generative AI Engineer Associate Questions and Answers

Options:

Answer:

Explanation:

Options:

Answer:

Explanation:

Options:

Answer:

Explanation:

Options:

Answer:

Explanation:

Options:

Answer:

Explanation:

Options:

Answer:

Explanation:

Options:

Answer:

Explanation:

Options:

Answer:

Explanation:

Options:

Answer:

Explanation:

Options:

Answer:

Explanation:

Options:

Answer:

Explanation:

Options:

Answer:

Explanation:

Options:

Answer:

Explanation: