New Year Special Limited Time 70% Discount Offer - Ends in 0d 00h 00m 00s - Coupon code: 70percent

CertNexus AIP-210 CertNexus Certified Artificial Intelligence Practitioner (CAIP) Exam Practice Test

Demo: 26 questions
Total 90 questions

CertNexus Certified Artificial Intelligence Practitioner (CAIP) Questions and Answers

Question 1

When working with textual data and trying to classify text into different languages, which approach to representing features makes the most sense?

Options:

A.

Bag of words model with TF-IDF

B.

Bag of bigrams (2 letter pairs)

C.

Word2Vec algorithm

D.

Clustering similar words and representing words by group membership

Question 2

You have a dataset with many features that you are using to classify a dependent variable. Because the sample size is small, you are worried about overfitting. Which algorithm is ideal to prevent overfitting?

Options:

A.

Decision tree

B.

Logistic regression

C.

Random forest

D.

XGBoost

Question 3

Which of the following equations best represent an LI norm?

Options:

A.

|x| + |y|

B.

|x|+|y|^2

C.

|x|-|y|

D.

|x|^2+|y|^2

Question 4

A data scientist is tasked to extract business intelligence from primary data captured from the public. Which of the following is the most important aspect that the scientist cannot forget to include?

Options:

A.

Cyberprotection

B.

Cybersecurity

C.

Data privacy

D.

Data security

Question 5

What is the primary benefit of the Federated Learning approach to machine learning?

Options:

A.

It does not require a labeled dataset to solve supervised learning problems.

B.

It protects the privacy of the user's data while providing well-trained models.

C.

It requires less computation to train the same model using a traditional approach.

D.

It uses large, centralized data stores to train complex machine learning models.

Question 6

We are using the k-nearest neighbors algorithm to classify the new data points. The features are on different scales.

Which method can help us to solve this problem?

Options:

A.

Log transformation

B.

Normalization

C.

Square-root transformation

D.

Standardization

Question 7

Which of the following occurs when a data segment is collected in such a way that some members of the intended statistical population are less likely to be included than others?

Options:

A.

Algorithmic bias

B.

Sampling bias

C.

Stereotype bias

D.

Systematic value distortion

Question 8

What is the open framework designed to help detect, respond to, and remediate threats in ML systems?

Options:

A.

Adversarial ML Threat Matrix

B.

MITRE ATT&CK® Matrix

C.

OWASP Threat and Safeguard Matrix

D.

Threat Susceptibility Matrix

Question 9

Which of the following pieces of AI technology provides the ability to create fake videos?

Options:

A.

Generative adversarial networks (GAN)

B.

Long short-term memory (LSTM) networks

C.

Recurrent neural networks (RNN)

D.

Support-vector machines (SVM)

Question 10

For each of the last 10 years, your team has been collecting data from a group of subjects, including their age and numerous biomarkers collected from blood samples. You are tasked with creating a prediction model of age using the biomarkers as input. You start by performing a linear regression using all of the data over the 10-year period, with age as the dependent variable and the biomarkers as predictors.

Which assumption of linear regression is being violated?

Options:

A.

Equality of variance (Homoscedastidty)

B.

Independence

C.

Linearity

D.

Normality

Question 11

Which of the following describes a neural network without an activation function?

Options:

A.

A form of a linear regression

B.

A form of a quantile regression

C.

An unsupervised learning technique

D.

A radial basis function kernel

Question 12

Which of the following is NOT a valid cross-validation method?

Options:

A.

Bootstrapping

B.

K-fold

C.

Leave-one-out

D.

Stratification

Question 13

Which of the following is NOT an activation function?

Options:

A.

Additive

B.

Hyperbolic tangent

C.

ReLU

D.

Sigmoid

Question 14

Workflow design patterns for the machine learning pipelines:

Options:

A.

Aim to explain how the machine learning model works.

B.

Represent a pipeline with directed acyclic graph (DAG).

C.

Seek to simplify the management of machine learning features.

D.

Separate inputs from features.

Question 15

A company is developing a merchandise sales application The product team uses training data to teach the AI model predicting sales, and discovers emergent bias. What caused the biased results?

Options:

A.

The AI model was trained in winter and applied in summer.

B.

The application was migrated from on-premise to a public cloud.

C.

The team set flawed expectations when training the model.

D.

The training data used was inaccurate.

Question 16

R-squared is a statistical measure that:

Options:

A.

Combines precision and recall of a classifier into a single metric by taking their harmonic mean.

B.

Expresses the extent to which two variables are linearly related.

C.

Is the proportion of the variance for a dependent variable thaf’ s explained by independent variables.

D.

Represents the extent to which two random variables vary together.

Question 17

You are implementing a support-vector machine on your data, and a colleague suggests you use a polynomial kernel. In what situation might this help improve the prediction of your model?

Options:

A.

When it is necessary to save computational time.

B.

When the categories of the dependent variable are not linearly separable.

C.

When the distribution of the dependent variable is Gaussian.

D.

When there is high correlation among the features.

Question 18

You are building a prediction model to develop a tool that can diagnose a particular disease so that individuals with the disease can receive treatment. The treatment is cheap and has no side effects. Patients with the disease who don't receive treatment have a high risk of mortality.

It is of primary importance that your diagnostic tool has which of the following?

Options:

A.

High negative predictive value

B.

High positive predictive value

C.

Low false negative rate

D.

Low false positive rate

Question 19

The following confusion matrix is produced when a classifier is used to predict labels on a test dataset. How precise is the classifier?

Options:

A.

48/(48+37)

B.

37/(37+8)

C.

37/(37+7)

D.

(48+37)/100

Question 20

Which of the following is a privacy-focused law that an AI practitioner should adhere to while designing and adapting an AI system that utilizes personal data?

Options:

A.

General Data Protection Regulation (GDPR)

B.

ISO/IEC 27001

C.

PCIDSS

D.

Sarbanes Oxley (SOX)

Question 21

Your dependent variable data is a proportion. The observed range of your data is 0.01 to 0.99. The instrument used to generate the dependent variable data is known to generate low quality data for values close to 0 and close to 1. A colleague suggests performing a logit-transformation on the data prior to performing a linear regression. Which of the following is a concern with this approach?

Definition of logit-transformation

If p is the proportion: logit(p)=log(p/(l-p))

Options:

A.

After logit-transformation, the data may violate the assumption of independence.

B.

Noisy data could become more influential in your model.

C.

The model will be more likely to violate the assumption of normality.

D.

Values near 0.5 before logit-transformation will be near 0 after.

Question 22

Which of the following algorithms is an example of unsupervised learning?

Options:

A.

Neural networks

B.

Principal components analysis

C.

Random forest

D.

Ridge regression

Question 23

A product manager is designing an Artificial Intelligence (AI) solution and wants to do so responsibly, evaluating both positive and negative outcomes.

The team creates a shared taxonomy of potential negative impacts and conducts an assessment along vectors such as severity, impact, frequency, and likelihood.

Which modeling technique does this team use?

Options:

A.

Business

B.

Harms

C.

Process

D.

Threat

Question 24

A big data architect needs to be cautious about personally identifiable information (PII) that may be captured with their new IoT system. What is the final stage of the Data Management Life Cycle, which the architect must complete in order to implement data privacy and security appropriately?

Options:

A.

De-Duplicate

B.

Destroy

C.

Detain

D.

Duplicate

Question 25

A market research team has ratings from patients who have a chronic disease, on several functional, physical, emotional, and professional needs that stay unmet with the current therapy. The dataset also captures ratings on how the disease affects their day-to-day activities.

A pharmaceutical company is introducing a new therapy to cure the disease and would like to design their marketing campaign such that different groups of patients are targeted with different ads. These groups should ideally consist of patients with similar unmet needs.

Which of the following algorithms should the market research team use to obtain these groups of patients?

Options:

A.

k-means clustering

B.

k-nearest neighbors

C.

Logistic regression

D.

Naive-Bayes

Question 26

Which of the following sentences is TRUE about the definition of cloud models for machine learning pipelines?

Options:

A.

Data as a Service (DaaS) can host the databases providing backups, clustering, and high availability.

B.

Infrastructure as a Service (IaaS) can provide CPU, memory, disk, network and GPU.

C.

Platform as a Service (PaaS) can provide some services within an application such as payment applications to create efficient results.

D.

Software as a Service (SaaS) can provide AI practitioner data science services such as Jupyter notebooks.

Demo: 26 questions
Total 90 questions