Big Halloween Sale Limited Time 70% Discount Offer - Ends in 0d 00h 00m 00s - Coupon code: 70percent

CertNexus AIP-210 CertNexus Certified Artificial Intelligence Practitioner (CAIP) Exam Practice Test

Demo: 27 questions
Total 92 questions

CertNexus Certified Artificial Intelligence Practitioner (CAIP) Questions and Answers

Question 1

You are developing a prediction model. Your team indicates they need an algorithm that is fast and requires low memory and low processing power. Assuming the following algorithms have similar accuracy on your data, which is most likely to be an ideal choice for the job?

Options:

A.

Deep learning neural network

B.

Random forest

C.

Ridge regression

D.

Support-vector machine

Question 2

What is the open framework designed to help detect, respond to, and remediate threats in ML systems?

Options:

A.

Adversarial ML Threat Matrix

B.

MITRE ATTandCK® Matrix

C.

OWASP Threat and Safeguard Matrix

D.

Threat Susceptibility Matrix

Question 3

A classifier has been implemented to predict whether or not someone has a specific type of disease. Considering that only 1% of the population in the dataset has this disease, which measures will work the BEST to evaluate this model?

Options:

A.

Mean squared error

B.

Precision and accuracy

C.

Precision and recall

D.

Recall and explained variance

Question 4

Which of the following text vectorization methods is appropriate and correctly defined for an English-to-Spanish translation machine?

Options:

A.

Using TF-IDF because in translation machines, we do not care about the order of the words.

B.

Using TF-IDF because in translation machines, we need to consider the order of the words.

C.

Using Word2vec because in translation machines, we do not care about the order of the words.

D.

Using Word2vec because in translation machines, we need to consider the order of the words.

Question 5

You have a dataset with thousands of features, all of which are categorical. Using these features as predictors, you are tasked with creating a prediction model to accurately predict the value of a continuous dependent variable. Which of the following would be appropriate algorithms to use? (Select two.)

Options:

A.

K-means

B.

K-nearest neighbors

C.

Lasso regression

D.

Logistic regression

E.

Ridge regression

Question 6

Which of the following describes a typical use case of video tracking?

Options:

A.

Augmented dreaming

B.

Medical diagnosis

C.

Traffic monitoring

D.

Video composition

Question 7

Which of the following is a type 1 error in statistical hypothesis testing?

Options:

A.

The null hypothesis is false, but fails to be rejected.

B.

The null hypothesis is false and is rejected.

C.

The null hypothesis is true and fails to be rejected.

D.

The null hypothesis is true, but is rejected.

Question 8

R-squared is a statistical measure that:

Options:

A.

Combines precision and recall of a classifier into a single metric by taking their harmonic mean.

B.

Expresses the extent to which two variables are linearly related.

C.

Is the proportion of the variance for a dependent variable thaf’ s explained by independent variables.

D.

Represents the extent to which two random variables vary together.

Question 9

You create a prediction model with 96% accuracy. While the model's true positive rate (TPR) is performing well at 99%, the true negative rate (TNR) is only 50%. Your supervisor tells you that the TNR needs to be higher, even if it decreases the TPR. Upon further inspection, you notice that the vast majority of your data is truly positive.

What method could help address your issue?

Options:

A.

Normalization

B.

Oversampling

C.

Principal components analysis

D.

Quality filtering

Question 10

We are using the k-nearest neighbors algorithm to classify the new data points. The features are on different scales.

Which method can help us to solve this problem?

Options:

A.

Log transformation

B.

Normalization

C.

Square-root transformation

D.

Standardization

Question 11

You are implementing a support-vector machine on your data, and a colleague suggests you use a polynomial kernel. In what situation might this help improve the prediction of your model?

Options:

A.

When it is necessary to save computational time.

B.

When the categories of the dependent variable are not linearly separable.

C.

When the distribution of the dependent variable is Gaussian.

D.

When there is high correlation among the features.

Question 12

Which of the following can benefit from deploying a deep learning model as an embedded model on edge devices?

Options:

A.

A more complex model

B.

Guaranteed availability of enough space

C.

Increase in data bandwidth consumption

D.

Reduction in latency

Question 13

You and your team need to process large datasets of images as fast as possible for a machine learning task. The project will also use a modular framework with extensible code and an active developer community. Which of the following would BEST meet your needs?

Options:

A.

Caffe

B.

Keras

C.

Microsoft Cognitive Services

D.

TensorBoard

Question 14

A data scientist is tasked to extract business intelligence from primary data captured from the public. Which of the following is the most important aspect that the scientist cannot forget to include?

Options:

A.

Cyberprotection

B.

Cybersecurity

C.

Data privacy

D.

Data security

Question 15

In a self-driving car company, ML engineers want to develop a model for dynamic pathing. Which of following approaches would be optimal for this task?

Options:

A.

Dijkstra Algorithm

B.

Reinforcement learning

C.

Supervised Learning.

D.

Unsupervised Learning

Question 16

Which of the following is a privacy-focused law that an AI practitioner should adhere to while designing and adapting an AI system that utilizes personal data?

Options:

A.

General Data Protection Regulation (GDPR)

B.

ISO/IEC 27001

C.

PCIDSS

D.

Sarbanes Oxley (SOX)

Question 17

Which two of the following decrease technical debt in ML systems? (Select two.)

Options:

A.

Boundary erosion

B.

Design anti-patterns

C.

Documentation readability

D.

Model complexity

E.

Refactoring

Question 18

Which type of regression represents the following formula: y = c + b*x, where y = estimated dependent variable score, c = constant, b = regression coefficient, and x = score on the independent variable?

Options:

A.

Lasso regression

B.

Linear regression

C.

Polynomial regression

D.

Ridge regression

Question 19

For each of the last 10 years, your team has been collecting data from a group of subjects, including their age and numerous biomarkers collected from blood samples. You are tasked with creating a prediction model of age using the biomarkers as input. You start by performing a linear regression using all of the data over the 10-year period, with age as the dependent variable and the biomarkers as predictors.

Which assumption of linear regression is being violated?

Options:

A.

Equality of variance (Homoscedastidty)

B.

Independence

C.

Linearity

D.

Normality

Question 20

Which of the following is the primary purpose of hyperparameter optimization?

Options:

A.

Controls the learning process of a given algorithm

B.

Makes models easier to explain to business stakeholders

C.

Improves model interpretability

D.

Increases recall over precision

Question 21

Which of the following sentences is true about model evaluation and model validation in ML pipelines?

Options:

A.

Model evaluation and validation are the same.

B.

Model evaluation is defined as an external component.

C.

Model validation is defined as a set of tasks to confirm the model performs as expected.

D.

Model validation occurs before model evaluation.

Question 22

For a particular classification problem, you are tasked with determining the best algorithm among SVM, random forest, K-nearest neighbors, and a deep neural network. Each of the algorithms has similar accuracy on your data. The stakeholders indicate that they need a model that can convey each feature's relative contribution to the model's accuracy. Which is the best algorithm for this use case?

Options:

A.

Deep neural network

B.

K-nearest neighbors

C.

Random forest

D.

SVM

Question 23

A classifier has been implemented to predict whether or not someone has a specific type of disease. Considering that only 1% of the population in the dataset has this disease, which measures will work the BEST to evaluate this model?

Options:

A.

Mean squared error

B.

Precision and accuracy

C.

Precision and recall

D.

Recall and explained variance

Question 24

You have a dataset with many features that you are using to classify a dependent variable. Because the sample size is small, you are worried about overfitting. Which algorithm is ideal to prevent overfitting?

Options:

A.

Decision tree

B.

Logistic regression

C.

Random forest

D.

XGBoost

Question 25

Which two encodes can be used to transform categories data into numerical features? (Select two.)

Options:

A.

Count Encoder

B.

Log Encoder

C.

Mean Encoder

D.

Median Encoder

E.

One-Hot Encoder

Question 26

A company is developing a merchandise sales application The product team uses training data to teach the AI model predicting sales, and discovers emergent bias. What caused the biased results?

Options:

A.

The AI model was trained in winter and applied in summer.

B.

The application was migrated from on-premise to a public cloud.

C.

The team set flawed expectations when training the model.

D.

The training data used was inaccurate.

Question 27

Which of the following models are text vectorization methods? (Select two.)

Options:

A.

Lemmatization

B.

PCA

C.

Skip-gram

D.

TF-IDF

E.

Tokenization

F.

t-SNE

Demo: 27 questions
Total 92 questions