Which of the following should an analyst do to best summarize the data on a data set?
An analyst wants to test the association between the number of doors in a car and the number of gears in the car. Which of the following is the best test to use?
A column is being used to store strings of variable lengths. Performance is a concern, so the column needs to use as little space as possible. Which of the following data types best meets these requirements?
A data analyst is helping a retail store categorize its customers into five different groups based on the following information:
• How recently the customers made purchases
• How frequently the customers made purchases
• How much the customers spent
Given the following information:
Which of the following would be most important for the analysis?
Mario works with a group of R programmers tasked with copying data from an accounting system into a data warehouse.
In what phase are the group's R skills most relevant?
A data analyst needs to present the results of an online marketing campaign to the marketing manager. The manager wants to see the most important KPIs and measure the return on marketing investment. Which of the following should the data analyst use to BEST communicate this information to the manager?
What SQL command is used to delete an entire table from a database?
Which of the following will MOST likely be streamed live?
An analyst is reporting on the average income for a county and is reviewing the following data:
Which of the following is the reason the analyst would need to cleanse the data in this data set?
Which of the following would be used to store unstructured data from different sources?
An analyst is currently working on a ticket for revamping a company-wide dashboard that has been in use for five years. Which of the following should be the first step in the development process?
Given the following data tables:
Which of the following MDM processes needs to take place FIRST?
Which of the following is an example of PII?
Which of the following report types is most appropriate for a high-level, year-end report requested by a Chief Executive Officer?
A data analyst is asked to create a sales report for the second-quarter 2020 board meeting, which will include a review of the business’s performance through the second quarter. The board meeting will be held on July 15, 2020, after the numbers are finalized. Which of the following report types should the data analyst create?
Which of the following is used for calculations and pivot tables?
Which of the following data types best describe 4Ac1? (Select two).
Taylor wants to investigate how manufacturing, marketing, and sales expenditures impact overall profitability for her company.
Which of the following systems is the most appropriate?
Which of the following best describes an exploratory analysis?
Which of the following analysis techniques is an unsupervised data mining process?
Given the table below:
Which of the following variables can be considered inconsistent, and how many distinct values should the variable have?
Which of the following is the best description of discrete data types?
Which of the following data cleansing issues will be fixed when a DISTINCT function is applied?
A report is scheduled to run and be distributed at the end of business each day. On Mondays, one of the recipients opens the previous week's reports and combines them to calculate the weekly totals and projections for the coming week. This is a tedious process, and the recipient asks an analyst for help. Which of the following should the analyst recommend?
An analyst is working with the income data of suburban families in the United States. The data set has a lot of outliers, and the analyst needs to provide a measure that represents the typical income. Which of the following would BEST fulfill the analyst’s goal?
A data analyst is designing a dashboard that will provide a story of sales and determine which site is providing the highest sales volume per customer. The analyst must choose an appropriate chart to include in the dashboard. The following data is available:
Which of the following types of charts should be considered?
A user receives a large custom report to track company sales across various date ranges. The user then completes a series of manual calculations for each date range. Which of the following should an analyst suggest so the user has a dynamic, seamless experience?
Which of the following data governance concepts fits into the security requirements category?
Which of the following data types should an analyst use to provide the most flexibility when recording emails on a form?
Which one of the following programming languages is specifically designed for use in analytics applications?
Which of the following variable name formats would be problematic if used in the majority of data software programs?
A junior web developer is developing a new application where users can upload short videos. The first task is to create a homepage that shows the headline "Upload Your Short Videos" and a clickable button that says "upload now".
Which of the following HTML commands would help the developer to complete the task successfully?
Which of the following is a difference between a primary key and a unique key?
After completing web scraping, which of the following file formats needs to be parsed?
Which of the following is a control measure for preventing a data breach?
Which of the following is an example of a flat file?
An organization wants to evaluate whether project activities are within the set projections and in line to meet the desired project targets. Which of the following types of analysis is best suited for this situation?
Which of the following should be accomplished NEXT after understanding a business requirement for a data analysis report?
Which of the following is the best variable formal to store a customer's age using the least possible amount of storage data?
A data analyst for a media company needs to determine the most popular movie genre. Given the table below:
Which of the following must be done to the Genre column before this task can be completed?
An analyst has generated a report that includes the number of months in the first two quarters of 2019 when sales exceeded $50,000:
Which of the following functions did the analyst use to generate the data in the Sales_indicator column?
Which of the following is an example of a strategy to reduce statistical errors?
A marketing analytics team received customer transaction data from two different sources. The data is complete and accurate; however, the field names appear to be inconsistent. Given the following tables:
Which of the following is considered best practice if the team wants to consolidate the files and conduct further analysis?
Which one of the following is a common data warehouse schema?
A data analyst needs to create a master file that includes customer information from the tables below:
Given the three tables above, the analyst wants to filter down the information prior to joining it together. In which of the following orders should this data manipulation bo approached for the most efficient result?
Given the following data set:
Which of the following is the best reason for cleansing the data?
An analyst is creating a resource to improve users' experience when they select specific records based on particular dates. Which of the following should the analyst use to create a resource that best meets user needs?
During data cleansing, an analyst conducts measures of central tendency on a data set. Which of the following data is the analyst attempting to identify?
An analyst is building a new dashboard for a user. After an initial conversation with the user. the analyst created a mock-up of the dashboard. Which of the following best explains why the analyst created the mock-up?
Which of the following is the median of the number set:3, 7, 5, 6, 9?
A collections manager has a team calling customers who are past due on their accounts in an attempt to collect payments. The manager receives the call list in the form of a printed report that is generated by the accounting department at the beginning of each week. Consequently, the collections team calls some customers who have made payments in the time since the report was last printed. Which of the following reporting enhancements could the accounting department implement to best reduce the number of calls on current accounts?
A Chief Executive Officer (CEO) is requesting more up-to-date sales data for improved visibility prior to month-end. An analyst must determine the frequency of a sales report that was previously distributed on an as-needed basis. Which of the following would be the most appropriate frequency for this report?
Which of the following defines the policies and procedures for managing the master data?
Given the following data sample:
Which of the following best describes the data quality issue?
An analyst collected data that includes primary account numbers, expiration dates, and service codes. Which of the following data governance classifications is used to describe this data?
A data analyst has been asked to organize the table below in the following ways:
By sales from high to low -
By state in alphabetic order -
Which of the following functions will allow the data analyst to organize the table in this manner?
An analyst needs to know what data an organization possesses. Which of the following is the best document for the analyst to consult?
Given the following customer and order tables:
Which of the following describes the number of rows and columns of data that would be present after performing an INNER JOIN of the tables?
Standardized tests are given to students in the middle of each month, and the results are ready by the end of the month. The superintendent needs a quick view of test performance. Which of the following would be the best recommendation to meet the superintendent's requirements?
A database administrator needs to ensure only approved users can access specific database tables to perform financial functions. Which of the following is the best access control method for the administrator to use?
A database administrator is required to mask certain table columns containing Pll in order to comply with the company privacy policy. Which of the following are the most likely types of information the administrator should mask? (Select two).
An analyst computed a new variable of income per day in the household by multiplying the number of days worked by the number of people working in the household and the income earned per day. Which of the following is the correct name for this new variable?
Angela is aggregating data from CRM system with data from an employee system.
While performing an initial quality check, she realizes that her employee ID is not associated with her identifier in the CRM system.
What kind of issues is Angela facing?
Choose the best answer.
A database consists of one fact table that is composed of multiple dimensions. Depending on the dimension, each one can be represented by a denormalized table or multiple normalized tables. This structure is an example of a:
An analyst is required to run a text analysis of data that is found in articles from a digital news outlet. Which of the following would be the BEST technique for the analyst to apply to acquire the data?
A publishing group has requested a dashboard to track submissions before publication. A key requirement is that all changes are tracked, as multiple users will be checking out documents and editing them before submissions are considered final. Which of the following is the BEST way to meet this stakeholder requirement?
Which one of the following in NOT a common data integration tool?
Which of the following programming languages are best suited for analysis and machine-learning applications? (Select two).
An analyst needs to provide a chart to identify the composition between the categories of the survey response data set:
Which of the following charts would be BEST to use?
A data analyst needs to apply quality control concepts to a data set for accuracy. Which of the following is the best way to do this?
Given the following data:
Which of the following BEST describes the data set?
A data analyst has a set of data that shows the number of gallons of oil produced each day. The company would like to know the standard deviation for the data set. The variance for the data is 36 gallons. Which of the following is the standard deviation for gallons produced?
A data analyst needs to present the results of an online marketing campaign to the marketing manager. The manager wants to see the most important KPIs and measure the return on marketing investment. Which of the following should the data analyst use to BEST communicate this information to the manager?
Which of the following is the first step an analyst should perform upon receiving a business request for analysis?
A reporting analyst is creating a dashboard that shows the year-over-year performance for a sales organization. Which of the following is the best visual for the analyst use to illustrate the organization's performance?
Which of the following is an example of a at flat file?
Given the image below:
Which of the following file formats is depicted?
You are working with a professional statistician to perform an analysis and would like to use a statistics package.
Which one of the following would be the most appropriate?
A development company is constructing a new Init in its apartment complex. The complex has the following floor plans:
Using the average cost per square foot of the original floor plans. which of the following should be the price of the Rose Init?
Consider two different datasets, one with gas prices and the other with food prices. Which of the following measures is most affected by outliers?
A data analyst is creating a report that will provide information about various regions, products, and time periods. Which of the following formats would be themost efficient way to deliver this report?
Which of the following is the best reason for removing data outliers?
Which of the following types of dashboards should a business intelligence engineer develop in order to provide information about failed data pipelines?
An analyst has written the following code:
SELECT *
FROM Cust_table
WHERE age > 60 AND City = "New York"
Which of the following criteria is the analyst retrieving?
Which of the following describes the method of sampling in which elements of data are selected randomly from each of the small subgroups within a population?
Which of the following is an example of a data-mining ETL tool?
Given the following grocery store orders:
If a query is made to the table with the following logic:
Order_Total > 132 OR (Order Total >= 25 AND Order_Total < 74)
Which of the following is the number of orders that will be returned by the query?
An analyst wants to check the progress and performance regarding the number of customers an organization served in the last six years. Which of the following represents the type of analysis theanalyst should perform?
A data analyst is developing a dashboard to track and monitor metrics. Which of the following best practices should be taken into during the FIRST pment process?
An analyst needs to join two tables of data together for analysis. All the names and cities in the first table should be joined with the corresponding ages in the second table, if applicable.
Which of the following is the correct join the analyst should complete. and how many total rows will be in one table?
Which of the following is most likely to be used as a data-mining ETL tool?
What category of data stewardship work is focused on ensuring that the organization respects the wishes of data subjects?
Given the following table:
Which of the following describes the data quality issues with theagedata?
A research analyst collects ten data points from 1.000 specimens. The analyst will not need any additional data to complete the analysis and will not need to retrieve information by specifier. Which of the following is the best data structure for the analyst to use?
Which of the following occurs if a 90% confidence interval increases to 95%?
Which of the following data sampling methods involves dividing a population into subgroups by similar characteristics?
Which of the following is a domain-specific language used in programming that is designed for managing data that is held in a relational data stream management system?
Which one the following is not considered an aggregate function?
The duration of a phone call in milliseconds is an example of:
After the daily ETL jobs are completed, the data in the reports does not appear complete, and a lot of data seems to be missing. Which of the following concepts should be used to assess and investigate further?
Given the following data:
CustomerID
ItemBought
Date
Tre_234
Sofa
2022-09-08
216_Tre
Shoes
08/02/2021
215/Tre
Blanket
2021/06/20
045/Tre
Mug
12-26-2021
Tre-345
Lamp
31/08/2022
TREJD19
Bucket
2022'08/01
Which of the following best describes the main issue in the data set?
A data analyst needs to create a weekly recurring report on sales performance and distribute it to all sales managers. Which of the following would be the BEST method to automate and ensure successful delivery for this task?
An analyst has been asked to validate data quality. Which of the following are the BEST reasons to validate data for quality control purposes? (Choose two.)
Randy scored 76 on a math test, Katie scored 86 on a science test, Ralph scored 80 on a history test, and Jean scored 80 on an English test. The table below contains the mean and standard deviation of the scores for each of the courses:
Using this information, which of the following students had the BEST score?
Which of the following best describes a business analytics tool with interactive visualization and business capabilities and an interface that is simple enough for end users to create their own reports and dashboards?
Python
Which of the following concepts should be applied if a data set with 40 fields needs to be pared down to 20 fields and contains similar data across multiple fields?
When analyzing the values of two variables, you decide to convert both variables so they are on a scale of 0 to 1.
What term describes this action?
A sales team wants visibility of current sales numbers, pipeline, and team performance. The team would also like to see calculations of individuals’ earned commissions and projected commissions based on sales, but they want that information to be kept confidential. Which of the following would be the BEST way to provide this visibility?