Is this exam completely free?

ExamCademy offers a free preview for every exam, covering around 20% of the total questions. Full access to all questions is available for free by signing up for an account. Optional supporters unlock AstroTutor (AI tutor) and advanced study modes.

Can I practice IT certification exams from Microsoft, AWS, or CompTIA?

Yes! ExamCademy supports a wide range of IT certification exams including Microsoft Azure, AWS Cloud Practitioner, and CompTIA A+ and Security+.

How accurate are the mock exams?

Our practice questions are community-created and moderator-reviewed to align with realistic exam objectives, style, and difficulty.

DY0-001 by CompTIA - Page 1 | ExamCademy

Mode Selection

Question 1

Question 2

Question 3

The term "greedy algorithms" refers to machine-learning algorithms that:

A update priors as more data is seen.
B examine every node of a tree before making a decision.
C apply a theoretical model to the distribution of the data.
D make the locally optimal decision.

Question 4

A data scientist is deploying a model that needs to be accessed by multiple departments with minimal development effort by the departments. Which of the following APIs would be best for the data scientist to use?

A SOAP
B RPC
C JSON
D REST

Question 5

Which of the following compute delivery models allows packaging of only critical dependencies while developing a reusable asset?

A Thin clients
B Containers
C Virtual machines
D Edge devices

Question 6

A data analyst is analyzing data and would like to build conceptual associations. Which of the following is the best way to accomplish this task?

A n-grams
B NER
C TF-IDF
D POS

Question 7

Which of the following belong in a presentation to the senior management team and/or C-suite executives? (Choose two.)

A Full literature reviews
B Code snippets
C Final recommendations
D High-level results
E Detailed explanations of statistical tests
F Security keys and login information

Question 8

During EDA, a data scientist wants to look for patterns, such as linearity, in the data. Which of the following plots should the data scientist use?

A Violin
B Box-and-whisker
C Scatter
D Q-Q

Question 9

Which of the following distribution methods or models can most effectively represent the actual arrival times of a bus that runs on an hourly schedule?

A Binomial
B Exponential
C Normal
D Poisson

Question 10

A data scientist has constructed a model that meets the minimum performance requirements specified in the proposal for a prediction project. The data scientist thinks the model's accuracy should be improved, but the proposed deadline is approaching. Which of the following actions should the data scientist take first?

A Continue collecting data.
B Request additional funding.
C Consult the key project stakeholder.
D Test additional model specifications.

Question 11

Which of the following best describes the minimization of the residual term in a ridge linear regression?

A |e|
B e
C e2
D 0

Question 12

A data scientist is performing a linear regression and wants to construct a model that explains the most variation in the data. Which of the following should the data scientist maximize when evaluating the regression performance metrics?

A Accuracy
B R2
C p value
D AUC

Question 13

A statistician notices gaps in data associated with age-related illnesses and wants to further aggregate these observations. Which of the following is the best technique to achieve this goal?

A Label encoding
B Linearization
C Binning
D Imputing

Question 14

A data scientist needs to analyze a company's chemical businesses and is using the master database of the conglomerate company. Nothing in the data differentiates the data observations for the different businesses. Which of the following is the most efficient way to identify the chemical businesses' observations?

A Ingest the data from all of the hard drives and perform exploratory data analysis to identify which business is responsible for chemical operations.
B Perform analysis on all of the data and create a summary report on the results relevant to chemical operations.
C Consult with the business team to identify which sites are responsible for chemical operations and ingest only the relevant data for analysis.
D Ingest data from the hard drive containing the most data and present sample results on the chemical operations.

Question 15

Which of the following distance metrics for KNN is best described as a straight line?

A Radial
B Euclidean
C Cosine
D Manhattan

Question 16

A data scientist is building a forecasting model for the price of copper. The only input in this model is the daily price of copper for the last ten years. Which of the following forecasting techniques is the most appropriate for the data scientist to use?

A Autoregressive
B Moving average
C Dynamic time warping
D Relative strength

Question 17

An analyst wants to show how the component pieces of a company's business units contribute to the company's overall revenue. Which of the following should the analyst use to best demonstrate this breakdown?

A Box-and-whisker chart
B Sankey diagram
C Scatter plot matrix
D Residual chart

That's the end of the Preview

Create a free account to unlock all questions for this exam.

Log In / Sign Up

Page 1 of 4 • Questions 1-25 of 82

1 2 3 4

→

Know a question that should be here? Contribute to this exam

Which of the following issues should a data scientist be most concerned about when generating a synthetic data set?

A The data set consuming too many resources
B The data set having insufficient features
C The data set having insufficient row observations
D The data set not being representative of the population

A data scientist needs to:
Build a predictive model that gives the likelihood that a car will get a flat tire.
Provide a data set of cars that had flat tires and cars that did not.
All the cars in the data set had sensors taking weekly measurements of tire pressure similar to the sensors that will be installed in the cars consumers drive. Which of the following is the most immediate data concern?

A Granularity misalignment
B Multivariate outliers
C Insufficient domain expertise
D Lagged observations