Certbus > EMC > EMC Certifications > E20-007 > E20-007 Online Practice Questions and Answers

E20-007 Online Practice Questions and Answers

Questions 4

What is a core deliverable at the end of the analytic project?

A. An implemented database design

B. A whitepaper describing the project and the implementation

C. A presentation for project sponsors

D. The training materials

Browse 198 Q&As
Questions 5

Refer to the Exhibit.

In the Exhibit. For effective visualization, what is the chart's primary flaw?

A. The use of 3 dimensions.

B. The slanting of axis labels.

C. The location of the legend.

D. The order of the columns.

Browse 198 Q&As
Questions 6

A disk drive manufacturer has a defect rate of less than 1.0% with 98% confidence. A quality assurance team samples 1000 disk drives and finds 14 defective units. Which action should the team recommend?

A. The manufacturing process should be inspected for problems.

B. A larger sample size should be taken to determine if the plant is functioning properly

C. A smaller sample size should be taken to determine if the plant is functioning properly

D. The manufacturing process is functioning properly and no further action is required.

Browse 198 Q&As
Questions 7

Refer to the exhibit.

In the exhibit, the x-axis represents the derived probability of a borrower defaulting on a loan. Also in the exhibit, the pink represents borrowers that are known to have not defaulted on their loan, and the blue represents borrowers that are known to have defaulted on their loan.

Which analytical method could produce the probabilities needed to build this exhibit?

A. Logistic Regression

B. Linear Regression

C. Discriminant Analysis

D. Association Rules

Browse 198 Q&As
Questions 8

Before building an ARMA model, how can you determine if the time series is weakly stationary?

A. Constant variance around a constant mean is apparent

B. Mean of the series is close to 0

C. Series is normally distributed

D. No trend component is apparent

Browse 198 Q&As
Questions 9

What describes a true property of Logistic Regression method?

A. It is robust with redundant variables and correlated variables.

B. It handles missing values well.

C. It works well with discrete variables that have many distinct values.

D. It works well with variables that affect the outcome in a discontinuous way.

Browse 198 Q&As
Questions 10

On analyzing your time series data you suspect that the data represented as y1, y2, y3, ... , yn-1, yn may have a trend component that is quadratic in nature. Which pattern of data will indicate that the trend in

the time series data is quadratic in nature?

A. (y3-y2) ?(y2-y1) = .........= (yn-yn-1)-(yn-1-yn-2)

B. (y2-y1) = (y3-y2) = ....... = (yn-yn-1)

C. ((y2-y1) /y1 ) * 100% = .......((yn-yn-1)/yn-1) * 100%

D. (y4-y2) ?(y3-y1) = .........= (yn-yn-2)-(yn-1-yn-3)

Browse 198 Q&As
Questions 11

In addition to less data movement and the ability to use larger datasets in calculations, what is a benefit of analytical calculations in a database?

A. quicker time to insight

B. more efficient handling of categorical values

C. improved connections between disparate data sources

D. full use of data aggregation functionality

Browse 198 Q&As
Questions 12

You are using MADlib for Linear Regression analysis. Which value does the statement return? SELECT (linregr(depvar, indepvar)).r2 FROM zeta1;

A. Goodness of fit

B. Coefficients

C. Standard error

D. P-value

Browse 198 Q&As
Questions 13

Based on the exhibit, the table shows the values for the input Boolean attributes A, B, and

C. In addition, the exhibit shows the values for the output attribute "class".

Which decision tree is valid for the data?

A. Tree A

B. Tree B

C. Tree C

D. Tree D

Browse 198 Q&As
Questions 14

Which word or phrase completes the statement?

Theater actor is to "Artistic and Expressive" as Data Scientist is to ________________

A. "Communicative and Collaborative"

B. "Introverted and Technical"

C. "Logical and Steadfast"

D. "Independent and Intelligent"

Browse 198 Q&As
Questions 15

Which word or phrase completes the statement?

Business Intelligence is to ad-hoc reporting and dashboards as Data Science is to ______________ .

A. Optimization and Predictive Modeling

B. Alerts and Queries

C. Structured Data and Data Sources

D. Sales and profit reporting

Browse 198 Q&As
Questions 16

In MADlib what does MAD stand for?

A. Magnetic, Agile, Deep

B. Machine Learning, Algorithms for Databases

C. Mathematical Algorithms for Databases

D. Modular, Accurate, Dependable

Browse 198 Q&As
Questions 17

The web analytics team uses Hadoop to process access logs. They now want to correlate this data with structured user data residing in a production single-instance JDBC database. They collaborate with the production team to import the data into Hadoop. Which tool should they use?

A. Sqoop

B. Pig

C. Chukwa

D. Scribe

Browse 198 Q&As
Questions 18

Refer to the exhibit.

You are asked to write a report on how specific variables impact your client's sales using a data set provided to you by the client. The data includes 15 variables that the client views as directly related to sales, and you are restricted to these variables only.

After a preliminary analysis of the data, the following findings were made:

1.

Multicollinearity is not an issue among the variables

2.

Only three variables--A, B, and C--have significant correlation with sales

You build a linear regression model on the dependent variable of sales with the independent variables of

A, B, and C. The results of the regression are seen in the exhibit.

Which interpretation is supported by the analysis?

A. Variables A, B, and C are significantly impacting sales, but are not effectively estimating sales

B. Variables A, B, and C are significantly impacting sales and are effectively estimating sales

C. Due to the R2 of 0.10, the model is not valid ?the linear regression should be re-run with all 15 variables forced into the model to increase the R2

D. Due to the R2 of 0.10, the model is not valid ?a different analytical model should be attempted

Browse 198 Q&As
Exam Code: E20-007
Exam Name: Data Science and Big Data Analytics
Last Update: Mar 17, 2025
Questions: 198 Q&As

PDF

$49.99

VCE

$55.99

PDF + VCE

$65.99