Which one of the following statements is FALSE regarding the communication between DataNodes and a federation of NameNodes in Hadoop 2.0?
A. Each DataNode receives commands from one designated master NameNode.
B. DataNodes send periodic heartbeats to all the NameNodes.
C. Each DataNode registers with all the NameNodes.
D. DataNodes send periodic block reports to all the NameNodes.
Examine the following Hive statements:

Assuming the statements above execute successfully, which one of the following statements is true?
A. Hive reformats File1 into a structure that Hive can access and moves into to/user/joe/x/
B. The file named File1 is moved to to/user/joe/x/
C. The contents of File1 are parsed as comma-delimited rows and loaded into /user/joe/x/
D. The contents of File1 are parsed as comma-delimited rows and stored in a database
Which HDFS command uploads a local file X into an existing HDFS directory Y?
A. hadoop scp X Y
B. hadoop fs -localPut X Y
C. hadoop fs-put X Y
D. hadoop fs -get X Y
You need to move a file titled "weblogs" into HDFS. When you try to copy the file, you can't. You know you have ample space on your DataNodes. Which action should you take to relieve this situation and store more files in HDFS?
A. Increase the block size on all current files in HDFS.
B. Increase the block size on your remaining files.
C. Decrease the block size on your remaining files.
D. Increase the amount of memory for the NameNode.
E. Increase the number of disks (or size) for the NameNode.
F. Decrease the block size on all current files in HDFS.
What does the following command do? register andapos;/piggyban):/pig-files.jarandapos;;
A. Invokes the user-defined functions contained in the jar file
B. Assigns a name to a user-defined function or streaming command
C. Transforms Pig user-defined functions into a format that Hive can accept
D. Specifies the location of the JAR file containing the user-defined functions
A combiner reduces:
A. The number of values across different keys in the iterator supplied to a single reduce method call.
B. The amount of intermediate data that must be transferred between the mapper and reducer.
C. The number of input files a mapper must process.
D. The number of output files a reducer must produce.
Which Two of the following statements are true about hdfs? Choose 2 answers
A. An HDFS file that is larger than dfs.block.size is split into blocks
B. Blocks are replicated to multiple datanodes
C. HDFS works best when storing a large number of relatively small files
D. Block sizes for all files must be the same size
Table metadata in Hive is:
A. Stored as metadata on the NameNode.
B. Stored along with the data in HDFS.
C. Stored in the Metastore.
D. Stored in ZooKeeper.
You use the hadoop fs –put command to write a 300 MB file using and HDFS block size of 64 MB. Just after this command has finished writing 200 MB of this file, what would another user see when trying to access this life?
A. They would see Hadoop throw an ConcurrentFileAccessException when they try to access this file.
B. They would see the current state of the file, up to the last bit written by the command.
C. They would see the current of the file through the last completed block.
D. They would see no content until the whole file written and closed.
Examine the following Pig commands:

Which one of the following statements is true?
A. The SAMPLE command generates an "unexpected symbol" error
B. Each MapReduce task will terminate after executing for 0.2 minutes
C. The reducers will only output the first 20% of the data passed from the mappers
D. A random sample of approximately 20% of the data will be output
Assuming default settings, which best describes the order of data provided to a reducer's reduce method:
A. The keys given to a reducer aren't in a predictable order, but the values associated with those keys always are.
B. Both the keys and values passed to a reducer always appear in sorted order.
C. Neither keys nor values are in any predictable order.
D. The keys given to a reducer are in sorted order but the values associated with each key are in no predictable order
You write MapReduce job to process 100 files in HDFS. Your MapReduce algorithm uses TextInputFormat: the mapper applies a regular expression over input values and emits key- values pairs with the key consisting of the matching text, and the value containing the filename and byte offset. Determine the difference between setting the number of reduces to one and settings the number of reducers to zero.
A. There is no difference in output between the two settings.
B. With zero reducers, no reducer runs and the job throws an exception. With one reducer, instances of matching patterns are stored in a single file on HDFS.
C. With zero reducers, all instances of matching patterns are gathered together in one file on HDFS. With one reducer, instances of matching patterns are stored in multiple files on HDFS.
D. With zero reducers, instances of matching patterns are stored in multiple files on HDFS. With one reducer, all instances of matching patterns are gathered together in one file on HDFS.
For each intermediate key, each reducer task can emit:
A. As many final key-value pairs as desired. There are no restrictions on the types of those key-value pairs (i.e., they can be heterogeneous).
B. As many final key-value pairs as desired, but they must have the same type as the intermediate key-value pairs.
C. As many final key-value pairs as desired, as long as all the keys have the same type and all the values have the same type.
D. One final key-value pair per value associated with the key; no restrictions on the type.
E. One final key-value pair per key; no restrictions on the type.
Assuming the following Hive query executes successfully:

Which one of the following statements describes the result set?
A. A bigram of the top 80 sentences that contain the substring "you are" in the lines column of the input data A1 table.
B. An 80-value ngram of sentences that contain the words "you" or "are" in the lines column of the inputdata table.
C. A trigram of the top 80 sentences that contain "you are" followed by a null space in the lines column of the inputdata table.
D. A frequency distribution of the top 80 words that follow the subsequence "you are" in the lines column of the inputdata table.
Consider the following two relations, A and B.

A Pig JOIN statement that combined relations A by its first field and B by its second field would produce what output?
A. 2 Jim Chris 2 3 Terry 3 4 Brian 4
B. 2 cherry 2 cherry 3 orange 4 peach
C. 2 cherry Jim, Chris 3 orange Terry
4 peach Brian
D. 2 cherry Jim 2 2 cherry Chris 2 3 orange Terry 3 4 peach Brian 4