SEC-c Bda Quiz 1

Approved & Edited by ProProfs Editorial Team
The editorial team at ProProfs Quizzes consists of a select group of subject experts, trivia writers, and quiz masters who have authored over 10,000 quizzes taken by more than 100 million users. This team includes our in-house seasoned quiz moderators and subject matter experts. Our editorial experts, spread across the world, are rigorously trained using our comprehensive guidelines to ensure that you receive the highest quality quizzes.
Learn about Our Editorial Process
| By Yogesh
Y
Yogesh
Community Contributor
Quizzes Created: 1 | Total Attempts: 177
Questions: 14 | Attempts: 177

SettingsSettingsSettings
SEC-c Bda Quiz 1 - Quiz


Questions and Answers
  • 1. 

    Point out the correct statement:

    • A.

      Raw data is original source of data

    • B.

      Preprocessed data is original source of data

    • C.

      Raw data is the data obtained after processing steps

    • D.

      None of the Mentioned

    Correct Answer
    A. Raw data is original source of data
    Explanation
    The correct statement is that raw data is the original source of data. Raw data refers to the unprocessed and unorganized information that is collected directly from the source. It has not undergone any manipulation or analysis. Preprocessed data, on the other hand, refers to the data that has been cleaned, transformed, and organized for further analysis. Therefore, the answer "Raw data is original source of data" is the correct statement.

    Rate this question:

  • 2. 

    Which of the following is performed by Data Scientist ?

    • A.

      Define the question

    • B.

      Create reproducible code

    • C.

      Challenge results  

    • D.

      All of the Mentioned

    Correct Answer
    D. All of the Mentioned
    Explanation
    Data scientists perform all of the mentioned tasks. They define the question or problem they are trying to solve, create reproducible code to analyze and manipulate data, and challenge the results to ensure accuracy and reliability. By doing all of these tasks, data scientists are able to extract insights and make data-driven decisions.

    Rate this question:

  • 3. 

    Point out the wrong statement:

    • A.

      Merging concerns combining datasets on the same observations to produce a result with more variables

    • B.

      Data visualization is the organization of information according to preset specifications

    • C.

      Subsetting can be used to select and exclude variables and observations

    • D.

      All of the Mentioned

    Correct Answer
    B. Data visualization is the organization of information according to preset specifications
    Explanation
    The correct answer is "Data visualization is the organization of information according to preset specifications." This statement is incorrect because data visualization is the representation of data in graphical or visual format to provide insights and communicate patterns or trends in the data, not the organization of information according to preset specifications.

    Rate this question:

  • 4. 

    Which of the following approach should be used to ask Data Analysis question ?

    • A.

      Find only one solution for particular problem

    • B.

      Find out the question which is to be answered

    • C.

      Find out answer from dataset without asking question  

    • D.

      None of the mentioned

    Correct Answer
    B. Find out the question which is to be answered
    Explanation
    The correct approach to ask a Data Analysis question is to first identify the question that needs to be answered. This involves understanding the problem at hand and determining what specific information or insights are required from the dataset. Once the question is clearly defined, appropriate analysis techniques can be applied to find the answer. The other options mentioned, such as finding only one solution or directly extracting the answer from the dataset without asking a question, do not align with the systematic approach of data analysis.

    Rate this question:

  • 5. 

    Which of the following is one of the key data science skill ?

    • A.

      Statistics  

    • B.

      Machine Learning

    • C.

      Data Visualization

    • D.

      All of the Mentioned

    Correct Answer
    D. All of the Mentioned
    Explanation
    All of the mentioned options are key data science skills. Statistics is essential for analyzing and interpreting data, Machine Learning is crucial for building predictive models and making data-driven decisions, and Data Visualization is important for effectively communicating insights and patterns from data. Therefore, all of these skills are fundamental in the field of data science.

    Rate this question:

  • 6. 

    Which of the following is most important language for Data Science ?

    • A.

      Java

    • B.

      Ruby

    • C.

      R

    • D.

      None of the Mentioned

    Correct Answer
    C. R
    Explanation
    R is the most important language for Data Science because it is specifically designed for statistical analysis and data manipulation. It has a wide range of packages and libraries that make it easy to perform complex data analysis tasks. R also has a large and active community of users, which means there is a wealth of resources and support available for those working in Data Science. Additionally, R integrates well with other programming languages and tools commonly used in Data Science, making it a versatile and powerful language for this field.

    Rate this question:

  • 7. 

    A salesman offers you a choice of three boxes, one containing a million dollars and two containing fifty dollars and tells you to pick one. He then shows you fifty dollars in one of the other two boxes and asks you if you want to change your choice to the remaining box that you have neither picked nor seen inside. What do you do?

    • A.

      Change to the other box

    • B.

      Stay with the one you picked originally

    • C.

      It doesn't matter, so do nothing

    • D.

      You don't have enough information to figure out whether you should change, so do nothing

    Correct Answer
    A. Change to the other box
    Explanation
    The correct answer is to change to the other box. This is known as the Monty Hall problem. Initially, there is a 1/3 chance of picking the box with a million dollars, and a 2/3 chance of picking one with fifty dollars. When the salesman reveals one of the boxes with fifty dollars, the probability of the remaining unopened box containing a million dollars increases to 2/3. Therefore, it is advantageous to switch your choice to the other box.

    Rate this question:

  • 8. 

    Which of the following is preferred for text analytics ?

    • A.

      R

    • B.

      Python  

    • C.

      S

    • D.

      All of the mentioned

    Correct Answer
    A. R
    Explanation
    R is preferred for text analytics because it has a wide range of packages and libraries specifically designed for natural language processing and text mining tasks. These packages provide various functionalities such as tokenization, stemming, sentiment analysis, and topic modeling. R also has robust visualization capabilities, making it easier to analyze and interpret textual data. Additionally, R has a strong community support and a vast number of resources available online, making it a popular choice for text analytics tasks.

    Rate this question:

  • 9. 

    ______ is simplest class of analytics:

    • A.

      Descriptive

    • B.

      Predictive

    • C.

      Prescriptive

    • D.

      All of the mentioned

    Correct Answer
    A. Descriptive
    Explanation
    Descriptive analytics is the simplest class of analytics because it focuses on analyzing historical data to understand what has happened in the past. It involves summarizing and interpreting data to gain insights and identify patterns and trends. Descriptive analytics does not involve making predictions or prescribing actions for the future, unlike predictive and prescriptive analytics. Instead, it provides a foundation for further analysis and decision-making by providing a clear understanding of past events and their implications.

    Rate this question:

  • 10. 

    Your company is attempting to build a Big Data environment. The vendors you are working with tell you that an additional $1m of capital expenditure is needed on top of the $10m made so far. You are worried that the existing environment will not provide all the capability you need, however. Do you:

    • A.

      Finalize the work you are doing with your current vendors because there isn't much left to do

    • B.

      Pause work while you consider what would be needed to gain the extra capability you need

    • C.

      Scrap the project as it seems it will not be fit for purpose

    • D.

      None of the above

    Correct Answer
    B. Pause work while you consider what would be needed to gain the extra capability you need
    Explanation
    Pausing work while considering what would be needed to gain the extra capability is the most logical choice in this situation. The concern about the existing environment not providing all the necessary capability indicates that further evaluation and planning are required before making a decision. By pausing work, the company can assess the feasibility of meeting their requirements with the additional $1m investment and determine if any adjustments or changes need to be made to ensure the success of the Big Data environment project.

    Rate this question:

  • 11. 

    You are operating a public health screening post at an airport and 200 people with a disease are identified. Three quarters of these are young, and two-thirds of all young people are diseased. There are as many non-diseased old people as there are young people in total. You now screen a new previously unseen individual – what is the chance they are old?

    • A.

      Impossible to say from the data given

    • B.

      Impossible to say without knowledge of the previously unseen individual's gender

    • C.

      55%

    • D.

      40%

    Correct Answer
    C. 55%
    Explanation
    Based on the information given, it is stated that there are as many non-diseased old people as there are young people in total. Since three quarters of the identified diseased individuals are young, it can be inferred that the remaining one quarter of diseased individuals are old. Therefore, the chance that the new unseen individual is old is 25% + 25% = 50%. However, since the options provided do not include this percentage, the closest option is 55%.

    Rate this question:

  • 12. 

    Data by itself is not useful unless:

    • A.

      It is massive

    • B.

      It is processed to obtain information

    • C.

      It is collected from diverse sources

    • D.

      It is properly stated

    Correct Answer
    B. It is processed to obtain information
    Explanation
    Data by itself is raw and unorganized information. In order to derive any meaningful insights or make informed decisions, the data needs to be processed and analyzed to extract valuable information. Processing the data involves organizing, cleaning, and transforming it into a more structured format. This allows for the identification of patterns, trends, and relationships within the data, enabling the generation of useful information that can be used for various purposes. Therefore, processing the data is essential to make it useful and meaningful.

    Rate this question:

  • 13. 

    For taking decisions data must be:

    • A.

      Very accurate

    • B.

      Massive

    • C.

      Processed correctly

    • D.

      Collected from diverse sources

    Correct Answer
    C. Processed correctly
    Explanation
    To make informed decisions, it is crucial that the data is processed correctly. Processing data correctly involves ensuring that it is organized, cleaned, and transformed in a way that eliminates errors and inconsistencies. By processing data correctly, one can derive meaningful insights and make accurate conclusions. Without proper processing, the data may be unreliable and lead to incorrect decisions. Accuracy, massiveness, and diverse sources are important aspects, but processing the data correctly is the key to utilizing these factors effectively.

    Rate this question:

  • 14. 

    Point out the correct statement :

    • A.

      Hadoop is an ideal environment for extracting and transforming small volumes of data

    • B.

      Hadoop stores data in HDFS and supports data compression/decompression

    • C.

      The Giraph framework is less useful than a MapReduce job to solve graph and machine learning

    • D.

      None of the mentioned

    Correct Answer
    B. Hadoop stores data in HDFS and supports data compression/decompression
    Explanation
    Hadoop stores data in HDFS and supports data compression/decompression. This means that Hadoop has the capability to store large volumes of data in its distributed file system (HDFS) and also provides the functionality to compress and decompress the data. This feature is important in big data processing as it helps in reducing storage space and improving data processing efficiency.

    Rate this question:

Quiz Review Timeline +

Our quizzes are rigorously reviewed, monitored and continuously updated by our expert board to maintain accuracy, relevance, and timeliness.

  • Current Version
  • Mar 18, 2023
    Quiz Edited by
    ProProfs Editorial Team
  • Aug 22, 2019
    Quiz Created by
    Yogesh
Back to Top Back to top
Advertisement