Data Analytics Ultimate Quiz! Trivia

Approved & Edited by ProProfs Editorial Team
The editorial team at ProProfs Quizzes consists of a select group of subject experts, trivia writers, and quiz masters who have authored over 10,000 quizzes taken by more than 100 million users. This team includes our in-house seasoned quiz moderators and subject matter experts. Our editorial experts, spread across the world, are rigorously trained using our comprehensive guidelines to ensure that you receive the highest quality quizzes.
Learn about Our Editorial Process
| By Kamil Lazim
K
Kamil Lazim
Community Contributor
Quizzes Created: 1 | Total Attempts: 126
Questions: 40 | Attempts: 126

SettingsSettingsSettings
Data Analytics Ultimate Quiz! Trivia - Quiz



Questions and Answers
  • 1. 

    What skills required in Data Science?

    • A.

      Statistics / Mathematics skills

    • B.

      Coding / Hacking skills

    • C.

      Domain Knowledge / Business Knowledge

    • D.

      All the above

    Correct Answer
    D. All the above
    Explanation
    The correct answer is "All the above". Data Science requires a combination of statistics/mathematics skills, coding/hacking skills, and domain/business knowledge. Statistics and mathematics skills are essential for analyzing and interpreting data. Coding and hacking skills are necessary for programming and manipulating data. Domain and business knowledge is important for understanding the context and making informed decisions. Therefore, all of these skills are required in Data Science.

    Rate this question:

  • 2. 

    Which of the following best describes the principal goal of data science?

    • A.

      To collect and archive exhaustive data sets from various source systems for corporate record keeping uses.

    • B.

      To mine and analyze large amounts of data to uncover information that can be leveraged for operational improvements and business gains.

    • C.

      To prepare data for analysts to use as part of analytics applications.

    • D.

      All the above

    Correct Answer
    B. To mine and analyze large amounts of data to uncover information that can be leveraged for operational improvements and business gains.
    Explanation
    The principal goal of data science is to mine and analyze large amounts of data to uncover information that can be leveraged for operational improvements and business gains. This involves using techniques and tools to extract valuable insights from data, which can then be used to make informed decisions and drive business growth. Collecting and archiving data sets for record keeping purposes and preparing data for analysts are important steps in the data science process, but they are not the ultimate goal.

    Rate this question:

  • 3. 

    Which of the following is performed by a data scientist?

    • A.

      Create reproducible code

    • B.

      Define the question

    • C.

      Challenge results

    • D.

      All the above

    Correct Answer
    B. Define the question
    Explanation
    A data scientist is responsible for defining the question or problem that needs to be solved using data analysis. They need to understand the business objectives and formulate the right questions to guide their analysis. Creating reproducible code and challenging results are also important tasks for a data scientist, but they are not exclusive to their role. Other individuals involved in data analysis may also perform these tasks. Therefore, the correct answer is defining the question.

    Rate this question:

  • 4. 

    Which of the following is the most widely used language for data science?

    • A.

      Ruby

    • B.

      Java

    • C.

      Python

    • D.

      C++

    Correct Answer
    C. Python
    Explanation
    Python is the most widely used language for data science due to its simplicity, versatility, and extensive libraries such as NumPy, Pandas, and Scikit-learn. These libraries provide powerful tools for data manipulation, analysis, and machine learning. Python's syntax is easy to understand and its large community of users contribute to its popularity by sharing code and resources. Its integration with other languages and frameworks also makes it a preferred choice for data scientists.

    Rate this question:

  • 5. 

    Which of the following is not a step in data analysis?

    • A.

      EDA

    • B.

      Obtaining data

    • C.

      Cleaning data

    • D.

      Securing data

    Correct Answer
    D. Securing data
    Explanation
    Securing data is not a step in data analysis because it is a separate process that focuses on protecting data from unauthorized access, disclosure, alteration, or destruction. While it is essential to secure data to maintain its integrity and confidentiality, it is not directly involved in the analysis of data. The other options, EDA (Exploratory Data Analysis), obtaining data, and cleaning data, are all crucial steps in the data analysis process. EDA involves exploring and understanding the data, obtaining data involves collecting relevant data sources, and cleaning data involves removing errors, inconsistencies, and outliers from the dataset.

    Rate this question:

  • 6. 

    Which of the following step is performed by data scientist after acquiring the data?

    • A.

      Data Integration

    • B.

      Data Cleansing

    • C.

      Data Replication

    • D.

      All the above

    Correct Answer
    B. Data Cleansing
    Explanation
    After acquiring the data, one of the steps performed by a data scientist is data cleansing. This involves identifying and removing any errors, inconsistencies, or irrelevant information from the dataset. Data cleansing ensures that the data is accurate, complete, and suitable for analysis. It may involve tasks such as removing duplicate records, filling in missing values, correcting inconsistencies, and standardizing formats. By performing data cleansing, the data scientist ensures that the data is of high quality and can be effectively used for further analysis and modeling.

    Rate this question:

  • 7. 

    What is the simplest class of analytics?

    • A.

      Descriptive

    • B.

      Predictive

    • C.

      Prescriptive

    • D.

      All the above

    Correct Answer
    A. Descriptive
    Explanation
    Descriptive analytics is the simplest class of analytics as it focuses on summarizing and interpreting historical data to provide insights into past events and trends. It involves organizing and presenting data in a way that is easy to understand, such as through charts, graphs, and reports. Descriptive analytics helps in understanding what has happened in the past, but it does not involve making predictions or prescribing actions for the future like predictive and prescriptive analytics do.

    Rate this question:

  • 8. 

    Point out the correct statement:

    • A.

      Raw data is original source of data

    • B.

      Preprocessed data is original source of data

    • C.

      Raw data is the data obtained after processing steps

    • D.

      None of the above

    Correct Answer
    A. Raw data is original source of data
    Explanation
    The correct statement is that raw data is the original source of data. Raw data refers to the unprocessed and unorganized data that is collected directly from the source without any modifications or transformations. It is the initial data that is collected before any processing steps are applied to it. Preprocessed data, on the other hand, refers to the data that has undergone some form of cleaning, transformation, or organization to make it more suitable for analysis or use. Therefore, the correct answer is that raw data is the original source of data.

    Rate this question:

  • 9. 

    The below image is an example of:

    • A.

      Name box

    • B.

      Formula bar

    • C.

      Column heading

    • D.

      Row heading

    Correct Answer
    C. Column heading
    Explanation
    The image provided shows the labels at the top of each column in a spreadsheet. These labels indicate the specific data or information contained within each column. Therefore, the correct answer is "Column heading."

    Rate this question:

  • 10. 

    The below image is an example of

    • A.

      Cell

    • B.

      Column heading

    • C.

      Spreadsheet

    • D.

      Formula bar

    Correct Answer
    D. Formula bar
    Explanation
    The given image is an example of a formula bar. The formula bar is a feature in spreadsheet software that displays the contents of the active cell and allows users to enter or edit formulas and data. It is typically located at the top of the spreadsheet interface and provides a convenient way to input and manipulate data in cells.

    Rate this question:

  • 11. 

    Which is the most suitable chart for discrete data?

    • A.

      Bar chart

    • B.

      Line chart

    Correct Answer
    A. Bar chart
    Explanation
    A bar chart is the most suitable chart for discrete data because it displays data in separate bars, with each bar representing a specific category or group. This allows for easy comparison between different categories or groups, making it ideal for displaying discrete data. A line chart, on the other hand, is more suitable for continuous data, where the data points are connected by lines to show trends over time.

    Rate this question:

  • 12. 

    Which is the most suitable chart for continuous data?

    • A.

      Line chart

    • B.

      Pie chart

    Correct Answer
    A. Line chart
    Explanation
    A line chart is the most suitable chart for continuous data because it shows the relationship between two continuous variables over time. It is ideal for displaying trends, patterns, and changes in data over a continuous period. In contrast, a pie chart is more suitable for displaying categorical data and comparing parts of a whole.

    Rate this question:

  • 13. 

    What is the Excel feature that quickly allows us to show trend information in a single cell?

    • A.

      Pivot table

    • B.

      VLOOKUP

    • C.

      Sparklines

    • D.

      Conditional formatting

    Correct Answer
    C. Sparklines
    Explanation
    Sparklines is the correct answer because it is an Excel feature that allows us to show trend information in a single cell. Sparklines are small, condensed charts that can be inserted within a cell and provide a visual representation of data trends, such as line graphs, bar charts, or win/loss charts. They are useful for quickly analyzing and understanding data patterns without the need for creating a separate chart or graph.

    Rate this question:

  • 14. 

    How can you easily perform a summary over a large detailed data set?

    • A.

      Pie chart

    • B.

      Pivot table

    • C.

      Formula

    • D.

      Combo chart

    Correct Answer
    B. Pivot table
    Explanation
    A pivot table is a useful tool for summarizing large detailed data sets. It allows you to quickly and easily analyze and summarize data by creating a table with rows and columns that can be rearranged and manipulated. You can easily group and aggregate data, perform calculations, and generate summaries such as totals, averages, and percentages. This makes it efficient and convenient to get an overview of the data and identify patterns or trends without having to manually sift through and analyze each individual data point.

    Rate this question:

  • 15. 

    Which Excel feature can you use to ensure that users do not enter irrelevant data?

    • A.

      Autofill

    • B.

      VLOOKUP

    • C.

      Data validation

    • D.

      Formula

    Correct Answer
    C. Data validation
    Explanation
    Data validation is the correct answer because it is an Excel feature that allows users to set specific criteria for data entry. By using data validation, users can restrict the type of data that can be entered in a cell, such as numbers within a certain range, dates, or specific text. This helps to ensure that irrelevant or incorrect data is not entered, improving the accuracy and reliability of the data in the spreadsheet.

    Rate this question:

  • 16. 

    Which of the following is the correct formula to add the values in cell A1 to A3?

    • A.

      A1+A2+A3

    • B.

      =SUM(A1:A3)

    • C.

      SUM(A1,A2,A3)

    • D.

      =1+2+3

    Correct Answer
    B. =SUM(A1:A3)
    Explanation
    The correct formula to add the values in cell A1 to A3 is =SUM(A1:A3). This formula uses the SUM function in Excel to add the values within the specified range, which in this case is A1 to A3. This formula is the most efficient and concise way to add multiple values in Excel.

    Rate this question:

  • 17. 

    In what follows, S is the sample space of the experiment in question and E is the event of interest. n(S) is the number of elements in the sample space S and n(E) is the number of elements in the event E. A die is rolled, find the probability that an even number is obtained.

    • A.

      1/2

    • B.

      4/6

    • C.

      3/2

    • D.

      5/6

    Correct Answer
    A. 1/2
    Explanation
    The sample space S consists of all possible outcomes when rolling a die, which are the numbers 1, 2, 3, 4, 5, and 6. The event E consists of the outcomes that are even numbers, which are 2, 4, and 6. Therefore, n(S) = 6 and n(E) = 3. The probability of event E occurring is given by n(E)/n(S) = 3/6 = 1/2.

    Rate this question:

  • 18. 

    Two coins are tossed, find the probability that two heads are obtained. Note: Each coin has two possible outcomes H (heads) and T (Tails).

    • A.

      1/2

    • B.

      3/4

    • C.

      1/4

    • D.

      1

    Correct Answer
    C. 1/4
    Explanation
    When two coins are tossed, there are a total of four possible outcomes: HH, HT, TH, and TT. Since we are interested in the probability of obtaining two heads, there is only one favorable outcome (HH) out of the four possible outcomes. Therefore, the probability of obtaining two heads is 1 out of 4, which can be expressed as 1/4.

    Rate this question:

  • 19. 

    Which of these numbers cannot be a probability?

    • A.

      0

    • B.

      0.452

    • C.

      1.0001

    • D.

      20%

    Correct Answer
    C. 1.0001
    Explanation
    A probability must be a number between 0 and 1, inclusive. 1.0001 is greater than 1, which is outside the range of possible probabilities.

    Rate this question:

  • 20. 

    What is the probability of the shaded sector from the spinner below?

    • A.

      1/3

    • B.

      3/4

    • C.

      2/6

    • D.

      2/5

    Correct Answer
    D. 2/5
    Explanation
    The spinner is divided into 5 equal sectors, and the shaded sector occupies 2 of these sectors. Therefore, the probability of landing on the shaded sector is 2/5.

    Rate this question:

  • 21. 

    A jar contains 3 red marbles, 7 green marbles and 10 white marbles. If a marble is drawn from the jar at random, what is the probability that this marble is white?

    • A.

      1/2

    • B.

      4/20

    • C.

      2/6

    • D.

      7/10

    Correct Answer
    A. 1/2
    Explanation
    The probability of drawing a white marble can be found by dividing the number of white marbles by the total number of marbles in the jar. In this case, there are 10 white marbles and a total of 20 marbles in the jar. Therefore, the probability of drawing a white marble is 10/20, which simplifies to 1/2.

    Rate this question:

  • 22. 

    The blood groups of 200 people are distributed as follows: 50 have type A blood, 65 have B blood type, 70 have O blood type and 15 have type AB blood. If a person from this group is selected at random, what is the probability that this person has O blood type?

    • A.

      5/10

    • B.

      85/200

    • C.

      70/200

    • D.

      120/200

    Correct Answer
    C. 70/200
    Explanation
    The probability of selecting a person with O blood type can be calculated by dividing the number of people with O blood type (70) by the total number of people (200). Therefore, the probability is 70/200.

    Rate this question:

  • 23. 

    A die is rolled, find the probability that the number obtained is greater than 4.

    • A.

      1/3

    • B.

      4/6

    • C.

      2/3

    • D.

      1/6

    Correct Answer
    A. 1/3
    Explanation
    The probability of rolling a number greater than 4 on a die can be determined by counting the favorable outcomes (numbers 5 and 6) and dividing it by the total number of possible outcomes (numbers 1 to 6). In this case, there are 2 favorable outcomes (5 and 6) out of 6 possible outcomes. Therefore, the probability is 2/6, which simplifies to 1/3.

    Rate this question:

  • 24. 

    The sample space S of the experiment in question 8 is shown below: A card is drawn at random from a deck of cards. Find the probability of getting a queen.

    • A.

      1/14

    • B.

      3/12

    • C.

      4/13

    • D.

      1/13

    Correct Answer
    D. 1/13
    Explanation
    The sample space S represents all the possible outcomes of the experiment, which is drawing a card at random from a deck of cards. The sample space consists of 52 cards. The event of interest is getting a queen, which is one out of the four queens in the deck. Therefore, the probability of getting a queen is 4/52, which simplifies to 1/13.

    Rate this question:

  • 25. 

    The expected value or _______ of a random variable is the center of its distribution.

    • A.

      Mode

    • B.

      Median

    • C.

      Mean

    • D.

      Bayesian inference

    Correct Answer
    C. Mean
    Explanation
    The expected value of a random variable is the center of its distribution. It represents the average value that the random variable is expected to take on over a large number of trials. The mean is calculated by summing up all the possible values of the random variable, each multiplied by their respective probabilities. It is a measure of central tendency and provides a measure of the typical value of the random variable.

    Rate this question:

  • 26. 

    Which of the following of a random variable is a measure of spread?

    • A.

      Variance

    • B.

      Standard deviation

    • C.

      Empirical mean

    • D.

      All of the Mentioned

    Correct Answer
    A. Variance
    Explanation
    Variance is a measure of spread because it quantifies how much the values of a random variable vary from the mean. It calculates the average of the squared differences between each value and the mean, providing a measure of the overall dispersion or spread of the data. A higher variance indicates a greater spread of values, while a lower variance indicates a more concentrated or narrow distribution. Therefore, variance is a commonly used statistical measure to understand the variability or spread of a random variable.

    Rate this question:

  • 27. 

    The square root of the variance is called the ________ deviation.

    • A.

      Empirical

    • B.

      Mean

    • C.

      Continuous

    • D.

      Standard

    Correct Answer
    D. Standard
    Explanation
    The square root of the variance is called the standard deviation. This is a commonly used measure of the amount of variation or dispersion in a set of data. It tells us how spread out the data points are from the mean. By taking the square root of the variance, we obtain the standard deviation, which is expressed in the same units as the original data.

    Rate this question:

  • 28. 

    The following questions will be based on this set of numbers: 20, 24, 25, 36, 25, 22, 23 The mode?

    • A.

      20

    • B.

      25

    • C.

      22

    • D.

      36

    Correct Answer
    B. 25
    Explanation
    The mode is the number that appears most frequently in a set of numbers. In this set, the number 25 appears twice, which is more than any other number. Therefore, the mode of this set of numbers is 25.

    Rate this question:

  • 29. 

    The following questions will be based on this set of numbers: 20, 24, 25, 36, 25, 22, 23 The mean?

    • A.

      22.5

    • B.

      26

    • C.

      25

    • D.

      24.3

    Correct Answer
    C. 25
    Explanation
    The correct answer is 25. To find the mean, you add up all the numbers in the set and then divide by the total number of values. In this case, the sum of the numbers is 175. There are a total of 7 numbers in the set. So, when you divide 175 by 7, you get 25. Therefore, the mean of the set of numbers is 25.

    Rate this question:

  • 30. 

    The following questions will be based on this set of numbers: 20, 24, 25, 36, 25, 22, 23 The median?

    • A.

      25

    • B.

      23

    • C.

      48

    • D.

      24

    Correct Answer
    D. 24
    Explanation
    The median is the middle value in a set of numbers when they are arranged in ascending order. In this case, when the numbers are arranged in ascending order, they become 20, 22, 23, 24, 25, 25, 36. The middle value is 24, which is the correct answer.

    Rate this question:

  • 31. 

    The following questions will be based on this set of numbers: 20, 24, 25, 36, 25, 22, 23 The standard deviation? (approximate value)

    • A.

      4.934

    • B.

      3.241

    • C.

      51.64

    • D.

      5.164

    Correct Answer
    D. 5.164
    Explanation
    The correct answer is 5.164. The standard deviation measures the amount of variation or dispersion in a set of numbers. It indicates how spread out the numbers are from the average. In this case, the standard deviation is approximately 5.164, which suggests that the numbers in the set are relatively spread out from the mean.

    Rate this question:

  • 32. 

    The following questions will be based on this set of numbers: 20, 24, 25, 36, 25, 22, 23 The variance? (approximate value)

    • A.

      23.673

    • B.

      26.666

    • C.

      25.133

    • D.

      24.340

    Correct Answer
    B. 26.666
    Explanation
    The correct answer is 26.666. To calculate the variance, we need to find the average of the numbers first. Adding up all the numbers and dividing by the total count (7) gives us an average of 25.857. Then, we subtract the average from each number, square the result, and calculate the average of these squared differences. This gives us a variance of approximately 26.666.

    Rate this question:

  • 33. 

    Which of the following gave rise to need of graphs in data analysis?

    • A.

      Data visualization

    • B.

      Communicating results

    • C.

      Decision making

    • D.

      All of the above

    Correct Answer
    D. All of the above
    Explanation
    The need for graphs in data analysis arose due to various reasons, including data visualization, communicating results, and decision making. Data visualization helps in representing complex data in a visual format, making it easier to understand and interpret. Communicating results through graphs allows for effective presentation and sharing of information. Graphs also aid in decision making by providing a clear and concise representation of data, enabling better analysis and informed decision making. Therefore, all of the mentioned reasons contributed to the need for graphs in data analysis.

    Rate this question:

  • 34. 

    Which of the following graph can be used for simple summarization of data?

    • A.

      Cumulative Frequency

    • B.

      Overlaying

    • C.

      Bar Plot

    • D.

      Frequency Polygon

    Correct Answer
    C. Bar Plot
    Explanation
    A bar plot can be used for simple summarization of data because it visually represents the frequency or count of different categories or groups. It consists of rectangular bars where the length of each bar corresponds to the quantity or value it represents. This type of graph allows for easy comparison between different categories and is particularly useful for displaying categorical data. It provides a clear and concise summary of the data by showing the distribution and relative frequencies of each category.

    Rate this question:

  • 35. 

    In a Statistical data graph, a ____ is a representation of frequency distribution by means of the four-sided figure whose width represents class intervals and whose areas are directly proportional to the corresponding frequencies.

    • A.

      Bar Plot

    • B.

      Histogram

    • C.

      Box Plot

    • D.

      Line Graph

    Correct Answer
    B. Histogram
    Explanation
    A histogram is a representation of frequency distribution in a statistical data graph. It uses a four-sided figure where the width represents class intervals and the areas of the bars are directly proportional to the corresponding frequencies. This allows for a visual representation of the distribution of the data, making it easier to identify patterns and trends. A histogram is commonly used to display continuous data and is particularly useful for analyzing large data sets.

    Rate this question:

  • 36. 

    Which of the following information is not given from box-plot?

    • A.

      Mode

    • B.

      Median

    • C.

      Minimum

    • D.

      First quartile

    Correct Answer
    A. Mode
    Explanation
    The mode is not given from a box plot. A box plot displays the minimum, first quartile, median, third quartile, and maximum values of a dataset. The mode, however, represents the most frequently occurring value in the dataset and is not represented in a box plot. Therefore, the mode is not given from a box plot.

    Rate this question:

  • 37. 

    Color and shape can be used to add dimensions to graph data.

    • A.

      True

    • B.

      False

    Correct Answer
    A. True
    Explanation
    Color and shape can be used to add dimensions to graph data. By assigning different colors and shapes to different data points, additional information can be conveyed in the graph. For example, different colors can represent different categories or groups, while different shapes can represent different variables or conditions. This helps to visually differentiate and distinguish the data points, making it easier for the viewer to interpret and analyze the graph. Therefore, the statement "Color and shape can be used to add dimensions to graph data" is true.

    Rate this question:

  • 38. 

    Which of the following dimension type graph is related to table below? Bar Plot Box plot Density Plot Histogram

    • A.

      One-dimensional

    • B.

      Two-dimensional

    • C.

      Three-dimensional

    • D.

      Four-dimensional

    Correct Answer
    B. Two-dimensional
    Explanation
    The correct answer is two-dimensional. This is because a two-dimensional graph, such as a bar plot, box plot, density plot, or histogram, is commonly used to represent data from a table. These types of graphs allow for the visualization of data in two dimensions, typically with one variable on the x-axis and another variable on the y-axis.

    Rate this question:

  • 39. 

    Point out the wrong statement:

    • A.

      Plot are created with multiple functions only

    • B.

      Plots are created with both single and multiple function calls

    • C.

      Annotation in plot is not especially intuitive

    • D.

      None of the Mentioned

    Correct Answer
    A. Plot are created with multiple functions only
    Explanation
    The given answer is incorrect. Plots can be created with both single and multiple function calls. In fact, plots can be created using a single function call by providing the necessary arguments and data to the function. Multiple function calls may be used to add additional elements or customize the plot further, but it is not necessary to create a plot. Therefore, the correct statement is "Plots are created with both single and multiple function calls."

    Rate this question:

  • 40. 

    The most heavily used summarization visualization is the ______, which measures the correlation between every pair of values in a dataset and plots a result in color.

    • A.

      Box Plot

    • B.

      Scatter Plot

    • C.

      Correlation Plot

    • D.

      Parallel Coordinates Plot

    Correct Answer
    C. Correlation Plot
    Explanation
    A correlation plot is a type of visualization that measures the correlation between every pair of values in a dataset and represents it using colors. This plot helps in understanding the relationship between variables and identifying patterns or trends in the data. It is widely used for summarizing and analyzing large datasets to gain insights into the strength and direction of the relationships between variables.

    Rate this question:

Quiz Review Timeline +

Our quizzes are rigorously reviewed, monitored and continuously updated by our expert board to maintain accuracy, relevance, and timeliness.

  • Current Version
  • Mar 21, 2023
    Quiz Edited by
    ProProfs Editorial Team
  • May 21, 2018
    Quiz Created by
    Kamil Lazim
Back to Top Back to top
Advertisement
×

Wait!
Here's an interesting quiz for you.

We have other quizzes matching your interest.