The editorial team at ProProfs Quizzes consists of a select group of subject experts, trivia writers, and quiz masters who have authored over 10,000 quizzes taken by more than 100 million users. This team includes our in-house seasoned quiz moderators and subject matter experts. Our editorial experts, spread across the world, are rigorously trained using our comprehensive guidelines to ensure that you receive the highest quality quizzes.
In this quiz you will get to learn about data analysis questions. Play this quiz and test your knowledge on this!
Questions and Answers
1.
Association rules provide information in the form of "if-then" statements.
A.
True
B.
False
Correct Answer
A. True
Explanation Association rules are used in data mining to discover relationships or patterns in datasets. These rules are typically in the form of "if-then" statements, where the antecedent represents the condition or itemset, and the consequent represents the outcome or result. By analyzing large datasets, association rules can provide valuable insights and information about the relationships between different items or variables. Therefore, the statement "Association rules provide information in the form of 'if-then' statements" is true.
Rate this question:
2.
Support is the conditional probability that a randomly selected transaction will include all the items in the consequent given that the transaction includes all the items in the antecedent.
A.
True
B.
False
Correct Answer
B. False
Explanation The explanation for the given answer "False" is that support is the conditional probability that a randomly selected transaction will include all the items in the antecedent given that the transaction includes all the items in the consequent. In other words, support measures the proportion of transactions that contain both the antecedent and the consequent items out of all the transactions that contain the antecedent items. Therefore, the statement in the question is incorrect.
Rate this question:
3.
Confidence is the conditional probability that a randomly selected transaction will include all the items in the consequent given that the transaction includes all the items in the antecedent.
A.
True
B.
False
Correct Answer
A. True
Explanation The given statement is true. Confidence is indeed the conditional probability that a randomly selected transaction will include all the items in the consequent, given that the transaction includes all the items in the antecedent. In other words, it measures the likelihood of the consequent occurring when the antecedent is present in a transaction.
Rate this question:
4.
One major shortcoming of association analysis is that the support confidence framework often-
A.
Generates too many rules.
B.
Generates too few rules.
C.
Generates inaccurate rules.
Correct Answer
A. Generates too many rules.
Explanation Association analysis is a data mining technique used to discover patterns and relationships between items in a dataset. The support-confidence framework is commonly used in association analysis to generate rules based on the frequency of itemsets and the confidence of the associations. However, one major shortcoming of this framework is that it often generates too many rules. This occurs because the support-confidence framework considers all possible combinations of items, leading to a large number of rules being generated.
Rate this question:
5.
Which data mining method is used to analyse transaction data?
A.
Clustering
B.
Classification
C.
Market Basket (Association)
Correct Answer
C. Market Basket (Association)
Explanation Market Basket (Association) is the correct answer because it is a data mining method specifically designed to analyze transaction data. This method is used to identify patterns and relationships between items that are frequently purchased together. It helps in understanding customer behavior, making recommendations, and improving marketing strategies. Clustering and classification are also data mining methods but they are not specifically focused on analyzing transaction data.
Rate this question:
6.
The Apriori algorithm is used for the following data mining task -
A.
Classification
B.
Clustering
C.
Association
Correct Answer
C. Association
Explanation The Apriori algorithm is specifically designed for association rule mining, which involves discovering relationships or associations between different items in a dataset. It helps identify frequent itemsets and generate association rules based on the support and confidence measures. This algorithm is not suitable for classification or clustering tasks, as it focuses solely on finding associations between items.
Rate this question:
7.
How many phases is the cross industry process for data mining consist of?
A.
3
B.
4
C.
6
D.
7
Correct Answer
C. 6
Explanation The cross-industry process for data mining consists of six phases. These phases include business understanding, data understanding, data preparation, modeling, evaluation, and deployment. Each phase plays a crucial role in the overall data mining process, starting from understanding the business objectives, gathering and analyzing the data, building and evaluating models, and finally deploying the results. These six phases ensure a systematic and comprehensive approach to data mining, enabling organizations to extract valuable insights and make informed decisions based on the data.
Rate this question:
8.
Pick the correct order of the 6 phases of the CRISP-DM ?
A.
Data Preparation, Modelling, Deployment, Evaluation, Business Understanding, Data Understanding
B.
Business Understanding, Data modelling, Data Understanding, Evaluation, Data Preparation, Deployment
C.
Data Understanding, Business Understanding, Data Modelling, Evaluation, Deployment, Data Preparation
D.
Business understanding, Data Understanding, Data Preparation, Data Modelling, Evaluation, Deployment
Correct Answer
D. Business understanding, Data Understanding, Data Preparation, Data Modelling, Evaluation, Deployment
Explanation The correct order of the 6 phases of the CRISP-DM (Cross-Industry Standard Process for Data Mining) is as follows: Business understanding, Data Understanding, Data Preparation, Data Modelling, Evaluation, Deployment. This order ensures that the business goals and objectives are clearly defined and understood before proceeding to understand the available data. Once the data is understood, it is prepared for analysis and modelling. The models are then evaluated to assess their effectiveness and accuracy. Finally, the successful models are deployed for use in the business operations.
Rate this question:
9.
Without proper data processing it is very difficult to select the appropriate model for the data.
A.
True
B.
False
Correct Answer
A. True
Explanation Data processing is an essential step in selecting the appropriate model for the data. Without proper data processing, the data may contain inconsistencies, errors, or irrelevant information, which can lead to inaccurate model selection. Data processing involves cleaning, transforming, and analyzing the data to ensure its quality and relevance. By properly processing the data, one can identify patterns, relationships, and characteristics that can inform the selection of an appropriate model for the data. Therefore, without proper data processing, it is indeed difficult to select the appropriate model for the data.
Rate this question:
10.
In a transaction data set where we have three variables:
Customer Identification
Product bought.
Order of product purchase.
What model role must we assign to (2) ie; Product bought.
A.
ID
B.
Sequence
C.
Target
Correct Answer
C. Target
Explanation In this transaction data set, the "Product bought" variable represents the target variable. The target variable is the variable that we are interested in predicting or understanding. In this case, we want to analyze and predict the products that customers are buying. Therefore, the "Product bought" variable should be assigned the role of the target variable.
Rate this question:
11.
In data mining a decision tree is a predictive model?
A.
True
B.
False
Correct Answer
A. True
Explanation A decision tree is a predictive model in data mining because it uses a tree-like structure to represent decisions and their possible consequences. It starts with a root node and branches out into different paths based on different conditions or attributes. Each branch represents a decision or outcome, and the final leaves of the tree represent the predicted results. By following the path from the root to a leaf, it is possible to predict the outcome or make a decision based on the given input. Therefore, the statement "a decision tree is a predictive model" is true.
Rate this question:
Quiz Review Timeline +
Our quizzes are rigorously reviewed, monitored and continuously updated by our expert board to maintain accuracy, relevance, and timeliness.