The editorial team at ProProfs Quizzes consists of a select group of subject experts, trivia writers, and quiz masters who have authored over 10,000 quizzes taken by more than 100 million users. This team includes our in-house seasoned quiz moderators and subject matter experts. Our editorial experts, spread across the world, are rigorously trained using our comprehensive guidelines to ensure that you receive the highest quality quizzes.
Dividing a database into 3 parts; a training data set, validation data set and testing data set is known as:
A.
Data Understanding
B.
Data Partitioning
C.
Association Analysis
D.
Predictive Modeling
Correct Answer
B. Data Partitioning
Explanation Dividing a database into three parts, namely a training data set, validation data set, and testing data set, is known as data partitioning. This process is commonly used in machine learning and data analysis to ensure that the model is trained on a subset of the data, validated on another subset, and tested on a separate subset. This partitioning helps in evaluating the model's performance and generalizability by assessing its accuracy on unseen data.
Rate this question:
2.
What is the correct order for the 6 CRISP DM phases?
A.
Business Understanding, Data Understanding, Data Preparation, Modeling, Evaluation, Deployment
B.
Data Understanding, Data Preparation, Business Understanding, Modeling, Evaluation, Deployment
C.
Data Understanding, Business Understanding, Data Preparation, Modeling, Evaluation, Deployment
D.
Business Understanding, Data Preparation, Data Understanding, Modeling, Evaluation, Deployment
Correct Answer
A. Business Understanding, Data Understanding, Data Preparation, Modeling, Evaluation, Deployment
Explanation The correct order for the 6 CRISP DM phases is as follows: Business Understanding, Data Understanding, Data Preparation, Modeling, Evaluation, Deployment. This order is logical as it starts with understanding the business objectives and requirements, then moves on to gaining a deeper understanding of the available data. Once the data is understood, it can be prepared for analysis. Modeling involves building and testing predictive models, followed by evaluating their effectiveness. Finally, the deployment phase involves implementing the models into the business process.
Rate this question:
3.
What is SAS Enterprise Miner used for:
A.
ONLY for Market Basket Analysis
B.
ONLY for Predictive Modeling
C.
Creating Accurate Descriptive and Predictive Models
D.
None of the above
Correct Answer
C. Creating Accurate Descriptive and Predictive Models
Explanation SAS Enterprise Miner is a software tool used for creating accurate descriptive and predictive models. It is not limited to just market basket analysis or predictive modeling alone. With SAS Enterprise Miner, users can explore and analyze data, build and validate models, and deploy the models to make predictions and gain insights. The software offers a wide range of data mining and machine learning techniques, making it a versatile tool for creating accurate descriptive and predictive models.
Rate this question:
4.
What is Market Basket Analysis?
A.
The process of discovering association rules between variables in a dataset.
B.
Is the process of developing clusters in order to segregate data and discover the relevant categories of data.
C.
Market Basket analysis uses decision trees to predict outcomes.
D.
All of the above.
Correct Answer
A. The process of discovering association rules between variables in a dataset.
Explanation Market Basket Analysis is the process of discovering association rules between variables in a dataset. This technique is commonly used in retail and marketing industries to understand the purchasing patterns of customers. It helps identify relationships between products that are frequently bought together, enabling businesses to make strategic decisions such as product placement, cross-selling, and targeted marketing campaigns. By analyzing the associations between variables, businesses can gain valuable insights into customer behavior and optimize their operations for increased sales and customer satisfaction.
Rate this question:
5.
What is Predictive Modeling?
A.
The process of using decision trees to predict certain outcomes.
B.
Is the process of developing clusters in order to segregate data and discover the relevant categories of data.
C.
The process of discovering association rules between variables in a dataset.
D.
None of the above.
Correct Answer
A. The process of using decision trees to predict certain outcomes.
Explanation Predictive modeling refers to the process of using decision trees to predict certain outcomes. Decision trees are a popular algorithm used in machine learning for classification and regression tasks. By analyzing a dataset and creating a tree-like model of decisions and their possible consequences, predictive modeling can be used to make predictions or forecasts about future events or outcomes based on past data. This approach is widely used in various fields, including finance, marketing, and healthcare, to make informed decisions and optimize business strategies.
Rate this question:
6.
What is Data Processing and Analysis?
A.
The overall method of creating models to address real world situations
B.
The method of understanding and identifying the outcome of the data
C.
The method of developing steps and/or methods in order to clarify the data.
D.
None of the above
Correct Answer
A. The overall method of creating models to address real world situations
Explanation Data processing and analysis refers to the overall method of creating models to address real-world situations. This involves using various techniques and tools to collect, organize, and analyze data in order to gain insights and make informed decisions. It includes processes such as data cleaning, transformation, visualization, and statistical analysis to understand patterns, trends, and relationships within the data. By creating models, researchers and analysts can simulate different scenarios and predict outcomes based on the data, helping to solve real-world problems and improve decision-making processes.
Rate this question:
7.
Association rules describe the relationships between certain variables in a large database.
A.
True
B.
False
Correct Answer
A. True
Explanation Association rules are used in data mining to discover relationships or patterns between variables in a large database. These rules help to identify the co-occurrence or dependencies between different items or attributes. By analyzing the data, association rules can provide insights into the relationships and associations that exist within the dataset. Therefore, the statement "Association rules describe the relationships between certain variables in a large database" is true.
Rate this question:
8.
In market basket analysis; confidence is:
A.
The general measure of association between the two item sets.
B.
The conditional probability that a transaction contains item set B given that it contains item set A
C.
The probability that the two item sets occur together.
D.
The conditional probability that a transaction contains item set B given that it does not contain item set A.
Correct Answer
B. The conditional probability that a transaction contains item set B given that it contains item set A
Explanation Confidence in market basket analysis refers to the conditional probability that a transaction contains item set B given that it contains item set A. This means that confidence measures the likelihood of item set B being purchased when item set A is already in the basket. It quantifies the strength of the association between the two item sets and helps identify which items are frequently purchased together.
Rate this question:
9.
In market basket analysis; support is: The general measure of association between the two item sets.
A.
True
B.
False
Correct Answer
B. False
Explanation Support is not a measure of association between two item sets in market basket analysis. Support is a measure of how frequently an item set appears in a dataset. It is used to identify the popularity or occurrence of an item set in a transaction dataset. Association between item sets is measured using other metrics like confidence and lift, which determine the strength of the relationship between items in a transaction. Therefore, the statement that support is a measure of association is false.
Rate this question:
10.
Which of these is NOT part of the CRISP DM Data Understanding phase?
A.
Collecting relevant data.
B.
Finding and identifying any problems within the data sets.
C.
Cleaning and addressing any problems with the data sets.
D.
These are all part of the data understanding phase
Correct Answer
C. Cleaning and addressing any problems with the data sets.
Explanation Cleaning and addressing any problems with the data sets is not part of the CRISP DM Data Understanding phase. The Data Understanding phase includes collecting relevant data, finding and identifying any problems within the data sets. Cleaning and addressing data problems are part of the Data Preparation phase, which comes after the Data Understanding phase in the CRISP DM methodology.
Rate this question:
11.
When discovering association rules; it is most important to look for rules that generate:
A.
Low support and confidence.
B.
Low support but high confidence and great lift
C.
High support and confidence as well as great lift.
D.
None of the above
Correct Answer
C. High support and confidence as well as great lift.
Explanation When discovering association rules, it is most important to look for rules that have high support and confidence as well as great lift. Support measures the frequency of the rule in the dataset, confidence measures the reliability of the rule, and lift measures the strength of association between items. Therefore, rules with high support and confidence indicate that they are frequently occurring and reliable, while great lift indicates a strong association between the items in the rule.
Rate this question:
Quiz Review Timeline +
Our quizzes are rigorously reviewed, monitored and continuously updated by our expert board to maintain accuracy, relevance, and timeliness.