1.
What is data skew is when partitions in database become disproportionately-sized?
Correct Answer
A. True
Explanation
Data skew occurs when the partitions in a database have disproportionate sizes. This means that some partitions contain significantly more data than others, leading to an imbalance in the distribution of data across the database. This can have negative effects on performance, as queries may take longer to execute on partitions with larger data sizes. Therefore, the statement "True" is correct as it accurately describes data skew in a database.
2.
Which of the following are true about the variant data type in Snowflake? Select all that apply.
Correct Answer(s)
A. A. Optimized storage based on repeated elements
C. C. Can be queried using JSON path notation
Explanation
The variant data type in Snowflake provides optimized storage based on repeated elements, allowing for efficient storage and retrieval of data. Additionally, the variant data type can be queried using JSON path notation, making it easy to extract specific data elements from the variant column. However, it does not require a custom mapping for each record type, as the variant data type can handle different types of data without the need for explicit mapping.
3.
The Snowflake Lodge is a community site that does not allow members to post questions?
Correct Answer
B. False
Explanation
The statement "The Snowflake Lodge is a community site that does not allow members to post questions" is false. This means that members of the Snowflake Lodge community site are allowed to post questions.
4.
When dealing with a large complex query, the user must:
Correct Answer
B. B. Scale up the cluster
Explanation
When dealing with a large complex query, scaling up the cluster is the correct approach. Scaling up the cluster means increasing the resources (such as CPU, memory, storage) of the existing nodes in the cluster. This allows the cluster to handle larger and more complex queries by providing more computing power and storage capacity. Rewriting the query may help optimize its performance, but it is not the primary solution for dealing with a large complex query. Scaling out the cluster, on the other hand, involves adding more nodes to the cluster, which may not be necessary or efficient for handling a single large query.
5.
The closer the ratio of scanned micro-partitions and columnar data is to the ratio of actual data selected, the more efficient is the pruning performed on the table?
Correct Answer
A. True
Explanation
The statement is true because when the ratio of scanned micro-partitions and columnar data closely matches the ratio of actual data selected, it indicates that the pruning performed on the table is efficient. This means that only the necessary data is being scanned and retrieved, leading to faster and more efficient query execution. Therefore, the closer the ratio is, the more efficient the pruning is.
6.
When dealing with query throughput, the user must:
Correct Answer
C. C. Scale out the cluster
Explanation
Scale out the cluster to allow for more concurrency (increase the number of pipes)
7.
Snowflake includes administration settings for resource consumption in order to:
Correct Answer
A. A. Help control costs associated with unexpected credit usage of warehouses
Explanation
Snowflake includes administration settings for resource consumption in order to help control costs associated with unexpected credit usage of warehouses. This means that Snowflake provides tools and settings that allow users to monitor and manage the amount of credits being used by their warehouses. By having control over resource consumption, users can prevent excessive credit usage and avoid unexpected costs.
8.
Some DDL operations are metadata only operations?
Correct Answer
A. True
Explanation
Deleting all rows in a table is a metadata only
9.
In a multi-cluster environment, what are the names of the scaling policies that can be selected (select all that apply)?
Correct Answer(s)
B. B. Standard
C. C. Economy
Explanation
In a multi-cluster environment, the scaling policies that can be selected are "Standard" and "Economy". The options "Normal" and "Maximize" are not mentioned as available scaling policies. Therefore, the correct answer is B. Standard and C. Economy.
10.
What are the types of tables in Snowflake (select all that apply)?
Correct Answer(s)
B. B. Transient
C. C. Permanent
E. E. Temporary
Explanation
The types of tables in Snowflake are Transient, Permanent, and Temporary. Transient tables are used for temporary data storage and are automatically deleted after a specified period of time or when the session ends. Permanent tables store data persistently and are not automatically deleted. Temporary tables are similar to transient tables but are explicitly created and dropped by the user. Therefore, the correct answers are B. Transient, C. Permanent, and E. Temporary.
11.
What are the steps Snowflake pruning?
Correct Answer(s)
B. B. Eliminate micro-partitions that are not needed for the query
D. D. Eliminate by column within the remaining micro-partitions
Explanation
Snowflake pruning is a technique used to optimize query performance in Snowflake. The correct answer states that the steps involved in Snowflake pruning are to eliminate micro-partitions that are not needed for the query and then eliminate columns by sorting within the remaining micro-partitions. This means that Snowflake will first remove any unnecessary micro-partitions from the query, reducing the amount of data that needs to be scanned. Then, within the remaining micro-partitions, Snowflake will further optimize the query by eliminating unnecessary columns. This helps to minimize the amount of data that needs to be processed, resulting in faster query execution.
12.
Create Role can be granted within a Snowflake account by the administrator?
Correct Answer
A. True
Explanation
The statement is true because the administrator of a Snowflake account has the authority to grant the "Create Role" privilege to users within the account. This privilege allows users to create new roles, which are used to manage and assign permissions to groups of users. By granting this privilege, the administrator can delegate some administrative tasks and empower users to create and manage roles as needed.
13.
What is the recommend size of files to be loaded via Snowflake's Snowpipe?
Correct Answer
D. D. 10-100MB compressed
Explanation
The recommended size of files to be loaded via Snowflake's Snowpipe is 10-100MB compressed. This size range ensures efficient and optimal performance during the loading process. Files within this range can be processed quickly and effectively, minimizing the overall load time and maximizing the system's capabilities.
14.
How many resource monitors can you have at the account level?
Correct Answer
B. B. 1
Explanation
At the account level, you can have only one resource monitor. This means that you can only monitor the resources of your account using a single resource monitor.
15.
Drop Role can be granted within a Snowflake account by the administrator?
Correct Answer
A. True
Explanation
In Snowflake, the administrator has the ability to grant the "DROP ROLE" privilege to users. This privilege allows users to drop or delete roles within the Snowflake account. Therefore, it is true that the "DROP ROLE" privilege can be granted by the administrator in a Snowflake account.
16.
Create User permission can be granted within a Snowflake account by the administrator?
Correct Answer
A. True
Explanation
The explanation for the given correct answer is that the administrator of a Snowflake account has the ability to grant the Create User permission. This permission allows the administrator to create new user accounts within the Snowflake account. By granting this permission, the administrator can control who has access to the account and manage the user accounts effectively.
17.
What is the largest size of a micro-partition?
Correct Answer
C. C. 16 MB compressed
Explanation
The largest size of a micro-partition is 16 MB when it is compressed. This means that the data within the micro-partition is compressed to reduce its size, allowing for more efficient storage and processing. The compressed size of 16 MB allows for better utilization of storage resources and improved performance in data retrieval and analysis.
18.
Snowflake has three types of caching to optimize performance. Select the three types of caches from the list:
Correct Answer(s)
B. B. Warehouse
D. D. Results
F. F. Metadata
Explanation
Snowflake has three types of caching to optimize performance: Warehouse, Results, and Metadata. The Warehouse cache stores recently accessed data in memory to avoid reading from disk, improving query performance. The Results cache stores the results of recently executed queries, allowing for faster retrieval of the same results. The Metadata cache stores information about the database schema, table structures, and other metadata, reducing the need for frequent metadata queries.
19.
Drop User permission can be granted within a Snowflake account by the administrator?
Correct Answer
A. True
Explanation
In Snowflake, the administrator has the authority to grant the Drop User permission within the account. This permission allows the administrator to delete a user from the Snowflake account. Therefore, the statement "Drop User permission can be granted within a Snowflake account by the administrator" is true.
20.
Which proven data modeling techniques does Snowflake support (check all that apply)?
Correct Answer
E. E. All of the above
Explanation
Snowflake supports all of the mentioned proven data modeling techniques, including 3rd Normal Form, Data Vault, Star Schema, and Snowflake Schema. This means that users can choose the most suitable modeling technique for their specific data requirements and design their data structures accordingly. Snowflake's flexibility in supporting various modeling techniques allows for efficient and optimized data storage and retrieval based on the specific needs of the organization.
21.
The clustering depth for a table is an absolute or precise measure of whether the table is well-clustered.
Correct Answer
B. False
Explanation
Ultimately, query performance is the best indicator of how well-clustered a table is.
22.
Grant Privilege permission can be granted within a Snowflake account by the administrator?
Correct Answer
A. True
Explanation
In Snowflake, the administrator has the ability to grant privilege permissions within the account. This means that the administrator can assign specific privileges to users or roles, allowing them to perform certain actions or access certain resources within the Snowflake account. Therefore, the statement "Grant Privilege permission can be granted within a Snowflake account by the administrator" is true.
23.
A table that consists of a single micro-partition or no micro-partitions (i.e. an empty table) always has a clustering depth of 0?
Correct Answer
A. True
Explanation
A table that consists of a single micro-partition or no micro-partitions will always have a clustering depth of 0 because clustering depth refers to the number of levels in a table's clustering keys. In this case, since there is only one micro-partition or no micro-partitions, there are no levels to consider, resulting in a clustering depth of 0.
24.
Snowflake provides specific administration features and capabilities to support the following activities except:
Correct Answer
D. D. Manage 3rd party applications providing data to a Snowflake account
Explanation
Snowflake provides specific administration features and capabilities to support managing databases and warehouses within a Snowflake account, managing roles and users within a Snowflake account, and monitoring usage and managing resources to control costs in a Snowflake account. However, Snowflake does not provide specific administration features and capabilities to manage 3rd party applications providing data to a Snowflake account.
25.
The use case and audience drives the selection of the proven data modeling techniques?
Correct Answer
A. True
Explanation
The use case and audience play a crucial role in determining the appropriate data modeling techniques to be used. Different use cases may require different levels of complexity and granularity in data modeling. For example, a simple data model may be sufficient for a small-scale application, while a more sophisticated and detailed data model may be necessary for a large enterprise system. Similarly, the audience's level of technical expertise and their specific requirements also influence the selection of data modeling techniques. Therefore, it is true that the use case and audience drive the selection of proven data modeling techniques.
26.
Snowflake tables are logical structures of collections of columns and rows of data physically stored in Snowflake's FDN files (micro-partitions)
Correct Answer
A. True
Explanation
Snowflake tables are indeed logical structures that organize data into columns and rows. These tables are physically stored in Snowflake's FDN files, which are micro-partitions. Micro-partitions are a key feature of Snowflake's architecture, as they enable efficient data storage and query processing. Therefore, the given answer, "True," is correct.
27.
A defined data set is a point in time snapshot and can be updated by customers is?
Correct Answer
A. Snowflake's Cloning
Explanation
Snowflake's Cloning allows customers to create a defined data set, which is a point in time snapshot of their data. This snapshot can then be updated by customers, meaning they can make changes or additions to the data set as needed. This feature enables customers to have control over their data and ensure that it is up to date and accurate.
28.
As a general rule, Snowflake recommends ordering the keys from lowest cardinality to highest cardinality for a clustering key.
Correct Answer
A. True
Explanation
Snowflake recommends ordering the keys from lowest cardinality to highest cardinality for a clustering key because it helps optimize query performance. By ordering the keys in this manner, Snowflake can efficiently group and organize the data, which can lead to faster query execution and improved data retrieval. This approach ensures that the data is evenly distributed across the clusters, reducing the need for data movement during query processing. Therefore, it is advisable to follow this general rule when defining the clustering key in Snowflake.
29.
Cluster keys should be defined with high cardinality columns
Correct Answer
B. False
Explanation
Cluster keys should not be defined with high cardinality columns. High cardinality columns have many distinct values, which can result in uneven data distribution across the nodes in a cluster. This can lead to performance issues and inefficient query execution. Instead, cluster keys should be defined with low cardinality columns, such as those that have fewer distinct values, to ensure better data distribution and improved performance.
30.
If you want to use a column with very high cardinality as a clustering key, we recommend defining the key as an expression on the column, rather than on the column directly, to reduce the number of distinct values.
Correct Answer
A. True
Explanation
instead of c_timestamp, use to_date(c_timestamp)
31.
A defined data set can be shared with customers outside the Snowflake account is?
Correct Answer
B. Snowflakes's Data Sharing
Explanation
Snowflake's Data Sharing allows a defined data set to be shared with customers outside the Snowflake account. This feature enables organizations to securely and efficiently share data with external parties, such as customers or partners, without the need for data movement or duplication. It provides real-time access to the shared data, ensuring that all parties have access to the most up-to-date information. With Snowflake's Data Sharing, organizations can collaborate and exchange data seamlessly, enhancing their business relationships and improving decision-making processes.
32.
This type of cache lives on the Compute instance?
Correct Answer
D. D. Warehouse
Explanation
The correct answer is D. Warehouse. A warehouse cache is a type of cache that lives on the Compute instance. It is used to store and retrieve frequently accessed data from the warehouse. This cache improves performance by reducing the need to fetch data from the warehouse every time it is requested.
33.
Which of the following are actions that can be take from a Resource Monitor trigger (select all that apply)?
Correct Answer(s)
A. A. Suspend
C. C. Suspend immediately
D. D. Notify
Explanation
The Resource Monitor trigger allows for three actions to be taken: suspending the task, suspending the task immediately, and notifying the user. Therefore, options A, C, and D are all valid actions that can be taken from a Resource Monitor trigger.
34.
Query statement encryption is supported on ___________ accounts?
Correct Answer
D. Virtual Private Snowflake (VPS)
Explanation
Query statement encryption is supported on Virtual Private Snowflake (VPS) accounts. This feature provides an additional layer of security by encrypting the queries that are executed on the Snowflake platform. So, it’s true that query statement encryption is supported on Virtual Private Snowflake (VPS) accounts.
35.
To recluster a table, an admin would execute the RECLUSTER command?
Correct Answer
B. False
Explanation
Reclustering is done automatically. Manual reclustering has been deprecated
36.
Metadata cache is used to optimize queries and improve query compile time?
Correct Answer
A. True
Explanation
Metadata cache is a component in a database management system that stores metadata information about the database objects such as tables, columns, indexes, etc. This information is used by the query optimizer to generate efficient query execution plans. By storing this information in the metadata cache, the query optimizer can quickly access and utilize it during query compilation, resulting in improved query performance and reduced compile time. Therefore, the statement that metadata cache is used to optimize queries and improve query compile time is true.
37.
Warehouse cache may be reset or invalidated if the Virtual Warehouse is suspended and resumed?
Correct Answer
A. True
Explanation
When a Virtual Warehouse is suspended and resumed, it is possible for the warehouse cache to be reset or invalidated. This means that any data or information stored in the cache may be cleared or become invalid. This can occur due to various reasons such as system maintenance or updates. Therefore, the statement "Warehouse cache may be reset or invalidated if the Virtual Warehouse is suspended and resumed" is true.
38.
Caching techniques are supported by Snowflake's performance optimizing query methods?
Correct Answer
A. True
Explanation
Snowflake's performance optimizing query methods do support caching techniques. Caching is a technique used to store frequently accessed data in a cache, which allows for faster retrieval and reduces the need to access the underlying storage system. By utilizing caching techniques, Snowflake can improve query performance by retrieving data from the cache instead of fetching it from the disk. This helps to reduce latency and improve overall query response times, making the statement "True" accurate.
39.
Semi-structured data can be queried using SQL while incorporating JSON path notation?
Correct Answer
A. True
Explanation
Semi-structured data refers to data that does not have a strict schema or structure, such as JSON or XML. SQL is a standard query language used to manage and manipulate structured data in relational databases. However, with the increasing popularity of semi-structured data formats like JSON, SQL has also evolved to support querying and manipulating such data using JSON path notation. This allows users to query and extract specific data elements from semi-structured data using familiar SQL syntax. Therefore, the statement that semi-structured data can be queried using SQL while incorporating JSON path notation is true.
40.
B-tree type indexes are supported by Snowflake's performance optimizing query methods?
Correct Answer
B. False
Explanation
B-tree type indexes are not supported by Snowflake's performance optimizing query methods.
41.
In general, you should try to match the size of the warehouse to the expected size and complexity of the queries to be processed by the warehouse?
Correct Answer
A. True
Explanation
It is important to match the size of the warehouse to the expected size and complexity of the queries to be processed. If the warehouse is too small, it may not be able to handle the volume of data and the complexity of the queries efficiently, leading to slower processing times and potential performance issues. On the other hand, if the warehouse is too large for the expected workload, it may result in unnecessary costs and resources. Therefore, it is crucial to carefully consider the size and complexity of the queries when determining the appropriate size of the warehouse.
42.
SQL functionality can be extended via (select all of the appropriate answers):
Correct Answer(s)
A. A. SQL User Functions (UDF)
B. B. Javascript UDFs
C. C. Session Variables
Explanation
SQL functionality can be extended via SQL User Functions (UDF), Javascript UDFs, and Session Variables. SQL User Functions (UDF) allow users to define their own functions in SQL that can be used in queries. Javascript UDFs enable users to write custom functions using Javascript code that can be called within SQL queries. Session Variables are variables that can be set and used within a session to store temporary data. Therefore, options A, B, and C are all correct as they all provide ways to extend the functionality of SQL.
43.
This type of cache lives on the Cloud Services layer?
Correct Answer
C. C. Metadata
Explanation
The correct answer is C. Metadata. Metadata is a type of cache that lives on the Cloud Services layer. Metadata refers to the data that provides information about other data. In the context of cloud services, metadata can include information about the structure, format, and characteristics of data stored in the cloud. This type of cache is used to improve performance and efficiency by storing frequently accessed metadata in a cache to reduce the need for repeated database queries.
44.
A customer needs to have a preexisting account with AWS or MicroSoft Azure in order to have a Snowflake account or to use Snowflake?
Correct Answer
B. False
Explanation
False. A customer does not need to have a preexisting account with AWS or Microsoft Azure in order to have a Snowflake account or to use Snowflake. Snowflake is a separate cloud-based data warehousing platform that can be used independently of AWS or Azure. While Snowflake can be integrated with AWS or Azure for additional functionality, it is not a requirement to have an account with either of these platforms to use Snowflake.
45.
How many availability zones does Snowflake replicate to for disaster recovery and high availability?
Correct Answer
C. C. Three
Explanation
Snowflake automatically does this for each account. It is built-in and fully managed by Snowflake
46.
Warehousing cache stores the results of the queries that have been executed for 24 hours unless the underlying data changes, at which point the entry is invalidated?
Correct Answer
B. False
Explanation
The given statement is false. Warehousing cache does not store the results of queries for 24 hours. Instead, it stores the results until the underlying data changes. Once the data changes, the cache entry is invalidated. Therefore, the cache does not have a fixed time limit for storing query results.
47.
Retrieving results of previous query from cache is supported by Snowflake's performance optimizing query methods?
Correct Answer
A. True
Explanation
Snowflake's performance optimizing query methods include the ability to retrieve results of previous queries from cache. This means that if a query has already been executed and the results are stored in the cache, subsequent queries can retrieve the results from the cache instead of re-executing the query. This can significantly improve query performance and reduce the amount of data that needs to be processed. Therefore, the statement "Retrieving results of previous query from cache is supported by Snowflake's performance optimizing query methods" is true.
48.
Snowflake security supports user-based access control.
Correct Answer
A. True
Explanation
Snowflake security supports user-based access control, meaning that it allows administrators to grant or restrict access to specific users or groups. This feature ensures that only authorized individuals can access and manipulate data within the Snowflake system. By implementing user-based access control, Snowflake enhances data security and helps organizations maintain strict control over their data assets.
49.
Federated authentication in Snowflake is complaint with SAML 2.0?
Correct Answer
A. True
Explanation
Federated authentication in Snowflake is compliant with SAML 2.0. This means that Snowflake supports the use of SAML 2.0 for federated authentication, allowing users to authenticate to Snowflake using their existing SAML 2.0 identity provider. SAML 2.0 is an industry-standard protocol for exchanging authentication and authorization data between parties, making it a widely supported and secure option for federated authentication.
50.
A best practice of load and store Semi-structured data in Snowflake is to parse the semi-structure string into structured columns on source data load?
Correct Answer
B. False
Explanation
the data should be loaded and stored in a VARIANT data type.