According to the SnowPro Advanced: Architect documents and learning resources, column masking policies are applied at query time based on the privileges of the user who runs the query. Therefore, if a user has the privilege to see unmasked data in a column, they will see the original data when they query that column. If they load this column data into another column that does not have a masking policy, the unmasked data will be loaded in the new column, and any user who can query the new column will see the unmasked data as well. The masking policy does not affect the underlying data in the column, only the query results.
References:
Snowflake Documentation: Column Masking
Snowflake Learning: Column Masking
Question 21
When using the copy into
command with the CSV file format, how does the match_by_column_name parameter behave?
Options:
A.
It expects a header to be present in the CSV file, which is matched to a case-sensitive table column name.
B.
The parameter will be ignored.
C.
The command will return an error.
D.
The command will return a warning stating that the file has unmatched columns.
Option B is the best design to meet the requirements because it uses Snowpipe to ingest the data continuously and efficiently as new records arrive in the object storage, leveraging event notifications. Snowpipe is a service that automates the loading of data from external sources into Snowflake tables1. It also uses streams and tasks to orchestrate transformations on the ingested data. Streams are objects that store the change history of a table, and tasks are objects that execute SQL statements on a schedule or when triggered by another task2. Option B also uses an external function to do model inference with Amazon Comprehend and write the final records to a Snowflake table. An external function is a user-defined function that calls an external API, such as Amazon Comprehend, to perform computations that are not natively supported by Snowflake3. Finally, option B uses the Snowflake Marketplace to make the de-identified final data set available publicly for advertising companies who use different cloud providers in different regions. The Snowflake Marketplace is a platform that enables data providers to list and share their data sets with data consumers, regardless of the cloud platform or region they use4.
Option A is not the best design because it uses copy into to ingest the data, which is not as efficient and continuous as Snowpipe. Copy into is a SQL command that loads data from files into a table in a single transaction. It also exports the data into Amazon S3 to do model inference with Amazon Comprehend, which adds an extra step and increases the operational complexity and maintenance of the infrastructure.
Option C is not the best design because it uses Amazon EMR and PySpark to ingest and transform the data, which also increases the operational complexity and maintenance of the infrastructure. Amazon EMR is a cloud service that provides a managed Hadoop framework to process and analyze large-scale data sets. PySpark is a Python API for Spark, a distributed computing framework that can run on Hadoop. Option C also develops a python program to do model inference by leveraging the Amazon Comprehend text analysis API, which increases the development effort.
Option D is not the best design because it is identical to option A, except for the ingestion method. It still exports the data into Amazon S3 to do model inference with Amazon Comprehend, which adds an extra step and increases the operational complexity and maintenance of the infrastructure.
References: 1: Snowpipe Overview 2: Using Streams and Tasks to Automate Data Pipelines 3: External Functions Overview 4: Snowflake Data Marketplace Overview : [Loading Data Using COPY INTO] : [What is Amazon EMR?] : [PySpark Overview]
The copy into
command is used to load data from staged files into an existing table in Snowflake. The command supports various file formats, such as CSV, JSON, AVRO, ORC, PARQUET, and XML1.
The match_by_column_name parameter is a copy option that enables loading semi-structured data into separate columns in the target table that match corresponding columns represented in the source data. The parameter can have one of the following values2:
The match_by_column_name parameter only applies to semi-structured data, such as JSON, AVRO, ORC, PARQUET, and XML. It does not apply to CSV data, which is considered structured data2.
When using the copy into
command with the CSV file format, the match_by_column_name parameter behaves as follows2:
References:
1: COPY INTO
| Snowflake Documentation
2: MATCH_BY_COLUMN_NAME | Snowflake Documentation
Ace
No problem! I highly recommend Cramkey Dumps to anyone looking to pass their certification exams. They will help you feel confident and prepared on exam day. Good luck!
HarrisOct 31, 2024
That sounds amazing. I'll definitely check them out. Thanks for the recommendation!
Osian
Dumps are fantastic! I recently passed my certification exam using these dumps and I must say, they are 100% valid.
AzaanAug 8, 2024
They are incredibly accurate and valid. I felt confident going into my exam because the dumps covered all the important topics and the questions were very similar to what I saw on the actual exam. The team of experts behind Cramkey Dumps make sure the information is relevant and up-to-date.
Yusra
I passed my exam. Cramkey Dumps provides detailed explanations for each question and answer, so you can understand the concepts better.
AlishaAug 29, 2024
I recently used their dumps for the certification exam I took and I have to say, I was really impressed.
Nia
Why are these Dumps so important for students these days?
MaryOct 9, 2024
With the constantly changing technology and advancements in the industry, it's important for students to have access to accurate and valid study material. Cramkey Dumps provide just that. They are constantly updated to reflect the latest changes and ensure that the information is up-to-date.
Laila
They're such a great resource for anyone who wants to improve their exam results. I used these dumps and passed my exam!! Happy customer, always prefer. Yes, same questions as above I know you guys are perfect.
KeiraAug 12, 2024
100% right….And they're so affordable too. It's amazing how much value you get for the price.
Question 22
Which Snowflake data modeling approach is designed for BI queries?
In the context of business intelligence (BI) queries, which are typically focused on data analysis and reporting, the star schema is the most suitable data modeling approach.
Option B: Star Schema - The star schema is a type of relational database schema that is widely used for developing data warehouses and data marts for BI purposes. It consists of a central fact table surrounded by dimension tables. The fact table contains the core data metrics, and the dimension tables contain descriptive attributes related to the fact data. The simplicity of the star schema allows for efficient querying and aggregation, which are common operations in BI reporting.
Question 23
How can the Snowpipe REST API be used to keep a log of data load history?
Options:
A.
Call insertReport every 20 minutes, fetching the last 10,000 entries.
B.
Call loadHistoryScan every minute for the maximum time range.
C.
Call insertReport every 8 minutes for a 10-minute time range.
D.
Call loadHistoryScan every 10 minutes for a 15-minute time range.
Snowpipe is a service that automates and optimizes the loading of data from external stages into Snowflake tables. Snowpipe uses a queue to ingest files as they become available in the stage. Snowpipe also provides REST endpoints to load data and retrieve load history reports1.
The loadHistoryScan endpoint returns the history of files that have been ingested by Snowpipe within a specified time range. The endpoint accepts the following parameters2:
The loadHistoryScan endpoint can be used to keep a log of data load history by calling it periodically with a suitable time range. The best option among the choices is D, which is to call loadHistoryScan every 10 minutes for a 15-minute time range. This option ensures that the endpoint is called frequently enough to capture the latest files that have been ingested, and that the time range is wide enough to avoid missing any files that may have been delayed or retried by Snowpipe. The other options are either too infrequent, too narrow, or use the wrong endpoint3.
References:
1: Introduction to Snowpipe | Snowflake Documentation
2: loadHistoryScan | Snowflake Documentation
3: Monitoring Snowpipe Load History | Snowflake Documentation