provocationofmind.com

Unlocking Sensitive Data Security with Snowflake's New UI

Written on

At Snowflake, our mission is to empower customers to fully leverage their data while adhering to strict compliance requirements and ensuring the protection of sensitive information. Recognizing the critical need for quick and effective identification of sensitive data, we consistently innovate with features such as classification, tag-based policies, and an intuitive Data Governance UI.

We are thrilled to announce the launch of the new Snowflake Data Classification UI in Snowsight, now available to all users. This feature simplifies the identification and tagging of sensitive information, streamlining data management and safeguarding processes for organizations. We will start with a brief overview, followed by an examination of the classification process using SQL and API calls, before delving into the specifics of this new user interface.

Simplifying Data Identification and Protection

Snowflake’s Data Classification feature equips organizations to effectively locate and label sensitive data. By automating the detection and tagging of personal information, users can bolster their data security while ensuring compliance with privacy laws.

Main Benefits

  • Control Over Data Access: Enables informed decisions regarding who has access to sensitive information.
  • Data Sharing Insights: Assists in recognizing Personally Identifiable Information (PII), facilitating better choices about third-party data sharing.
  • Policy Enforcement: Aids in the implementation of masking or row access policies to protect sensitive information.

Functionality of the Feature

The Data Classification feature in Snowflake scans table columns to identify personal and sensitive information, applying predefined system tags to the detected data. These tags are divided into two primary categories:

  • Semantic Category (SNOWFLAKE.CORE.SEMANTIC_CATEGORY): Marks personal attributes such as names, ages, or phone numbers.
  • Privacy Category (SNOWFLAKE.CORE.PRIVACY_CATEGORY): Classifies data types into categories like Identifier, Quasi-Identifier, or Sensitive data.

These tags play an essential role for data engineers in monitoring and safeguarding information, ensuring compliance with privacy regulations.

Classification Management via SQL

To utilize Snowflake’s classification process, users need specific privileges, including roles with SELECT and APPLY TAG rights. Snowflake provides built-in views and functions to help track classification activities and tag assignments.

Classification can be performed through SQL commands or via the Snowsight interface, offering flexibility in data management. Users can opt to classify individual tables or conduct asynchronous classifications for all tables within a schema.

  • To classify a specific table, utilize the system call SYSTEM$CLASSIFY.

CALL SYSTEM$CLASSIFY('<table_name>', {'auto_tag': true});

  • To view tag assignments, query TAG_REFERENCES_ALL_COLUMNS.

SELECT * FROM TABLE(hr.INFORMATION_SCHEMA.TAG_REFERENCES_ALL_COLUMNS('<table_name>', 'table'));

Data Classification via API

While the Data Classification UI in Snowsight provides an easy method for managing and applying tags to sensitive information, advanced users may prefer Snowflake’s traditional APIs for more detailed control. These APIs, including EXTRACT_SEMANTIC_CATEGORIES and ASSOCIATE_SEMANTIC_CATEGORY_TAGS, facilitate comprehensive data classification and tag management.

Overview of Traditional APIs

The classic APIs for data classification cater to users seeking additional control or wishing to automate their classification workflows. These APIs are operational, although they do not receive new feature updates. For more complex requirements, the following APIs are available:

  1. EXTRACT_SEMANTIC_CATEGORIES: Analyzes table columns to identify and extract semantic categories, such as names or ages.
  2. ASSOCIATE_SEMANTIC_CATEGORY_TAGS: Assigns classification tags to columns based on the results from EXTRACT_SEMANTIC_CATEGORIES.

Classifying Data with Traditional APIs

1. Classify a Single Table

To classify a specific table, follow these steps:

  • Analyze: Execute the EXTRACT_SEMANTIC_CATEGORIES function to identify the semantic categories of the columns.

SELECT EXTRACT_SEMANTIC_CATEGORIES('my_db.my_schema.my_table');

  • Review: Check the output to ensure the categories are accurate.
  • Apply: Use the ASSOCIATE_SEMANTIC_CATEGORY_TAGS stored procedure to automatically apply tags.

CALL ASSOCIATE_SEMANTIC_CATEGORY_TAGS('my_db.my_schema.hr_data', EXTRACT_SEMANTIC_CATEGORIES('my_db.my_schema.hr_data'));

Alternatively, tags can be applied manually using an ALTER TABLE statement.

ALTER TABLE my_db.my_schema.hr_data

MODIFY COLUMN fname

SET TAG SNOWFLAKE.CORE.SEMANTIC_CATEGORY='NAME';

Utilizing the Classification API unlocks powerful capabilities. For further details, refer to the official documentation linked below.

Data Classification via Snowsight

Within Snowsight, users can initiate a data classification job across an entire schema or select specific tables. This method allows for larger-scale classifications without needing to analyze results on a per-table basis. An auto-tagging feature is also available, enabling high-confidence classifiers to automatically tag objects, simplifying the classification process.

Users can choose to apply these tags automatically or review them manually prior to application. The user-friendly interface provides a clear overview of classification outcomes, facilitating the management and protection of sensitive information.

Steps to Classify and Tag Tables in a Schema:

  1. Start Classification and Tagging:
    • Open Snowsight and navigate to the desired schema via the object explorer.
    • Access options by selecting the More menu (...).
    • Choose Classify and Tag Sensitive Data.
  2. Select Warehouse and Tables:
    • If not already in use, select a warehouse.
    • Choose the tables you wish to classify. By default, no tables are preselected.
  3. Configure Advanced Options:
    • Auto-tagging Data: Automatically applies tags to columns after classification. This is enabled by default but can be disabled if needed.
    • Include Custom Classifiers: Utilize custom classifiers available to you. Check access by selecting View custom classifiers and running the provided command in a worksheet.
    • After reviewing and adjusting these settings, select Classify and Tag Sensitive Data. Note that Snowsight can classify up to 1,000 tables.
  4. Monitor and Review Classification:
    • Allow the classification process to complete. A green checkmark will indicate completion in the CLASSIFICATION column.
    • Click View Results to inspect and, if necessary, modify tag values. Ensure you have the necessary privileges on the SNOWFLAKE database to make changes.
  5. Finalize Classification:
    • Follow prompts to review and approve classification results. Adjust tag values as necessary.
    • Select Complete Classification to apply the reviewed tags.
  6. Verify Tag Assignments:
    • To confirm tag assignments, select the table, navigate to the Columns tab, and review the TAGS column.
    • Alternatively, use a worksheet to invoke the TAG_REFERENCES_ALL_COLUMNS function to view tag assignments for specific columns.
  7. Review Classification Records:
    • Consult the Account Usage DATA_CLASSIFICATION_LATEST view for records by navigating through the object explorer or querying in a worksheet.

Best Practices for Classification

  • Validation: Prioritize classifying frequently accessed data objects.
  • Consistency: Employ clear, consistent column names and appropriate data types to improve classification accuracy.

Conclusion

Snowflake’s Data Classification feature represents a significant advancement in assisting organizations to identify, label, and protect sensitive data. By streamlining these processes, Snowflake enables users to accelerate data analysis while ensuring compliance with data privacy standards. Whether utilizing the Snowsight UI or SQL commands, Snowflake simplifies data protection, enhancing both data governance and security.

For further information, visit: https://docs.snowflake.com/en/user-guide/classify-intro

Share the page:

Twitter Facebook Reddit LinkIn

-----------------------

Recent Post:

Finding Co-Founders for Your Startup: Essential Tips for 2024

Explore effective strategies and platforms to find co-founders for your startup or business in 2024.

Innovative Approaches to Combat Antibiotic Resistance

Exploring new strategies to tackle antibiotic resistance, including drug repurposing, plasma-activated water, and targeting biofilms.

Embracing Gratitude: Transforming Perspectives on Flaws

Explore the power of gratitude in shifting our focus from flaws to appreciation, fostering a more positive outlook on life.

# Innovations in Chromosome Construction: A New Era in Genetics

Explore the latest advancements in DNA sequencing and chromosome construction, highlighting the journey from redesigning to building chromosomes.

The Thrilling Ride of Bitcoin: Halving, ETFs, and Altcoins

Explore the exhilarating journey of Bitcoin as it surges in value, driven by halving and ETFs, and discover the role of altcoins in this crypto landscape.

AI and the Future: Will Machines Outthink Us by 2045?

Exploring the implications of AI potentially surpassing human intelligence by 2045, and the challenges of measuring such advancements.

Overcoming Life's Greatest Challenges: A Journey to Success

Explore how to conquer fears and pursue dreams with effective principles and personal insights.

Empowering Single Moms: Breaking the Silence on Challenges

A heartfelt exploration of the struggles faced by single moms, highlighting the need for support and understanding in their educational journeys.