Cost Governance of Snowflake Connector for SharePoint¶

Note

The Snowflake Connector for SharePoint is subject to the Connector Terms.

Important

Thank you for your interest in the Snowflake Connector for SharePoint. We’re now focused on a next-generation solution that will offer a significantly improved experience; therefore, moving this connector to the general availability status is currently not on our product roadmap. You may continue to use this connector as a preview feature, but please note that support for future bug fixes and improvements is not guaranteed. The new solution is available as Openflow Connector for SharePoint and includes better performance, customizability, and enhanced deployment options.

This topic provides best practices for cost governance and finding the optimal warehouse size for the Snowflake Connector for SharePoint.

Measuring Cost of the Connector¶

If the connector has a separate account only for data ingestion and storage, and the account shows no other activity (such as executing queries by users using the ingested data), you can read the overall cost on the account level. To learn more, refer to Exploring overall cost.

If the account is not dedicated only to the connector or you need to investigate the costs further, you should analyze the charged costs for the three components separately:

Compute Cost
Storage Cost
Cortex Search service cost
Data Transfer Cost

For an introduction to these three components of cost, refer to Understanding overall cost.

General Recommendations¶

To obtain cost generated by the connector, we recommend that you create a separate account solely for using the connector. This way you can track the exact data transfer generated by the connector.

If you cannot use a separate account for the connector, try the following:

Create a separate database for storing ingested data to track storage cost easier.
Allocate a warehouse only for the connector to get the exact compute cost.
Use object tags on databases and a warehouse to build custom cost reports.

Compute Cost¶

We recommend that you create a separate warehouse only for the connector. This setup allows you to create resource monitors on the warehouse. You can use the monitors to send email alerts and suspend the warehouse, stopping the connector when the set credit quota is exceeded. The connector automatically resumes after the credit quota is renewed. Note that setting credit quota too low in configurations where large volumes of data are ingested may cause the connector to not ingest all data.

For information on how to check credits consumed by the warehouse, refer to Exploring compute cost. You can also assign object tags to the warehouse and use the tags to create cost reports.

If the warehouse used by the connector is used by other workflows, you can split the cost by roles. To split usage by roles, see Attributing cost and add the following WHERE clause on the QUERY_HISTORY view:

warehouse_name = '<connector warehouse name>' AND role_name = 'APP_PRIMARY'

Copy

The query gives only an approximation of the cost.

Note

Only one native app may use the warehouse, otherwise costs of different applications are inseparable because each native app uses the same role name (APP_PRIMARY).

Parse document function cost¶

For cost considerations related to the Parse document function, see Cost considerations.

Cortex Search service cost¶

For cost considerations related to the Cortex Search service, see cost considerations.

Storage Cost¶

The Snowflake Connector for SharePoint stores data in two places:

The connector database, which is created from the listing and which holds the connector internal state.
The user-specified schema where the ingested data is stored.

Data storage is also used by the Snowflake Fail-safe feature. The amount of data stored in Fail-safe depends on the table updates done by the connector. The amount of data increases if the table rows ingested from SharePoint are updated frequently or the whole table is reloaded. Typically, seven to ten days after the connector is set up, the amount of Fail-safe data stabilizes (assuming that no reloads are performed and that the flow of ingested data is at a steady rate).

If you want to check storage usage in Snowsight, we recommend that you have a separate database for storing ingested data. This way you can filter the graphs for storage usage by object, which shows usage by separate databases. You can also do it by querying the DATABASE_STORAGE_USAGE_HISTORY view and filtering by both databases used by the connector.

If the database contains other schemas not related to the connector, you can query storage usage of a specific schema that is dedicated to the data ingested from the connector. You can get the information from TABLE_STORAGE_METRICS view after filtering by database and schema names and aggregating columns with storage usage.

Data Transfer Cost¶

The connector uses external access to retrieve data from SharePoint. Snowflake charges only for egress traffic generated by the connector, based on the size of the requests from the connector to SharePoint. The responses from SharePoint do not generate cost on Snowflake side.

Information on data transfer usage is available only in the aggregated form for all external access integrations on the account level. To access the number of transferred bytes, use the DATA_TRANSFER_HISTORY view and filter by the EXTERNAL_ACCESS transfer type.

Healthcheck Task Cost¶

The connector creates a serverless task that will regularly check health status of your app instance and send only the summarized result (if it’s healthy or not) to Snowflake. The task is created after completing the installation wizard (or calling FINALIZE_CONNECTOR_CONFIGURATION in worksheets). It runs in the background and generates a fixed cost of up to 0.5 credit/day even if no SharePoint folder is enabled for replication.

The task cannot be manually stopped or dropped. However, to reduce this cost you can call PAUSE_CONNECTOR procedure which will disable the task and not generate any cost when the connector is unused.

Cost Optimization¶

Determining the Optimal Warehouse Size for the Connector Instance¶

To find the optimal warehouse size for the connector, you should consider the factors that affect the performance of the connector, such as the size of the instance, the number of enabled tables, and the schedule for synchronizing each table. For example, if only a few tables are enabled the connector might not benefit from increased parallelization.

We recommend that you define a set of measurable expectations, such as time intervals in which all tables should be synchronized, and pick the smallest warehouse size that meets these expectations. For large amounts of ingested data with tens of synchronized tables, the default recommendation is Large warehouse. On the other hand, when you just want to try out the connector and enable a single table for ingestion, an X-Small warehouse should be sufficient. To find out if you can downsize the warehouse, refer to Monitoring warehouse load.