Capacity scenario is a storage use case powered by our available Sharepoint datasets. This allows customers to better understand how their storage is being used by providing meaningful insights and analytics offered by the SharePoint Sites and SharePoint Files datasets
After you follow these steps, you will have a great set of Power BI dashboards related to SharePoint Capacity Scenario, like the one shown below.
The first step to running this template would be to create an application in the tenant and use that appId and secret to setup the other required resources.
- Navigate to app registrations in your subscription.
- Register a new application
- Save the application id (In the screenshot, the one ending in 9826). Navigate to API permissions
- In the App, navigate to Owners and add an owner who will be responsible for running the pipeline. This is required for the pipeline to run successfully.
- Navigate to "Certificates and secrets" in the left pane and click on "New client secret"
- Provide a description and add a secret
- Copy the value of this new secret and save it securely before navigating away from this page
- Use this link to initiate the setup of the pre-requisites. Use the appid and secret created in the previous steps. Custom deployment - Microsoft Azure here
The link above sets up the pre-requisites to using the capacity analytics template, which are:
- Create a Synapse Workspace
- Create a Spark Pool for the Synapse workspace
- Create a storage account for the extracted data
- Grant permission to the Synapse workspace & the MGDC Service Principal to the storage account as Blob Data Contributor
- Create an Azure SQL Server (optional)
- Create a sample database within the Azure SQL Server (optional)
By clicking on the above button (or navigating to the linked URL), users will be brought to the Azure portal on the Custom deployment page.
On that screen, on top of providing information about the resource group and region to deploy the components into, they will need to provide the following information:
- Application Id to be used by MGDC (from step #3, ending in 9826)
- Application secret for that app
- A new password for the Azure SQL Server
Once all required information has been provided, click on the Review + create button at the bottom of the page:
This will validate that the information provided to the template is correct. Once the information has been validated, click on the Create button at the bottom of the page.
This will initiate the deployment. It should normally take about 5 minutes for the whole deployment to complete.
- After the pre-reqs are complete, navigate to the Synapse workspace just created
- Open the Synapse Studio
- Navigate to "Integrate -> Add new resource -> Browse gallery"
- Search for "SharePoint" and select the "Unlock Capacity Analytics and Insights for SharePoint and OneDrive using Microsoft 365 Datasets" template and Continue
- Create the new Target Linked services required by this pipeline
- Provide the parameters of the Linked Service
- Select Authentication Type = Service Principal
- Use the storage account name, SPN id and secret (SPN key) from the pre-req steps above
- Test Connection and then click on Create
- Repeat the Linked Service creation steps for the source Linked Service
- Select "Open Pipeline"
- Click on "Publish All" to validate and publish the pipeline
- Review the changes and click Publish
- Verify that the pipeline has been successfully published
- Trigger the pipeline
- Provide the required parameters - StartTime, EndTime, StorageAccount, and StorageContainer created by the pre-req steps above (Note: names are case sensitive) NOTE: This template is designed to run only for a full snapshot; therefore, StartTime and EndTime must be the same.
-
You ran the pipeline created the request for the SharePoint Sites and SharePoint Files datasets which will be used to power the report! The data will be processed and delivered to your storage account.
-
You will see the data in the storage account. The report will pull the data from the uploaded capacity directory in your storage
Note: In order to draw data to the Power BI report, you first need to run the Synapse pipeline as instructed in the previous section.
The report will always pull the latest data from your Azure Data Lake Storage Gen 2. The report is to been as a template and is to be modified based on your requirements
Below steps will help to link datasets that are generated using Synapse pipeline above to link to Power BI Template.
- Download and install Microsoft Power BI Desktop if you don’t have it installed already on your machine.
- Link to download Download Microsoft Power BI Desktop from Official Microsoft Download Center. here
- Download the pre-created Power BI Capacity report that can generate insights from data that is produced using Synapse pipeline in azure storage locations. Link to download PowerBI Report here
- Open the Power BI file and click on Transform data → Data source settings
- You will see 1 data sources in the Data source settings page. Change the Storage account path in URL with right storage account that data is generated from synapse pipeline in the steps above. You can get the storage account that is used in Synapse template pipeline Step 6 above
-
Now we need to give the right storage account key / credentials for these data sources.
-
Congratulations, you are all set and will see that the report will be refreshed with the latest data
-
If you see any errors or if the data is not being refreshed, please make sure you have entered the correct storage account details, path, and credentials in the data source settings
-
The report is designed to filter out system sites. System sites include, but are not limited to, those with Site Types such as 'SPSMSITEHOST', 'SRCHCENTERLITE', and 'TENANTADMIN'. Please adjust this filter to suit your requirements
Additional Notes:
- The Power BI report will always pull the latest data copied during the Synapse pipeline run.
- The Power BI report in this document is simply a template and is to be modified based on your requirements.