One of the projects I finished before is to implement a SharePoint 2003 -:) Site Usage report .
- Note: SP2010 has a web analysis service running at different levels and provide detailed web analysis OOTB. Now, in SP2013, it is not a SharePoint Service any more, instead it is integrated into Search service and is a standing alone feature with its own Content DB. From 2003 –> 2007 –> 2010 –> 2013, Web analysis is changed significantly. Although the component was changed dramatically, the Object Model is still similar, and approaches listed below are still useful and can be served as a solution design reference.
Requirement: For upgrading 2003 to 2010. It is helpful to understand what the most popular SharePoint sites/pages are, and which SharePoint sites/pages are never used before the upgrading work.
Step One: extract the site usage data from the SharePoint Sites, including all sites, sub sites and each detailed pages.
Options:
1. Use a third party tool, like CardioLog, looks it provides exactly what I need, but the full version of the software is not free.
2. Use existing usage static page: this is same as if you configure the usage statics on the central admin page. As we already knew, this approach is not including sub sites, we have to navigate to all the sites/sub sites to get usage data. This is not a good approach if we need to go through over 5,000 sites.
3. Use customized web page to collect the site and sub sites usage data and gather them together into one single page – in this way you do not need to navigate to all the sub sites, and it will return all the usage data back on one page for one site collection. – It is a preferred approach. we can then go through all the site collections and get the usage data from all sites and sub sites and put them together.
5. Use Stsadm tool. There is no command related with site usage analysis, but there are a couple of commands to loop through the site collections. Did not find out if there is any built-in option available for Listing usage data there, Looks like I have to build a stsadm command extension, this is sort of painful, especially when SharePoint 2003 Object Model is not as neat as 2007 version.
6. Work on configuring the Log file and find/build a log reader to get the usage data and build a UI to display the extracted Usage data. This also needs customized code to parse the log data.
7. Work on SharePoint Database to extract all usage information is recorded on a SharePoint database table. Then using query against the database to calculate the usage summary data. – This need further understand the DB schema structure regarding how it is to store the usage data, and need to access SharePoint DB without using an API.
8. Write or use some tool to parse the user request data information directly at the IIS level since IIS 6.0 support the ISAPI routing pipeline. – This is a good approach if we do not require detailed request information but it needs lots of work on IIS log information parsing process.
Compare with different approaches:
Option 3 is a better choice, it requires using .NET 1.1 to against the SharePoint 2003 Object Model using VS tool. There are two options:
1.Create a unique virtual directory under the portal web site in IIS to deploy the page with codes running on the back end if building a web site, and we do not want it interfere with the running SharePoint sites.
2.Using a script page runs script to access the Object Model to get information. We can deploy it under some secured site so only administrator can view the page results.
Step Two: Analyze the data
A well-structured data format is the base to simplify web site usage report analysis. Since a script page can control how to write out a file, decide to use an xmlwriter class to write out a xml format file on the output page, the save the page as a xml file. Using excel 2010’s new feature to read the xml file directly and then doing analysis accordingly.
Solution Design:
Final decision is option 3, and then writing a script page to reach the SharePoint 2003 Object Model to loop through all virtual server, site collections, sub sites and each pages, and use usage analysis function to get the usage data. Since we are using object model, we get all the current usage data for the latest 31 days by default. This script is writing a xml formatted output on the page itself, and it is then saved as a xml file, and it is proceed by excel 2010 tool.
