Wednesday, October 24, 2012

Tips to get accurate list of user created documents exclude system ones from SharePoint database



There are some use cases and scenarios where you need to retrieve SharePoint site collection information quickly without performance suffer through object model that will iterate all sub-sites and lists. One of the use cases is to evaluate third party search crawling index process accuracy on the site collection. We would like to get the accurate list of all documents exclude system ones in a site collection to compare the crawling result. The key here is to exclude system documents.

There are many different ways including using server object model, client object model, PowerShell, search API, SOAP/REST web service, or 3rd party tools. Another way, and a really quick one, is querying directly the SP content db tables or views. Since reading from a SharePoint DB (with the exception of the logging database) might leave it in an unsupported state indicated in KB article, I would suggest doing this in non-production environment. You could also confirm the script with Microsoft before run on production.

If you read the SP content db schema, you might find several SQL scripts to get list of documents from SharePoint database. Some of them are from SP 2007 that will no longer work and some of them will list many system files/documents. I modified the SQL script and added several enhancements so we could get accurate list of user created documents exclude system ones from SharePoint database without blocking other queries. Here are some tips.

  1. Use Docs view instead of AllDocs table to simplify query
  2. Join Docs view with UserData view to exclude system files/documents 
  3. All (NOLOCK) hint to both Docs and UserData views to prevent performance degradation
  4. Filter out file type to exclude system like files. Example is LeafName LIKE '%.stp' to exclude solution files
  5. Filter out directory type to exclude system like files. Example is filesDirName LIKE '%_catalogs%' as branding files
 Here is the script and you might need to exclude more additional system like files/documents.

SELECT

distinct DirName AS Directory,

UserData.nvarchar7 AS Title,

LeafName AS DocName,

Docs.Size AS Size,

Docs.TimeCreated AS Time

FROM Docs AS Docs WITH (NOLOCK) INNER JOIN UserData(NOLOCK) as UserData ON (Docs.DoclibRowId = UserData.tp_ID) AND Docs.ListId = UserData.tp_ListId

WHERE (DoclibRowId > 0)

AND Docs.ListID = UserData.tp_ListID
AND Docs.DoclibRowId = UserData.tp_ID
AND Docs.Type <> 1
AND (LeafName NOT LIKE '%.stp') 
AND (LeafName NOT LIKE '%.aspx')
AND (LeafName NOT LIKE '%.xfp')
AND (LeafName NOT LIKE '%.dwp')
AND (LeafName NOT LIKE '%template%')
AND (LeafName NOT LIKE '%.inf')
AND (LeafName NOT LIKE '%.css')
AND (LeafName NOT LIKE '%.master')
AND (LeafName NOT LIKE '%.xml')
AND (Docs.Size > 1)
AND (DirName NOT LIKE '%_catalogs%')
AND (DirName NOT LIKE 'personal%')
AND (DirName NOT LIKE '%Style%')
AND (DirName NOT LIKE '%SiteCollectionImages')
AND (DirName NOT LIKE 'mysites%')
AND (DirName NOT LIKE '%Publishing%')
AND (DirName NOT LIKE '%OntolicaStyles%')
AND (DirName NOT LIKE '%Data Connections for PerformancePoint%')
AND (DirName NOT LIKE '%ProjectBICenter/Templates%')
AND (DirName NOT LIKE '%Data Connections%')
AND (DirName NOT LIKE '%Reporting Templates%')
AND (DirName NOT LIKE '%FormServerTemplates%')
AND (DirName NOT LIKE '%ProjectBICenter/Sample Reports%')
AND (DirName NOT LIKE '%Pages%')

ORDER BY Directory


There are some other useful SQL query on users on the site collection and list data you could play around. Please remember to use these on non-production environment and consultant with Microsoft support before execute on production.

Tuesday, October 9, 2012

Tips and tricks to use Quark Microsoft SharePoint 2010 Adapter



Quark offers integration between SharePoint and the dynamic publishing software so that customers can quickly implement and easily use SharePoint from within Quark. This is done through Quark Microsoft SharePoint Adapter to connect your creative and publishing systems to the most widely used content management and business collaboration platform, establishing an enterprise-wide content management and collaboration platform with all the power of a dynamic publishing system. This helps you deliver materials to multiple media more quickly, improve accuracy, reduce bottle necks, and lower overall marcom costs.

Here are key capabilities of the solution:

  • By connecting Quark dynamic publishing software with Microsoft SharePoint, the Marcom and Creative teams gain full access to SharePoint, and knowledge workers gain access to Quark’s publishing capabilities so they can participate directly in the publishing process — all while remaining in SharePoint.
  • Use SharePoint as the repository for all assets used in the production process, and for the final output of PDF, Flash®, HTML, and XML, eliminating data redundancy.
  • Use the same workflows that other knowledge workers use, improving collaboration and productivity
  • Allocate resources, make task assignments, share calendars, and use SharePoint’s communications functions, improving operational efficiency.
  • Manage their own projects and tasks and participate in other projects across the enterprise, improving project management efficiency.
  • Use SharePoint’s social features to support blogs and wikis.
  • Continue using their specialized publishing tools such as QuarkXPress®, Adobe® Photoshop®, Illustrator®, and others, while also using the organization’s standard business tools such as Microsoft Office. 
After we install Quark SharePoint 2010 adapter to our environment, we have identified some issues. Here are some tips and tricks that might help you to manage the Quark SharePoint 2010 adapter deployment.
1.  How to enable Quark to webapp without any site collection? The simple answer is you can't. You need to have a site collection URL in order to complete Quark installation. Otherwise, you will get the following error.
2. How to enable Quark to multiple webapps? When you install Quark SharePoint adapter, it will require to enter a site URL. The adapter will will be fully installed to that webapp. Quark hidden features will be enabled on that site collection. However, any other site collection inside other webapp will not have Quark features. Here is the trick to copy all required dll from webapp deployed to webapp you need to enable Quark.

The following directory is the typical path for the “bin” folder on dedfault webapp
C:\inetpub\wwwroot\wss\VirtualDirectories\80\bin


Quark.DS.LibraryUtilities.dll
Quark.DS.ReferenceResolver.dll
Quark.DS.SharePointAlert.dll
Quark.DS.SharepointSnapshot.dll


Copy to another webapp with port number 46777
C:\inetpub\wwwroot\wss\VirtualDirectories\46777\bin

You could verify the Quark web service will be available for webapp (http://<servername>:46777) after you done this.

The follow two Quark features will be deployed to 14/TEMPLATES/FEATURES folder.
 3. How to enable Quark feature enabled on site collections that need it? First, you need to follow instruction from tip #2 to install Quark, and then follow Quark instruction to activate features on the site.
 Start > Programs > Quark XML Author SharePoint Adapter > Site Collection Configuration Tool and enter site URL and the "Install".

You could also use powershell command below to enable the Quark features.
Enable-SPFeature –Identity XA.SharePoint.ServerAdapter_QXAFeatures –url http://sbx18:46777
Enable-SPFeature –Identity XA.SharePoint.ServerAdapter_XADocumentType –url http://sbx18:46777


4. How to remove or avoid Quark feature to be enabled on site collections that do not need it? If you provide a URL during Quark installation, the site will have Quark features installed and enabled. System admin will be able to use powershell to see one site feature and one web feature enabled on the site. Users will see one command from Site Actions and many site columns as shown in the below screen shots.




If you want to remove the Quark features, you would need to use powershell to disable the two features.

Disable-SPFeature –Identity XA.SharePoint.ServerAdapter_XADocumentType –url http://sbx18:46777
Disable-SPFeature –Identity XA.SharePoint.ServerAdapter_QXAFeatures –url http://sbx18:46777

Please note even you disable the Quark features, only the custom command will be removed, the site columns will remain on the site! There is no clean way to remove all Quark related features functions when you enabled on the site. 

Our suggestion is to create the site collection that will be used for Quark collaboration first and pass it to Quark installation.

5. How to remove or uninstall Quark software? You could run uninstall Quark software but most of the components like dlls in bin directory, web services will NOT be removed. You may need to check the regestry entry whether those will be cleaned up. 
 [HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Office Server\14.0
\Search\Applications\<Site>\Gather\Portal_Content\Extensions\ExtensionList]

You might need to remove them manually if you need the system to be clean.

As you see, the major two issues for Quark SharePoint adapter are lacking enable multiple webapp and uninstall clean up procedures. You might need to pay additional attention for any SharePoint upgrade process that might impact your Quark function since Quark adapter web services are deployed to same SharePoint web service location.