As an enterprise search initiative, our company started to evaluate Attivio search engine that should add world-class content analysis, search and navigation.
Attivio’s Active Intelligence Engine™ (AIE)
indexes content and metadata, performs advanced linguistic and context
analysis, delivers permission-aware, relevance-ranked search and content
navigation. One of the search integration targets is SharePoint 2010. Since
there is limited architecture diagram and document on the architecture how Attivio integration
with SharePoint and what are the impacts to SharePoint, this blog will cover
the Attivio architecture, Attivio
SharePoint connector, limitation
of the Attivio SharePoint integration
so you could refer to manage the Attivio integration with SharePoint.
Attivio architecture has three major layers including Endpoint API layer (Ingestion) to crawl all the contents, Universal Index layer to create indexes, and Query API layer to expose search. There are other services including ingestion services and asynchronous workflows for cleansing and enriching content before it is persisted in the Universal Index, system services for backup and logs, and Transport Layer enables workflow communication and distribution across one or many nodes. The detailed architecturediagram is listed below.
Attivio SharePoint connector supports all SharePoint lists including document libraries, calendar, tasks, issues, discussions boards and all SharePoint objects as well as a read/write feature. It also gives users access to all site collections in a farm, including subsite connection.
Attivio SharePoint connector installation is very simple and one wsp solution named entropysoft.sharepoint.webservice.wsp will be deployed to SharePoint farm. Since there is very limited documentation on how the connector works, we will dig into at what components will be deployed so we will be able to understand how it works.
After Attivio SharePoint connector installation, the following changes will be made to SharePoint farm.
2. Four dll files deployed to assembly GAC
3. Two web services files will be deployed to ISAP folder
<deny users="?" />
<add type="Entropysoft.Sharepoint.Webservice.ExceptionSoapExtension, Entropysoft.Sharepoint.Webservice, Version=18.104.22.168, Culture=neutral, PublicKeyToken=08ab0f4d3c6ea37b" priority="2" group="0" />
<add type="Microsoft.Web.Services2.WebServicesExtension, Microsoft.Web.Services2,Version=22.214.171.124, Culture=neutral, PublicKeyToken=31bf3856ad364e35" priority="1" group="0" />
After the connector installation, you could verify the connector through web service. The URL is http://<servername>/<sites>/_vti_bin/sharepointConnector.asmx. You could view the wsdl by appending the ?wsdl as you normally do for all other SharePoint web services.
If you add a service account through central admin to have read access all the webapp site collection and pass only webapp root site collection URL, your have complete the SharePoint side installation and configuration. The Attivio crawling process will use the web service call GetSiteCollectionsUrls to retrieve all site collections inside the webapp and then call web services to index all content and metadata inside the site collection. After the first full crawling, the connector will use web service call GetChanges to index any future changes.
As SharePoint administrator, you may be concerned on the performance impact to the system especially on the first FULL crawling process. You should conduct the performance testing on the crawling process on non production environment and schedule this on non working hours in production.
Now, you should feel comfortable to manage the Attivio SharePoint 2010 connector installation, configuration, and support. We will focus on some of the issues in next blog.