Showing posts with label SharePoint 2010 Search. Show all posts
Showing posts with label SharePoint 2010 Search. Show all posts

Friday, September 14, 2012

Attivio SharePoint 2010 search integration issues and concerns you need to know

If you implement Attivio SharePoint 2010 search connector, users could search SharePoint content and metadata through centralized Attivio search interface with other contents such as wiki, email, files share, Documentum, and eRoom. After reviewing the Attivio architecture and SharePoint connector guide, we have some concerns and questions that should be resolved for SharePoint search integration. Some of them are critical that need to be addressed before going to production.

1. System pages and galleries crawling - Attivio SharePoint 2010 connector guide indicated SharePoint Object Model connector does not support crawling system galleries. However, this limitation does not indicated in SharePoint Web Services connector which is we are using. Since SharePoint pages and galleries will contain many pages and exclude them from the crawling will be best practice to reduce the performance impact. The following galleries and system page should be excluded from the crawling and there may be more need to be excluded you may check the sacreen shot for SharePoint site.
  • Web Part Gallery
  • Site Template Gallery
  • List Template Gallery
  • Master Page Gallery
  • Theme Gallery
  • From Template system files
  • IWConvertedForms
  • Workflow Forms
 
2. SharePoint 2010 permissions crawling - Attivio SharePoint 2010 connector guide indicated target audiences and audience filtering are not supported. There is no way to return the target audience of an item. As a result, the search will not apply target audiences permission. Users not inside target audiences might be able to search and view the content. This needs to be verified and addressed.

3. SharePoint 2010 content type crawling - Attivio SharePoint 2010 connector guide indicated content types are not supported and we had concern that content with customized content type might not be indexes. Attivio consultants have confirmed this is not correct and all content with different content types will be indexed and will be searchable. This needs to be tested.

4. SharePoint 2010 crawling configuration - Attivio SharePoint 2010 connector guide indicated the NoCrawlproperty for lists and sites is not available. As a result, we could not exclude any list or sites to be excluded in Attivio search. We have some secrete site collections in the system we do not exposure to any users except some restricted users. Owners of these sites might not want to expose any content through other search UI even the permission has been properly applied. We may need to identify some workaround to address this.

5. SharePoint 2010 MySite crawling - Attivio SharePoint 2010 connector guide indicated to pass http://host:port/personal/username rather than http://host:port/MySite. SharePoint treats MySites as separate repositories. We are not sure whether we need to pass each and every personal my site URL which is more than 10,000 in our company. This need to be address if MySite content need to be searchable through Attivio.

6. SharePoint 2010 Meeting Workspaces crawling - Attivio SharePoint 2010 connector guide indicated crawling Meeting Workspaces causes the server to queue child pages such as Workspace Pages that do not exist, which in turn causes an Exception error message during a crawl. This needs to be testing and verified.

7. SharePoint 2010 audit and logs – SharePoint will contain audit logs and other logs inside content database. At this point, we are not sure whether Attivio will index any of these. We are hoping these will not be indexes to avoid performance issue. This needs to be confirmed.

8. SharePoint 2010 entitlement policy – We are implementing Nextlabs SharePoint entitlement solution to deny certain group users to selected site content even those users are granted permissions through SharePoint. The SharePoint search will be integrated with Nextlabs SharePoint entitlement policies and will block those users to search or view selected content. However, Attivio SharePoint 2010 connector will not aware of the Nextlabs SharePoint entitlement policies and may expose selected content to those users. We might need to customize Attivio search to Nextlabs SharePoint entitlement policies through Nextlabs policy web services before display search result to end users.

9. SharePoint 2010 retention policy – We are implementing retention policy to some site content. For example, if we apply the retention policy to one site as seven year policy, content will be deleted automatically after seven years. The SharePoint backup tape may have one year retention policy and will be recycled after one year. The same one year policy should be applied to Attivio index tapes. In other words, anything deleted from SharePoint should not exist on Attivio side even backup tapes.

10. Attivio SharePoint 2010 connector web service – This web service contains several interfaces that will not only read but also update and delete SharePoint contents. Although this is not a real issue now but we are surprised that crawling process web service contains update and change interfaces. We would need to be careful only grant Attivio SharePoint crawling account as READ only and may utilize the following update interfaces.
  • CancelCheckOut
  • CheckOut
  • Checkin
  • CopyItem
  • CreateDocument
  • CreateFolder
  • DeleteItems
  • DeleteVersion
  • MoveItems
  • Promote
  • SetAttachments
  • SetPermissions
  • UpdateItem
If you found anything else we need to be concerned on Attivio SharePoint 2010 search, please share with us.

Thursday, April 21, 2011

Why SharePoint 2010 search does not show some results?

SharePoint 2010 search is better than ever before. Enterprise search for SharePoint 2010 contains all the features and functionality of MOSS 2007 Search, like people search, but goes further with richer navigation, refinement and related search capabilities. After Microsoft acquired FAST 2 years ago and applied this to SharePoint, it now offers as a separate add-on to SharePoint for those willing to invest in high end enterprise search.

Since search is very comportment in SharePoint adoption, there are many cases users reported they could not find the expected items in the search result. I would like to share the tips we found since August 2010 so you could explain and resolve those "issues" quickly.

Before we dig into some specific search "issues", first you should take a look of the different SharePoint Versions Search Comparison before implement it in your company. Searches like People Search, Social, Taxonomy integration not included in Search Server 2010 Express. Second, verify search services have been setup and associated with webapp and crawling process has been completed without errors. Incremental  crawling process has also been scheduled. You could verify from central admin and refer some instructions. Now let's dig into some specific search "issues" users reported frequently.

1. Why documents or items did not displayed in my search result while is for some other people? This is very common questions people complain and most of the cases, it's the permission issue. Specific users may not have permission to read those documents and as a result, it will not be displayed in the search result.

2. Why documents on one site do not show in the search result but others shown?
Besides the permission checking, this is typical site search disabled setting issue. Go to "Site Actions"->"Site Settings"->Site Administration"->"Search and offline availability" to verify "Allow this site to appear in search results" is set to yes as shown in the screen shot. This is site setting and site owners could hide the items on the site intentionally not to display in the search result.


3. Why search not showing any results for anonymous users?
This is similar site search disabled setting issue described above and you could change site setting and set "Always index all Web Parts on this site" to true.

4. Why some of my documents did not show in the search result?
Yes, this happens to several users and this may be related to Draft documents that are in document library that requires approval. You could check whether your document library enabled the approval process by going to "Document Library Settings"->"Version Settings" and whether "Require content approval for submitted items?" is set to yes as in the screen shots.




There is another interesting setting on the Document Library setting "Draft Item Security" that will impacting your search result.

Here is the case if you have file 1 has version 1.0 that is in published status and version 1.1 checked-in but not approved.

If  "Draft Item Security"is set to "Only users who can edit items", file 1 will NOT in any search result as designed since the crawling process will only index the latest version. In this case, the latest version has not been published and as a result, this file will not be indexed and will not in search result.

However, people reported if "Draft Item Security" is set to "Any user who can read items", file 1 with version 1.0 is still in the index and will be displayed in the search result. However, my testing does NOT display this file in the search result that is consistent with the golden rule - only latest published version will be indexed and if latest version is in draft version, none of the versions will be indexed as designed.

4. Why PDF files not show in the search result?
This is easy answer. You need to install PDF iFilter to index PDF files. You could follow Microsoft instruction to install it.

5.  Why social tagging and discussion not show in the result?
You may check whether jobs to index the social tagging are running correctly. The job for this is scheduled hourly and you could change it from central admin as administrator. The job is named as "User Profile Service - Social Data Maintenance Job" that is to Aggregates social tags and ratings and cleans the social data change log.

6. Why people search not show any result?
Besides you need to have check whether you are NOT running Search Server 2010 Express, this may be the mysite search index issue. You should included the MySite in the index content as in the screen shot. You may refer the instruction listed in some bog and verify it.


7. Why no result show in the default simple search but show in advanced search?
If you can see your items in advanced search but not in default simple search, this might be the webapp zone setting bug described in MSDN discussion board we identified during testing. If you have multiple zones setup for the same webapp. Please mask sure the default zone is Intranet NOT other settings. This seems to be a bug you could not get search result if the default zone is NOT Intranet. We may need to submit this bug to Microsoft.

8. Why my external line of business data not show in the result even we have BCS setup already?
If you have setup BCS to bring external data to SharePoint through external content type and did not find those in search result, this might be the search setting issue. You could configure this in SharePoint Central Administration as described in previous blog.

9. Why documents or items did not displayed in my search result even I have the permission through SharePoint?
If you have implemeted some security policy like NextLabs and enabled Search Result Trimming, it will allows SharePoint to limit the display of search results to only those web parts or documents (i.e., list items) which the search user is authorized to view based on NextLabs entitlement policy. On the worlds, even users have been granted permission through SharePoint to have access those contents, they will not be able to see in search result if thay are restricted from NextLabs policy entitlement. This is one of the requirement from our security team to block groups to access the sesitive content. See my new blog for details.


There are some other search issues we may share later.