After we upgraded from 2010 to 2013, we found internal
users are not able to consistently login to the two web applications with the
mixed authentications. Different internal users hitting different web
applications on different Web Front servers at different time might get the following
different results.
- Login without issue
- Sorry, something went wrong. An unexpected error has occurred
- Stay on Sign In page
- Sorry this page has not been shared for you
- Server Error in ‘/’ Application
Here are the screenshots for different error.
We have working with Microsoft support on this critical
production SharePoint 2013 issue several days after go-live without solutions.
The issue seems to be related to SharePoint 2013 farm with multiple mixed authentication
providers (forms and Windows as example) web applications that can be reproduced on several environments. The exception from the ULS log is listed below.
Unexpected
System.ArgumentException: Exception of type 'System.ArgumentException' was
thrown. Parameter name: encodedValue at Microsoft.SharePoint.Administration.Claims.SPClaimEncodingManager.DecodeClaimFromFormsSuffix(String
encodedValue) at
Microsoft.SharePoint.Administration.Claims.SPClaimProviderManager.GetProviderUserKey(IClaimsIdentity
claimsIdentity, String encodedIdentityClaimSuffix) at
Microsoft.SharePoint.Administration.Claims.SPClaimProviderManager.GetProviderUserKey(String
encodedIdentityClaimSuffix) at
Microsoft.SharePoint.Utilities.SPUtility.GetFullUserKeyFromLoginName(String loginName)
at Microsoft.SharePoint.ApplicationRuntime.SPHeaderManager.AddIsapiHeaders(HttpContext
context, String encodedUrl, NameValueCollection
headers) at
Microsoft.SharePoint.Application...
If you decompile Microsoft SharePoint package and the exception seems to be thrown when SahrePoint try to decode userID and failed to find the "|" inside claims like "i:0#.w|DOMAIN/username". Since there is no quick solution at this point, we looked at the exception and based on the following logic, we came up the workaround listed below. Here is the details for the logics and adjustment we have applied to production to at least reduce the issue if
not resolved. We will have to do more research to identify which change is absolute necessary.
1. The first change is to
configure LdapMembershipProvider to use version 15 (2013 version)
instead of version 14 (2010 version).The reason behind this
is SharePoint 2013 might modified the LdapMembershipProvider
implementation and we may run into authentication issue if we use 2010
version. The updated configuration is listed below if for web
application web.cnfig.
<add name="LdapMember"
type="Microsoft.Office.Server.Security.LdapMembershipProvider,
Microsoft.Office.Server, Version=15.0.0.0,
Culture=neutral, PublicKeyToken=71e9bce111e9429c"
server="xldap.qualcomm.com" port="636"
useSSL="true" connectionUsername="uid=spexovd,ou=People,o=Sharepoint
Extranet,o=qualcomm.com" connectionPassword="Qualcomm123"
useDNAttribute="false" userDNAttribute="entrydn"
userNameAttribute="uid" userContainer="ou=people,o=Corporate
Legal,o=qualcomm.com" userObjectClass="person"
userFilter="(ObjectClass=person)" scope="Subtree"
otherRequiredUserAttributes="sn,givenname,cn" />
You still need to modify Security Token Service web.config and Central Administration web.config as discussed in different blog.
2. The second change is we removed the sessionstate from the web.config. The following three lines have been removed since the issue seems to be server side authentication confused on the cached user authentication. If we remove the server side session and leverage the client cookie, it might reduce the issue.
<sessionState
mode="SQLServer" timeout="60"
allowCustomSqlDatabase="true" sqlConnectionString="Data
Source=SPSQLSTG3;Initial Catalog=SessionStateDatabase;Integrated
Security=True;Enlist=False;Pooling=True;Min Pool Size=0;Max Pool Size=100;Connect
Timeout=15" />
<remove
name="Session" />
<add
name="Session"
type="System.Web.SessionState.SessionStateModule" />
3. The third change is to disable the page session session state. The thought behind this is same as previous reason.
<pages enableSessionState="false" enableViewState="true"
enableViewStateMac="true" validateRequest="false"
clientIDMode="AutoID" pageParserFilterType="Microsoft.SharePoint.ApplicationRuntime.SPPageParserFilter,
Microsoft.SharePoint, Version=15.0.0.0, Culture=neutral,
PublicKeyToken=71e9bce111e9429c" asyncTimeout="7">
4. The fourth change is to add client cookie persistent session time as one hour as described in Jalil Sear's blog. The thought behind this is we are utilize the client cookie instead of server session to persistent user's information, we would like to keep the cookie not expire quickly. The change is in RED.
<cookieHandler mode="Custom" path="/" persistentSessionLifetime="60">
5. The fifth change is to set the security token service to default configuration as default.
Get-SPSecurityTokenServiceConfig
Set-SPSecurityTokenServiceConfig -FormsTokenLifetime 600
$sec=Get-SPSecurityTokenServiceConfig
$sec.LogonTokenCacheExpirationWindow=6000000000
$sec.Update()
$sec
$sec=Get-SPSecurityTokenServiceConfig
$sec.CookieLifetime=4320000000000
$sec.Update()
$sec
The thought behind this is to make sure we have the correct security token service configuration on SharePoint 2013.
6. This sixth change is to fix SharePoint 2013 distributed cache bug as Jason Warren described in his blog. The issue Jason described is that occasionally, a user would click on link and instead of receiving the expected page they would unexpectedly be redirected to the sign in page where they were prompted to log in again. This is similar to what we experienced.
As Jason explained that when SharePoint tried to retrieve the token from distributed cache, the connection would time out or a connection would be unavailable and the comparison would fail. Since it couldn't validate the presented token SharePoint had no choice but to log the user out and redirect them to the sign in page.
The fix he provided is summarized below.
7. The seventh change the load balancer for the two web applications with multiple authentications. One VIP URL points to only one server and another points to different server. The though is based on Microsoft DSE inside information that different SharePoint 2013 client with multiple authentications has the similar login issue.The exception indicated that might be a bad user claims that might be introduced by multiple authentications that inside the cache.
As Jason explained that when SharePoint tried to retrieve the token from distributed cache, the connection would time out or a connection would be unavailable and the comparison would fail. Since it couldn't validate the presented token SharePoint had no choice but to log the user out and redirect them to the sign in page.
The fix he provided is summarized below.
- Apply AppFabric Cumulative Update 3, AppFabric Cumulative Update 4, or a later AppFabric CU to all servers in the farm
- Add backgroundGC key to DistributedCacheService.exe.config file on all cache servers
- Restart AppFabric Windows Service on all cache servers
- Restart Distributed Cache SharePoint service on all cache servers
- Reset IIS (IISRESET) on all servers in the farm
- Increase distributed cache client settings for affected containers using the Set-SPDistributedCacheClientSetting cmdlet.
- Increase security token service values with Get-SPSecurityTokenServiceConfig
- Restart AppFabric, and Distributed Cache on cache servers
7. The seventh change the load balancer for the two web applications with multiple authentications. One VIP URL points to only one server and another points to different server. The though is based on Microsoft DSE inside information that different SharePoint 2013 client with multiple authentications has the similar login issue.The exception indicated that might be a bad user claims that might be introduced by multiple authentications that inside the cache.
8. The eighth change is to remove the sticky session from the load balancer for the two web applications with multiple
authentications. The original though is user could fail over to another
server if encountered an error. However, since we have modified the VIP
to point to only one server, this setting is no longer relevant as I
can see. We will try to add this setting back and verify.
Although we noticed the login issue has dramatically reduced, we are still getting such error randomly. If you have similar issue, please let me know and we could push Microsoft to get final solution.
Harry,
ReplyDeleteThanks for this post.
What u have done, same we have done in our organization,
We also have migrated from sp 2010 to sp2013, we also have multilpe authentication for same url.
And if any user become inactive more than 30 mins than it gets error as above mentioned "sorry something went wrong".
so it will very helpful if u can guide
the same issue, 7-8, thank you
ReplyDeleteDid Microsoft ever reach any solution for this issue? Is there a KB article about it?
ReplyDeleteThis comment has been removed by the author.
ReplyDelete