Page MenuHomePhabricator

OAuth doesn't work when $wgBlockDisablesLogin is true
Open, Stalled, HighPublic

Description

Stashbot, Striker, Horizon, and our custom hooks in Keystone/OpenStack all use consumer-only OAuth tokens to authenticate to wikitech. This has been broken since authentication-related configuration became more strict.

This is actively preventing logins to Horizon, logins to Striker, and Stashbot updates to the wikitech SAL pages.

See T218608#5034880 for details of the error. As far as we can tell it has been there all along since the AuthManager rewrite (although it's a bit surprising that no third-party wiki would have reported it in the three years since then).

Event Timeline

Verified that password and password+2fa login/logout via web ui are working as expected.

There are no wikitech log records whatsoever in logstash, probably something is broken there.

OAuth works via a SessionManager provider (which takes priority over the usual one and uses the Authorization header instead of the cookie to identify a user). When $wgBlockDisablesLogin is set, that provider checks whether the user is blocked (so it can reject such sessions). User::getBlockedStatus() (which is at the core of all block checks) used $wgUser, but that only gets set after SessionManager has finished to determine the session user. There is a check in User (User::isSafeToLoad) to handle that situation somewhat, but unlike RequestContext (which will return a temporary anonymous user with that flag set to false), $wgUser will just be null, so the flag did not help there.

Chances are this has been broken since 2016 when the block logic was updated for SessionManager, there just wasn't any wiki using both OAuth and $wgBlockDisablesLogin that would have triggered this bug.

T159299: Deprecate and remove $wgUser would help (the RequestContext user is a lot more manageable than the global). OTOH it is a lot of work (the code changes will be trivial; the test changes sometimes not), and after T159299: Deprecate and remove $wgUser all that code needs to be touched again. So probably not worth the effort right now.

Past on hurdle an on to the next:

2019-03-19T00:28:11Z Stashbot     WARNING : type: pubnotice, source: barjavel.freenode.net, target: #wikimedia-labs, arguments: [u'*** Notice -- TS for #wikimedia-labs changed from 1552955291 to 1320108738'], tags: []
2019-03-19T00:28:16Z Stashbot     ERROR   : Error writing to wiki
Traceback (most recent call last):
  File "/data/project/stashbot/stashbot/sal.py", line 149, in log
    url = self._write_to_wiki(bang, channel_conf)
  File "/data/project/stashbot/stashbot/sal.py", line 277, in _write_to_wiki
    site = self._get_mediawiki_client(channel_conf['wiki'])
  File "/data/project/stashbot/stashbot/sal.py", line 320, in _get_mediawiki_client
    access_secret=conf['access_secret']
  File "/data/project/stashbot/stashbot/mediawiki.py", line 36, in __init__
    url, consumer_token, consumer_secret, access_token, access_secret)
  File "/data/project/stashbot/stashbot/mediawiki.py", line 56, in _site_for_url
    force_login=force_login
  File "/data/project/stashbot/venv-k8s-py2/local/lib/python2.7/site-packages/mwclient/client.py", line 125, in __init__
    raise errors.OAuthAuthorizationError(e.code, e.info)
OAuthAuthorizationError: The authorization headers in your request are not valid: Nonce already used: 42194491243016053901552955295
OAuthAuthorizationError: The authorization headers in your request are not valid: Nonce already used: 42194491243016053901552955295

Looking at the code, this could be caused by a failure of the BagOStuff used as a transient cache by MWOAuthDataStore:

backend/MWOAuthDataStore.php
if ( !$this->cache->add( $key, 1, $timestamp + 300 ) ) {
    $this->logger->info( "$key exists, so nonce has been used by this consumer+token" );
    return true;
}

The client code has not changed at all, so cache failue or some other change on the MediaWiki side that causes the request signature to be checked twice seems the most likely cause.

This comment was removed by bd808.
2019-03-19 02:37:41 [ce9c80f20243ddb8e2ae939f] labweb1001 labswiki 1.33.0-wmf.21 OAuth INFO: DEBUG-T218608 {"used":false} 
[Exception Exception] (/srv/mediawiki/php-1.33.0-wmf.21/extensions/OAuth/backend/MWOAuthDataStore.php:130) 
  #0 /srv/mediawiki/php-1.33.0-wmf.21/extensions/OAuth/lib/OAuth.php(759): MediaWiki\Extensions\OAuth\MWOAuthDataStore->lookup_nonce(MediaWiki\Extensions\OAuth\MWOAuthConsumer, MediaWiki\Extensions\OAuth\MWOAuthToken, string, string)
  #1 /srv/mediawiki/php-1.33.0-wmf.21/extensions/OAuth/lib/OAuth.php(707): MediaWiki\Extensions\OAuth\OAuthServer->check_nonce(MediaWiki\Extensions\OAuth\MWOAuthConsumer, MediaWiki\Extensions\OAuth\MWOAuthToken, string, string)
  #2 /srv/mediawiki/php-1.33.0-wmf.21/extensions/OAuth/lib/OAuth.php(611): MediaWiki\Extensions\OAuth\OAuthServer->check_signature(MediaWiki\Extensions\OAuth\MWOAuthRequest, MediaWiki\Extensions\OAuth\MWOAuthConsumer, MediaWiki\Extensions\OAuth\MWOAuthToken)
  #3 /srv/mediawiki/php-1.33.0-wmf.21/extensions/OAuth/backend/MWOAuthServer.php(268): MediaWiki\Extensions\OAuth\OAuthServer->verify_request(MediaWiki\Extensions\OAuth\MWOAuthRequest)
  #4 /srv/mediawiki/php-1.33.0-wmf.21/extensions/OAuth/api/MWOAuthSessionProvider.php(89): MediaWiki\Extensions\OAuth\MWOAuthServer->verify_request(MediaWiki\Extensions\OAuth\MWOAuthRequest)
  #5 /srv/mediawiki/php-1.33.0-wmf.21/includes/session/SessionManager.php(466): MediaWiki\Extensions\OAuth\MWOAuthSessionProvider->provideSessionInfo(WebRequest)
  #6 /srv/mediawiki/php-1.33.0-wmf.21/includes/session/SessionManager.php(191): MediaWiki\Session\SessionManager->getSessionInfoForRequest(WebRequest)
  #7 /srv/mediawiki/php-1.33.0-wmf.21/includes/WebRequest.php(750): MediaWiki\Session\SessionManager->getSessionForRequest(WebRequest)
  #8 /srv/mediawiki/php-1.33.0-wmf.21/includes/session/SessionManager.php(130): WebRequest->getSession()
  #9 /srv/mediawiki/php-1.33.0-wmf.21/includes/Setup.php(816): MediaWiki\Session\SessionManager::getGlobalSession()
  #10 /srv/mediawiki/php-1.33.0-wmf.21/includes/WebStart.php(77): include(string)
  #11 /srv/mediawiki/php-1.33.0-wmf.21/api.php(35): include(string)
  #12 /srv/mediawiki/w/api.php(3): include(string)
  #13 {main}

2019-03-19 02:37:41 [ce9c80f20243ddb8e2ae939f] labweb1001 labswiki 1.33.0-wmf.21 OAuth INFO: DEBUG-T218608 {"used":true} 
[Exception Exception] (/srv/mediawiki/php-1.33.0-wmf.21/extensions/OAuth/backend/MWOAuthDataStore.php:130) 
  #0 /srv/mediawiki/php-1.33.0-wmf.21/extensions/OAuth/lib/OAuth.php(759): MediaWiki\Extensions\OAuth\MWOAuthDataStore->lookup_nonce(MediaWiki\Extensions\OAuth\MWOAuthConsumer, MediaWiki\Extensions\OAuth\MWOAuthToken, string, string)
  #1 /srv/mediawiki/php-1.33.0-wmf.21/extensions/OAuth/lib/OAuth.php(707): MediaWiki\Extensions\OAuth\OAuthServer->check_nonce(MediaWiki\Extensions\OAuth\MWOAuthConsumer, MediaWiki\Extensions\OAuth\MWOAuthToken, string, string)
  #2 /srv/mediawiki/php-1.33.0-wmf.21/extensions/OAuth/lib/OAuth.php(611): MediaWiki\Extensions\OAuth\OAuthServer->check_signature(MediaWiki\Extensions\OAuth\MWOAuthRequest, MediaWiki\Extensions\OAuth\MWOAuthConsumer, MediaWiki\Extensions\OAuth\MWOAuthToken)
  #3 /srv/mediawiki/php-1.33.0-wmf.21/extensions/OAuth/backend/MWOAuthServer.php(268): MediaWiki\Extensions\OAuth\OAuthServer->verify_request(MediaWiki\Extensions\OAuth\MWOAuthRequest)
  #4 /srv/mediawiki/php-1.33.0-wmf.21/extensions/OAuth/api/MWOAuthSessionProvider.php(89): MediaWiki\Extensions\OAuth\MWOAuthServer->verify_request(MediaWiki\Extensions\OAuth\MWOAuthRequest)
  #5 /srv/mediawiki/php-1.33.0-wmf.21/includes/session/SessionManager.php(466): MediaWiki\Extensions\OAuth\MWOAuthSessionProvider->provideSessionInfo(WebRequest)
  #6 /srv/mediawiki/php-1.33.0-wmf.21/includes/session/SessionManager.php(191): MediaWiki\Session\SessionManager->getSessionInfoForRequest(WebRequest)
  #7 /srv/mediawiki/php-1.33.0-wmf.21/includes/WebRequest.php(750): MediaWiki\Session\SessionManager->getSessionForRequest(WebRequest)
  #8 /srv/mediawiki/php-1.33.0-wmf.21/includes/user/User.php(3571): WebRequest->getSession()
  #9 /srv/mediawiki/php-1.33.0-wmf.21/includes/user/User.php(3878): User->getRights()
  #10 /srv/mediawiki/php-1.33.0-wmf.21/includes/user/User.php(1851): User->isAllowed(string)
  #11 /srv/mediawiki/php-1.33.0-wmf.21/includes/user/User.php(2288): User->getBlockedStatus(boolean)
  #12 /srv/mediawiki/php-1.33.0-wmf.21/includes/user/User.php(2277): User->getBlock(boolean)
  #13 /srv/mediawiki/php-1.33.0-wmf.21/extensions/OAuth/api/MWOAuthSessionProvider.php(120): User->isBlocked()
  #14 /srv/mediawiki/php-1.33.0-wmf.21/includes/session/SessionManager.php(466): MediaWiki\Extensions\OAuth\MWOAuthSessionProvider->provideSessionInfo(WebRequest)
  #15 /srv/mediawiki/php-1.33.0-wmf.21/includes/session/SessionManager.php(191): MediaWiki\Session\SessionManager->getSessionInfoForRequest(WebRequest)
  #16 /srv/mediawiki/php-1.33.0-wmf.21/includes/WebRequest.php(750): MediaWiki\Session\SessionManager->getSessionForRequest(WebRequest)
  #17 /srv/mediawiki/php-1.33.0-wmf.21/includes/session/SessionManager.php(130): WebRequest->getSession()
  #18 /srv/mediawiki/php-1.33.0-wmf.21/includes/Setup.php(816): MediaWiki\Session\SessionManager::getGlobalSession()
  #19 /srv/mediawiki/php-1.33.0-wmf.21/includes/WebStart.php(77): include(string)
  #20 /srv/mediawiki/php-1.33.0-wmf.21/api.php(35): include(string)
  #21 /srv/mediawiki/w/api.php(3): include(string)
  #22 {main}

So, when OAuth is trying to authenticate the user and calls User::getBlockedStatus, that calls User::isAllowed, which triggers another authentication (the user will be cached in the WebRequest object, but only when the WebRequest::getSession call towards the top of the stack returns). Ugh. If the nonce check didn't fail, this would result in a loop.

More precisely:

  • SessionManager tries to determine the session user during setup phase.
  • The OAuth session provider reads the user ID, loads the user and checks whether the user is blocked.
  • User::getBlockedStatus() calls User::isAllowed() (before checking whether the session user is safe to load), that tries to call Session::getAllowedUserRights() (which is used by OAuth and similar extensions to limit user rights based on the authentication method), but getting the session triggers another call to the session provider, and a loop.

So the actual user object being checked here is actually safe to load (it's loaded by ID, it isn't the session user) but that doesn't help.

From IRC, maybe this was a clearer description of the problem: SessionManager asks OAuth to provide a session (= session id + identity of global user) -> OAuth session provider identifies user but wants to return null if user is blocked and $wgBlockDisablesLogin is set -> User::getBlockedStatus() checks whether the user is IP-blocked, which depends on whether the user has ipblock-exempt right -> User::getRights() asks the current session if any rights are revoked by the session provider -> session is not loaded yet, goto step 1

So the root problem is: say we are doing an early check during Setup.php, like an autocreation permission check. The user runs into an autoblock (which would prevent autocreation) but has ipblock-exempt. Except the session provider might want to revoke that right (e.g. OAuth where the relevant box was not checked), but we don't know that since we don't have a session yet. How should that fail?

The current failure mode is that we try to load the session, which might trigger the same permission check, so infinite loop. Alternatives include:

  • Just assume that this is a normal session (no rights revoked) but then the user might be able to do things they shouldn't.
  • Be paranoid and revoke all rights, but that would surely lead to surprising behavior in many scenarios.
  • Have a separate version of User::getBlock() which is session-safe (and either does not look at IP blocks or does not look at whether the user is exempt from them).
  • Try to rethink the entire session loading sequence.

An attempt to do the third option is at https://gerrit.wikimedia.org/r/c/mediawiki/core/+/497449/.

One question worth asking is whether User::newFromId( $id )->getRequest() === $wgRequest really makes sense. It causes an unfortunate side effect in the API that I just filed as T218674: User::getRights() applies session rights restrictions to non-session users. Removing that behavior would also break this loop, since the User created in MWOAuthSessionProvider wouldn't be the session user (yet).

If we did this, then ipblock-exempt would apply to OAuth and BotPasswords SessionProviders even if the grants don't allow that right. But OAuth and BotPasswords should still prevent the resulting session-user from taking advantage of ipblock-exempt, and the check added in T129738 would then prevent the user from taking advantage of any of their rights beyond what an IP has anyway.

Tgr renamed this task from OAuth doesn't work when LDAP is in strict mode to OAuth doesn't work when $wgBlockDisablesLogin is true.Mar 19 2019, 4:43 PM
Tgr updated the task description. (Show Details)

One question worth asking is whether User::newFromId( $id )->getRequest() === $wgRequest really makes sense.

IMO having a User::getRequest method doesn't really make sense in the first place.

Removing that behavior would also break this loop, since the User created in MWOAuthSessionProvider wouldn't be the session user (yet).

User::getRights() triggering session loading (and possibly a loop) is a landmine that can probably be triggered in more ways than this. Getting rid of that would definitely be a win.

If we did this, then ipblock-exempt would apply to OAuth and BotPasswords SessionProviders even if the grants don't allow that right. But OAuth and BotPasswords should still prevent the resulting session-user from taking advantage of ipblock-exempt, and the check added in T129738 would then prevent the user from taking advantage of any of their rights beyond what an IP has anyway.

I think specifically for $wgBlockDisablesLogin we don't really care about IP blocks anyway so this would be fine. The risky part is that any access check that happens during Setup.php would not have the session restrictions applied. I can't think of any actions happening that early (other than autocreation, which it does not make much sense to restrict for OAuth anyway), but maybe on a private wiki which disallows the read permission for OAuth apps it could cause an info leak in some way?

The other option I have been thinking about is moving $wgBlockDisablesLogin handling to SessionManager (which seems like a nice thing to do in general). So

  1. SessionManager calls providers to determine the user identity (and they are not allowed to do block checks)
  2. SessionManager sets the session in User::$mRequest (but not the global request yet; create a DerivativeReuqest, I guess? ugh)
  3. SessionManager checks for blocks (which works because User::$mRequest now has a fully loaded session) and aborts authentication if necessary
  4. SessionManager sets the session for the global request

IMO having a User::getRequest method doesn't really make sense in the first place.

Internally the request passed to User::newFromSession() is still needed, but I can't think of any need for it to be exposed publicly.

The risky part is that any access check that happens during Setup.php would not have the session restrictions applied. I can't think of any actions happening that early (other than autocreation, which it does not make much sense to restrict for OAuth anyway), but maybe on a private wiki which disallows the read permission for OAuth apps it could cause an info leak in some way?

Possibly a good step would be to audit for where $user->getRights() is called on a User object that wasn't created by User::newFromSession(). Some, like API list=allusers and list=users, would be ok, while others may be suspect.

The other option I have been thinking about is moving $wgBlockDisablesLogin handling to SessionManager (which seems like a nice thing to do in general). So

  1. SessionManager calls providers to determine the user identity (and they are not allowed to do block checks)

I note this bug isn't really caused by the block check itself but by the rights check for ipblock-exempt. Shouldn't you be forbidding any rights check?

  1. SessionManager sets the session in User::$mRequest (but not the global request yet; create a DerivativeReuqest, I guess? ugh)

You'd have to add a way for SessionManager to set User::$mRequest in the first place. Right now it'll be the defaulted-to $wgRequest.

More interesting is that the Session probably doesn't actually exist yet at the point where it would make most sense to add this check, you probably only have a SessionInfo. Adding it later in the process might break RequestContext::importScopedSession() or other code paths that load a session by ID rather than by interpreting fields in a WebRequest. That's not insurmountable, but it might take some thought to do well.

  1. SessionManager checks for blocks (which works because User::$mRequest now has a fully loaded session) and aborts authentication if necessary

"aborts authentication" is probably not the correct terminology here, since it has nothing to do with AuthManager. "rejects the session, using a new anonymous-user session" would make more sense.

  1. SessionManager sets the session for the global request

WebRequest->getSession() calls SessionManager to ask for the Session corresponding to itself, not the other way around. That's probably immaterial though.

jijiki triaged this task as High priority.Mar 19 2019, 6:15 PM

Internally the request passed to User::newFromSession() is still needed, but I can't think of any need for it to be exposed publicly.

And then that exposed object gets used even when the user was not created from the session. Which probably leads to incorrect behavior more often than not.

I note this bug isn't really caused by the block check itself but by the rights check for ipblock-exempt. Shouldn't you be forbidding any rights check?

Yeah, ideally other ways of triggering it should be prevented too, so it would be more of a quick patch than a full fix. I think there is value in moving the block check to SessionManager regardless, since most extensions will forget to do it.

More interesting is that the Session probably doesn't actually exist yet at the point where it would make most sense to add this check, you probably only have a SessionInfo. Adding it later in the process might break RequestContext::importScopedSession() or other code paths that load a session by ID rather than by interpreting fields in a WebRequest. That's not insurmountable, but it might take some thought to do well.

True. Probably not the best approach to fix a not-quite-but-almost-UBN issue then.

In T218608#5037102, @Anomie wrote: random

Possibly a good step would be to audit for where $user->getRights() is called on a User object that wasn't created by User::newFromSession(). Some, like API list=allusers and list=users, would be ok, while others may be suspect.

I spent some time doing that but gave up; it would take days, with all the places where User::isAllowed() is called. We should just log warnings when it happens.

One thing to note is that $request->getSession()->getUser() seems like a reasonable way to get a session user in many situations where one already has a request object, but (unlike $wgUser and RequestContext::getUser()) the user that gets returned is typically not one that has been created by User::newFromSession() (since it was created by the session provider which had to specify it by name or ID). So probably whether the user is equal to the session user is a better test than whether it came from User::newFromSession().

I spent some time doing that but gave up; it would take days, with all the places where User::isAllowed() is called. We should just log warnings when it happens.

I think that's a decent plan: create a log channel, write logs to it, and fix things until there are no more things. More below.

One thing to note is that $request->getSession()->getUser() seems like a reasonable way to get a session user in many situations where one already has a request object, but (unlike $wgUser and RequestContext::getUser()) the user that gets returned is typically not one that has been created by User::newFromSession() (since it was created by the session provider which had to specify it by name or ID). So probably whether the user is equal to the session user is a better test than whether it came from User::newFromSession().

Except testing name equality means there's no way to have a non-session User object for the same username as the session User. We have, basically, four cases:

  1. Session User object: returned from User::newFromSession(), $session->getUser(), and RequestContext->getUser().
  2. Non-session User object with a different name from the session user.
  3. Non-session User object with the same name as the session user.
  4. Non-session User object where we don't know if it's #2 or #3 because the Session isn't available yet.

For #1, we'll need to add some way to have $session->getUser() return a properly-flagged User object. These should always apply Session->getAllowedUserRights().

For #2, we never want to apply Session->getAllowedUserRights().

For #4, I'd think we can consider the provisional "session user" to be the anon with the request's IP, which would usually convert #4 into #2. I note that's what User::getBlockedStatus() is already trying to do. Can you think of any cases where we wouldn't want to do that?

That leaves #3. To resolve T218674 we also need these to never apply Session->getAllowedUserRights(), but first we need to fix any callers using a #3 where they should be using a #1. That's where the logging comes in: if ->isAllowed() is called on a #3, log and treat it like a #1. It looks like there are few enough direct calls to ->getRights() that we could audit them ahead of time, which would prevent false log warnings every time the API modules are used.

Can we find a way to ensure toolsadmin.wikimedia.org doesn't start allowing LDAP user creations when this is fixed? Right now user creation is stopped via wikitech, and is broken there but it would be best to couple the two.

Can we find a way to ensure toolsadmin.wikimedia.org doesn't start allowing LDAP user creations when this is fixed? Right now user creation is stopped via wikitech, and is broken there but it would be best to couple the two.

How does user creation work? Does Striker do it on its own, and it just checks something via OAuth first (which now breaks), or does it create accounts via the wikitech API? If it's the latter, just revoke its createaccount permission.

Can we find a way to ensure toolsadmin.wikimedia.org doesn't start allowing LDAP user creations when this is fixed? Right now user creation is stopped via wikitech, and is broken there but it would be best to couple the two.

How does user creation work? Does Striker do it on its own, and it just checks something via OAuth first (which now breaks), or does it create accounts via the wikitech API? If it's the latter, just revoke its createaccount permission.

From IRC

bd808> Striker does manipulate LDAP directly, but it also asks wikitech quiests about if the ip and username are ok to create accounts for
1:55 PM <bd808> *queries
1:55 PM <bd808> there's no "off switch" for the account creation feature (probably something to add)
1:56 PM <bd808> so revoking the grant would be the most guaranteed way to keep it down
1:56 PM <bd808> that will also keep other things from working like creation of new tool accounts

For #4, I'd think we can consider the provisional "session user" to be the anon with the request's IP, which would usually convert #4 into #2. I note that's what User::getBlockedStatus() is already trying to do. Can you think of any cases where we wouldn't want to do that?

I can come up with contrived examples (a wiki that allows autocreation via CentralAuth but wants to prevent OAuth tools from doing that, because there's a clickthrough agreement on your first pageview or something) but none that we should care right now given 1) this is kinda UBN territory 2) T218558: Move User::getRights and related methods into PermissionManager probably obviates some of these concerns in the long run. So let's do that.

That leaves #3. To resolve T218674 we also need these to never apply Session->getAllowedUserRights(), but first we need to fix any callers using a #3 where they should be using a #1. That's where the logging comes in: if ->isAllowed() is called on a #3, log and treat it like a #1. It looks like there are few enough direct calls to ->getRights() that we could audit them ahead of time, which would prevent false log warnings every time the API modules are used.

Per above, I'm not sure a long-term fix-all-callers process is worth the effort before knowing how PermissionManager will look. Within the current architecture, my preference would be a two-step getSessionForRequest process where the first step returns a provisional User and Session object, and the second step is all the permission checks, with User using the provisional session for that. But there's no point in doing that anymore, with the upcoming changes. So yeah, let's go with the logging, knowing more about callers never hurt anyone; and once PermissionManager is fleshed out, we'll see if the session user / not session user flag on User still makes sense (I'd guess we'll want to move away from it in light of also trying to move away from User in favor of small value objects, and will instead have some way of callers indicating to PermissionManager whether they care about session restrictions).

Can we find a way to ensure toolsadmin.wikimedia.org doesn't start allowing LDAP user creations when this is fixed? Right now user creation is stopped via wikitech, and is broken there but it would be best to couple the two.

https://gerrit.wikimedia.org/r/#/c/labs/striker/+/497952 Add feature flag to disable account creation

Per above, I'm not sure a long-term fix-all-callers process is worth the effort before knowing how PermissionManager will look.

Good point.

and will instead have some way of callers indicating to PermissionManager whether they care about session restrictions).

Yes, that would be needed then.

Bawolff changed the visibility from "Custom Policy" to "Public (No Login Required)".Mar 22 2019, 1:08 AM
sbassett changed the task status from Open to Stalled.Oct 4 2019, 5:46 PM
sbassett moved this task from Incoming to Watching on the Security-Team board.

This is an upstream problem as far as cloud services is concerned at this point.