Acquire an OAuth token

Hong Ooi

This is a short introduction to authenticating with Azure Active Directory (AAD) with AzureAuth.

The get_azure_token function

The main function in AzureAuth is get_azure_token, which obtains an OAuth token from AAD:

library(AzureAuth)

token <- get_azure_token(resource="myresource", tenant="mytenant", app="app_id",
    password="mypassword", username="username", certificate="encoded_cert", version=1, ...)

The function has the following arguments:

Scopes in AAD v2.0 consist of a URL or a GUID, along with a path that designates the scope. If a scope doesn’t have a path, get_azure_token will append the /.default path with a warning. A special scope is offline_access, which requests a refresh token from AAD along with the access token: without this, you will have to reauthenticate if you want to refresh the token.

# request an AAD v1.0 token for Resource Manager
token1 <- get_azure_token("https://management.azure.com/", "mytenant", "app_id")

# same request to AAD v2.0, along with a refresh token
token2 <- get_azure_token(c("https://management.azure.com/.default", "offline_access"),
    "mytenant", "app_id", version=2)

Authentication methods

AzureAuth supports the following methods for authenticating with AAD: authorization_code, device_code, client_credentials, resource_owner and on_behalf_of.

  1. Using the authorization_code method is a multi-step process. First, get_azure_token opens a login window in your browser, where you can enter your AAD credentials. In the background, it loads the httpuv package to listen on a local port. Once you have logged in, the AAD server redirects your browser to a local URL that contains an authorization code. get_azure_token retrieves this authorization code and sends it to the AAD access endpoint, which returns the OAuth token.

    The httpuv package must be installed to use this method, as it requires a web server to listen on the (local) redirect URI. Since it opens a browser to load the AAD authorization page, your machine must also have an Internet browser installed that can be run from inside R. In particular, if you are using a Linux Data Science Virtual Machine in Azure, you may run into difficulties; use one of the other methods instead.
# obtain a token using authorization_code
# no user credentials needed
get_azure_token("myresource", "mytenant", "app_id", auth_type="authorization_code")
  1. The device_code method is similar in concept to authorization_code, but is meant for situations where you are unable to browse the Internet – for example if you don’t have a browser installed or your computer has input constraints. First, get_azure_token contacts the AAD devicecode endpoint, which responds with a login URL and an access code. You then visit the URL and enter the code, possibly using a different computer. Meanwhile, get_azure_token polls the AAD access endpoint for a token, which is provided once you have entered the code.
# obtain a token using device_code
# no user credentials needed
get_azure_token("myresource", "mytenant", "app_id", auth_type="device_code")
  1. The client_credentials method is much simpler than the above methods, requiring only one step. get_azure_token contacts the access endpoint, passing it the credentials. This can be either a client secret or a certificate, which you supply in the password or certificate argument respectively. Once the credentials are verified, the endpoint returns the token. This is the method typically used by service accounts.
# obtain a token using client_credentials
# supply credentials in password arg
get_azure_token("myresource", "mytenant", "app_id",
                password="client_secret", auth_type="client_credentials")

# can also supply a client certificate as a PEM/PFX file...
get_azure_token("myresource", "mytenant", "app_id",
                certificate="mycert.pem", auth_type="client_credentials")

# ... or as an object in Azure Key Vault
cert <- AzureKeyVault::key_vault("myvault")$certificates$get("mycert")
get_azure_token("myresource", "mytenant", "app_id",
                certificate=cert, auth_type="client_credentials")
  1. The resource_owner method also requires only one step. In this method, get_azure_token passes your (personal) username and password to the AAD access endpoint, which validates your credentials and returns the token.
# obtain a token using resource_owner
# supply credentials in username and password args
get_azure_token("myresource", "mytenant", "app_id",
                username="myusername", password="mypassword", auth_type="resource_owner")
  1. The on_behalf_of method is used to authenticate with an Azure resource by passing a token obtained beforehand. It is mostly used by intermediate apps to authenticate for users. In particular, you can use this method to obtain tokens for multiple resources, while only requiring the user to authenticate once.
# obtaining multiple tokens: authenticate (interactively) once...
tok0 <- get_azure_token("serviceapp_id", "mytenant", "clientapp_id", auth_type="authorization_code")
# ...then get tokens for each resource with on_behalf_of
tok1 <- get_azure_token("resource1", "mytenant," "serviceapp_id",
                        password="serviceapp_secret", auth_type="on_behalf_of", on_behalf_of=tok0)
tok2 <- get_azure_token("resource2", "mytenant," "serviceapp_id",
                        password="serviceapp_secret", auth_type="on_behalf_of", on_behalf_of=tok0)

If you don’t specify the method, get_azure_token makes a best guess based on the presence or absence of the other authentication arguments, and whether httpuv is installed.

# this will default to authorization_code if httpuv is installed, and device_code if not
get_azure_token("myresource", "mytenant", "app_id")

# this will use on_behalf_of method
get_azure_token("myresource", "mytenant", "app_id",
                password="client_secret", on_behalf_of=token)

OpenID Connect

You can also use get_azure_token to obtain ID tokens, in addition to access tokens.

With AAD v1.0, using an interactive authentication flow (authorization_code or device_code) will return an ID token by default – you don’t have to do anything extra. The token will be in the credentials$id_token component of the returned object.

AAD v2.0 does not return an ID token by default, but you can get one by specifying openid as a scope. Again, this applies only to interactive authentication.

# ID token with AAD v1.0
# if you only want an ID token, set the resource to blank ("")
tok <- get_azure_token("", "mytenant", "app_id")
tok$credentials$id_token

# ID token with AAD v2.0
tok2 <- get_azure_token(c("openid", "offline_access"), "mytenant", "app_id", version=2)
tok2$credentials$id_token

Caching

AzureAuth caches tokens based on all the inputs to get_azure_token, as listed above. It defines its own directory for cached tokens, using the rappdirs package. On recent Windows versions, this will usually be in the location C:\Users\(username)\AppData\Local\AzureR. On Linux, it will be in ~/.local/share/AzureR, and on MacOS, it will be in ~/Library/Application Support/AzureR. Note that a single directory is used for all tokens, and the working directory is not touched (which significantly lessens the risk of accidentally introducing cached tokens into source control).

For reasons of CRAN policy, the first time that AzureAuth is loaded, it will prompt you for permission to create this directory. Unless you have a specific reason otherwise, it’s recommended that you allow the directory to be created. Note that most other cloud engineering tools save credentials in this way, including Docker, Kubernetes, and the Azure CLI itself. The prompt only appears in an interactive session; if AzureAuth is loaded in a batch script, the directory is not created if it doesn’t already exist.

To list all cached tokens on disk, use list_azure_tokens. This returns a list of token objects, named according to their MD5 hashes.

To delete a cached token, use delete_azure_token. This takes the same inputs as get_azure_token, or you can specify the MD5 hash directly in the hash argument. To delete all cached tokens, use clean_token_directory.

# list all tokens
list_azure_tokens()

# <... list of token objects ...>

# delete a token
delete_azure_token("myresource", "mytenant", "app_id",
                   password="client_credentials", auth_type="client_credentials")

Refreshing

A token object can be refreshed by calling its refresh() method. If the token’s credentials contain a refresh token, this is used; otherwise a new access token is obtained by reauthenticating. In most situations you don’t need to worry about this, as the AzureR packages will check if the credentials have expired and automatically refresh them for you.

One scenario where you might want to refresh manually is using a token for one resource to obtain a token for another resource. Note that in AAD, a refresh token can be used to obtain an access token for any resource or scope that you have permissions for. Thus, for example, you could use a refresh token issued on a request for https://management.azure.com to obtain a new access token for https://graph.microsoft.com (assuming you’ve been granted permission).

To obtain an access token for a new resource, change the object’s resource (for an AAD v1.0 token) or scope field (for an AAD v2.0 token) before calling refresh(). If you also want to retain the token for the old resource, you should call the clone() method first to create a copy.

# use a refresh token from one resource to get an access token for another resource
tok <- get_azure_token("https://myresource", "mytenant", "app_id")
tok2 <- tok$clone()
tok2$resource <- "https://anotherresource"
tok2$refresh()

# same for AAD v2.0
tok <- get_azure_token(c("https://myresource/.default", "offline_access"),
    "mytenant", "app_id", version=2)
tok2 <- tok$clone()
tok2$scope <- c("https://anotherresource/.default", "offline_access")
tok2$refresh()

More information

For the details on Azure Active Directory, consult the Microsoft documentation.