Rubrik : Using GraphQL


Background

Sending REST API calls to the cluster is sometimes slow and not very efficient. I came across some interesting posts on that topic. It looks like while REST is getting all the results (even the one you are not interested with) it exist another query type that only returns what you are looking for. This is GraphQL. Let's have a look !

GraphQL

GraphQL is a query language created by Facebook in 2012. It is fully open source since 2015 and is actively maintained by the community. The main purpose of this method is performance. Indeed, while returning only the requested data, you are not only more efficient on data transfer but as well on data computing. No need to sort the result in your code, most of the sorting and selection is done on the backend, before transmitting the data.

Rubrik

Rubrik started to use GraphQL few years ago on their CDM UI and more extensively with the  Polaris framework.
This is very interesting when creating elaborated scripts ! Indeed, I'm thinking at the Rubrik Central script. It would be awesome if all the sorting is done on the Rubrik cluster before sending the data back. Performance-wise, I'm sure that would have a lot of benefits.

When using the Rubrik API Code Capture, I already noticed some of the GraphQL queries floating around. 

So, why not giving it a try !

Rubrik Playground

There is an endpoint in the Rubrik playground located at https://<cluster_ip>/docs/internal/playground/#/%2Fgraphql, but this is empty and there is no place to try anything. 


Instead, Rubrik created a side application called Rubrik GraphQL Playground available on GitHub here. It has been released to the community as an open source project and you can get it for Windows and macOS as well as the source code. So, I believe some folks will be able to run it on Linux.


When you start the application, you have to choose between Polaris or CDM. The result is : depending of the choice, the application is loading the correct schema and related documentation.

This is not the goal here to explain you how GraphQL works, there are plenty of tutorials and documentation available here and there. You can already start to browse to the official website, this is always a great starting point.

Next, you can check this document from Drew Russel where the basics of working with GraphQL and Rubrik are presented.

My First query!

So, when you start the GraphQL Playground, choose CDM, you will see a similar window :


Type this query in the left pane : 

query ClusterDetails {
  cluster(id: "me") {
    version
    id
    brikCount
    isSingle
    isBootstrapped
  }
}

Click on "Play" button at the top and you should see a similar result :

{
  "data": {
    "cluster": {
      "isSingle": false,
      "brikCount": 1,
      "version": "6.0.2-13213",
      "id": "me",
      "isBootstrapped": true
    }
  }
}

Congratulations, this is your first GraphQL query !

So, as you can see, very easy to read the query and the result. Simple JSON-like standard format.

What's next ?

Next ? What about Php ? Yes, we can do it. This will use the POST method and we can easily reuse what's been done in the past.

function rkGraphQL($clusterConnect,$query)
{
    $API="/api/internal/graphql";

    // Query must be encapsulated into a valid JSON string. {"query": "$query"}
    $config_params="{\"query\":\"".$query."\"}";
    $curl = curl_init();
    curl_setopt($curl, CURLOPT_POST, 1);
    curl_setopt($curl, CURLOPT_POSTFIELDS,$config_params);
    curl_setopt($curl, CURLOPT_USERPWD, $clusterConnect["username"].":".$clusterConnect["password"]);
    curl_setopt($curl, CURLOPT_HTTPAUTH, CURLAUTH_BASIC);
    curl_setopt($curl, CURLOPT_FOLLOWLOCATION, 1);
    curl_setopt($curl, CURLOPT_HTTPHEADER, array('Content-Type: application/json'));
    curl_setopt($curl, CURLOPT_SSL_VERIFYPEER, false);
    curl_setopt($curl, CURLOPT_SSL_VERIFYHOST, false);
    curl_setopt($curl, CURLOPT_URL, "https://".$clusterConnect["ip"].$API);
    curl_setopt($curl, CURLOPT_RETURNTRANSFER, 1);
    $result = curl_exec($curl);
    curl_close($curl);

    return(json_decode($result));
}

The above function is really straight forward, it takes the $query variable and send it as bulk to the relevant endpoint. That's it ! The result is simply converted into a JSON-like php object. The best way to parse the result and use what we want.

The query is this one : 

$query="{\"query\":\"query Cluster{cluster(id:\\\"me\\\"){version id brikCount isSingle isBootstrapped}}\"}";

Pay attention to the escape characters, it took me some time to figure it out how to format the query without error.

When called, this function returns our requested details : 

$ php GraphQL.php
object(stdClass)#4 (1) {
  ["data"]=>
  object(stdClass)#3 (1) {
    ["cluster"]=>
    object(stdClass)#2 (7) {
      ["isSingle"]=>
      bool(false)
      ["__typename"]=>
      string(7) "Cluster"
      ["brikCount"]=>
      int(1)
      ["version"]=>
      string(11) "6.0.2-13213"
      ["id"]=>
      string(2) "me"
      ["isRegistered"]=>
      bool(false)
      ["isBootstrapped"]=>
      bool(true)
    }
  }
}

Polaris

The GraphQL implementation in Rubrik is following two very different tracks : CDM and Polaris.

First, on the CDM side, you can only query and you are not able to make any changes (mutations in GraphQL language). On the other side, the schema applied to CDM is very limited, so you cannot do a lot as of today, but this might evolve in the future.

Second, Polaris is providing much more capabilities. The entire Polaris UI is GraphQL-based, this is of course performance and flexibility oriented. Rubrik provides an API Code Capture facility. More details here. This tool facilitate reverse engineering to get the queries right ;)

Sample Polaris GraphsQL query to get all clusters from the GPS main page : 

query DataSitesMapQuery {
  clusterConnection(filter: {type: [OnPrem]}) {
    nodes {
      id
      name
      version
      estimatedRunway      
      status

      geoLocation {
        address
        latitude
        longitude
      }
    }
  }
}

Result : 

{
  "data": {
    "clusterConnection": {
      "nodes": [
        {
          "id": "c96b3e02-ea58-4ef5-xxxx-xxxxxxxx",
          "name": "rub01",
          "version": "6.0.2-p2-13398",
          "estimatedRunway": 370,
          "status": "Connected",
          "geoLocation": {
            "address": "Asia",
            "latitude": 22.xx,
            "longitude": 114.xx
        },
        {
          "id": "590b1ae4-202d-4d0f-xxxx-xxxxxxxx",
          "name": "rub02",
          "version": "6.0.2-p2-13398",
          "estimatedRunway": 682,
          "status": "Connected",
          "geoLocation": {
            "address": "USA",
            "latitude": 38.xx,
            "longitude": -77.xx
          }
        }
        }
      ]
    }
  }
}

It returns the 2 clusters that I currently have in Polaris. I have of course obfuscated some of the information for security reasons.

Now, how to script this query outside of the playground ?

Polaris Token

The complex part is the authentication to Polaris. We need a service account and a token (used as a bearer).
To generate a service account, simply goto  the Polaris UI in the Gear Icon, Users and Roles


Click "Add service account"


Choose an appropriate name


Select the role.

The next screen is very important, this is where your token credentials will be displayed and saved as JSON. You absolutely need to secure that JSON file in a safe place, this is your access to your Polaris tenant with the previously selected roles.


Click on Download as JSON.

Next step, is to create a Bearer. This is the token that will authenticate you when sending the GraphQL queries to Polaris.

You need to send a post query to your tenant URL with the following payload : 

{
  "client_id": "client_id_from_json",
  "client_secret": "client_secret_from_json",
  "grant_type": "client_credentials"
}

Remember the JSON file you downloaded above ? It contains all the magic values you need to change in the payload above. So, replace client_id_from_json and client_secret_from_json by the values you have in your own file.

To get the bearer (unique token for authentication), you need to cURL the endpoint asking for it. Replace the green text below with your own details.

$ curl --request POST --header "Content-Type: application/json" --data '{  "client_secret": "<client_secret>",  "client_id": "<client_id>",  "grant_type": "client_credentials"}' https://<my_tenant>.my.rubrik.com/api/client_token

This query will return the bearer. This is a very long string similar to a md5 hash :

{"client_id":"client|Cx7gyixxxxxxxxxxx8ZcSS7m79R","access_token":"xxxxxxxx"}

Of course, the access_token above is changed for obvious security reasons. Secure that token, you will use it as the bearer. On my side, the bearer is 920 characters long ! This bearer will not change as soon as you do not change the service account. So, this step must only be done once. When you have the bearer, it will be used in every queries from now on.

[EDIT 1st Feb 2022] : the token as a TTL of 24 hours, it means that you will have to generate a new token every day. I have added a function that generates a token, so I recommend you to call it at the top of each Polaris related script to avoid any issues.

Do you remember the query above ? We will format it as a JSON string : 

{
"query" : "query DataSitesMapQuery {clusterConnection(filter: {type: [OnPrem]}) { nodes { id name version estimatedRunway status geoLocation { address latitude longitude }}}}"
}

Now, we can run this query against our Polaris tenant to get the same results as we had in the playground before.

$ curl -X POST -H 'Content-Type: application/json' -H 'Authorization: Bearer xxxxxxxx' -d '{"query" : "query DataSitesMapQuery {clusterConnection(filter: {type: [OnPrem]}) { nodes { id name version estimatedRunway status geoLocation { address latitude longitude }}}}"}' https://<my_tenant>.my.rubrik.com/api/graphql -k

Replace <my_tenant> with the correct data matching your environment.

{"data":{"clusterConnection":{"nodes":[{"id":"c96b3e02-ea58-4ef5-xxxx-xxxxxxxx","name":"rub01","version":"6.0.2-p2-13398","estimatedRunway":392,"status":"Connected","geoLocation":{"address":"Asia","latitude":22.xx,"longitude":114.xx}},{"id":"590b1ae4-202d-4d0f-xxxx-xxxxxxxx","name":"rub02","version":"6.0.2-p2-13398","estimatedRunway":481,"status":"Connected","geoLocation":{"address":"USA","latitude":38.xx,"longitude":-77.xx}}}

Let's try this with php

Same philosophy as for the CDM implementation, but this time we are not using username and password, but a token and we have to send the query to our tenant URL.

Let's see !

$ php GraphQL.php
string(163) "query DataSitesMapQuery {clusterConnection(filter: {type: [OnPrem]}) { nodes { id name version estimatedRunway status geoLocation { address latitude longitude }}}}"

object(stdClass)#10 (1) {
  ["data"]=>
  object(stdClass)#9 (1) {
    ["clusterConnection"]=>
    object(stdClass)#8 (1) {
      ["nodes"]=>
      array(3) {
        [0]=>
        object(stdClass)#2 (6) {
          ["id"]=>
          string(36) "c96b3e02-ea58-4ef5-xxxx-xxxxxxxx"
          ["name"]=>
          string(11) "rub01"
          ["version"]=>
          string(23) "6.0.2-p2-13398"
          ["estimatedRunway"]=>
          int(426)
          ["status"]=>
          string(9) "Connected"
          ["geoLocation"]=>
          object(stdClass)#3 (3) {
            ["address"]=>
            string(19) "Asia"
            ["latitude"]=>
            float(22.xx)
            ["longitude"]=>
            float(114.xx)
          }
        }
        [1]=>
        object(stdClass)#4 (6) {
          ["id"]=>
          string(36) "590b1ae4-202d-4d0f-xxxx-xxxxxxxx"
          ["name"]=>
          string(11) "rub02"
          ["version"]=>
          string(14) "6.0.2-p2-13398"
          ["estimatedRunway"]=>
          int(436)
          ["status"]=>
          string(9) "Connected"
          ["geoLocation"]=>
          object(stdClass)#5 (3) {
            ["address"]=>
            string(46) "USA"
            ["latitude"]=>
            float(38.xx)
            ["longitude"]=>
            float(-77.xx)
          }
        }
      }
    }
  }
}

The following php function is sending the query to the specified endpoint : 

function rkPolGraphQL($polarisConnect,$query)
{
    $API="/api/graphql";

    // Query must be encapsulated into a valid JSON string. {"query": "$query"}
    $config_params="{\"query\":\"".$query."\"}";
    $curl = curl_init();
    curl_setopt($curl, CURLOPT_POST, true);
    curl_setopt($curl, CURLOPT_POSTFIELDS,$config_params);
    curl_setopt($curl, CURLOPT_FOLLOWLOCATION, true);
    curl_setopt($curl, CURLOPT_HTTPHEADER, array('Authorization: Bearer '.$polarisConnect["token"],'Content-Type: application/json'));
    curl_setopt($curl, CURLOPT_SSL_VERIFYPEER, false);
    curl_setopt($curl, CURLOPT_SSL_VERIFYHOST, false);
    curl_setopt($curl, CURLOPT_URL, "https://".$polarisConnect["tenant"].".my.rubrik.com".$API);
    curl_setopt($curl, CURLOPT_RETURNTRANSFER, true);
    $result = curl_exec($curl);
    curl_close($curl);

    return(json_decode($result));
}

Sample call : 

$tenant="my_tenant";
// Polaris credenditals
$polarisConnect=array(
"token" => rkpolGetToken($clientID,$clientSecret,$tenant),
"tenant" => $tenant
);
$polarisQuery="query DataSitesMapQuery {clusterConnection(filter: {type: [OnPrem]}) { nodes { id name version estimatedRunway status geoLocation { address latitude longitude }}}}";
$res=rkPolGraphQL($polarisConnect,$polarisQuery);

$res contains the result of the query.

This function has been added into the PhP Framework on my GitHub repository. and the GraphQL example is here.

I can imagine you will have to read this a couple of time to get it right. It was not easy for me to get it at first. If this is confusing, let me know.

This is opening a brand new world of efficient queries. The complex part is to learn the schema of the various endpoints (CDM and Polaris have different schema) and start to play around.

I really hope this is helping others, it took me few days to collect information from various sources and compile them into one. Do not hesitate to comment below if you have any question/issue/concern.

Special thanks to Jaap Brasser and the Rubrik support team for answering my many questions ;)

See you next time !





Comments

What's hot ?

ShredOS : HDD degaussing with style

Nutanix : CVM stuck into Phoenix