Splunk Search

How to create an interesting field "statusCode" and have it sorted by different statusCode values?

dickersons
Explorer

Hi,

I am using a regex to search for a field "statusCode" which could have multiple values, i.e. "200", "400", "500", etc....  I am attempting to create an Interesting Field "statusCode" and have it sorted by different statusCode values.

I am  trying to do perform a search using the following:

 

 

\\Sample Query
index=myCoolIndex cluster_name="myCoolCluster" sourcetype=myCoolSourceType label_app=myCoolAppName ("\"statusCode\"") | rex field=_raw \"statusCode\"\s:\s\"?(?<statusCode>2\d{2}|4\d{2}|5\d{2})\"?

\\Sample Log (Looks like JSON object, but its a string):
"{
  "correlationId" : "",
  "message" : "",
  "tracePoint" : "",
  "priority" : "",
  "category" : "",
  "elapsed" : 0,
  "locationInfo" : {
    "lineInFile" : "",
    "component" : "",
    "fileName" : "",
    "rootContainer" : ""
  },
  "timestamp" : "",
  "content" : {
    "message" : "",
    "originalError" : {
      "statusCode" : "200",
      "errorPayload" : {
        "error" : ""
      }
    },
    "standardizedError" : {
      "statusCode" : "400",
      "errorPayload" : {
        "errors" : [ {
          "error" : {
            "traceId" : "",
            "errorCode" : "",
            "errorDescription" : "",
            "errorDetails" : ""
          }
        } ]
      }
    },
    "standardizedError" : {
      "statusCode" : "500",
      "errorPayload" : {
        "errors" : [ {
          "error" : {
            "traceId" : "",
            "errorCode" : "",
            "errorDescription" : ""
            "errorDetails" : ""
          }
        } ]
      }
    }
  },
}"

 

 

Using online regex tools and a sample output of a log I have confirmed the regEx works outside of a Splunk query.  I have also gone through numerous Splunk community threads where I have tried different permutations based on suggestions with no luck.  Any help would be appreciated.

 

Labels (3)
0 Karma
1 Solution

yuanliu
SplunkTrust
SplunkTrust

It is not completely true that spath only works on conformant JSON; Splunk does try extra hard to deal with uncomformant JSON.  The nice thing about spath - even with partially conformant data, is that it gives you the path so you know exactly which segment of the data this value comes from; sometimes it is critical to know.

If you don't want to know the path, you can still use regex and hope that your developer doesn't change format in the future.  The command you gave did not quote the regex.  I just added quote and it works fine.  In the following, I also added max_match to capture all occurrences of the pattern.

| rex max_match=0 "\"statusCode\"\s:\s\"?(?<statusCode>2\d{2}|4\d{2}|5\d{2})\""
``` no need to specify field; default is _raw ```

 

View solution in original post

dickersons
Explorer

Had to add a "?" at the end after 2nd to last double quote, but worked like a charm this is a life save.  I absolutely understand what you are getting at in terms of using regEx if it were up to me everything would be in structured format.....  Thanks again for this assistance it is much appreciated!

0 Karma

dickersons
Explorer

It should return all values....the expected goal is to separate those values by their statusCode.  1 response can have multiple values I am looking to find all of those values and then separate them once they have been categorized as "statusCode".  "spath" ONLY work (according to Splunk) with XML and JSON so "spath" is not a way forward as a solution.

0 Karma

yuanliu
SplunkTrust
SplunkTrust

It is not completely true that spath only works on conformant JSON; Splunk does try extra hard to deal with uncomformant JSON.  The nice thing about spath - even with partially conformant data, is that it gives you the path so you know exactly which segment of the data this value comes from; sometimes it is critical to know.

If you don't want to know the path, you can still use regex and hope that your developer doesn't change format in the future.  The command you gave did not quote the regex.  I just added quote and it works fine.  In the following, I also added max_match to capture all occurrences of the pattern.

| rex max_match=0 "\"statusCode\"\s:\s\"?(?<statusCode>2\d{2}|4\d{2}|5\d{2})\""
``` no need to specify field; default is _raw ```

 

dickersons
Explorer

Hi,

Thanks for the response, but unfortunately the entirety of the response is in string format not XML or JSON.  The content looks like JSON but it is a string which is why I am attempting to us regEx to extract statusCode from the string.  I am not too concerned about multiple values as I am going to "dedup" based off another string field extraction for the traceId of the response.  Any suggestion regarding the regEx is appreciated.  Again the regEx works using regex101, but only fails in a Splunk query.....

0 Karma

yuanliu
SplunkTrust
SplunkTrust

What is the expected return?  In originalError, you have 200, then in standardizedError, you have 400 and 500. Do you want them all or do you want them separate?

I get the feeling that your developer botched the log format when they really intended to be JSON.  So, the first action is to perhaps to ask developers to fix log.  But even without, spath happens to be able to extract those fields just as well, at least on the illustrated sample.

 

| spath
| table content.originalError.statusCode content.standardizedError.statusCode

 

content.originalError.statusCode
content.standardizedError.statusCode
200
400
500
Tags (1)
0 Karma
Get Updates on the Splunk Community!

Index This | Forward, I’m heavy; backward, I’m not. What am I?

April 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with another ...

A Guide To Cloud Migration Success

As enterprises’ rapid expansion to the cloud continues, IT leaders are continuously looking for ways to focus ...

Join Us for Splunk University and Get Your Bootcamp Game On!

If you know, you know! Splunk University is the vibe this summer so register today for bootcamps galore ...