Solved: Multiple queries to extract one value

rh417692 · ‎05-16-2017

I eventually have to find out the 'N-' value. When I do this manually, I have to do it like this:

Query: index="prod" "MainRefinementMenuHandler exceptionjava.lang.NullPointerException"
Result: 16:09:11,714 ERROR assembler.cartridge.handler.MainRefinementMenuHandler JSESSIONID:hT0NrpquJyI47wOYg28J8oTv.PS012.PROD1A, PROFILE_ID:2870650047, APACHE_UNIQUE_ID:WRoK5wpGHxQAAFf2XJkAAAAL, SOURCE_IP:55.55.555.555, Client-Source-IP:55.555.555.555, MainRefinementMenuHandler exceptionjava.lang.NullPointerException`From the above result, I need to capture the JSESSIONID for the next query. Need help with that JSESSIONID extraction. Trying to use rex but unsuccessful so far
Using the JSESSIONID, I use this query to find the 'N-' value
Query: index="prod" "hT0NrpquJyI47wOYg28J8oTv" | reverse
Result: (result 1) ....
(result 2) ....
(result 3) 16:09:11,716 ERROR .droplet.InvokeAssembler JSESSIONID:hT0NrpquJyI47wOYg28J8oTv.PS012.PROD1A, PROFILE_ID:1234567891, APACHE_UNIQUE_ID:WRoK5wpGHxQAAFf2XJkAAAAL, SOURCE_IP:55.55.555.555, Client-Source-IP:55.555.555.555, Servicing 'error' oparam. An exception occurred invoking the Assembler with ContentItem {content={@type=PageSlot, name=Dynamic Page Slot,
contentPaths=[/content/Navigation/CategoryPages], templateTypes=[Page], templateIds=[], ruleLimit=1, endeca:siteRootPath=/pages, endeca:contentPath=//category, contents=[{@type=OneColumnPage, name=One Column Page, mainContent=[{removeAllAction=com.endeca.infront.cartridge.model.NavigationAction@629453c8, refinementCrumbs=[com.endeca.infront.cartridge.model.RefinementBreadcrumb@7808a207,com.endeca.infront.cartridge.model.RefinementBreadcrumb@7ec5b721], geoFilterCrumb=null, @type=Breadcrumbs, name=Breadcrumbs, searchCrumbs=[], rangeFilterCrumbs=[]}, {@type=ContentSlotMain, name=FeatureZone, contentPaths=[/content/Contents/CategoryPageContents/Category Headers], templateTypes=[MainContent], templateIds=[], ruleLimit=2, contents=[]}, {@type=ContentSlotMain, name=TextBreacrumbZone, contentPaths=[/content/Contents/CategoryPageContents/TextBreadCrumb], templateTypes=[MainContent], templateIds=[], ruleLimit=1, contents=[{@type=TextBreadcrumb, name=Default, textBreadcrumb=}]}, {@type=ContentSlotMain, name=RefineZone, contentPaths=[/content/Contents/CategoryPageContents/Refinement Zone], templateTypes=[MainContent], templateIds=[], ruleLimit=1, contents=[{@type=MainGuidedNavigation, name=Navigation Container, navigation=[{refinements=null, multiSelect=true, expandRefinement=false, allDeadEnd=yes, @type=MainRefinementMenu, name=Gender, ancestors=[com.endeca.infront.cartridge.model.Ancestor@667a188c], displayName=gender, dimensionName=gender, whyPrecedenceRuleFired=null}, {refinements=null, multiSelect=true, expandRefinement=false, allDeadEnd=yes, @type=MainRefinementMenu, name=Occasion, ancestors=[com.endeca.infront.cartridge.model.Ancestor@59e1c3e3], displayName=occassion, dimensionName=occassion, whyPrecedenceRuleFired=null}]}]},{@type=ContentSlotMain, name=ResultList Zone, contentPaths=[/content/Contents/CategoryPageContents/ResultList Zone], templateTypes=[MainContent], templateIds=[], ruleLimit=1, contents=[{subRecordFieldNames=[colorCode, color, heelHeight, toeShape, materials, width, listPrice, sku.isClearanceItem, sku.inventory], sortOption=product.isClearance|0||product.inventory|1||product.daysAvailable|0||rating|1, subRecordsPerAggregateRecord=ALL, fieldNames=[product.displayName, product.productTypeWeb, product.repositoryId, msrp, nonMemberPrice, product.nonMemberMinPrice, product.nonMemberMaxPrice, isClearance, product.defaultColorCode, brand, gender, product.category, product.originalPrice, product.showPriceInCart, allAncestors.repositoryId, product.hasAnimatedImage], boostStrata=[], @type=ResultsList, name=Results List, relRankStrategy=nterms,maxfield,exact,static(product.analytics.conversion_rate,descending), buryStrata=[], offset=0, recordsPerPage=90}]}, {@type=ContentSlotMain, name=Main Content Slot, contentPaths=[/content/Contents/CategoryPageContents/SEO Content], templateTypes=[MainContent], templateIds=[], ruleLimit=1, contents=[]}]}]}, contentUri=/pages/category, @type=RedirectAwareContentInclude}com.endeca.infront.assembler.AssemblerException: java.lang.NullPointerException
(result 4) 16:09:11,716 INFO com.commerce.common.servlet.SessionIDRequestFilter [REQUEST_URI:/rest/model/com/commerce/endeca/controller/v1_0/content/ContentActor/getPageContent/_/N-27c8Z1z141ggZ1z13vpx, METHOD:GET, FORWARDED_PATH:null, QUERY_STRING:No=0&_dynSessConf=-3081247017807204448&locale=en_US&pagePath=/pages/category&pushSite=&skipHeaderFooterContent=true, JSESSIONID:hT0NrpquJyI47wOYg28J8oTv.012.PROD1A, UNIQUE_ID:WRoK5wpGHxQAAFf2XJkAAAAL, SOURCE_IP:55.555.555.555, Client-Source-IP:55.555.555.555, DYN_USER_ID:null, X-Akamai-Edgescape:, CLIENT_DEVICE_TYPE:tablet, EXECUTION_TIME: 29 ms]

Basically, I need the 'N-' value that is below the big result in (result 3) as that is the bad 'N-' value I need. Any other random 'N-' value are good values that I do not need to capture. Need a query for this.

There are multiple queries for this. Any help in finding the entire solution or even part of it would be great. Thanks!

gcusello · ‎05-17-2017

Hi rh417692,
I'm not sure about your request.

Anyway the regex to extract the bold value in item 1 is

JSESSIONID:(?<JSESSIONID>[^\.]*)\.

the regex to extract the bold value in item 2 is

\[REQUEST_URI:.*(?<field>N-[^,]*),

Bye.
Giuseppe

View solution in original post

gcusello · ‎05-17-2017

Hi rh417692,
I'm not sure about your request.

Anyway the regex to extract the bold value in item 1 is

JSESSIONID:(?<JSESSIONID>[^\.]*)\.

the regex to extract the bold value in item 2 is

\[REQUEST_URI:.*(?<field>N-[^,]*),

Bye.
Giuseppe

rh417692 · ‎05-17-2017

Hi @cusello. Using the below search query, I get unique JSESSIONIDs. How would I add your second query string to this query:

index="atg_prod" "MainRefinementMenuHandler exceptionjava.lang.NullPointerException" | rex "JSESSIONID:(?[^.]+)" | dedup JSESSIONID

I'm very new to Splunk so apologies if this seems trivial...

cmerriman · ‎05-17-2017

if my above JSESSIONID rex works, would something like this work for combining it all? i could be way off base, but this is just an idea from reading the above. adding a subsearch for your first search to get all of the JSESSIONIDs into the second search and then doing a streamstats to search for only the 3rd result.

index="atg_prod1a" [search index="atg_prod" "MainRefinementMenuHandler exceptionjava.lang.NullPointerException"|rex field=_raw JSESSIONID:(?<JSESSIONID>\w+)|stats count by JSESSIONID |fields - count |format] | reverse|streamstats count by JSESSIONID|search count=3

rh417692 · ‎05-17-2017

I'm currently using this query to get unique JSESSIONIDs:

index="atg_prod" "MainRefinementMenuHandler exceptionjava.lang.NullPointerException" | rex "JSESSIONID:(?[^.]+)" | dedup JSESSIONID

With your query, I actually do not see any result. Maybe adding this would help?

\[REQUEST_URI:.(?N-[^,]),

I just don't know how to add it or if needs editing as I'm so new to Splunk

cmerriman · ‎05-17-2017

i just realized that my subsearch wouldn't work anyways since you don't have JSESSIONID defined in that index. try something like this, maybe

index="atg_prod1a"  | rex "JSESSIONID\:(?<JSESSIONID>[^.]+)"|join JSESSIONID [search index="atg_prod" "MainRefinementMenuHandler exceptionjava.lang.NullPointerException" | rex "JSESSIONID\:(?<JSESSIONID>[^.]+)"|stats count by JSESSIONID |fields - count ] | reverse|streamstats count by JSESSIONID|search count=3|rex "\[REQUEST_URI:.(?<REQUEST_URI>N-[^,]),"|table REQUEST_URI

rh417692 · ‎05-17-2017

We're almost there @cmerriman. It seems to be getting a result but it does not display anything. When I remove the 'table REQUEST_URI, I get the result with all details when I only need the 'N-' value. Is there any way we can see only the 'N-' values?

cmerriman · ‎05-18-2017

when you remove the |table REQUEST_URI and look over at the list of available fields in verbose mode on the left hand side, is REQUEST_URI in there? If so, is it the N- value you are looking for? if not, it's likely a problem with the rex command. I just noticed there is a difference in the rex i have above and the rex @cusello provided. try this:

 index="atg_prod1a"  | rex "JSESSIONID\:(?<JSESSIONID>[^.]+)"|join JSESSIONID [search index="atg_prod" "MainRefinementMenuHandler exceptionjava.lang.NullPointerException" | rex "JSESSIONID\:(?<JSESSIONID>[^.]+)"|stats count by JSESSIONID |fields - count ] | reverse|streamstats count by JSESSIONID|search count=3|rex "\[REQUEST_URI:.*(?<REQUEST_URI>N-[^,]*),"|table REQUEST_URI

rh417692 · ‎05-19-2017

I'm getting results but the rows in the table does not show any value. What could potentially be the reason?
I feel we're really close to achieving our goal. Thank you so much for your help so far!!

cmerriman · ‎05-19-2017

when you're in verbose mode, what are the available fields listed on the left hand side? do you have REQUEST_URI? if so, what is the value? any other important fields/values listed? are they accurate? it could be the regex being inaccurate.

rh417692 · ‎05-19-2017

In verbose mode, I'm seeing all fields and there's no REQUEST_URI field anywhere, unfortunately. There's JSESSIONID listed among other fields but apart from that field, I can't think of any other field to be equally important displayed.

cmerriman · ‎05-19-2017

i think it's a problem with the regex statement for REQUEST_URI. in the events tab, are you getting the correct events back before the table statement? I'm not sure why that regex wouldn't work. perhaps try something like this:

|rex "REQUEST_URI:.(?N-.), METHOD"

regex101.com is a great tool to tweak regex statements. you might have to just tweak it until you get that field extracted.

rh417692 · ‎05-19-2017

Good news @cmerriman! We're almost there! I did a few tiny tweaks in your query and I'm getting a list with the N- values in the table. Majority of the rows are still coming blank but I can see a few 'N-' values. This is the query I used:

Now, I have 2 questions:
1. How do I get actual values in the table instead of some blank values?
2. How do I make sure that all the N- values in the list are the bad values I'm looking for and not a mix of bad and good?

cmerriman · ‎05-19-2017

In your example it was the third result. Is it always the third ? What qualifies it as bad?

To remove null values, add |search REQUEST_URI=*

rh417692 · ‎05-22-2017

@cmerriman That query removes the null values. Thanks you!

So regarding the N- values. There are good and bad N- values. Good N- values are those where if put inside a url, gives an actual webpage while a bad N- value gives a 'Page not Found' page which is what I'm looking for. Now, when I used to do this query index="prod" "JSESSIONID" | reverse', I would get multiple values with that JSESSIONID. However, whenever I would get a massive event (like in result 3), the event right after that one (result 4 in our case) has an N- value which gives the exact N- value I'm looking for as those values are always bad values. So is there a way to get that N- value from an event right after the big event? And no, it does not necessarily have to be the third result. It can be in any order but the bad N- value come in an event which is right after the big event like you see in my example. Hope my explanation was clear enough? I feel we're maybe only one step away from the needed result. Thank you again for all your help so far!!

cmerriman · ‎05-22-2017

how about adding something like

...| reverse|streamstats count by JSESSIONID|eval length=len(_raw)|streamstats window=1 current=f values(length) as previousLength by JSESSIONID|eval lengthDiff=prevLength-length

this would give you the number of characters in the _raw event and then the number of characters in the previous event, you might be able to search where the difference of lengths between events is so much?

rh417692 · ‎05-23-2017

This does not give me the table like in the previous query and even when I put the table tag, don't get my result.

Is there a way to skip to the next event if we have a big event? Let's say we have an event with more than 5 lines (most events have 5 lines or under) and when we have that event (big event) we skip to the next line and fetch the N- value from there and once we've got that value, we scan again for the next big event and repeat the same way. I think that way, we can get all the required N- values. Hope it makes sense and that there's a way to do it!

cmerriman · ‎05-17-2017

will this rex work for your extraction of JSESSIONID:

JSESSIONID:(?<JSESSIONID>\w+)

rh417692 · ‎05-17-2017

Yup! This gets me the JSESSIONID. Thanks @cmerriman!

Multiple queries to extract one value

Webinar Recap | Revolutionizing IT Operations: The Transformative Power of AI and ML ...

.conf24 | Registration Open!

ICYMI - Check out the latest releases of Splunk Edge Processor