Solved: How to retrive a Date column from another search(i...

kiru2992 · ‎07-27-2020

Hello Everyone!

I have a scenario to get a Date column from index1 in search1 and remove the rows with null values in Date column in search2 on index2. I have column id common in both the indexes

Example :

Search1:

index="index1" sourcetype="st1" field1="abc"

|table id1 Date1

Search2:

index="index2" source="xyz"

|?????????

| eval Date2 =Date1 where id2 =id1

|.........

Output:

|table id2 Date2 where Date2 NOT NULL

What's the best way to go about it? Including both indexes at the start of the search is not feasible given the absurd size of the second index.

Can anyone please help me here?

Thank you in advance.

gcusello · ‎07-28-2020

Hi @kiru2992,

if id field is the same in both the searches, you don't need of eval and coalesce, so try something like this:

(index="index1" sourcetype="st1" field1="abc") OR (index="index2" source="xyz")
| stats values(fielda) AS fielda values(fieldb) AS fieldb values(DateField) AS DateField BY id

eventually for DateField you can use earliest or lates instead values.

For your information coalesce takes an arbitrary number of arguments and returns the first value that is not NULL and it's useful when you want to take the values from two or more columns in a new one (for more infos see at https://docs.splunk.com/Documentation/Splunk/8.0.5/SearchReference/CommonEvalFunctions ).

Ciao.

Giuseppe

View solution in original post

gcusello · ‎07-27-2020

Hi @kiru2992,

when you say that search2 has an "absurd size", you mean that it's very large, I think.

This means that you cannot put the second search in a subsearch, because there's the limit of 50,000 results in subsearches.

I don't like join because it's very slow and I always prefer to use a main search with both the searches correlating the values with a stats command BY the common key.

In other words something like this:

(index="index1" sourcetype="st1" field1="abc") OR (index="index2" source="xyz")
| eval id=coalesce(id2, id1)
| stats values(Date2) AS Date2 BY id

obviously I cannot test your search, but I'd like to show you the approach to this kind of use cases, Splunk isn't a DB where join is the best approach.

Ciao.

Giuseppe

kiru2992 · ‎07-28-2020

Hi @gcusello,
Yes, by "absurd" I meant large. Thank you for reply.
I was not sure about how coalesce work, nevertheless I tried you suggestion.
And found few things mentioned below:
In both the indexes the id is stored as 'id' so I tried 'rename' within parentheses of search and received the below error even when they were balanced. From what I understood, introducing a "|" symbol leads to the below error.Error: Error in 'search' command: Unable to parse the search: unbalanced parentheses.
id1 is a subset of id2 and index2 has duplicate id's and I require them for my calculation. According to my knowledge using coalesce removes duplicate values.
To make things clear, in index1 I have to extract set of ids(id1) and their respective Date(Date1) and then I have to find all(even duplicates) the ids(id2) in index2 which is equivalent to id1 and map the respective Date.
For example:
Sample Data
index1 :    id   field1   field2   DateField
                100   abc      xyz        10-06-2020
                101    def      www       08-06-2020
                102    abc      uuu         25-05-2020

index2 :    id fielda    fieldb
                100   open             ""
                101    closed    Solved
                100   Progress    ""
                101    closed      Solved
                102    Progress ""
                100   Closed       Invalid
   102    open           ""
Sample Output:
                id fielda    fieldb         DateField
                100   open             ""           10-06-2020
                101    closed    Solved        08-06-2020
                100   Progress    ""            10-06-2020
                101    closed      Solved        08-06-2020
                102    Progress ""            25-05-2020
                101    closed      Solved      08-06-2020
                100   Closed       Invalid       10-06-2020
   102    open           ""                 25-05-2020
I hope this helps to understand the scenario better. To proceed further, please let me know how to use 'rename' within parentheses and how to map date to even duplicate ids.
Any help or lead is much appreciated. Kindly let me know in case any further queries.

Thanks,
@kiru2992

gcusello · ‎07-28-2020

Hi @kiru2992,

if id field is the same in both the searches, you don't need of eval and coalesce, so try something like this:

(index="index1" sourcetype="st1" field1="abc") OR (index="index2" source="xyz")
| stats values(fielda) AS fielda values(fieldb) AS fieldb values(DateField) AS DateField BY id

eventually for DateField you can use earliest or lates instead values.

For your information coalesce takes an arbitrary number of arguments and returns the first value that is not NULL and it's useful when you want to take the values from two or more columns in a new one (for more infos see at https://docs.splunk.com/Documentation/Splunk/8.0.5/SearchReference/CommonEvalFunctions ).

Ciao.

Giuseppe

kiru2992 · ‎07-28-2020

Hi @gcusello,
Thank you for your swift reply. I would have a look at linked page.
When I used your method, I am not able to see duplicate ids..
For Example:

Actual Output:
                id fielda    fieldb         DateField
                101    closed      Solved      08-06-2020
                100   Closed       Invalid       10-06-2020
   102    open           ""                 25-05-2020

Required Output:

id fielda fieldb DateField

100 open "" 10-06-2020

101 closed Solved 08-06-2020

100 Progress "" 10-06-2020

101 closed Solved 08-06-2020

102 Progress "" 25-05-2020

101 closed Solved 08-06-2020

100 Closed Invalid 10-06-2020

102 open "" 25-05-2020

Can you please let me know how to retain the duplicate ids as I require them to calculate 'fielda' transitions?

Thank you in advance.

gcusello · ‎07-28-2020

hi @kiru2992,

ok, so try this please:

(index="index1" sourcetype="st1" field1="abc") OR (index="index2" source="xyz")
| fillnull fielda value="-"
| fillnull fieldb value="-"
| stats values(DateField) AS DateField BY id fielda fieldb

Ciao.

Giuseppe

kiru2992 · ‎07-28-2020

hi @gcusello,

I am sorry, still I am not able to get duplicates.
From my understanding, the below line itself removes duplicates. May I please know whether we have a way to retain duplicates here.
(index="index1" sourcetype="st1" field1="abc") OR (index="index2" source="xyz")
Thank you in advance.

gcusello · ‎07-28-2020

Hi @kiru2992,

no the main search doesn't remove duplicates, it gives you all the events.

You remove duplicates with the stats command indicating (with BY clause) which are the keys to consider.

Ciao.

Giuseppe

kiru2992 · ‎07-30-2020

Hi @gcusello,
Yes... You are right.. But for me stats with both id and fields resulted in only distinct values. Luckily, streamstats worked for me.. Thanks a lot for your help:)

spitchika · ‎07-27-2020

You should go with "inner Join" with subsearch...

Something like this

Index=index1 sourcetype="st1" field1="abc"
| join type=inner ID [search index=index2 ... remaining query}

| table your fields

Note : Joining field names should be same, so if you have different names then rename them using "rename" command.

Hope it gives you some clue. Thanks.

How to retrive a Date column from another search(index)

join

Threat Hunting Unlocked: How to Uplevel Your Threat Hunting With the PEAK Framework ...

Splunk APM: New Product Features + Community Office Hours Recap!

Index This | Forward, I’m heavy; backward, I’m not. What am I?