Hi,
I have data in the following format from Microsoft Windows OS process executions:
FileName,ProcessID,ParentProcessID
child1.exe,126,108
parent1.exe,108,93
grandparent1.exe,93,24
child2.exe,276,92
parent2.exe,92,24
...
As you can see, for example, the process hierarchies here would be as follows:
grandparent1.exe -> parent1.exe -> child1.exe
grandparent1.exe -> parent2.exe -> child2.exe
And there could be many more relationships with various parents and grandparents, as you would expect.
I would like to output these relationships in the following manner:
FileName,ParentFileName,GrandparentFileName
child1.exe,parent1.exe,grandparent1.exe
child2.exe,parent2.exe,grandparent1.exe
...
The limitations here are that, frustratingly, I have no permissions to use lookup tables on the hosted Splunk environment I'm using.
Currently, I can quite easily get the parent information using the following:
| inputcsv dispatch=t procs.csv
| append [
search event=ProcessExecution earliest=-1y latest=now [
search event=ProcessExecution FileName="cmd.exe"
| rename ParentProcessID AS ProcessID
| outputcsv dispatch=t procs.csv
| fields ProcessID
]
| rename FileName AS ParentFileName
| fields ParentFileName
]
| stats values(FileName) as FileName
values(ParentFileName) as ParentFileName
by ProcessID
But I'm totally lost on how I would get the grandparent information into this.
I'd like to stay away from using 'join' because I'll sometimes be processing a lot more than 50,000 records. As you can see above, I'm limiting my first subsearch to 'FileName' matching 'cmd.exe" and only querying for the parent processes of those records. This way, the search is efficient and will never hit 50,000.
Any help is much appreciated, thank you.
... View more