Good day,
I have a usecase explained below -
Index A has Reporting_Host (mix of IP address, hostname, FQDN) and Index CMDBB had data from CMDB ( so contains hostname, FQDN, IP Address, Server owner information etc ). My requirement is to map Reporting_Host data from index A again CMDB data and display server owner information along with hostname, IP etc. Issue here is, Index CMDB has data in multiple fields like Hostname (contains servername), CI_Name(contains FQDN), IP_address(obvious IP address). How do I match Reporting_Host field values agains these 3 fields in CMDB and display the output?
I tried using join but able to compare with any one field in CMDB data but not 3. Sample query below -
index=A sourcetype=syslog_stats | stats min(_time) as old, max(_time) as new by Reporting_Host | stats min(old) as oldest, max(new) as newest by Reporting_Host | eval diff = tostring((newest - oldest), "duration") | where newest < now() - (86400 * 2) | eval stopped= (now()-newest) | eval stopped_for = round(stopped/86400, 0) | convert ctime(oldest) | convert ctime(newest) | join Reporting_Host [ search index=CMDB | rename HostName as Reporting_Host ] | fields oldest newest diff stopped_for Reporting_Host Server_Owner
I did a field alias for CI_Name, IP_address and Hostname and named it as HostName but its not working.
You may want to clarify your requirements. If I take your description literally, I see two indices, each contain three fields, IP address, hostname, and FQDN, and each contain some extra fields; I would interpret your desire to match multiple fields as a requirement to have exact match of all three fields. Assuming that matching field names are identical in both indices, there is no reason to not add all three fields to join command. Is this what you wanted?
But really, you should avoid using join in the first place because it is inefficient. That is why there is a special section Alternative commands in that document. The best one would be a combination of OR and stats, like
(index=A sourcetype=syslog_stats) OR index=CMDB
| stats min(_time) as old, max(_time) as new
values(owner_info1) as owner_info1 values(owner_info2) as owner_info2
by IP hostname FQDN
I have some suspicion that this is NOT the actual requirement, however, because you mentioned only one field Reporting_Host in index A, but described it as "mix of IP address, hostname, FQDN". If you don't illustrate what this "mix" means, no one can really help you.
Do you mean that "Reporting_Host" is a string that combines IP, hostname, and FQDN that can match corresponding values in index CMDB? If I assume that Reporting_Host is of format "127.0.0.0-zeus-jupiter.mycompany.com", i.e., "<ipv4>-<hostname>-<FQDN>", you can split it into the three components and run stats just like above. In this example, you do
index=A sourcetype=syslog_stats OR index=CMDB
| rex field=Reporting_Host "(?<IP>[^-]+)-(?<hostname>[^-]+)-(?<FQDN>.+)"
| stats min(_time) as old, max(_time) as new values(owner_info1) as owner_info1 values(owner_info2) as owner_info2 by IP hostname FQDN
Hi @yuanliu
Thank you for looking into it. When I say mix of values (Reporting_host), I meant like below
Reporting_Host (not a string that contains IP host and FQDN but its like below but list of individual values)
100.10.10.10
abc
xyz
44.16.23.01
and goes on..
This explanation is much clearer. So, Reporting_Host could be any of the three things, and you don't have a reliable way to predict which one it is in a given event. If so, you will need to process it first to know which one each event comes with. For example,
(index=A sourcetype=syslog_stats) OR index=CMDB
| rex field=Reporting_Host "^[^@]+@(?<FQDN>.+)"
| rex field=Reporting_Host "^(?<hostname>[^\.]+)$"
| rex field=Reporting_Host "^(?<IP>\d+(\.\d+){3}$")
| stats min(_time) as old, max(_time) as new
values(owner_info1) as owner_info1 values(owner_info2) as owner_info2
by IP hostname FQDN
Because you didn't provide additional identifying information/criteria, the above takes a lot of assumptions about your data, e.g., "hostname contains no dot (.)." If you have additional conditions to help figure out which event comes with which, definitely use it.
😞
I will put some sample data here for better understanding of my usecase.
Index A contains events like below. Here I already did a field extraction for IP address Hostname FQDN and named it as Reporting host. This is basically list of servers that are forwarding logs.
2023-02-21 00:14:43.6543016070 2.2.2.2
2023-02-21 00:14:43.6213010920 abc.domain.net
2023-02-21 00:14:43.6543016070 4.3.2.1
2023-02-21 00:14:43.6213010920 xyz.domain.net
2023-02-21 00:14:43.6543016070 1.1.1.1
2023-02-21 00:14:43.6213010920 pqr
so when I index=A | stats count by Reporting_host, the result is below. First of the query works.
Reporting_Host
2.2.2.2
abc.domain.net
4.3.2.1
xyz.domain.net
1.1.1.1
pqr
Now I have second index which is from our CMDB which contains Server information like owner, BU, location etc.
2023-02-20 07:47:14.269, CI_Name="xyz.domain.net", Hostname="xyz", Domain="example.com", Environment="QA", IP_Address="3.5.4.6", Tier1="Hosting", Tier2="Processing unit", Tier3="Server", Operating_System="windows", OS_Version="Server 2016", Priority="Priority_5", Server_Owner="Owner3"
2023-02-20 07:47:14.269, CI_Name="pqr.domain.net", Hostname="pqr", Domain="example.com", Environment="QA", IP_Address="13.15.14.16", Tier1="Hosting", Tier2="Processing unit", Tier3="Server", Operating_System="windows", OS_Version="Server 2016", Priority="Priority_5", Server_Owner="owner2"
2023-02-20 07:47:14.269, CI_Name="Host1.domain.net", Hostname="Host1", Domain="example.com", Environment="QA", IP_Address="2.2.2.2", Tier1="Hosting", Tier2="Processing unit", Tier3="Server", Operating_System="windows", OS_Version="Server 2016", Priority="Priority_5", Server_Owner="owner1"
My usecase is I need to map the Reporting_host from index A with index B data and get the server information. Challenge I am facing here is, Reporting_host as shown above is list of IP address, FQDN, Hostnames ( all comes in separate events like some hosts are reporting with IP address, some are reporting with FQDN and rest with IP address so the results) so how can I compare REporting_host with CMDB data and get the Server information. If the Reporting_host comes in one format like hostname or IP address the comparison could have been easy so I will just use join and the common field would be Reporting_host comparing it with Hostname in index B. Since the Reporting_host is has 3 different formats how do I compare them ? I hope I explained my usecase in detail now. Sorry for too many post.
For above ex: my output should be
Reporting_Host Server_Owner
2.2.2.2 Owner1
xyz.domain.net Owner3
pqr Owner2
Yes, if you want specific help, it is extremely important to illustrate and explain your data. The main difference between this latest illustration and what you explained previously is in field names. Also, that Reporting_Host field does not contain E-mail address format. Therefore I will not bother to match that. Otherwise, the general formula remains unchanged.
(index=A sourcetype=syslog_stats) OR index=CMDB
| rex field=Reporting_Host "^(?<Hostname>[^\.]+)$"
| rex field=Reporting_Host "^(?<IP_Address>\d+(\.\d+){3})$"
| rex field=Reporting_Host "^(?<CI_Name>[^\.]+(\.[^\.]+)+)$"
| eval CI_Name = if(index=="A" AND isnotnull(IP_Address), null(), CI_Name)
| stats values(Server_Owner) as Server_Owner by IP_Address CI_Name Hostname
Using your samples, the first line will give you events like these
_raw | _time |
2023-02-20 07:47:14.269, CI_Name="xyz.domain.net", Hostname="xyz", Domain="example.com", Environment="QA", IP_Address="3.5.4.6", Tier1="Hosting", Tier2="Processing unit", Tier3="Server", Operating_System="windows", OS_Version="Server 2016", Priority="Priority_5", Server_Owner="Owner3" | 2023-02-20 07:47:14.269 |
2023-02-20 07:47:14.269, CI_Name="pqr.domain.net", Hostname="pqr", Domain="example.com", Environment="QA", IP_Address="13.15.14.16", Tier1="Hosting", Tier2="Processing unit", Tier3="Server", Operating_System="windows", OS_Version="Server 2016", Priority="Priority_5", Server_Owner="owner2" | 2023-02-20 07:47:14.269 |
2023-02-20 07:47:14.269, CI_Name="Host1.domain.net", Hostname="Host1", Domain="example.com", Environment="QA", IP_Address="2.2.2.2", Tier1="Hosting", Tier2="Processing unit", Tier3="Server", Operating_System="windows", OS_Version="Server 2016", Priority="Priority_5", Server_Owner="owner1" | 2023-02-20 07:47:14.269 |
2023-02-21 00:14:43.6543016070 2.2.2.2 | 2023-02-21 00:14:43.6543016070 |
2023-02-21 00:14:43.6213010920 abc.domain.net | 2023-02-21 00:14:43.6213010920 |
2023-02-21 00:14:43.6543016070 4.3.2.1 | 2023-02-21 00:14:43.6543016070 |
2023-02-21 00:14:43.6213010920 xyz.domain.net | 2023-02-21 00:14:43.6213010920 |
2023-02-21 00:14:43.6543016070 1.1.1.1 | 2023-02-21 00:14:43.6543016070 |
2023-02-21 00:14:43.6213010920 pqr | 2023-02-21 00:14:43.6213010920 |
And the full search will give you
IP_Address | CI_Name | Hostname | Server_Owner |
13.15.14.16 | pqr.domain.net | pqr | owner2 |
2.2.2.2 | Host1.domain.net | Host1 | owner1 |
3.5.4.6 | xyz.domain.net | xyz | Owner3 |