Deployment Architecture

Removing a Host from Data Summary.

Shridhar7Hitesh
Explorer

Hi,
I mistakenly uploaded the same data twice and now I want to remove 1 host as well as the data linked to host. When I click the data summary option in Search and Reporting page, it shows the same host name twice. Please suggest how to remove it.

Also,
I want to know if duplicacy has some issues in finding or fetching the data? i am trying to fetch data. It was working properly before I uploaded a duplicate copy of data but not after that. Even the simple search like "fail*" is not retrieving anything.

Please help.
Thanks,
Hitesh

Tags (1)
0 Karma

Richfez
SplunkTrust
SplunkTrust

Carefully craft a search that returns those rows and ONLY those rows. There's isn't enough information here to know precisely what it is that search will look like.

You could start by doing something like a * | stats count by sourcetype to find out what sourcetype the ones you want to get rid of are, since I think this is your determining factor (remove all of that older sourcetype).

Once you've determined the sourcetype (or anything unique) of the data you want to remove, craft yourself a search that displays just those events. Perhaps

index=* sourcetype=Y 

Make double sure this works right. It should include ALL the rows you want to have removed, but include NO rows that you want to keep. This is the search we'll use to actually do the delete with.

Then follow the steps in the documentation on removing data from indexes using YOUR search to prevent those results from showing up again. To recap "how to delete data" from the docs, it's basically
1) Add "delete" capabilities to a role (preferably use a special user)
2) Log in as that role
3) Use that search we made above, double-check that it returns the right data and only that data.
4) Then run that search | delete
5) Watch the output, it'll tell you how many events got deleted.

Then log OUT as that special user (and I'd suggest disabling it, but you can do what you want), log back in as your usual user and check that you only have the right data in there now.

0 Karma

Shridhar7Hitesh
Explorer

Yea the Solution is "Change the time setting from All time(real) time to All time or pick other days/any time variant. " Since All-time (real) and All time gave me an initial confusion, make sure we look or accept the correct time variant.
Since I got my search results well, the actual question is still the same.

Question with Scenario:

Scenario:
I have Uploaded the same file twice in my Splunk server using "Upload Option" under 2 different names and under different conditions.
These 2 files are visible under "Data Summary" as "host"name "ABC and XYZ".
File "ABC" while uploaded i have selected the "default Setting" under "sourcetype" and
File "XYZ" while uploaded i have selected the "Application->db_audit" under "sourcetype
".

Now I want to delete file with "host"-> "ABC".

Question: How do I do the deletion ?

0 Karma

Richfez
SplunkTrust
SplunkTrust

What index is this data in, and what other data is in that index?

What results to you get with the following search? index=*

If you put this into its own index and there's no other data in that index except this duplicated data, and if you can load the data in once again (Which seems likely since you have already done it twice), it is super easy to just delete the index and reload the data. That won't delete any of the knowledge objects or searches or anything related to it, just the data.

If this data is intermingled in an index with other data you do not want to lose, then this gets harder.

0 Karma

Shridhar7Hitesh
Explorer

Hi,
What i am saying is that I did the "Add data" option and then use the "Upload Data" to feed my data to splunk. I chose the same file but under different options such as once " Sourcetype=csv with default settings" and 2nd time with "database->db_audit" .
Now after this I am getting nothing reflected from search bar while there are 320,000 events indexed as shown in data summary.
Even fail* is not retrieving any value.

0 Karma

Richfez
SplunkTrust
SplunkTrust

That's all good information, but what does the following search get you when run over all time? (All time is important - newest versions of Splunk default to showing you last 24 hours and we need that to be opened up.)

index=*

Don't give it a more specific search - your search for fail* is looking for events with that string in them; there's a variety of reasons this may not work so we need to start at the bottom and work our way up.

So does that search I pasted above return any data or not? If it does, what is the sourcetype of the data returned?

Also, when you uploaded with the database->db_audit source type, did all the rest of the data upload wizard make sense and work right? Did it find your timestamps, fields and so on? You usually can't just change a sourcetype from a csv to a db_audit unless that's what type of data it is, and only if it matches what that sourcetype expects the data to look like. And it seems to me that something in a CSV isn't likely to be properly parsed as a DB Audit trail.

What I mean by that is the sourcetype isn't really a tag you apply to some data to tell you what the data is, it's a tag based on the data itself to tell the system what kind of data it is. Yes, you can apply it, yes you can make your own and so on, but if you are using an existing one it has to match the actual data being ingested or it won't work right. Does that make sense?

0 Karma

Shridhar7Hitesh
Explorer

Hi,
yes I ran the search index=* and initially it didn't return any data or any sourcetype.
I changed the window to real time and gave the value as earliest to 1 week and newest to now. it returned data with expected sourcetype.
Then I changed the time setting to Week to date again I didn't get any result.
Then I changed it to month and results were retrieved.
I have made some changes in the Time selector and forgot to roll back when I asked this question. But now I am more amazed to see this variation that when in real time I give the value from week to date as in earliest and newest, I am getting result but when I do the same in relative time: i am not getting the result. Interesting.

Anyways,Thanks a lot for quick responses. For now its working. Some times silly things can create a lot of fuss..

0 Karma

Richfez
SplunkTrust
SplunkTrust

AH! If you actually found a solution could you write that up as an answer here? Then you can accept your own answer - It's OK to do that occasionally when it makes sense and I think in this case it makes sense!

If you still have related issues about this data, you can either write up an answer and accept it here then start a new, more specific question for your new problems...

-- OR --

Provide a bit more context here about what it's doing and what it's not doing and we can continue debugging in this thread.

Your choice!

Happy Splunking,
Rich

0 Karma
Get Updates on the Splunk Community!

What's new in Splunk Cloud Platform 9.1.2312?

Hi Splunky people! We are excited to share the newest updates in Splunk Cloud Platform 9.1.2312! Analysts can ...

What’s New in Splunk Security Essentials 3.8.0?

Splunk Security Essentials (SSE) is an app that can amplify the power of your existing Splunk Cloud Platform, ...

Let’s Get You Certified – Vegas-Style at .conf24

Are you ready to level up your Splunk game? Then, let’s get you certified live at .conf24 – our annual user ...