Getting Data In

Search reads first event instead of desired event

crobicha
Explorer

I am using the Google Analytics Data Export API to pull some data down into a log file so it can be indexed by splunk. Data is printed out in the following format:

ga:eventCategory=ViewChange | ga:eventAction=Grid ga:totalEvents=6 ga:uniqueEvents=2

ga:eventCategory=SessionEvent | ga:eventAction=UserLogin ga:totalEvents=13 ga:uniqueEvents=9

My search for counting logins is:

source="googleanalytics.txt" "ga:eventCategory=SessionEvent | ga:eventAction=UserLogin" | extract kvdelim="=" | timechart span=1d sum(totalEvents) as "Total Logins", sum(uniqueEvents) as "Unique Logins"

The problem is that the search is taking the first occurrence of ga:totalEvents, regardless of if it is a UserLogin event or not.

Edit: To be more clear, for the above example the timechart displays 6 total, 2 unique logins instead of the expected 13 total, 9 unique. The pipe inside the quotes is read as a search character, but I have removed it just to make sure, am seeing the same result when just searching for "ga:eventAction=UserLogin"

Tags (3)
0 Karma
1 Solution

crobicha
Explorer

I ended up using a regex after spending way too much time messing with sourcetypes and props.conf, this is my final search:

source="googleanalytics.txt" "ga:eventCategory=SessionEvent | ga:eventAction=UserLogin" | rex field=_raw "ga:eventAction=UserLogin[\s]ga:totalEvents=(?.)[\s]ga:uniqueEvents=(?.)" | eval _time = _time - 172800 | timechart span=1d sum(totalEvents) as "Total Logins", sum(uniqueEvents) as "Unique Logins"

The eval _time statement is because I haven't gotten splunk to pick up the timestamp in the log file properly, instead it timestamps the date when the script is run. GA data isnt guaranteed accurate until 48 hours later so the script pulls from the 24 period starting 3 days ago.

View solution in original post

0 Karma

crobicha
Explorer

I ended up using a regex after spending way too much time messing with sourcetypes and props.conf, this is my final search:

source="googleanalytics.txt" "ga:eventCategory=SessionEvent | ga:eventAction=UserLogin" | rex field=_raw "ga:eventAction=UserLogin[\s]ga:totalEvents=(?.)[\s]ga:uniqueEvents=(?.)" | eval _time = _time - 172800 | timechart span=1d sum(totalEvents) as "Total Logins", sum(uniqueEvents) as "Unique Logins"

The eval _time statement is because I haven't gotten splunk to pick up the timestamp in the log file properly, instead it timestamps the date when the script is run. GA data isnt guaranteed accurate until 48 hours later so the script pulls from the 24 period starting 3 days ago.

0 Karma

crobicha
Explorer

I've run this in the search window and it does work, because it is in quotes splunk must recognize that it is a literal string and not a pipe

0 Karma

tgow
Splunk Employee
Splunk Employee

Why don't you try this instead:

source="googleanalytics.txt" (ga:eventCategory="SessionEvent" OR ga:eventAction="UserLogin") | extract kvdelim="=" | timechart span=1d sum(totalEvents) as "Total Logins", sum(uniqueEvents) as "Unique Logins"

0 Karma

crobicha
Explorer

I may switch to this syntax since it is more clear and doesn't use the pipe, but this doesn't fix my issue.

0 Karma

Ayn
Legend

You must have mispasted your search - the "|ga:eventAction" would be a syntax error as Splunk would try to interpret that as a search command. Please check.

0 Karma
Get Updates on the Splunk Community!

Index This | I am a number, but when you add ‘G’ to me, I go away. What number am I?

March 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with another ...

What’s New in Splunk App for PCI Compliance 5.3.1?

The Splunk App for PCI Compliance allows customers to extend the power of their existing Splunk solution with ...

Extending Observability Content to Splunk Cloud

Register to join us !   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to ...