Splunk Search

Sorting help

changwoo
Communicator

i am searching like this

sourcetype=user
|fields user_id, user_gender, user_age,user_occup,user_zipcode 
|rename user_id as rate_user_id
|join rate_user_id [search sourcetype=rate |rename rate_movie_id as movie_id | 
join movie_id [search sourcetype = movie ] ]
| stats avg(rate_rating) as avg_rating by movie_id,movie_name 
| stats max(avg_rating) as ttt by movie_name 
|sort -ttt | head 100

What i am asking is can i get the other columns??? instead of only movie_name columns?
result looks like
movie_name | avg_rating
i want something like this
movie_name | avg_rating | movie_id .. . . .| ..... | . .. . . .

Tags (1)
0 Karma
1 Solution

lguinn2
Legend

First, you seem to be using Splunk like a relational database. What you are doing may work, but you aren't leveraging Splunk's capabilities. In Splunk, the user data and the movie data would probably be better as lookup tables, and only the ratings would be Splunk events. If you did it that way, it would be much easier to get what you want. (And BTW, there is no need for unique field names like rate_movie_id - if the field is the id of the movie, just use `movie_id".)

However, the following will work and will be more efficient as well:

[edit] fixed example

sourcetype=rate |  rename rate_movie_id as movie_id
| stats avg(rating) AS avg_rating count as number_of_ratings by movie_id
| join movie_id [ search sourcetype = movie ]
| table movie_name avg_rating number_of_ratings movie_id
| sort -avg_rating | head 100

Since you aren't using any of the user information, I ignored it. This lists the 100 movies with the highest average rating.

View solution in original post

lguinn2
Legend

First, you seem to be using Splunk like a relational database. What you are doing may work, but you aren't leveraging Splunk's capabilities. In Splunk, the user data and the movie data would probably be better as lookup tables, and only the ratings would be Splunk events. If you did it that way, it would be much easier to get what you want. (And BTW, there is no need for unique field names like rate_movie_id - if the field is the id of the movie, just use `movie_id".)

However, the following will work and will be more efficient as well:

[edit] fixed example

sourcetype=rate |  rename rate_movie_id as movie_id
| stats avg(rating) AS avg_rating count as number_of_ratings by movie_id
| join movie_id [ search sourcetype = movie ]
| table movie_name avg_rating number_of_ratings movie_id
| sort -avg_rating | head 100

Since you aren't using any of the user information, I ignored it. This lists the 100 movies with the highest average rating.

changwoo
Communicator

Thank you!

0 Karma

lguinn2
Legend

Thanks @yannK

0 Karma

yannK
Splunk Employee
Splunk Employee

missing "AS" to the search, I just edited the search.

yannK
Splunk Employee
Splunk Employee
0 Karma

changwoo
Communicator

avg_rating is is showing invalid argument is it right ?;

and can you give me an example with using lookup?

0 Karma
Get Updates on the Splunk Community!

Introducing the 2024 SplunkTrust!

Hello, Splunk Community! We are beyond thrilled to announce our newest group of SplunkTrust members!  The ...

Introducing the 2024 Splunk MVPs!

We are excited to announce the 2024 cohort of the Splunk MVP program. Splunk MVPs are passionate members of ...

Splunk Custom Visualizations App End of Life

The Splunk Custom Visualizations apps End of Life for SimpleXML will reach end of support on Dec 21, 2024, ...