i am searching like this
sourcetype=user
|fields user_id, user_gender, user_age,user_occup,user_zipcode
|rename user_id as rate_user_id
|join rate_user_id [search sourcetype=rate |rename rate_movie_id as movie_id |
join movie_id [search sourcetype = movie ] ]
| stats avg(rate_rating) as avg_rating by movie_id,movie_name
| stats max(avg_rating) as ttt by movie_name
|sort -ttt | head 100
What i am asking is can i get the other columns??? instead of only movie_name columns?
result looks like
movie_name | avg_rating
i want something like this
movie_name | avg_rating | movie_id .. . . .| ..... | . .. . . .
First, you seem to be using Splunk like a relational database. What you are doing may work, but you aren't leveraging Splunk's capabilities. In Splunk, the user data and the movie data would probably be better as lookup tables, and only the ratings would be Splunk events. If you did it that way, it would be much easier to get what you want. (And BTW, there is no need for unique field names like rate_movie_id
- if the field is the id of the movie, just use `movie_id".)
However, the following will work and will be more efficient as well:
[edit] fixed example
sourcetype=rate | rename rate_movie_id as movie_id
| stats avg(rating) AS avg_rating count as number_of_ratings by movie_id
| join movie_id [ search sourcetype = movie ]
| table movie_name avg_rating number_of_ratings movie_id
| sort -avg_rating | head 100
Since you aren't using any of the user information, I ignored it. This lists the 100 movies with the highest average rating.
First, you seem to be using Splunk like a relational database. What you are doing may work, but you aren't leveraging Splunk's capabilities. In Splunk, the user data and the movie data would probably be better as lookup tables, and only the ratings would be Splunk events. If you did it that way, it would be much easier to get what you want. (And BTW, there is no need for unique field names like rate_movie_id
- if the field is the id of the movie, just use `movie_id".)
However, the following will work and will be more efficient as well:
[edit] fixed example
sourcetype=rate | rename rate_movie_id as movie_id
| stats avg(rating) AS avg_rating count as number_of_ratings by movie_id
| join movie_id [ search sourcetype = movie ]
| table movie_name avg_rating number_of_ratings movie_id
| sort -avg_rating | head 100
Since you aren't using any of the user information, I ignored it. This lists the 100 movies with the highest average rating.
Thank you!
Thanks @yannK
missing "AS" to the search, I just edited the search.
here is the lookup manual
http://docs.splunk.com/Documentation/Splunk/6.0.1/Search/Useexternalfieldlookups
and the lookup command
http://docs.splunk.com/Documentation/Splunk/6.0.1/SearchReference/Lookup
avg_rating is is showing invalid argument is it right ?;
and can you give me an example with using lookup?