All Apps and Add-ons

website input charset problem

akdake
Explorer

I have create an input for scraping web-pages for data through Website Inputs add-on, and it work well.

However, the search result is some unreadable codes instead of Chinese character, , such Âé×핽ʽ£ºÈ«Â,
the html charset is gb2312, as following,

Also, the the CHARSET in props.conf is also gb2312

I have also tried charset HZ,utf-8,AUTO, in vain,

who can tell me why ? thanks .

Tags (2)
0 Karma

LukeMurphey
Champion

Please let me know if version 0.8 fixes your problem (or accept the answer so I know it worked).

0 Karma

LukeMurphey
Champion

This is a bug. The input isn't correctly determining the encoding of the page. I have a bug report created for and will get it fixed very soon.

Update:
This should work now as of version 0.8.

LukeMurphey
Champion

Is the site you are trying to get information from public? If so, could you share the selector you are using and the URL you are trying to load data from so that I could reproduce the issue?

akdake
Explorer

Yes, no search result was caputured after installing version, 0.8, I have tried this version on different Splunk demo,

0 Karma

LukeMurphey
Champion

Are you saying that it is no longer logging the results in Splunk?

0 Karma

akdake
Explorer

Many thanks, i update the add on to 0.8 , but it doesn't work, which cannot get any search result., pls confirm that.

0 Karma
Get Updates on the Splunk Community!

Index This | I am a number, but when you add ‘G’ to me, I go away. What number am I?

March 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with another ...

What’s New in Splunk App for PCI Compliance 5.3.1?

The Splunk App for PCI Compliance allows customers to extend the power of their existing Splunk solution with ...

Extending Observability Content to Splunk Cloud

Register to join us !   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to ...