Getting Data In

Can the Export function export csv file in BIG5 charset ??

SamChang
Path Finder

Dear Sir

Our customer export results to csv file. They open this csv file with Microsoft Excel.
Because csv file include Traditional Chinese words,
the Microsoft Excel can't display csv file's content normally (in UTF-8).
Can the Export function export csv file in BIG5 charset ??
Please help me to fix this issue, thank...

Tags (2)
0 Karma

jrodman
Splunk Employee
Splunk Employee

Looking at the problem more closely, this is an Excel bug.

Your proposal of converting the utf8 to big5 doesn't work, at least with my version of Excel 2008. It fails to understand the text at all. No option is provided to select a character set.

Converting the utf8 to utf16-le works sort of. It causes excel to drop into the import wizard, where you have to specify that this is seperator delimited, and the seperator is a comma, despite it being a file with "csv" extension. It will further prompt you to interpret some of the fields, which might be useful. This means you get, even with utf16 around an 5-6 step process to load up the files. It's hard to know if that meets your use case, but the limitation is strictly on Excel. The text does appear to be intact at that point, however.

Customers using other versions of Excel will see different behavior. They might already have success with the above proposed change.

0 Karma

Stephen_Sorkin
Splunk Employee
Splunk Employee

Here is an extension to my previous answer. Excel seems to honor UTF-8 encoded files as long as the files start with the UTF-8 BOM. Below is a small patch that can be made to the appserver to allow this. Note that this code is experimental and shouldn't be applied in general. This code will also disrupt non-Windows users of the CSV files, as most other platforms don't expect a UTF-8 BOM. I will file an enhancement request to add another option for Excel-specific CSV.

In the file $SPLUNK_HOME/lib/python2.6/site-packages/splunk/appserver/mrsparkle/controllers/search.py go to around line 266, where the statement output = job.getFeed(asset, http_method=jobFeedRequestMethod, **kwargs) is.

Add the following code after that line:

    if 'isDownload' in kwargs:
        import codecs
        output = "".join((codecs.BOM_UTF8, output))

This should allow the files to be read by Excel.

0 Karma

jrodman
Splunk Employee
Splunk Employee

I think the behavior may well be different depending upon the platform as well as version. Do you need this to work on 2008 excel for the mac, or excel for windows or both? I think we're encountering at least one microsoft bug, so this information is important.

0 Karma

Genti
Splunk Employee
Splunk Employee

Tried this and it seems that the BOM flag is not set. Will work with Sorkin to find out why.

0 Karma

Stephen_Sorkin
Splunk Employee
Splunk Employee

I'd suggest exporting to a file after making this change and verifying that the UTF-8 BOM is the first part of the file.

0 Karma

dmlee
Communicator

Hi Stephen,
I tried to modify search.py as you said , my Splunk is 4.1.6 , there is some different in the source code, I think maybe it is not the right place to add code , anyway I tried to add your code as below, restart splunk services , search something , export result as csv file, open csv file using Excel 2008 ... Excel still doesn't recognise chinese words (grabeled)

    # pass through the search options
    output = job.getFeed(asset, **kwargs)

    if 'isDownload' in kwargs:
        import codecs
        output = "".join((codecs.BOM_UTF8, output))
0 Karma

Stephen_Sorkin
Splunk Employee
Splunk Employee

No, export can only export UTF-8. I will do some experiments with exporting to UTF-16LE and see if Excel honors that. If so, we could provide a patch for the appserver to deliver files in this encoding.

0 Karma
Get Updates on the Splunk Community!

Introducing Splunk Enterprise 9.2

WATCH HERE! Watch this Tech Talk to learn about the latest features and enhancements shipped in the new Splunk ...

Adoption of RUM and APM at Splunk

    Unleash the power of Splunk Observability   Watch Now In this can't miss Tech Talk! The Splunk Growth ...

Routing logs with Splunk OTel Collector for Kubernetes

The Splunk Distribution of the OpenTelemetry (OTel) Collector is a product that provides a way to ingest ...