Solved: CSV Japanese header extraction

msona · ‎02-11-2011

Dear all,

I want to extract the Japanese CSV header from csv log file. I am configurations are as follows.

inputs.conf
---------------------------------------
[monitor:C:\Program Files\Splunk\etc\apps\tougou\tougou_logs\*.csv]
disabled = false
host =My-PC
index =myindex

props.conf
[source::...CPU...]
sourcetype = cpu

[source::...Disk...]
sourcetype = disk

[source::...Network...]
sourcetype = network

[source::...Swap_Rate...]
sourcetype = swap-rate


[disk]
CHECK_FOR_HEADER=TRUE
 #REPORT-disk=argus_extractions_disk

[network]
CHECK_FOR_HEADER=TRUE
#REPORT-network=argus_extractions_network

[cpu]
CHECK_FOR_HEADER=TRUE
#REPORT-cpu=argus_extractions_cpu

[swap-rate]
CHECK_FOR_HEADER=TRUE
#REPORT-swap-rate=argus_extractions_swap_rate


----------------------------------------------------
transforms.conf
----------------------------------------------------
[argus_extractions_disk]
DELIMS=","
FIELDS="タイムゾーン","記録時間","システム名","タイム・スタンプ","ディスク名","マウント・ポイント","ファイル・システム・タイプ","サイズ (MB)","使用 ディスク (MB)","使用 ディスク 率","フリー・ ディスク (MB)","フリー・ ディスク 率","合計 i ノード","使用済み i ノード数","i ノード使用率","空き i ノード","フリーの i ノード 率"

[argus_extractions_network]
DELIMS=","
FIELDS="タイムゾーン","記録時間","システム名","タイム・スタンプ","ネットワーク・インターフェース名","IPアドレス","インターフェース状況","最大転送単位","受信 数 (KB)","1 秒当たりの 受信 バイト","送信 数 (KB)","1 秒当たりの 送信 バイト","受信パケット","1 秒当たりの 受信 パケット","送信パケット","1 秒当たりの 送信 パケット","入力エラー","出力エラー","合計衝突","衝突  (分あたり)","衝突率","入力エラー (分あたり)","出力エラー (分あたり)","エラー (%)","ドロップした入力パケット","ドロップした出力パケット","入力 FIFO バッファー・オーバーラン","出力 FIFO バッファー・オーバーラン","パケット・フレーム・エラー","キャリア・ロス","入力エラー (%)","出力エラー (%)","デバイス・タイプ","MACアドレス"

[argus_extractions_cpu]
DELIMS=","
FIELDS="タイムゾーン","記録時間","システム名","タイム・スタンプ","CPU ID","ユーザー CPU (%)","ユーザー・ナイス CPU (%)","システム CPU (%)","アイドル CPU (%)","使用中の CPU (%)","I/O 待機率 (%)","システム CPU に対するユーザー (%)"

[argus_extractions_swap_rate]
DELIMS=","
FIELDS="タイムゾーン","記録時間","システム名","タイム・スタンプ","合計 スワップ・スペース  (MB)  (移動平均)","使用 スワップ・スペース  (MB)  (移動平均)","使用 スワップ・ スペース (バイト/時)","スワップ・スペースが いっぱいになる までの日数","スワップ・ スペース使用の ピーク  (MB)","スワップがフルになるまでの最小日数","空いている 実メモリーの 最低水準点  (KB)"

Your help appreaciate.

Regards newbie

Hajime · ‎02-14-2011

Hi, Unfortunately you can not use Japanese field names.

Splunk only accepts field names that contain alpha-numeric characters or an underscore.

For more information, please see:
http://www.splunk.com/base/Documentation/latest/Admin/Configureindex-timefieldextraction#Define_addi...

View solution in original post

Suda · ‎04-06-2011

Hello,

You might be able to extract fields with using Japanese field names in this case.

I don't recommend it.So, Hajime-san is right. But... you can (I confirm it with Splunk 4.1.7.)

And you keep in mind that Splunk cannot handle any field names containing space characters. Could you try to change " "(space) to "_"(underscore) in transforms.conf settings?

For example,

[argus_extractions_disk]
DELIMS=","
FIELDS="タイムゾーン","記録時間","システム名","タイム・スタンプ","ディスク名","マウント・ポイント","ファイル・システム・タイプ","サイズ_(MB)","使用_ディスク_(MB)","使用 ディスク_率","フリー・_ディスク (MB)","フリー・_ディスク_率","合計_i_ノード","使用済み_i_ノード数","i_ノード使用率","空き_i_ノード","フリーの_i_ノード_率"

I hope it will help your splunking!

Thanks.

Kenichi Suda

Hajime · ‎02-14-2011

Hi, Unfortunately you can not use Japanese field names.

Splunk only accepts field names that contain alpha-numeric characters or an underscore.

For more information, please see:
http://www.splunk.com/base/Documentation/latest/Admin/Configureindex-timefieldextraction#Define_addi...

Hajime · ‎02-28-2011

Please try to add the setting in configuration: "crcSalt = " In inputs.conf --- For more information, please see: http://www.splunk.com/base/Documentation/latest/Admin/Monitorfilesanddirectories#Monitor_syntax_and_...

msona · ‎02-28-2011

Hi Hajime san,

I am facing the problem that all the data is not getting index and follwoing error I got in splunkd.log :

02-25-2011 19:41:58.030 ERROR TailingProcessor - Ignoring path due to: File will not be read, is too small to match seekptr checksum (file=C:\EDN\test01\kednwbs01_KLZ_Disk_110213.csv). Last time we saw this initcrc, filename was different. You may wish to use a CRC salt on this source. Consult the documentation or file a support case online at http://www.splunk.com/page/submit_issue for more info.

Hajime · ‎02-18-2011

Yes. You should be set individually.
However, you can use [default] to set globally.
For example, In props.conf:
[default]
TRANSFORMS-null= setnull

msona · ‎02-18-2011

Hi, yes I understood, but my all sorurcetypes are starting from タイムゾーン, so REGEX is same for all source type. Should I write different name only ?

Hajime · ‎02-17-2011

Hello. As you point out, my examples are applied to one source type.
I think your settings are correct.

msona · ‎02-17-2011

Dear Hajime San,
Thank you very much for quick answers. I have tried with above but only one source type removed. Here is my configuration.

props.conf
[disk]
REPORT-disk=argus_extractions_disk
TRANSFORMS-null= setnull

[network]
REPORT-network=argus_extractions_network
TRANSFORMS-null= setnull

[cpu]
REPORT-cpu=argus_extractions_cpu
TRANSFORMS-null= setnull

[swap-rate]
REPORT-swap-rate=argus_extractions_swap_rate
TRANSFORMS-null= setnull

transforms.conf:
[setnull]
REGEX = ^タイムゾーン,*
DEST_KEY = queue
FORMAT = nullQueue

Hajime · ‎02-16-2011

If the header starts from "タイムゾーン,", you can remove that header in the following way.

props.conf: (e.g. sourcetype=cpu)

[cpu]
TRANSFORMS-null = setnull

transforms.conf:

[setnull]
REGEX = ^タイムゾーン,.*
DEST_KEY = queue
FORMAT = nullQueue

Please try, should work.

msona · ‎02-15-2011

Hello,

Thanks for the answer but How can I remove this header ?
I have written English fields so that I can able to extract all the fields. But I am getting that header in fields. How I can remove that header from fields ???

CSV Japanese header extraction

Introducing the 2024 SplunkTrust!

Introducing the 2024 Splunk MVPs!

Splunk Custom Visualizations App End of Life