Why you have to start loving JSON and stop using XML [draft]

I recently had to onboard XML data in to Splunk. To say the least, this is not something that is done straight out of the box.

This time around I decided to try to work smarter, so I started looking into tools to convert the XML data into something acceptable for Splunk – enter JSON.

The reason for going with JSON is that Splunk is able to ingest proper JSON with little configuration. Every row contains headers, so field extraction magic is also not required.

I scavenged the net, and ended up using the now deprecated XML2JSON from GitHub.
This Python script simply inputs XML and spits out JSON, no question asked.
For this project I specified “–strip_namespace –strip_newlines –pretty –strip_text”.
xml2json.py -t xml2json -o ${OUTPUTFILE}.json ${INPUTFILE}.xml

The original XML containted several levels, so I had to use the marvellous tool jq to specify that I were only interested in the data some levels down in the three.
jq .message.body.bodyContent.meeting ${INPUTFILE}.json > ${OUTPUTFILE}.json

On the forwarder (heavy in my case), I had to specify a scripted input to actually fetch the data from the API, and also a monitor input to read the resulting JSON:


ignoreOlderThan = 14d

props.conf on the forwarder:

The reason for using DATETIME_CONFIG=CURRENT is due to the fact that these events did not contain any timestamps.

props.conf on the search heads:

KV_MODE=none is specified to specify that the search heads do not need extract fields, as this is already done at index time.

Please note that this configuration results in Splunk performing the extractions at index time and not at search time.
This could result in better search performance if you search using the key::value syntax, or using tstats to search indexed data.

This WILL also consume more storage usage on the indexers.
Final not – indexed extractions is usually not recommended except in specific cases.

Add trusted root certificate to python running under Splunk context [DRAFT]

Create a .py file with the following content:

from requests.utils import DEFAULT_CA_BUNDLE_PATH; print(DEFAULT_CA_BUNDLE_PATH)

Run the script in the context of Splunk:

[splunker@splunk splunk]$ splunk cmd python ca.py

Then add your certificate in base64-format to this file to let Splunk trust your SSL-inspecting proxy. Bear in mind that this file is write-protected, so you might need to chmod u+w /exp/splunk/lib/python2.7/site-packages/certifi/cacert.pem and then chmod u-w after changing the file.

GIAC Network Forensic Analyst (GNFA)

Had a great time taking SANS FOR572 in Amsterdam back in August 2019. Nailed the exam with 90% in December, making me a GIAC certified network forensic analyst (GNFA).

This is my second GIAC certification. I earned the GIAC Certified Detection Analyst (GCDA) in october of 2018.