Upload files into Splunk through the API
Learn how to programmatically upload CSV files through the API for data ingestion using the standard Linux command Curl or an HTTP client written in Go.
Although you can select “Upload and Index” a file from the Splunk GUI to upload and index a file, there are advantages to having an API endpoint that allows the same functionality.
There are a few CLI functions that can be used to perform one-time bulk loads of data however we are not going to cover CLI examples here, we are going to focus on using the Splunk API.
Scenarios when this method might be appropriate:
- Daily scheduled reports that need to be imported
- Files are in a standard format such as CSV and can be directly imported
- The only access to the Splunk server is through the REST interface
Keep reading for a quick overview of the Splunk API method or jump further down to see the examples.
Splunk API
Splunk has extensive documentation that can be found online on how to use the API however, we will be focused only on this method:
https://<host>:<mPort>/services/receivers/stream
In the expanded usage frame the following is provided:
For HTTP uploads, if the caller passes a content-type of “multipart/form data”, the HTTP file upload protocol is used and files are indexed.
Before attempting to post a file let's review what will be needed.
Splunk Checklist
In order to make an API post you will need the following:
- Make sure the Splunk server is accessible on port 8089 which is the default HTTPS REST API port.
- Splunk Credentials, either username and password OR a token.
- Name of the index to use.
- Name of the SourceType to use.
- A file to upload.
In the examples below I will be using:
- Splunk Server: splunk.mydomain.com
- Index: mycustomindex
- SourceType: mycustomcsv <- This is because I have mapped specific fields in my CSV file in Splunk.
- File: my-datafile-20210832.csv
Before uploading the file through the API I would manually upload the files from the Web UI to work through any issues or customizations with the data.
Upload Using Curl
If you are not familiar with the Linux command curl use the link below to read more about it.
curl is a tool to transfer data from or to a server, using one of
the supported protocols (DICT, FILE, FTP, FTPS, GOPHER, HTTP,
HTTPS, IMAP, IMAPS, LDAP, LDAPS, MQTT, POP3, POP3S, RTMP, RTMPS,
RTSP, SCP, SFTP, SMB, SMBS, SMTP, SMTPS, TELNET or TFTP). The
command is designed to work without user interaction.
curl offers a busload of useful tricks like proxy support, user
authentication, FTP upload, HTTP post, SSL connections, cookies,
file transfer resume and more. As you will see below, the number
of features will make your head spin!
Simple Example
curl -D - -u $USERNAME:$PASSWORD -F 'data=@my-datafile-20210832.csv' “https://splunk.mydomain.com:8089/services/receivers/stream?sourcetype=mycustomcsv&index=mycustomindex&host=curl-testing”
Let’s break down the above command in detail. We know that curl is uploading the file to Splunk but the flags need some explaining.
-D/--dump-header <file>
This tells curl to dump the protocol headers it receives from the server. The extra "-" after the "-D" indicates standard out instead of a file.-u $USERNAME:$PASSWORD
Specify the user name and password to use for server authentication. In this example I am referencing environment variables where I have currently stored my credentials.-F 'data=@my-datafile-20210832.csv'
This lets curl emulate a filled-in form in which a user has pressed the submit button. This causes curl to POST data using the Content-Type multipart/form-data according to RFC2388. This enables uploading of binary files etc. To force the 'content' part to be a file, prefix the file name with an @ sign.
The “-F” is the critical flag that makes the uploads work by making the curl send the file in a multipart/form-data format, a format that Splunk supports. You can also see more examples in the curl man pages if needed.
The query string params are standard Splunk params with the exception of the “host” parameter. This value indicates the host or application that sends in the data.
Using a Token
Using a username and password is not a good solution if this needs to be deployed and automated. Good thing Splunk support authorization tokens:
curl -D - -H "Authorization: Bearer $SPLUNK_TOKEN" -F 'data=@my-datafile-20210832.csv' “https://splunk.mydomain.com:8089/services/receivers/stream?sourcetype=mycustomcsv&index=mycustomindex&host=curl-testing”
Upload Using Go
The provided Go example below uses an API Token to upload a file from disk to Splunk using the same multipart/form-data format that Curl used above.
The reason for the added complexity in this example is because I don’t want the entire file loaded into memory during the upload process. Depending on the infrastructure conditions of where this code is run and the size of files in question, memory space could be an issue. Therefore I used an io.Pipe to stream the data from the file system as it is being uploaded into Splunk.
Lets Recap
We have covered off 2 different upload examples along with using standard username / password credentials and token authentication. The real advantage to using this method is that the data is not going through a transformation process. Alot of the Splunk examples demonstrate parsing a file into JSON and then uploading events. Here we are uploading the file as-is, without modification.