Table of Contents
- Preface
- Splunk Sidebar
- Splunk Search
- Splunk Results
- Splunk Knowledge Objects
- Debrief
1. Preface
I only made this blog in order to provide common Q&A information to anyone interested in using Splunk. It is also great as a reference. Please visit Splunk for the official learning courses
Splunk Q&A Study Guide
Splunk Enterprise — Q&A — Fields
2. Splunk Sidebar
How are fields broken down?
By selected fields and interesting fields lists
Where are selected fields displayed?
At the top of the sidebar and at the bottom of your events
What are the default fields?
Host, source, and sourcetype
What fields have values in at least 20% of the events?
Interesting fields
What does the letter ‘a’ denote?
It denotes a string value
What does a hash mark denote?
It denotes a numeral
What is shown when you click on a field?
It shows a list of values for the field, a count of the values, and a percentage of the events the value shows up in
How do you add a field value pair?
Clicking on the value then launching quick reports based on the field and adding it to the selected fields list. the available quick reports will be different depending on the returned values
How do you create a transforming search?
By clicking on a quick report and it will display your results as statistical data
What happens when you add a field to the selected fields list?
The field will show in the events where it occurs and persists for subsequent searches.
Where can you see all fields for the search?
By using “All Fields” link at the top of the fields list or the “more fields” link at the bottom
What can you do in the “All Fields” and “more fields”?
You can filter fields, see number of values, coverage, type, and add a field to the selected fields
3. Splunk Search
How do you limit events returned?
Searching for a field of the specified term and its value will return only those events
sourcetype=linux_sample
Is case sensitivity an issue with field names or values?
Field names are case sensitive, while values are not
Explain the field operators = and !=
They can be used with fields of numerical or string values
Explain > , <, <=and >=
They can be used for fields with numerical values
How can fields be added to your search?
By clicking on a value in a fields window
Using !=, filter out fail*
host!=”fail*”
Use NOT to achieve the same result
NOT host=”mail*”
What is nesting search terms in parenthesis?
Adding www1 to NOT using nesting search terms in parenthesis
NOT (host=”mail*” OR host=www1)
For fields containing an IP address, explain wildcards
wildcards are subnet and CIDR aware
What is the difference between != and NOT
!= is a field operator and NOT is a boolean and will not always return the same results/number of events
status!=400 returns events where the status field is not 400
NOT status=400 returns events that do not have field where the status equals 400
If an event does not have a status field at all, it will be included in the results
What is an alternative to chaining together several operators?
Splunk can also use an “IN” operator instead of “OR” with the values we are wanting to include in our results, wrapped in parenthesis
index=web status IN (“200”, “300”, “404”)
What command can be used to exclude or include fields from your search?
The fields command can be used to include or exclude fields from your search
Display a field stats command
index=web status IN (“200”, “300”, “404”) | stats count by status
Since the only field values being used belongs to the status field, only that will be displayed
Display an additional fields command
index=web status IN (“200”, “300”, “404”)
|fields status
| stats count by status
Filtering as early as possible in a search is best practice, so the command is placed before the stats command
What does the fields command use in order to include or exclude a field
A plus or minus operator, it also defaults to inclusion if no operator is specified
index=web status IN (“200”, “300”, “404”)
|fields -status
| stats count by status
Explain | rename
It is used to rename fields in your search to give more meaningful or user-friendly names
supply the field name with an as clause followed by the new field label and because the new label is a phrase, it will need to be wrapped in quotes and further rename multiple fields with the same command, and separate them with a comma
index=web status IN (“200”, “300”, “404”)
|fields -status
| stats count by status
| rename status as “Example Example”, count as “Number of Events”
4. Splunk Fields in Results
When Splunk ingests data into the index, a select number of fields are automatically extracted. What does this include?
This includes meta data fields such as host, source, and sourcetype, and internal fields such as _time and _raw which contain the event’s timestamp, and the original raw data of an event
At search time, field discovery extracts additional fields from raw event data. Explain
Splunk will automatically extract fields from your data based on its assigned sourcetype, as well as key value pairs found in the data. These fields are persistent and will be extracted every time a search is run containing the same search terms, unless you explicitly tell Splunk to not return them
Explain temporary fields
temporary fields can also be created on an ad-hoc basis using commands such as | eval
What is the eval command used for?
The eval command is used to calculate and manipulate field values | eval results of eval commands can be written to a new temporary field at search time, or replace an existing field’s values
Write a basic search of what sites were being misused during business hours they have added a new web access policy
sourcetype=cisco_wsa_squid s_hostname=*
| stats values (s_hostname) by cs_username
the usage is broken down into categories of usage types
business, borderline, personal, uknown, violation
Use the stats command to fins the sum of all bytes used
index=network sourcetype=cisco_wsa_squid
| stats sum(sc_bytes) as Bytes by usage
the resulting report shows the number of bytes used, as bytes
Convert the bytes to megabytes (a megabyte is 1024 to the power of two bytes)
pipe the search into an eval command that divides the bytes by 1024 then divides the result by 1024 again
sourcetype=cisco_wsa_squid s_hostname=*
| stats values (s_hostname) by cs_username
| eval bandwidth = Bytes/1024/1024
since we used a non-existing field name for the results, a new field of “bandwidth” is created
Explain Field extractor
the field extractor utility can be used to extract fields from your data that were not automatically extracted for its assigned sourcetype
you can also used commands to extract fields | erex and | rex that were not extracted at search time
these commands extract fields from your data using regular expressions, or regex
Recommend watching Regex Overview Video for more information on Regex
While fields extracted with the field extractor are persistent across searches, there might be times where you want to extract values temporarily for the duration of a search. When would be a time this would happen?
splunk provides the erex and rex commands for this, the erex command is a lot like automatic field extraction in the field extractor, you give it sample of values and splunk will try to extract what you want
Give an example of this situation
there is unextracted data representing character names for our users, we pipe out search into the erex command, provide a field name to use for the extracted values, use fromfield argument to tell splunk which field in our data to match on, and use an argument of “examples” to provide a list of sample data
index=games sourcetype=ExampleUnextracted
| erex Character fromfield=_raw examples=”sam, snoot”
What happens when it is run?
when this is run, splunk builds a regular expression based on the sample data, checks it against the raw event data, and adds matched values to the character field
while the erex command is very handy when you need to get quick results, it can suffer the same issue as automatic field extraction
What does erex look for?
It only knows what to look for based on the sample you have given it
Using the where command with the isnull function, we can see some character names were missed, how can this be fixed?
adding more sample to the examples argument will fix this
index=games sourcetype=ExampleUnextracted
| erex Character fromfield=_raw examples=”sam, snoot, kate”
| where isnull(Character)
Where can you view the regular expression?
we can see the regular expression Splunk used to match by clicking on the job menu
like with the field extractor, erex is a great way to start your regex, but should be checked and edited to match your needs
What information in the job menu suggests using the rex command with the generated regex?
consider using: | rex “(?!) CharacterName:’(?P<Character>[^’]+)’”
the rex command allows you to use regular expression named capture groups to extract values at search time
it can be used on field values or raw data, the rex command allows you to match multiple groups
Extract both the User and Character name values from our beta data and using the field of _raw as the field to match on
index=games sourcetype=ExampleGame
| rex field=_raw
by default this field will be used if no field argument is given, note that running rex over the _raw field can have a performance impact, if you already have an extracted field that you can use, always default to using it
Enter the regular expression created earlier inside double quotes and run the search
index=games sourcetype=ExampleGame
| rex field=_raw “^[^’\n]*’(?P<User>[a-zA-Z0–9_.-]+@[a-zA-Z0–9-]+\.[a-zA-Z0–9.]+)” |
we now have a field of User that includes the email addresses
Continue the expression to return a Character field, behind our capture group, add an apostrophe, followed by a \s to match the white space, to match the character name in the data add a character class of upper and lower case letters and a colon, and use a plus quantifier to match all characters within the class followed by an apostrophe, and create another capture group named Character, inside this group match one or more of a character class that include upper and lower case letters and digits along with period and dash symbols
index=games sourcetype=ExampleGame
| rex field=_raw “^[^’\n]*’(?P<User>[a-zA-Z0–9_.-]+@[a-zA-Z0–9-]+\.[a-zA-Z0–9.]+)’\s[a-zA-Z:]+’(?P<Character>[a-zA-Z0–9.-]+)”
after running the search we now have User and Character fields available to view
Explain using erex versus rex
erex is easier to use, do not need to know regex to use erex, where rex requires it, with erex you need to provide samples to generate regex, creates a regular expression for you/generates RegEx,
rex requires RegEx knowledge, does not require sample data, can use generated RegEx as a starting point/expression
when possible use rex
5. Splunk Knowledge Objects
Calculated fields storing an eval command in a calculated field Splunk will do what?
Automatically create the field at search time every time a search containing the Bytes field is run
One thing to keep in mind when creating a calculated field is calculated field can only reference fields that are already present in the events returned by a search, earlier our search the byte field is created by another command in the search, in order to perform correctly make sure they are configured to reference a field that has already been extracted
Explain Field Aliases
field aliases allow you to assign alternate names to fields in your data, if you have fields from multiple sourcetypes, such as User and Username that all contain similar values, you can create a field alias so that similar values are grouped together, this allows you to search for all values at once in the shared field alias, field aliases do not replace or remove the original field name so you can search your data using either the original name or its alias
Explain lookups
lookups allow you to add other fields and values to your events that are not part of the indexed data, these field value pairs can be configured to automatically append to events in your search at search time allowing you to add related information to lend additional context to your data
What is the order of the search time operations?
order when order with knowledge objects related to fields is -field extractions>field aliases>calculated fields> lookups>event types>tags
this means calculated fields can add additional context to extracted fields since they are evaluated third but a field alias cannot reference a value from a lookup
5. Debrief
I hoped this helped answer some general starter questions for anyone just learning Splunk. I really enjoyed doing and this and will be making more in the future.