«

»

Nov 03

lua Lesson 1 – Tapping TCP Expert data

In this blog, I am going to introduce Lua tap scripting for Tshark. Specifically, this blog is intended to provide a conceptual overview and foundation for more complex development tasks, which will be presented in future blogs.

Download LUA script

Introduction

Wireshark is a great tool for analyzing packet captures. However, there are many cases in which Wireshark doesn’t represent the data in a usable and efficient manner for the given task at hand. Thus, I have often found myself exporting the data to text, using a language such as Python or Perl to perform string parsing and/or pulling the data into Excel for processing and correlation. More recently I have been using Lua for these sorts of tasks.

LUA is an embedded, interpreted language which is frequently used in the gaming world. While it bears many similarities to languages such as Python, it has fewer data types and built in functions, etc. The sparse nature of Lua keeps it very small, stable and efficient. Coming from more of a Python and C background, I experienced a period of adjustment, though the basic concepts of programming, e.g. control structures, operators, functions, string parsing, etc. apply. The Lua language/interpreter has direct integration with Wireshark. While this integration allows for the creation of dissectors, we are going to focus on taps as we are processing data that has already been dissected/decoded by Wireshark/Tshark. The Wireshark Wiki provides additional information on the integration of LUA.

To determine whether or not your version of Wireshark is compiled with Lua, examine “About Wireshark” under help. Here is my about page, with the pertinent text highlighted:

Compiled with LUA

I mentioned the terms dissector and tap, as if this all makes sense to everyone reading this blog. As I realize that this may not be the case, let me take a step back and discuss these components.

Wireshark/Tshark has a core engine which processes each frame, one frame at a time. For each frame Wireshark/Tshark invokes dissectors, plugins, filters and taps.

  • dissectors decode the various fields and their values
  • plugins are typically external dissectors (dissectors which are not part of core distribution)
  • filters restrict the data displayed
  • taps work with dissected data, often to create statistics.

When a frame is processed, dissectors and plugins populate an ephemeral “protocol tree” with the various decoded values from the current frame. In a typical scenario, the frame dissector is called which populates the tree with frame related data, such as the arrival time, capture length, etc. If the frame type is Ethernet, the Ethernet dissector is called. If the type is IP,  the IP dissector, called, etc. This occurs until there are no further dissectors available and/or frame data left for dissection. Filters and taps can retrieve information from the tree, but do not have the ability of updating the tree. Filters are used to limit which frames are displayed/tapped and taps are often used to create additional metadata from dissected data.

The following graphic summarizes this relationship:

core

Lua Taps

From the command line, we can run a Lua tap script as follows:

tshark -X lua_script:streamXpert1.lua –r “small.pcapng” -q

This will cause Tshark to initialize the Lua script “streamXpert.lua” and process the trace file named “small.pcapng”. The “q” option prevents Tshark from displaying frames to the console, therefore only printing the output of our script. For each frame, Tshark will look for the .packet() function contained in the Lua script. Tap.packet() “extracts” field data from each packet, stores this data, and optionally performs calculations, etc. After all packets have been processed, Tshark looks for a .draw() function for output. Thus, the components of a useful Tshark Lua script may likely include the following :
  • Tap definition
  • Field extractors
  • array to hold extracted data and potentially other variables for program flow control, etc.
  • .packet() function
  • .draw() function
streamXpert1.lua
streamXpert1.lua examines each TCP segment, keeps a count of the number of total frames, retransmissions, duplicate acks, out of orders and zero windows by connection (stream). Our output should look similar to this: StreamXpert output

Initialization

At the beginning of our script, we define a tap using listener.new(). Specifically, we are going to be tapping tcp traffic and our tap is going to be named “tap”. Subsequently, if we named our tap “my_tap”, our functions would be named “my_tap.draw() and my_tap.packet().
As mentioned previously, when Wireshark/Tshark dissects a packet, it stores the decoded fields of the current packet in a “protocol tree” structure. To extract information from the tree we must create field extractors. Field extractors are essentially function definitions and are defined via Field.new(). These must by defined outside of the tap.packet() and tap.draw() functions.
To obtain the desired functionality, our script creates and maintains metadata for each connection. This metadata is stored in a table structure we create named “stream“. Additionally, we also create a variable to track the number of streams called “nxt_stream“. This variable is initialized to 0, and is incremented each time a new stream is detected. This variable is used for 3 purposes:
  • tap_packet() uses nxt_stream to determine whether the segment received is part of a new stream. Wireshark/Tshark allocates stream numbers sequentially whenever it sees a segment from a socket that it has not seen before. As nxt_stream contains the value of the next stream we expect to be created, when we receive a segment and it’s tcp.stream value is lower than nxt_stream, we can safely assume that this segment is part of an existing stream and update the stream entry already within our stream table. Otherwise, it’s a new stream and we must create a new entry in the table.
  • tap.draw() uses the value of nxt_stream to determine whether any TCP packets have been seen. If the value of nxt_stream = 0, no TCP traffic has been seen and there is no purpose in creating a column header. We exit tap.draw() with a message onscreen indicating that no TCP segments were not found in the trace.
  • tap.draw() uses nxt_stream to  iterate the stream{} table. While it is possible to iterate a data structure without knowing it’s length using the Lua pairs() function, Lua does not necessarily store nor retrieve elements in the order in which they were inserted into the table. We set an iterator to 0 (the first numbered stream) and use nxt_stream as a limit on a while loop. This ensures that our streams are ordered sequentially.

 tap.packet()

From a high level, Tshark feeds each packet through dissectors, plugins, filters, and then calls our tap.packet() function. Thus for each packet, Tshark looks for the function of the listener which we defined named tap.packet(). Tap.packet() extracts the necessary field data and conditionally updates our stream{} table based upon conditional logic. For example, if the segment is out of order, we update the out of order counter.
  • If the segment is the first within a stream, we create a “record” for the stream and populate the record accordingly.
  • If the packet is part of a stream for which we have an existing record, we update the values of this record accordingly.
The following flow chart outlines the programmatic logic contained within our tap.packet() function.tap.packet()

Stream data structure

Lua has no concept of lists, structures, etc. Data structures are based entirely on tables/associative arrays containing key/value pairs. In our example, in order to replicate a record structure we create a table called streams. In the streams table, the key is the stream number and the value is a pointer to the memory location of another table containing the values of a particular stream. Visually, this could be represented as follows:
streamstruct

 tap.draw()

 When all packets have been processed, Tshark looks for tap.draw(). Tap.draw reads our stream structure and outputs to the screen accordingly. The following flow chart outlines the programmatic logic contained within our tap.draw() function.
 tap_draw

Additional Notes

All attempts have been made to use local variables to minimize table lookups and unnecessary function calls. If a global variable or function is referenced more than once within tap.draw() or tap.packet(), we assign it to a local variable.

  • Within tap.packet() and tap.draw() we create a local variable to directly reference the record of the stream to which we are working. This eliminates the need to repetitively search (within the stream table) for a specific key to locate the memory location of a given record. For example, if the frame is retransmission, we need to increment the frame count as well as the retransmission count. Thus, the creation of a local variable pointing directly to the location of the data related to the current stream eliminates the need to look up the key multiple times in the stream table.
  • One interesting line of which I would like call attention, is the following:
local conn=tonumber(tostring(tcp_stream()))
We are storing the current stream in a local variable. Therefore, we only need to call the field extractor once. However, we need to convert the field extractor output to a string before converting to a number because tonumber() expects a string. The return value of tcpstream() is userdata and trying to convert it directly results in a nil value. I have spent many hours trying to figure out why my scripts were not working, only to discover that it was due to a variable type mismatch. For debugging I often add a line such as “print(type(variable))” so that I can determine whether my issue is due to a variable type mismatch.

Summary

The purpose of writing this blog was to provide a introductory tutorial on Lua tapping. While the example script may seem simplistic, it can easily be extended to provide much more value. Future blogs will seek to extend upon the basic logic presented here. I hope you enjoyed this exercise and look forward to any questions, comments, corrections and/or suggestions.

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>