Grasp Software Extras: Fitspipe Network Protocol
This page describes the protocol used by the 'fitspipe' server bundled as part of the 'extras' package in the software tarball. It gives a general overview of the server and documents the basic commands used by clients sending or receiving image frames to/from the server.
Emphasis is placed on clients retrieving data; it is expected that external entities will be more likely to be taking frames from a system supplied with GRASP controllers rather than putting data.
Description
"fitspipe" is a network server process that is used to transmit streams of image data from various instruments. Due to its background in astronomy, it is primarily focused on handling data that conforms to the FITS format. As such, some of the conventions used are somewhat FITS-centric.
While it is capable of managing frames from still cameras, its original intent was distribution of video.
To summarise its overall purpose, fitspipe exists to:
- Allow producer clients to distribute image data without needing knowledge of consumers.
- Allow multiple consumer clients to receive image data independently from one another.
- Act as a frame buffer to lessen the requirement that clients always keep up with video frame rates.
- Keep potentially high-frame-rate, high-bandwidth streams away from disk storage. fitspipe predates the existence of fast SSDs, and not relying on disk storage avoids problems with access times, transfer rates, file management, ssd flash wear, etc.
- Decouple the operation of the actual instrument producing image data (protocols, etc.) from the transport of data.
For example, the IR cameras for DL-NIRSP transport data via UDP jumbo frames with no facility to retransmit missed packets. Since attempting to distribute data that way would be a nightmare, the camera server publishes each frame as it's received to fitspipe, translating to TCP, big buffers and normal MTUs.
This document is intended for those that need to interface with fitspipe at the socket level. For users that simply want to quickly use fitspipe in a system, the suggested first step is to use the command-line tools available in the STARGRASP software tarball at https://www.stargrasp.org/wiki/GraspSw .
General Operational Concepts
fitspipe allows multiple independent streams of data to exist simultaneously. In fitspipe parlance, these are termed "feeds". Each feed represents one or more buffered frames of image data. Each frame received is assigned an incrementing sequence number, which is used by clients to retrieve data.
The buffer size for a stream is a function of image dimensions and "depth" in fitspipe parlance. The "depth" here refers to number of buffered frames that will be kept in memory (not image format or bits per pixel, of which only 16bpp is currently supported by fitspipe.) The depth can be set on the server command line.
Clients sending frames of image data to the fitspipe server ("put") specify which feed the data should be published to. New feeds are created on demand
Clients retrieving frames of image data from the server ("get") are responsible for keeping track of which frames are available in the server's buffer versus which frames they have yet to receive. To assist data retrieval without polling, a blocking mechanism exists to allow a client to wait for the next frame to arrive.
All image data transferred from fitspipe is sent in network byte order, i.e. big-endian.
Protocol Stack
Layers of network communications are often classified using the OSI Reference Model. Because the camera system is not just a concept, but a physical server with hardwired connection to the observatory, it is possible to describe all layers now:
Layer | Purpose | Commands | Data |
Layer 1 | Physical | 10GBase-T | 10GBase-T |
Layer 2 | Data Link | Ethernet | Ethernet |
Layer 3 | Network | IP | IP |
Layer 4 | Transport | TCP | TCP |
Layer 5 | Session | Socket | Socket |
Layer 6 | Presentation | custom | custom |
Layer 7 | Application | cash: Camera Shell | Raw/FITS data |
In so many words, a “socket connection over TCP/IP on Ethernet” could describe interface layers 1-5, but to be explicit it is worth describing each layer with some detail to avoid mistaken assumptions (e.g. Will we support IPv6? Jumbo packets?)
Physical Layer
Between the camera server running fitspipe and the rest of an observatory (CSS in the case of DKIST) will be a series of switches, fibers and/or copper wiring connecting the camera server to the facility network.
Data Link Layer
The camera server expects to operate over 10-Gigabit Ethernet with normal (1500 byte) max. transmission units (MTU). Operation over lower-bandwidth links is possible, but will eventually result in missing data as the aggregate data rate of the camera at the required frame rates is above 1 Gigabit/sec. Operation over higher-bandwidth links is permissible.
Network Layer
Internet Protocol v4 will be used. The camera server running fitspipe will be on a private IP subnet (not Internet). If necessary, switches will route communications with the facility.
Transport Layer
TCP will be used on top of IPv4.
By default, the command server will listen for new connections on TCP port 9999. However, clients should have the flexibility to connect on a different port.
Session Layer
Clients of the camera command interface establish a session by opening a socket. The number of active sessions is limited only by the operating system and CPU resources; the latter becomes increasingly important for high-resolution, high-frame-rate applications.
It is possible for any given client to use a new socket for every frame retrieved, but it is also acceptable to re-use a single socket connection.
Presentation Layer
A client need only be concerned with managing the application layer (described next). While sockets, TCP/IP and ethernet are all standard, the presentation layer is the first custom layer so a description is necessary.
Something something line-based messages.
First, a “line” is defined as any string of up to 32767 ASCII characters. Within the communications stream, lines are terminated by `\r` or `\n `. Note that this does not imply that the application layer supports command line lengths of 32767. Line type prefixes and encoding of control characters and non-7-bit ASCII characters within the message itself may occupy extra room.
Image data is presented as binary data instead of ASCII with no encoding, FITS header included.
Input direction (command)
Command input is only valid when previous command response has been received completely. or a new socket has just been opened. Input consists of a maximum of 32767 characters between ASCII 32 and 127 inclusive followed by a single \r or \n (ASCII 13 or 10).
TBD More details on encoding.
Output direction (response)
Each command will return one or more lines in response.
After the initial command-response transaction, both the `put` and `get` commands transfer binary data in the relevant direction.
Each response line begins with one of the following:
String | Description |
'> ’ | Followed by echo of command, acknowledges that command was received and processing has started. |
'+ ' | Followed any message line (except the last) of output generated by the command. |
'. ' | Followed by final message from command completing successfully. |
'! ' | Followed by error description from command which has failed. |
'# ' | Followed by 1-line description of image frame, then binary data. Only used in response to `get`. |
The client must not feed a new line of input until ’.’ or ’!’ has been received (or, in the case of 'get', until the initial '#' line has been received, followed by any header data requested and then the image data). The client is also not expected to close the session until such complete response has been received.
The prefix ’* ’ followed by an asynchronous notification is reserved for urgent/out of band messages not associated with any specific command. For example,
’* warning: system shutting down in 5 minutes’
could appear at any time (including in the midst of a response to a command, but never in a way that breaks another line.)
In addition to the Conductor Camera Shell namespace, the camera server command interface and the low-level interface to the FPGA boards also uses this presentation layer.
Application Layer
If desired, C Language bindings will be provided to interface to the presentation layer. The status server application programming interface (API) is described in detail in the Status Server Client C API document. TBD: A separate C library for accessing the command interface will allow the user to: • Set callback to receive response message strings. • Set callback to receive asynchronous messages. • Check for new messages. • Send an ASCII command string, wait for it to complete, and return binary (PASS or FAIL) status of the request. • Interrupt the system (from another session.)
Summary of command string input syntax: • Command names are lower case letters, digits, and underscores only. • Command parameters are typically specified as a series of `name=value` pairs. • Command parameters follow the command name, separated by whitespace. • Parameter names are case insensitive letters, digits, and underscores only. • Parameter names may be defined containing a single ’ * ’ star character indicating that the parameter name may be abbreviated beyond that point.
- The root portion of the parameter name (up to any ’ * ’ star character given in the definition) is required. Additional characters, up to the full name, are optional but may not be mistyped.
- Abbreviations are intended as convenience for manual operation modes. Automated clients and scripts should use the full name.
• Quotation: ’...’ and "..." are accepted. • Any characters after ’ # ’ outside of quotation taken as comments. • Command parameters are not sensitive to white space outside of quotation. • Commands may define their parameters as positional and non-positional.
- Positional parameters may be supplied only as a value without the name= portion of the parameter pair.
- Positional parameters are defined by their order in the set of command parameters.
- If the first character in a parameter argument is not a legal parameter name character, or if the first non-legal character is not ’=’ (equal sign) then the entire argument is interpreted as a value for a positional parameter.
- Non-positional parameters are fully specified by a name=value pair.
- Non-positional parameters are order-independent; each pair fully specifies the name of the parameter, so the parameter may appear anywhere in the set.
- Positional rules are a convenience for manual operation. Automated clients and scripts should always generate parameters with the full option name and an ’=’.
The actual commands intended for the application layer are described in the following section.
Command Set
Fitspipe supports a very limited command set. There are single commands for feed discovery/interrogation, to put a frame to the server, and to get one. These are performed with `ls`, `put` and `get` accordingly.
ls
`ls` lists all of the available feeds, and provides some information about each one. The command serves two purposes:
- Discovery of available feeds,
- Retrieving information about each feed prior to retrieving a frame of image data.
The `ls` command takes no parameters.
If one or more feeds exist, it receives one or more lines in response, each representing a single feed. Each per-feed line begins with the continuation string documented above, followed by the following `name=value` parameter pairs:
Name | Type | Description |
'feed=' | String | Name of feed. |
'naxis1=' | Integer | Width of image frame. |
'naxis2=' | Integer | Height of image frame. |
'depth=' | Integer | Number of buffered frames. |
'oldest=' | Integer | Sequence number of oldest frame currently held in buffer. |
'newest=' | Integer | Sequence number of most recent frame currently held in buffer. |
This is then followed by a final line indicating successful completion of the command.
If no feeds have been created, then there will be no lines of data ahead if the completion indication, but the command will indicate that it was processed successfully.
For example:
ls + feed=default naxis1=2048 naxis2=2048 depth=300 oldest=3330 newest=3629 . OK
The above response indicates that there is a single feed named "default" available, representing 300 buffered frames of 2048x2048-pixel image data. The oldest frame can be retrieved with sequence number 3330, and the most recent by requesting frame 3629. Frames with sequence numbers below that range have been dropped from the buffer and no longer exist, and frames with higher sequence numbers have yet to be received.
put
The `put` command publishes a single frame of FITS image data to the server. Given the name of a feed, the server then expects a binary transfer of a full simple FITS image, including a complete header and padding as required in a FITS file if it were on disk.
Because the transfer includes the header, the client doesn't need to specify any other information about the image being transferred other than the feed name; the FITS header defines the dimensions of the data, etc. This can be useful for general-purpose clients that don't need to know anything about the data they're uploading.
The put command takes the following parameters:
Name | Type | Required? | Description |
'feed=' | String | Required | Name of the feed to publish to. If the feed doesn't already exist, it is created. |
The server replies with a single line indicating success or failure. Upon success, the client then writes the entire FITS image, header included, as binary data.
get
The `get` command retrieves a single frame of image data from the server. The client passes the name of the feed and a sequence number for the frame to be retrieved, along with an indication of whether the FITS header should also be downloaded.
The get command takes the following parameters:
Name | Type | Required? | Description |
'feed=' | String | Required | Name of the feed to fetch a frame from. |
'frame=' | Integer | Optional | Sequence number of the frame to be retrieved. If not given, default behaviour is to fetch the most recent frame. |
'fullheader=' | Boolean | Optional | Indicates whether FITS header should be transferred. '1' to send FITS header, '0' to send only the image data. Default value is 0. |
The response to the get command is a little different from all other commands in a couple of ways:
- If the command was parsed successfully and the requested image is available, the response will be a single line beginning with `# ` as indicated above, followed by a minimal amount of data about the image. The intent is to signify that the first line is not part of the actual data, but is more like a comment providing more information about the following image data.
- If the command was parsed successfully, the feed exists, but the frame number is for a future frame that does not yet exist, the response line will be started with `# `, but further output to that client will block until that frame becomes available. Once the frame has been uploaded to the server, the rest of the metadata comment line will be sent, followed by the image data.
- The complete response line is always a fixed 40 bytes in length, including the trailing newline.
The fixed-length response takes the following format:
'# frame width x height \n'
where the fields occupy fixed positions and have fixed widths:
frame:: The frame sequence number, as recorded by the server (integer, 10 digits wide, chars 2-11). width:: The width of the image in pixels (integer, 10 digits wide, chars 13-22). height:: The height of the image in pixels (integer, 10 digits wide, chars 26-35).
There is a guaranteed space between the leading `#` and the first fields and also between each of the fields. The fields are always arranged in the fixed offsets indicated above. The line ends with three spaces and a newline to fill out the remaining 40 bytes.
Note that if the client requests a frame number that has been discarded (i.e. is lower than the oldest buffered frame), the server currently responds by sending the latest frame. It is up to the client to check that the frame number received is the same as that requested, and to deal with the case where it is not.
Both this behaviour and the fixed 40-byte format stems from a previous system involving small images transferred at high rates, keeping the overhead down as far as possible in terms of bytes transferred (and thus the number of packets per frame) and latency. The fixed-width/fixed-position fields also reduce the complexity of parsing the received data, further reducing latency.
Only 16-bit-per-pixel image data is supported, so at this point, the size of the image data in bytes is known.
After the initial one-line response, the server sends the frame as binary data.
If a header was requested, then the client is expected to detect and handle the additional data by following the standard for FITS headers. The client will not know ahead of time the size of the header, but the rules for FITS headers make it possible to determine the extent of the header. While it is outside the scope of this document to fully describe the FITS header format, the relevant rules can be summarised as follows:
- FITS headers consist of one or more blocks of exactly 2880 bytes.
- Each block consists of 36 80-byte 'cards'.
- Each card begins with an 8-byte text keyword that names the card.
- If any keyword in a block is `END ` ("END" followed by 5 ASCII spaces), that indicates the block is the last in the header.
- Any keyword in a block may indicate the end of the header.
- No cards containing data will be found after the `END` keyword.
- Alternatively, if no `END` keyword is present in a header block, the header continues, and the current block will be followed by at least one more 2880-byte block.
After the last 2880-byte block is received for the header, the server sends the image data. The image data is contiguous with the last header block, and consists of 16 bit-per-bixel values of the dimensions indicated in the first line of the response. The first pixel sent is (0,0), followed by the rest of the first row of pixels, followed by the next row, etc.
If no header is requested, then the header is skipped and the image data is sent immediately after the one-line text response.
Note that it is strongly recommended that clients always take the header for each image. The overhead of receiving a few header blocks per frame versus the full image frame is usually minimal, and the headers may contain metadata that could be useful to be passed on from the camera system.
Common FITS Pitfalls
While not strictly related to the fitspipe protocol, the FITS standard does lay a few traps for the unwary. A brief recap is therefore probably worthwhile:
As mentioned, image data is sent in network byte order, and handily the FITS standard requires big-endian image data. It may be tempting to think that little-endian systems need to do a lot of byte-swapping to "fix" the 16-bit pixels, but this is not the case so long as the data remains in FITS format. However, the client must consider endianness if it translates the image data to some other format.
The FITS standard does not allow for unsigned integer values for pixels; all stored values are signed. However, raw image data is typically a series of unsigned A/D converter measurements. To get around this, a standard FITS header will include two keywords, `BZERO` and `BSCALE` which are used (as per the astropy docs) to transform stored values to the original physical values as follows:
physical value = BSCALE * (storage value) + BZERO
Because the client that published the data to fitspipe followed the FITS standard, the reverse transform was applied to the raw data before uploading, and all of the pixel values were shifted and scaled. If the client intends to write data out to a non-FITS format, it must transform every one of the pixels received before passing the data along. Failure to do so will result in odd value wrapping problems, etc.
In general, for most 16-bit-per-pixel systems it is somewhat safe to assume that BZERO is 32768 and BSCALE is 1.0. However, it is more correct to transfer the header and retrieve the values directly from the relevant keyword values.
If the client intends to write a FITS image using the supplied header and image data, please note that the FITS standard requires the image data to be padded out to the next 2880-byte boundary with zeroes.
The image data from the server does not include this padding; it is the client's responsibility to generate and append this padding as needed.
The rationale to this is threefold:
- It is trivial to generate the required padding within the client.
- It is far less expensive to generate locally than transferring trailing zeroes across the network (a client can calloc() a 2880-byte buffer once and then keep it on hand for padding all frames).
- Is only applicable to clients that intend to produce FITS data. Clients that intend to translate from FITS to some other format would always have to expect the trailing padding and then remove it.
Examples
Some examples of client-server interactions to transfer frames from fitspipe.
Note: trailing newline characters on each message are omitted for clarity.
Client retrieves latest frame
The client grabs one frame (the latest in the server's buffer) from the server with no header.
Client retrieves latest frame (including header)
The client grabs one frame (the latest in the server's buffer) from the server, preceded by a header.
Client Gets Multiple Frames
A client that wishes to get multiple frames has a couple of different strategies it can use, all of which are similar to the latest frame example above. After the initial `ls` to get a sense of the current frame sequence number, the client can attack this in a couple of ways:
- The client queries the server with 'ls' ahead of each and every frame.
- The client can maintain its own running sequence number without repeated queries of the server.
In either case, it is the client's responsibility to ensure that the sequence number in the initial `# ` response line matches that which was requested.
It is also the client's choice as to how to react to a mismatch. For example, the `fitspipe-get` example tool in the STARGRASP tarball doesn't attempt to do anything intelligent - if it falls so far behind that the server returns the latest frame instead of a frame it no longer has, `fitspipe-get` simply declares the missing frames lost forever while printing a warning, instead of trying to do something more dynamic to attempt to drop as few frames as possible.
Client Queries Server on Each Frame
In this example, the client makes no assumptions about which is the "current" frame, so it queries the server ahead of each transfer.
Note: if the client relies solely on the server's view of the latest frame without maintaining any sort of state for itself, it may be vulnerable to skipping frames if the client momentarily takes longer than a single frame period to between requests. For example, if it requests and gets frame 300, then some sort of garbage collection mechanism kicks in (or similar) and the camera keeps producing new data in the meantime, it may come back and find on the next query that the current frame is 302. If it blindly requests that frame, then it never gets frame 301 from the server, and drops that frame.
For that reason, it is strongly recommended that so long as latency is not a concern, even if it queries the server each time the client should maintain its own counter and request sequential frames even if it falls behind. For a given application, fitspipe's buffer should have been sized appropriately to allow the client to lag for a short time and then catch up again, dropping no data.
On the other hand, if latency is more important than receiving and processing each and every frame (e.g. for guide video), it may be preferable to assume older frames are stale and should be skipped. As such, it is probably preferable in this case to always request the latest frame regardless of however many other frames are buffered; applications such as this might also consider running fitspipe with a very small frame buffer.
Client Maintains its own Frame Count
Alternatively, the client, after initially querying the server, can maintain its own frame counter and request sequential frames without checking with the server each time.
The client may request and receive headers on each frame; this is omitted for brevity.
Example Client
As an added bonus, an example implementation of a simple fitspipe-get client in Python was produced from reading this document to illustrate fetching images from a fitspipe server with full FITS headers, and then producing either concatenated frames on stdout to be piped to other fitspipe-style tools, or to a series individual simple FITS image files.
This is NOT provided for production use.The example code does not make any serious attempts at error detection (never mind correction), does the bare minimum to get things up and running on the network, makes a whole bunch of assumptions that are easily exploitable, and is likely not a good guide to Python programming practices in general.
Attachments
- example_client_latest_frame.png (25 kB) - Sequence diagram showing example of client getting latest frame from server (no FITS header), added by crae on Tue Sep 10 10:58:38 2019.
- example_client_latest_frame_with_header.png (34 kB) - Sequence diagram showing example of client getting latest frame from server (including FITS header), added by crae on Tue Sep 10 10:59:14 2019.
- example_client_new_frame_ls.png (33 kB) - Sequence diagram showing example of client waiting for new data, querying server feed ahead of each new frame, added by crae on Tue Sep 10 10:59:59 2019.
- example_client_new_frame_counter.png (59 kB) - Sequence diagram showing example of client waiting for new data, initially querying server feed then maintaining its own frame counter, added by crae on Tue Sep 10 11:12:36 2019.
- fitspipe-get.py (13 kB) - Example implementation of 'fitspipe-get' in python. NOT FOR PRODUCTION USE, added by crae on Tue Sep 10 11:18:02 2019.