Home

Message parts

 
 

(Bulk) Transport

The Meshy Space Interface defines standard elements which are reusable for complex client-service connections. But, of course, the only reason for the connection is to send information, and receive it back. In many cases, the amount of data is small and hence can be in-lined. However, Meshy Space is designed to support huge data quantities: hundreds of gigabytes per connection per day. So, transport of the data related to a single call may take hours.

The standard base building brick of the larger components of MSI messages by default supports various kinds of transports. It is hidden to the application which of the transport methods is used to pass data. There are many strategies possible to decide what will be used, including manual configuration.

Negotiation

MSI will not make a choice between packaging methods, compression methods, checksum, encryption, whatever. This is very much on purpose: it needs to be able to evolve organicly over time.

The negotiation works a bit like HTTP-Headers, where your browser uses Accept and Accept-Encoding to tell the web-server what it prefers to get. In the case of MSI, the service description describes what it can deliver, and the client picks its preference. A choice may add cost or impact performance.

For instance, CommonCrawl publishes content of web-sites in many WARC files. The files are 5GB uncompressed each. Their service may decide to offer the standard gz-compressed version and an xz-compressed version. The second may be produced only on demand, for an additional price but much smaller to download: that's on the client to decide which version it takes.

Preparing the data for transport

All transported data has a unique identifier. Always. When the the client wants to send data to the service, or the service return data to the client, that query or answer refers to the data by that identifier plus the transport choices made.

Available transport choices:

  • Content type (html plain odt)
  • Content format (xml, json, yaml)
  • Applied compression (gzip, xz)
  • Applied packaging (tar, zip, cpio)
  • Checksum (md5sum, sah256sum)
  • Signature
  • Encryption

Be aware that you can only use choices which both client and service support. In the beginning, very little will be supported.

Sending data

Sending data with the client's query has three modes:

In-line
The service specific data may fully integrate in the message: using the same format, namespace, etc. This means that the MSI implementation and the service code integrate cleanly.
Uploaded beforehand
The client can ask the service to provide a way (for instance FTP) to upload the data before the operation is started. The service provides location, authorization, restrictions and expiration.
Pass via side-channel
The client asks the the service to load the data before it starts the operation. The service will respond with an "upload in progress" refusal on the operation until it has collected the data.

The latter are useful when the data exceeds megabytes: more work to implement, but serious sized data needs more attention anyway.

Receiving data

The service may return data in various ways. The configuration is parallel to the options for sending data, but the actual processing sometimes differs.

TODO: explain data flows of the different mechanisms. This does really depend on the simplest way to formulate it in the message schema.


mark@overmeer.net      Web-pages generated on 2023-12-19