Retrieving Files from Documents |
Scroll |
A Thereforeā¢ document may contain one or more files/documents (e.g. 2 word documents, a PDF file, and 3 JPEG images), usually belonging to the same topic or task. These individual documents in a Thereforeā¢ document are often referred to as streams. The
Here is a comparison table of options available in Therefore Web API for file retrieval.
Thereforeā¢ Web API Method Name |
HTTP verb |
(restun-streamed) |
(restun, restwin) |
(soapun, soapwin) |
||||
---|---|---|---|---|---|---|---|---|
Transfer-Encoding: chunked |
Binary Payload |
Base64 String |
JSON Byte Array |
Binary Payload |
Base64 String |
JSON Byte Array |
||
POST |
- |
- |
YES, see IsStreamDataBase64JSONNeeded param in request |
YES |
- |
YES |
- |
|
POST |
- |
- |
- (only in XML encoded response message, use Accept: application/xml) |
YES |
- |
YES |
- |
|
POST |
- |
- |
YES, see IsStreamDataBase64JSONNeeded param in request |
YES |
- |
YES |
- |
|
POST |
- |
- |
- (only in XML encoded response message, use Accept: application/xml) |
YES |
|
YES |
- |
|
POST |
YES |
YES |
- |
- |
YES |
- |
- |
|
POST |
YES |
YES |
- |
- |
YES |
- |
- |
|
GET |
YES |
YES |
- |
- |
YES |
- |
- |
There is a streamed endpoint available in Thereforeā¢ Web API. The difference from the other endpoints is that the methods of this endpoint return the HTTP response as Transfer-Encoding: chunked.
Methods on the streamed (chunked) endpoint consumes less memory on both server and client side. Use it to retrieve big or huge files. It provides similar performance compared to retrieving files as binary data.
Read more (with an example) in the Endpoints\Streamed JSON endpoint to download big files section of Thereforeā¢ Web API Documentation. |
To retrieve a file as binary data use GetDocumentStreamRaw, GetConvertedDocStreamsRaw (HTTP POST) or GetDocumentStreamFile (HTTP GET) Web API methods instead of GetDocumentStream (HTTP POST, Base64 or byte array). It is the recommended way because these three methods provide better performance than GetDocumentStream or GetDocument (when used for stream retrieval) and smaller message size. It provides similar performance compared to the streaming data transfer option.
Here is an example of calling GetConvertedDocStreamsRaw medthod in Postman. Binary file content is returned in the body of the response.
Notice two headers in the response: Content-Type and Content-Disposition.
To retrieve streams from a Therefore document you can use the GetDocument Web API method (with the parameter IsStreamsInfoAndDataNeeded set to true), GetDocumentStream, or GetConvertedDocStreams. The GetThumbnail method returns thumbnail of a document.
Data returned in the StreamData property is encoded as a byte array in JSON responses and as a Base64 string in XML responses.
Use the following HTTP header to get back response as XML from JSON Endpoints (restun, restwin):
Accept: application/xml
C# Code Example. Using GetDocument Web API method
Step-by-Step guide illustrates how to extract one or all of these streams from a document via the Web API.
1 |
Create a channel factory to the web service endpoint. |
2 |
Create a channel to the web service endpoint. |
3 |
Create request parameters instance. Set requested document number and IsStreamsInfoAndDataNeeded parameter to true. |
4 |
Retrieve the document from the server. |
5 |
Close the channel and channel factory. |
6 |
Extract all file streams to the specified directory. |
// 1. Create a channel factory to the web service endpoint
...
ChannelFactory<IThereforeService> channelFactory = new ChannelFactory<IThereforeService>(binding, endpoint);
// 2. Create a channel to the web service endpoint
IThereforeService service = channelFactory.CreateChannel();
// 3. Create request parameters
GetDocumentParams parameters = new GetDocumentParams();
parameters.DocNo = = 123; // TODO: specify document number here
parameters.IsStreamsInfoAndDataNeeded = true;
// 4. Retrieve the document from the server
GetDocumentResponse response = service.GetDocument(parameters);
// 5. Close the channel and channel factory.
((IClientChannel)service).Close();
channelFactory.Close();
// 6. Extract all file streams to the specified directory
string extractDir = "D:\\temp\\";
string output = string.Empty;
foreach (var streamInfo in response.StreamsInfo)
{
string extractFileName = Path.Combine(extractDir, streamInfo.FileName);
File.WriteAllBytes(extractFileName, streamInfo.StreamData);
output += String.Format("Document stream extracted to {0}{1}", extractFileName, Environment.NewLine);
}