Description
I've got a suite of applications that produce very large binary buffers that I'd like to be able to intermittently write to and read from S3 with. I'd like to avoid large reallocations and copying of data and instead transfer the buffer directly to/from S3.
The PutObjectRequest wants an IOStream and the GetOjectRequest will allow you to provide a factory that creates an IOStream. So the best workaround I've got so far is to create a StringStream and use pubsetbuf()
to override the stream's underlying buffer (in the snippets below, I've got void* buffer
and size_t bufferSize
).
For the put:
std::shared_ptr<Aws::StringStream> stream =
Aws::MakeShared<Aws::StringStream>(ALLOCATION_TAG);
stream->rdbuf()->pubsetbuf(static_cast<char*>(const_cast<void*>(buffer)),
bufferSize);
stream->rdbuf()->pubseekpos(bufferSize);
stream->seekg(0);
And for the get:
request.SetResponseStreamFactory(
[buffer, bufferSize]()
{
std::unique_ptr<Aws::StringStream>
stream(Aws::New<Aws::StringStream>(ALLOCATION_TAG));
stream->rdbuf()->pubsetbuf(static_cast<char*>(buffer),
bufferSize);
return stream.release();
});
This appears to work with the testing I've done so far, and as far as I understand streams it seems like this should accomplish what I want in terms of avoiding large allocations or copies. Just wondering if you have any thoughts on an easier/better way to do this or have any plans to add any overrides in the API to make this a little easier.
Also, to confirm, a single PutObjectRequest/GetObjectRequest will automatically do multi-part uploads for large transfers right? And if I don't override the rate limiter, by default it'll use as much network bandwidth as it can get?
Thanks.
-Adam