What is Multipart Upload?
Learn more about Multipart upload.
What is Multipart Upload?
Multipart Upload allows a single object to be uploaded as a collection of parts rather than as one single part. Each part is uploaded parallel to one another at the same time. If one part fails, it can be reuploaded without affecting any of the other parts. Once uploaded, these parts are compiled into the object. In general, all objects over 100MB are recommended to be uploaded using Multipart Upload.
This technology has a variety of benefits, including:
Improved throughput: By uploading each piece parallel to each other, the throughput of uploading the object is improved.
Quick recovery following network interruptions: If the connection is interrupted during upload, it can be easily restarted, without having to reupload the entire object.
Start and stop object uploads: Objects being uploaded with multipart upload can be started and stopped as needed, with no expiration as to when the rest of the parts need to be uploaded.
Multipart Upload is recommended to be used when:
Uploading large objects. By utilizing multipart upload, your object upload can maximize the available bandwidth of your network connection.
Uploading over an unstable network. By utilizing the quick upload restart of multipart upload, your object can be resilient to network errors by avoiding entire upload restarts. Instead, you only need to retry uploading parts that were interrupted.
The Multipart Upload Process:
The multipart upload process has three steps.
1. Initiation of the upload
When you initiate a multipart upload, the S3 service returns a response containing an Upload ID, which is a unique identifier for your multipart upload. This Upload ID needs to be included whenever you upload the object parts, list the parts, and complete or stop an upload. Any metadata for the object’s parts must be included in the initiation request.
2. The upload of the parts of the object
When you upload a part of an object using multipart upload, you need to specify a part number between 1 and 10,000. This number is a unique identifier for the part and it’s relative position in the object being uploaded. Part numbers do not need to be in a consecutive sequence. If you upload a new part with the same part number as a previously uploaded part, the previous part will be overwritten.
Whenever a part is uploaded, the S3 service returns an ETag header in it’s response. This ETag value must be included in the request to complete the multipart upload.
3. Confirmation that the upload has completed
Once the upload is complete, the parts are put together to create the object. The object can then be accessed just as any other object stored in a bucket. After a successful upload, the individual parts do not exist.
The S3 response confirming the upload includes an Etag that uniquely identifies the object and its data. This ETag is not necessarily the same as an MD5 hash of the object’s data.
Last updated