Applying data loading best practices for ML training with Amazon S3 clients

In this post, we present practical techniques and recommendations for optimizing throughput in ML training workloads that read data directly from Amazon S3 general purpose buckets. That said, many of the data loading optimization techniques discussed here are broadly applicable across different storage fabrics.