Amazon Redshift - Fundamentals

Late 2017, we set out to replace and upgrade our existing reporting and analytics infrastructure with something that would be a better fit for our workloads. Keeping costs and required maintenance at a minimum would be a nice plus, making for an easy sell. After a bit of research, it was obvious Amazon Redshift had the potential to tick all the right boxes. While steadily porting the most problematic workloads away from our existing infrastructure, I started writing an investigative article on the fundamental concepts of Amazon Redshift. I learned a lot studying each individual building block, allowing me to make some small, but impactful changes to our own setup along the way.

The outcome is a 10.000 word document (1 hour reading time), covering 7 topics:

Storage
Distribution
Importing data
Table maintenance
Exporting data
Query processing
Workload management

The text is available in three formats:

HTML
EPUB
MOBI (Kindle)

The project is open source and available on Github.

Thanks to everyone who proof-read earlier iterations and provided me with indispensable feedback.

I hope this work can teach you as much as it thought me. I’m looking forward to your feedback.