PostgreSQL Internal Storage Workshop – Pages & Heap Explained

Опубликовано: 04 Июнь 2026
на канале: Sepahram Data Eng. School
177
6

In this hands-on workshop from Sepahram Data Engineering School, we explored how PostgreSQL stores data physically on disk — diving deep into Pages, Heap, and the foundations of MVCC.

🧠 Using PostgreSQL 18 inside Docker, we examined the internal storage files under the base and global directories and understood how OIDs relate to database and table identifiers.

A Page (or Block) in PostgreSQL is an 8 KB unit of storage — the smallest chunk that PostgreSQL reads or writes from disk.
We explored how each Page contains:
✨ a header with metadata,
✨ line pointers (item identifiers),
✨ free space, and
✨ actual row data (tuples).

When inserting or updating rows, PostgreSQL looks for available free space in existing Pages (via the Free Space Map) — otherwise, it allocates new Pages.
We saw how updated rows are not overwritten but stored as new versions — forming the base for MVCC and concurrency control.
Later, we created a table with fillfactor=70 and observed how the free space within a Page allowed updated rows to stay in the same Page.
This behavior demonstrates the Heap storage model — unordered data storage where each row version is independently maintained.

🧰 Workshop highlights:

🔰Run PostgreSQL 18 with Docker
🔰Explore data folders (base, global, and OIDs)
🔰Insert and update rows to observe Page behavior
🔰Learn about Heap storage and the impact of fillfactor

💾 Workshop files available on GitHub:
👉 https://github.com/sepahram-school/wo...

🗣️ فارسی:
در این کارگاه عملی از مدرسه مهندسی داده سپهرام، با نحوه‌ی ذخیره‌سازی فیزیکی داده‌ها در PostgreSQL
آشنا شدیم. مفاهیم
Page و Heap
را از نزدیک بررسی کردیم و دیدیم که چگونه پستگرس داده‌ها را در سطح فایل و پیج مدیریت می‌کند.

#PostgreSQL #DataEngineering #SepahramSchool #DatabaseInternals #MVCC #Heap #Fillfactor #Pages #Docker #PostgresInternals