In this hands-on workshop from Sepahram Data Engineering School, we explored how PostgreSQL stores data physically on disk — diving deep into Pages, Heap, and the foundations of MVCC.
🧠 Using PostgreSQL 18 inside Docker, we examined the internal storage files under the base and global directories and understood how OIDs relate to database and table identifiers.
A Page (or Block) in PostgreSQL is an 8 KB unit of storage — the smallest chunk that PostgreSQL reads or writes from disk.
We explored how each Page contains:
✨ a header with metadata,
✨ line pointers (item identifiers),
✨ free space, and
✨ actual row data (tuples).
When inserting or updating rows, PostgreSQL looks for available free space in existing Pages (via the Free Space Map) — otherwise, it allocates new Pages.
We saw how updated rows are not overwritten but stored as new versions — forming the base for MVCC and concurrency control.
Later, we created a table with fillfactor=70 and observed how the free space within a Page allowed updated rows to stay in the same Page.
This behavior demonstrates the Heap storage model — unordered data storage where each row version is independently maintained.
🧰 Workshop highlights:
🔰Run PostgreSQL 18 with Docker
🔰Explore data folders (base, global, and OIDs)
🔰Insert and update rows to observe Page behavior
🔰Learn about Heap storage and the impact of fillfactor
💾 Workshop files available on GitHub:
👉 https://github.com/sepahram-school/wo...
🗣️ فارسی:
در این کارگاه عملی از مدرسه مهندسی داده سپهرام، با نحوهی ذخیرهسازی فیزیکی دادهها در PostgreSQL
آشنا شدیم. مفاهیم
Page و Heap
را از نزدیک بررسی کردیم و دیدیم که چگونه پستگرس دادهها را در سطح فایل و پیج مدیریت میکند.
#PostgreSQL #DataEngineering #SepahramSchool #DatabaseInternals #MVCC #Heap #Fillfactor #Pages #Docker #PostgresInternals