Note: These are notes made from the MySQL for Developers course by Planetscale. For more verbosity regarding schema and other topics like indexing, queries etc. refer to their course here.
Schema design is the real performance hack nobody brags about
Most people meet databases through queries.
You write a SELECT, it’s slow.
So you add an index.
Still slow.
So you rewrite the query.
Still slow.
So you stare at the screen like it personally betrayed you.
What usually doesn’t get enough blame in this story is the schema. The boring part. The table definitions. The “I’ll fix it later” part.
But schema design is kind of like picking the foundation of a house. You don’t see it once the house is built, but if it’s bad, everything on top of it feels cursed.
I learned this the annoying way: by working on apps where queries looked fine, indexes were fine, hardware was fine… and things were still weirdly slow. The problem wasn’t how we were asking questions. It was how the data was shaped in the first place.
So yeah. Before queries. Before indexes. Before any “advanced” stuff. Schema.
First: what people actually mean by “schema”
Schema is just a fancy word for: how your tables are structured.
Column types. Sizes. Nullable or not. How much space each row takes. What kind of data you’re promising will live there.
Basically: the contract between your app and your database.
And that contract can either make your database’s life easy… or quietly miserable.
“Disks are cheap, who cares?”
True. Disks are cheap.
But memory isn’t infinite. Caches aren’t infinite. CPU cycles aren’t free. And databases don’t read “one value.” They read pages. Chunks. Blocks of data.
If your rows are fat, fewer of them fit in memory. If fewer fit in memory, more disk reads happen. If more disk reads happen, everything slows down in ways no amount of clever SQL can fully undo.
Compact rows = more data per page = fewer reads = happier database.
This is why “smallest reasonable data type” is not pedantry. It’s performance.
Not in a micro-optimization way. In a “your system still feels fine at 10x data” way.
The three rules I wish I followed earlier
Whenever I’m defining a column now, three questions run in my head:
1. What’s the smallest type that can actually hold this?
Don’t store page counts in BIGINT just because it exists. A book with 9 quintillion pages would be impressive, but also concerning.
2. What’s the simplest type that matches reality? Numbers as numbers. Dates as dates. Not everything needs to be a string just because strings feel safe.
3. Does this match the real world, or just my mood today? If something must exist, don’t make it nullable. If something can be unknown, don’t fake it with empty strings.
Schema is you telling the database the truth about your data. The more honest you are, the more it can help you.
Integers: stop throwing BIGINT at everything
MySQL gives you a whole family of integers: tiny ones, small ones, medium ones, normal ones, big ones.
They’re not there for decoration. They’re there because storage size matters.
If a value will never cross 65k, SMALLINT is already living its best life. Two bytes. Cheap. Cache-friendly. Index-friendly.
And if something can never be negative, say that. UNSIGNED literally doubles your positive range for free.
Also, quick myth kill:
INT(11) does not mean “11 digits.” It never meant that. It was about display width. It’s deprecated. It does nothing useful. Let it go.
Decimals: money does not like vibes
If you are storing money, measurements, or anything where “close enough” is not enough - use DECIMAL.
FLOAT and DOUBLE are amazing for scientific stuff. Simulations. Probabilities. Physics. Graphics.
They are terrible for “why is my total 20.4000000000006?”
That’s not a MySQL bug. That’s how floating point works. It’s doing math in binary, not in base 10, and some numbers are just… unrepresentable.
If you care about exactness, DECIMAL is boring, predictable, and worth it.
And honestly, boring and predictable is elite-tier database energy.
Strings: fixed, variable, and “why is this 4GB”
Most strings fall into two camps:
If it’s always the same size (hashes, codes, flags): CHAR.
If it varies (names, titles, emails): VARCHAR.
The trap people fall into is going straight to TEXT because it feels unlimited and future-proof.
TEXT is for actually large text. Articles. Documents. Descriptions that could go wild.
It’s stored differently. It behaves differently. It’s annoying to index. It’s annoying to sort. And if you select it casually, you pay for it.
If your data usually fits in a couple hundred characters, VARCHAR is your friend. Let TEXT earn its place.
Same idea with BLOB. Yes, you can store images in MySQL. You can also microwave your phone. The fact that something is possible is not a recommendation.
Store files somewhere built for files. Store paths in the database.
Binary strings: when characters are just getting in the way
Sometimes your “string” isn’t text. It’s bytes. Hashes. UUIDs. Tokens.
That’s where BINARY and VARBINARY quietly slap.
No collations. No encodings. No “is this accent-sensitive?” debates. Just bytes.
Storing a UUID in binary instead of a 36-char string saves space and speeds up indexes. Not glamorous. But very effective.
Enums: small, sharp, and not free
Enums are kind of cool. You write strings. MySQL stores tiny integers. You get validation for free.
That’s powerful.
They’re great for values that truly don’t change often. Statuses. Fixed categories. Known finite states.
The cost is flexibility. Adding a new value means altering the table. On big tables, that’s not nothing.
Also, ordering is by internal number, not alphabetical order. Which is fun the first time and confusing forever after.
Use them. Just… deliberately.
Dates and time: pick your poison consciously
If you only need a date, use DATE. Don’t drag time into it.
If you need date and time, you’re choosing between DATETIME and TIMESTAMP.
TIMESTAMP is smaller and timezone-aware, but has a limited range.
DATETIME is bigger, simpler, and doesn’t care about time zones.
Neither is “better.” They solve different problems. The mistake is not knowing which problem you have.
And yes, timezones will hurt you emotionally at some point. Schema design won’t save you from that. But it can avoid making it worse.
JSON: powerful, heavy, and not a substitute for thinking
MySQL’s JSON support is genuinely good. Validation. Path queries. Generated columns. Indexing pieces of it.
It shines when the structure is messy, external, or evolving fast.
Third-party payloads. Config blobs. Nested data that doesn’t want to be relational.
What it is not: an excuse to skip schema design.
If you know the shape of your data, model it. Your future queries will thank you. Your indexes will actually work. Your CPU will run cooler. Your on-call self will be less angry.
Use JSON when structure is unclear, not when structure is inconvenient.
Generated columns: letting the database do the boring math
Generated columns are one of those features that feel like cheating the first time you use them.
You define a column that’s derived from another column. MySQL keeps it up to date. You can even index it.
Extracting domains from emails. Pulling fields out of JSON. Precomputing values you filter on constantly.
It’s the database saying: “Please stop recomputing this in every query. I got you.”
Sometimes it stores the value. Sometimes it computes it on the fly. You choose. Both have trade-offs. Both beat copy-pasting logic everywhere.
The quiet conclusion
Good schema design doesn’t feel like optimization.
It feels like… describing your data clearly.
And that’s kind of the point.
You’re not trying to outsmart MySQL. You’re trying to give it fewer reasons to struggle.
Pick smaller types. Pick honest types. Pick types that match reality, not vibes.
Indexes and queries can do a lot. But schema decides what’s even possible.
And yeah, nobody tweets about choosing SMALLINT over INT.
But six months later, when your app is still fast and nothing weird is happening at scale… that’s the reward.
