✏️

Chapter 3: Data Manipulation

INSERT, UPDATE, DELETE & batch operations

1 INSERT — Adding Data

Single Row Insert

The most basic form of INSERT adds one row at a time. Specify columns explicitly for clarity and maintainability.

-- Explicit column list (recommended)
INSERT INTO users (name, email, age)
VALUES ('Frank Miller', 'frank@example.com', 29);

-- Omitting column list (not recommended - fragile to schema changes)
INSERT INTO users
VALUES (NULL, 'Grace Park', 'grace@example.com', 27, NOW());

-- Omitting optional columns (they'll use defaults)
INSERT INTO users (name, email)
VALUES ('Henry Wilson', 'henry@example.com');

Multi-Row Insert

Insert multiple rows in a single statement for better performance. This reduces round trips to the server and is significantly faster than individual inserts.

-- Insert multiple rows at once
INSERT INTO users (name, email, age) VALUES
('Ivy Chen', 'ivy@example.com', 24),
('Jack Davis', 'jack@example.com', 33),
('Kate Adams', 'kate@example.com', 26),
('Leo Garcia', 'leo@example.com', 41),
('Mia Taylor', 'mia@example.com', 30);

-- Check how many rows were affected
-- Query OK, 5 rows affected
Performance tip: Multi-row INSERT is typically 5-10x faster than individual inserts. MySQL's max_allowed_packet limits the total size of a single statement (default 64MB). For very large imports, use LOAD DATA INFILE.

INSERT ... SELECT

Copy data from one table to another using a SELECT query as the data source.

-- Create an archive table with the same structure
CREATE TABLE users_archive LIKE users;

-- Copy all users older than 35 to archive
INSERT INTO users_archive (name, email, age, created_at)
SELECT name, email, age, created_at
FROM users
WHERE age > 35;

-- Copy with transformation
INSERT INTO users_archive (name, email, age)
SELECT
    CONCAT('[ARCHIVED] ', name),
    email,
    age
FROM users
WHERE created_at < '2024-01-01';

ON DUPLICATE KEY UPDATE

Handle conflicts when a UNIQUE or PRIMARY KEY already exists. This is MySQL's upsert mechanism — insert if new, update if existing.

-- Insert or update if email already exists
INSERT INTO users (name, email, age)
VALUES ('Alice Updated', 'alice@example.com', 29)
ON DUPLICATE KEY UPDATE
    name = VALUES(name),
    age = VALUES(age);

-- Using the new MySQL 8.0.19+ alias syntax
INSERT INTO users (name, email, age)
VALUES ('Bob Updated', 'bob@example.com', 36)
AS new_data
ON DUPLICATE KEY UPDATE
    name = new_data.name,
    age = new_data.age;

-- Increment a counter on conflict
INSERT INTO page_views (url, view_count)
VALUES ('/home', 1)
ON DUPLICATE KEY UPDATE
    view_count = view_count + 1;
💡 PostgreSQL comparison: PostgreSQL uses INSERT ... ON CONFLICT (column) DO UPDATE SET ... instead. The PostgreSQL syntax is more flexible, allowing you to specify which conflict target to handle: ON CONFLICT (email) DO UPDATE SET name = EXCLUDED.name.

2 UPDATE — Modifying Data

Basic UPDATE

Modify existing rows that match a WHERE condition. Always include a WHERE clause to avoid updating all rows accidentally.

-- Update a single column
UPDATE users SET age = 29 WHERE id = 1;

-- Update with a condition
UPDATE users SET age = 23 WHERE name = 'Charlie Lee';

-- Verify the change
SELECT * FROM users WHERE name = 'Charlie Lee';

Multi-Column UPDATE

Update multiple columns in a single statement by separating assignments with commas.

-- Update multiple columns at once
UPDATE users
SET
    name = 'Alice J. Johnson',
    email = 'alice.j@example.com',
    age = 29
WHERE id = 1;

-- Update all rows matching a condition
UPDATE users
SET age = age + 1
WHERE age IS NOT NULL;

UPDATE with Expressions

Use expressions, functions, and computed values in SET clauses.

-- Increment age by 1 for everyone
UPDATE users SET age = age + 1 WHERE age IS NOT NULL;

-- Use string functions
UPDATE users SET email = LOWER(email);

-- Use conditional logic
UPDATE users
SET age = CASE
    WHEN age IS NULL THEN 25
    WHEN age < 20 THEN 20
    ELSE age
END;

-- Use GREATEST / LEAST to clamp values
UPDATE users SET age = GREATEST(age, 18) WHERE age IS NOT NULL;

UPDATE with JOIN

Update rows in one table based on values from another table. This is a powerful pattern for syncing data between tables.

-- Suppose we have an orders table
CREATE TABLE orders (
    id          INT AUTO_INCREMENT PRIMARY KEY,
    user_id     INT NOT NULL,
    amount      DECIMAL(10, 2) NOT NULL,
    created_at  DATETIME DEFAULT CURRENT_TIMESTAMP
);

INSERT INTO orders (user_id, amount) VALUES
(1, 99.99), (1, 149.50), (2, 75.00), (3, 200.00);

-- Add a total_spent column to users
ALTER TABLE users ADD COLUMN total_spent DECIMAL(10, 2) DEFAULT 0;

-- Update users.total_spent based on orders
UPDATE users u
JOIN (
    SELECT user_id, SUM(amount) AS total
    FROM orders
    GROUP BY user_id
) o ON u.id = o.user_id
SET u.total_spent = o.total;

-- Verify
SELECT name, total_spent FROM users WHERE total_spent > 0;
-- Alice Johnson | 249.49
-- Bob Smith     |  75.00
-- Charlie Lee   | 200.00
⚠️ Warning: An UPDATE without a WHERE clause will modify every row in the table. Always double-check your WHERE condition. Use SELECT first with the same WHERE to preview which rows will be affected.

3 DELETE — Removing Data

Basic DELETE

Remove rows that match a condition. Like UPDATE, always use a WHERE clause to avoid deleting all data.

-- Delete a specific row
DELETE FROM users WHERE id = 5;

-- Delete rows matching a condition
DELETE FROM users WHERE age IS NULL;

-- Delete with LIMIT (useful for batch deletion)
DELETE FROM users WHERE age < 20 LIMIT 100;

-- Check how many rows were deleted
-- Query OK, 1 row affected

DELETE with Subquery

Use subqueries to delete rows based on conditions from other tables.

-- Delete users who have never placed an order
DELETE FROM users
WHERE id NOT IN (
    SELECT DISTINCT user_id FROM orders
);

-- Delete using EXISTS (often faster than NOT IN)
DELETE FROM users u
WHERE NOT EXISTS (
    SELECT 1 FROM orders o WHERE o.user_id = u.id
);

-- Delete with JOIN (MySQL-specific syntax)
DELETE u FROM users u
LEFT JOIN orders o ON u.id = o.user_id
WHERE o.id IS NULL;

TRUNCATE vs DELETE

Both remove data, but they work very differently under the hood.

Feature DELETE TRUNCATE
WHERE clause Supported Not supported
Speed Slower (row by row) Faster (drops and recreates table)
Transaction safe Yes (can ROLLBACK) No (implicit COMMIT)
AUTO_INCREMENT Not reset Reset to 1
Triggers Fires DELETE triggers Does not fire triggers
Logging Logs each row deletion Minimal logging
Foreign keys Checks constraints Fails if referenced
-- Delete all rows (slow, can rollback)
DELETE FROM users_archive;

-- Truncate all rows (fast, cannot rollback, resets AUTO_INCREMENT)
TRUNCATE TABLE users_archive;
💡 PostgreSQL comparison: In PostgreSQL, TRUNCATE can be included in a transaction and rolled back. PostgreSQL also supports TRUNCATE ... CASCADE to automatically truncate dependent tables, and TRUNCATE ... RESTART IDENTITY to reset sequences.

4 REPLACE & UPSERT Patterns

REPLACE INTO

REPLACE is a MySQL-specific statement that deletes the existing row (if a duplicate key is found) and inserts the new one. Be careful: it deletes first, which resets AUTO_INCREMENT values and fires DELETE triggers.

-- If email 'alice@example.com' exists, delete and re-insert
REPLACE INTO users (name, email, age)
VALUES ('Alice New', 'alice@example.com', 30);

-- REPLACE with multiple rows
REPLACE INTO users (name, email, age) VALUES
('Bob Replaced', 'bob@example.com', 36),
('Charlie Replaced', 'charlie@example.com', 23);
⚠️ REPLACE gotcha: Since REPLACE does DELETE + INSERT, the row gets a new AUTO_INCREMENT id. If other tables reference the old id via foreign keys, this will break referential integrity. Prefer ON DUPLICATE KEY UPDATE in most cases.

Comparing UPSERT Strategies

Method Behavior Preserves ID Use Case
INSERT ... ON DUPLICATE KEY UPDATE Insert or update in place Yes Most upsert scenarios
REPLACE INTO Delete + insert No Full row replacement
INSERT IGNORE Insert or silently skip N/A Idempotent inserts

INSERT IGNORE

Silently skip rows that would cause a duplicate key error. Useful for idempotent batch imports.

-- Insert and silently ignore duplicates
INSERT IGNORE INTO users (name, email, age) VALUES
('Alice Johnson', 'alice@example.com', 28),
('New User', 'newuser@example.com', 25);

-- Only 'New User' is inserted; Alice is skipped without error
-- Query OK, 1 row affected, 1 warning

Practical Upsert Example

A real-world scenario: maintaining a user settings table where each user has one row per setting key.

CREATE TABLE user_settings (
    user_id     INT NOT NULL,
    setting_key VARCHAR(50) NOT NULL,
    setting_val VARCHAR(255),
    updated_at  DATETIME DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
    PRIMARY KEY (user_id, setting_key)
);

-- Upsert a setting: insert if new, update if exists
INSERT INTO user_settings (user_id, setting_key, setting_val)
VALUES (1, 'theme', 'dark')
ON DUPLICATE KEY UPDATE
    setting_val = VALUES(setting_val);

-- Batch upsert multiple settings
INSERT INTO user_settings (user_id, setting_key, setting_val) VALUES
(1, 'theme', 'light'),
(1, 'language', 'en'),
(1, 'notifications', 'true')
ON DUPLICATE KEY UPDATE
    setting_val = VALUES(setting_val);
💡 PostgreSQL comparison: PostgreSQL does not have REPLACE INTO or INSERT IGNORE. Instead, it uses INSERT ... ON CONFLICT DO UPDATE (upsert) and INSERT ... ON CONFLICT DO NOTHING (skip duplicates), which are more explicit and less error-prone.

5 Safe Operation Tips

Data manipulation is powerful but dangerous. Follow these practices to protect against accidental data loss.

1. Preview Before Modifying

Always run a SELECT with the same WHERE clause before running UPDATE or DELETE. Verify the row count and affected data match your expectations.

-- Step 1: Preview
SELECT * FROM users WHERE age < 20;
-- Verify these are the rows you want to delete

-- Step 2: Execute
DELETE FROM users WHERE age < 20;
2. Enable Safe Updates Mode

MySQL's safe update mode prevents UPDATE and DELETE without a WHERE clause or without using a key column in the condition.

-- Enable safe updates for this session
SET SQL_SAFE_UPDATES = 1;

-- This will now be rejected:
-- UPDATE users SET age = 0;
-- ERROR 1175: safe update mode, no WHERE with KEY column

-- This works because it uses a key column:
UPDATE users SET age = 0 WHERE id = 1;
3. Use Transactions for Multi-Step Operations

Wrap related operations in a transaction so you can roll back if something goes wrong.

-- Start a transaction
START TRANSACTION;

-- Perform operations
UPDATE users SET age = age + 1 WHERE id = 1;
DELETE FROM orders WHERE user_id = 1 AND amount < 10;

-- Review the changes
SELECT * FROM users WHERE id = 1;
SELECT * FROM orders WHERE user_id = 1;

-- If everything looks good:
COMMIT;

-- If something went wrong:
-- ROLLBACK;
4. Back Up Before Bulk Operations

For large-scale data changes, create a backup table first. This provides a quick recovery path without needing full database restores.

-- Create a backup before bulk operations
CREATE TABLE users_backup AS SELECT * FROM users;

-- Perform your bulk operation
UPDATE users SET email = LOWER(email);

-- If something went wrong, restore from backup
-- INSERT INTO users SELECT * FROM users_backup
--   ON DUPLICATE KEY UPDATE
--     name = VALUES(name), email = VALUES(email),
--     age = VALUES(age), created_at = VALUES(created_at);
5. Use LIMIT for Batch Processing

For large-scale deletions or updates, process in batches to avoid locking the table for extended periods.

-- Delete old records in batches of 1000
DELETE FROM logs WHERE created_at < '2024-01-01' LIMIT 1000;
-- Repeat until 0 rows affected

-- In a shell script or application:
-- WHILE affected_rows > 0:
--   DELETE FROM logs WHERE created_at < '2024-01-01' LIMIT 1000;
--   SLEEP 0.1;  -- small pause to reduce lock contention
⚠️ Golden Rule: Never run UPDATE or DELETE on a production database without first running the equivalent SELECT. In a team environment, always have a peer review bulk data changes, and prefer running them during low-traffic hours.

6 Chapter Summary

INSERT Patterns

Single row, multi-row, INSERT SELECT, and ON DUPLICATE KEY UPDATE. Multi-row inserts are significantly faster than individual statements.

UPDATE Techniques

Basic updates, multi-column changes, expression-based updates, and JOIN updates for cross-table modifications. Always use WHERE.

DELETE Operations

Simple deletes, subquery-based deletes, and JOIN deletes. TRUNCATE is faster but cannot be rolled back and doesn't fire triggers.

UPSERT Strategies

ON DUPLICATE KEY UPDATE (preserves ID), REPLACE (delete+insert), INSERT IGNORE (skip duplicates). Prefer ON DUPLICATE KEY UPDATE.

Safe Operations

Preview with SELECT, enable SQL_SAFE_UPDATES, use transactions, backup before bulk changes, and process large deletes in batches.

Cross-DB Awareness

MySQL's REPLACE and INSERT IGNORE are non-standard. PostgreSQL uses ON CONFLICT. SQL Server uses MERGE. Always check dialect differences.

Previous: Ch2 Queries 📚 Index Next: Ch4 Table Design