From Thought to Query

The Process of Writing SQL-Queries

Beginner SQL JSON
Tom Gerzabek Tom Gerzabek on Jan 10, 2025

Have you ever wondered how to get from a rough question in data analysis to a working query?

In this blog post, I would like to give you an insight into my workflow and show you how I developed an SQL query step by step - from the idea to the final implementation.

The Problem: Inconsistent Data

I started of with inconsistent data in a column of a database table. Initially, the data was saved as JSON via a migration. Later, the format changed, which led to inconsistencies when processing and analyzing the comments.

The task was clear: the inconsistencies had to be resolved and the database had to be cleaned up so that the comments could be analyzed without any problems.

Step 1: Overview of the Data

Before you start with the actual work, it is important to understand the structure of the data. A simple SELECT query helps us to get an overview.

SELECT comment FROM postings;

This query displays all values from the comment column. However, I was only interested in the part of the data that was actually formatted as JSON. Fortunately, there is a function in MySQL for this, JSON_VALID().

The function expects an argument, typically a column name, and returns a Boolean.

The statement can simply be extended by a WHERE clause.

SELECT id FROM postings WHERE JSON_VALID(comment) IS TRUE;

The JSON_VALID operator checks whether the content of the comment column is a valid JSON object and only returns the corresponding rows.

Step 2: Working with JSON Data

The JSON_EXTRACT function is used to extract a specific text from a JSON object. This function expects two arguments.

  1. the column name (or JSON value) that contains the data.
  2. the JSON path that leads to the desired value.

For example,

SELECT JSON_EXTRACT(comment, '$.blocks[0].data.text') FROM postings;

In this case, I extract the text from the JSON structure that is stored in the first block (blocks[0]) stored in data.text.

Important to note: The dollar sign indicates the root of the JSON object.

Step 3: The Final Query

To clean up the database, I had to combine the two queries. The goal was to save the extracted text from the JSON structure in the comment column.

This caused another problem, as the JSON_EXTRACT function identifies the extracted text as a string and delimits it with quotation marks. Of course, you don't expect quotation marks in a plain text column.

That's why I needed a function that can replace certain character strings within a string. I found this in the MySQL manual under REPLACE.

This function expects the following arguments.

  1. The string which needs to be searched in, typically called haystack in computer lingo.
  2. The pattern string to be replaced, commonly named needle.
  3. The replacement characters.

We have now collected all the pieces of the puzzle and are able to formulate the final update statement.

UPDATE postings p SET p.comment = REPLACE(JSON_EXTRACT(p.comment, '$.blocks[0].data.text'), '"', '') WHERE JSON_VALID(p.comment) IS TRUE;

This UPDATE query replaces the values in the comment column with the extracted text if it is in a valid JSON format.

Since this query is relatively complex and has to iterate over more than 8,000 rows in the table, I was interested in how long the query would take. The result surprised me. MySQL completed this task in an incredible 23 ms.

Problems or tasks that seem overwhelming at first can often be overcome with the right approach.

In this case, it helped to break the problem down into smaller steps. This made everything clearer and the path to the solution was ultimately quicker and easier than I had initially assumed.

Tom Gerzabek

About the author

In his final year of apprenticeship, Tom Gerzabek combines hands-on experience with a strong willingness to learn.

He values reliability, teamwork, and clear communication, and enjoys taking on new challenges.

Currently focused on finishing his training and building a solid foundation for the future.

→ Connect on LinkedIn

Stay Close to How We Think About Systems

I write occasionally about system architecture, developer experience, and how complex products stay coherent as they grow.

This is not a marketing newsletter.

It’s a continuation of the thinking behind articles like this one.

If that resonates, you’ll probably want to stay.

    We respect your privacy: read our privacy policy.