From Thought to Query

The Process of Writing SQL-Queries

Beginner SQL JSON
Tom Gerzabek Tom Gerzabek on Jan 10, 2025

Have you ever wondered how to get from a rough question in data analysis to a working query?

In this blog post, I would like to give you an insight into my workflow and show you how I developed an SQL query step by step - from the idea to the final implementation.

The Problem: Inconsistent Data

I started of with inconsistent data in a column of a database table. Initially, the data was saved as JSON via a migration. Later, the format changed, which led to inconsistencies when processing and analyzing the comments.

The task was clear: the inconsistencies had to be resolved and the database had to be cleaned up so that the comments could be analyzed without any problems.

Step 1: Overview of the Data

Before you start with the actual work, it is important to understand the structure of the data. A simple SELECT query helps us to get an overview.

SELECT comment FROM postings;

This query displays all values from the comment column. However, I was only interested in the part of the data that was actually formatted as JSON. Fortunately, there is a function in MySQL for this, JSON_VALID().

The function expects an argument, typically a column name, and returns a Boolean.

The statement can simply be extended by a WHERE clause.

SELECT id FROM postings WHERE JSON_VALID(comment) IS TRUE;

The JSON_VALID operator checks whether the content of the comment column is a valid JSON object and only returns the corresponding rows.

Step 2: Working with JSON Data

The JSON_EXTRACT function is used to extract a specific text from a JSON object. This function expects two arguments.

  1. the column name (or JSON value) that contains the data.
  2. the JSON path that leads to the desired value.

For example,

SELECT JSON_EXTRACT(comment, '$.blocks[0].data.text') FROM postings;

In this case, I extract the text from the JSON structure that is stored in the first block (blocks[0]) stored in data.text.

Important to note: The dollar sign indicates the root of the JSON object.

Step 3: The Final Query

To clean up the database, I had to combine the two queries. The goal was to save the extracted text from the JSON structure in the comment column.

This caused another problem, as the JSON_EXTRACT function identifies the extracted text as a string and delimits it with quotation marks. Of course, you don't expect quotation marks in a plain text column.

That's why I needed a function that can replace certain character strings within a string. I found this in the MySQL manual under REPLACE.

This function expects the following arguments.

  1. The string which needs to be searched in, typically called haystack in computer lingo.
  2. The pattern string to be replaced, commonly named needle.
  3. The replacement characters.

We have now collected all the pieces of the puzzle and are able to formulate the final update statement.

UPDATE postings p SET p.comment = REPLACE(JSON_EXTRACT(p.comment, '$.blocks[0].data.text'), '"', '') WHERE JSON_VALID(p.comment) IS TRUE;

This UPDATE query replaces the values in the comment column with the extracted text if it is in a valid JSON format.

Since this query is relatively complex and has to iterate over more than 8,000 rows in the table, I was interested in how long the query would take. The result surprised me. MySQL completed this task in an incredible 23 ms.

Problems or tasks that seem overwhelming at first can often be overcome with the right approach.

In this case, it helped to break the problem down into smaller steps. This made everything clearer and the path to the solution was ultimately quicker and easier than I had initially assumed.

Join the LaravelUi5 Insider List

Get curated updates about LaravelUi5 — product milestones, architectural insights, and upcoming events.

Our newsletter is written for technical decision-makers who want to stay ahead of the curve without being overwhelmed by noise.

We send updates only when something significant happens — releases, webinars, or new SDK capabilities worth your attention.

    We respect your privacy: read our privacy policy.