Skip to content

[Tools] Automating SQL Data Cleanup in Development and Sandbox Environments

Published: at 12:00 PM

Introduction

In our development and sandbox environments, data can quickly accumulate, leading to bloated databases filled with old transaction records, exchange rates, and logs. Without a dedicated housekeeper, these environments can become cluttered, impacting performance and increasing storage costs. To address this issue, I developed the SQL Data Cleanup program, a powerful tool designed to clean up old records from multiple SQL databases based on a provided configuration.

Why Clean Up Data?

Before we dive into the technical details, let’s talk about why this is important:

Configuration

The configuration for the SQL Data Cleanup program is stored in the appsettings.json file. Here is a simplified example configuration for one database and one table:

{
  "DbCleanup": {
    // Number of days to keep data before considering it old and eligible for cleanup
    "OlderThanDays": 365, // Keep Data for 1 year

    // Connection string template for connecting to the SQL databases
    "ConnectionString": "YOUR_CONNECTION_STRING",

    // Primary field used for identifying records in the tables
    "PrimaryField": "Id",

    // Fields used to determine the age of the records for cleanup
    "ConditionFields": ["CreatedOn"],

    // Configuration for individual databases
    "Databases": {
      // Configuration for the "random-database-1" database
      "database-1": {
        // Primary field used for identifying records in this database
        "PrimaryField": "Id",

        // Fields used to determine the age of the records for cleanup in this database
        "ConditionFields": ["UpdateOn"],

        // Configuration for individual tables within this database
        "Tables": {
          // Configuration for the "random-table-1" table
          "table-1": {
            // Primary field used for identifying records in this table
            "PrimaryField": "Id"
          },
          // Configuration for the "random-table-2" table
          "random-table-2": {
            // Primary field used for identifying records in this table
            "PrimaryField": "Id"
          }
        }
      }
    }
  }
}

Configuration Explanation

The configuration is divided into three levels: Global, Database, and Table. Each level allows you to specify ConditionFields and PrimaryField settings, providing flexibility and control over the cleanup process.

Usage

For detailed setup instructions, please refer to the SQL Data Cleanup GitHub repository. The repository contains comprehensive instructions on how to clone the project, update the configuration, build, and run the program.

Code Structure

Conclusion

By using the SQL Data Cleanup program, you can efficiently manage and maintain your SQL databases, ensuring that old and unnecessary data is regularly cleaned up. This not only improves database performance but also helps in reducing storage costs. The program allows you to configure and clean up only the tables you want (whitelist), providing flexibility and control over the cleanup process. Give it a try and let us know your feedback!


Thank you for your time! If you have any further questions, feel free to ask. 🌟✨🎁

Steven GitHub