CRUD is the acronym for CREATE, READ, UPDATE and DELETE. These terms describe the four essential operations for creating and managing persistent data elements, mainly in relational and NoSQL databases.
This post will describe how CRUD operations are used for data processing. We will also show the issues that sysadmins or DevOps engineers may find when monitoring a database.
As mentioned, CRUD operations are used in persistent storage applications, meaning these applications will keep their data even after the system powers down. These are different from operations on data stored in volatile storage, like Random Access Memory or cache files.
CRUD is extensively used in database applications. This includes Relational Database Management Systems (RDBMS) like Oracle, MySQL, and PostgreSQL. It also includes NoSQL databases like MongoDB, Apache Cassandra, and AWS DynamoDB.
Operations similar to CRUD can be performed on persistent data structures like files. For example, you can create a Microsoft Word document, update it, read it, and even delete it from the file explorer. However, files are not record-oriented (or document-oriented in the case of MongoDB or Couchbase). The CRUD terminology is specifically related to record-oriented operations instead of flat file operations.
The CREATE operation adds a new record to a database. In RDBMS, a database table row is referred to as a record, while columns are called attributes or fields. The CREATE operation adds one or more new records with distinct field values in a table.
The same principle applies to NoSQL databases. If the NoSQL database is document-oriented, then a new document (for example, a JSON formatted document with its attributes) is added to the collection, which is the equivalent of an RDBMS table. Similarly, in NoSQL databases like DynamoDB, the CREATE operation adds an item (which is equivalent to a record) to a table.
READ returns records (or documents or items) from a database table (or collection or bucket) based on some search criteria. The READ operation can return all records and some or all fields.
UPDATE is used to modify existing records in the database. For example, this can be the change of address in a customer database or price change in a product database. Similar to READ, UPDATEs can be applied across all records or only a few, based on criteria.
An UPDATE operation can modify and persist changes to a single field or to multiple fields of the record. If multiple fields are to be updated, the database system ensures they are all updated or not at all. Some big data systems don’t implement UPDATE but allow only a timestamped CREATE operation, adding a new version of the row each time.
DELETE operations allow the user to remove records from the database. A hard delete removes the record altogether, while a soft delete flags the record but leaves it in place. For example, this is important in payroll where employment records need to be maintained even after an employee has left the company.
In RDBMS, CRUD operations are performed through Structure Query Language (SQL) commands.
CRUD operations in NoSQL databases will depend on the language of the specific database platform. For example, the Cassandra CQL looks very similar to SQL. In MongoDB, on the other hand, the operations are performed with built-in functions:
Database developers or DBAs often run CRUD statements manually against the database from a client tool. However, in most production use cases, these statements are embedded within the programming language code. When the program runs, the API for the target database takes the CRUD statement and translates it into the native language of the database.
For example, when an ecommerce site visitor initiates the user registration process, a microservice written in Python or Java may read the input values (such as first name, last name, email, address, and so on), and dynamically build an Oracle PL/SQL command. This statement is then sent to the Oracle driver library, which runs it against the database.
Let’s take the ecommerce store example further.
In an online travel agency, a user can CREATE a booking request, READ available flights for the requested route, and make a purchase. This will UPDATE a list of available seats for the flight and CREATE multiple records in the “itinerary” table. If the user terminates the session halfway, then all rows related to this transaction are DELETEd.
Software operations involving CRUD are usually black-box tested. When the testers perform certain operations, they check the backend database rather than analyzing the code to see if the intended changes were made or the correct data returned. Such testing aims to validate each CRUD operation resulting from various possible user interactions in different scenarios.
Efficient database design is the prerequisite for optimal CRUD operations. Without good database design, CRUD operations can adversely affect the performance of a database.
For example, operations like UPDATE or DELETE require exclusive locks on the rows (and their related resources, like data pages or indexes). Locks ensure that when one more row is modified, they are not available to other processes or users for any CRUD operation. This is to ensure the integrity of the data.
You can’t read a record when it’s being deleted or allow two or more users to update a single record simultaneously. Other types of locks, such as shared locks allow simultaneous READs. Locks can be configured at the database or statement level, and different types of locking will dictate which CRUD operations are allowed and how the CRUD operation will behave.
Needless to say, the type of locking and the number of simultaneous locks due to user sessions will affect the performance of a database. For example, a busy ecommerce site with hundreds or thousands of simultaneous users will have many locks operating at the same time. The result could be slow responsiveness as the user waits for locks to be released.
This performance challenge is why database administrators work to ensure CRUD operations can complete as quickly as possible. This requires query tuning based on feedback from monitoring solutions. Such monitoring solutions can show current database locks, metrics, and logs to help the administrator identify possible bottlenecks.
Falcon LogScale Community Edition (previously Humio) offers a free modern log management platform for the cloud. Leverage streaming data ingestion to achieve instant visibility across distributed systems and prevent and resolve incidents.
Falcon LogScale Community Edition, available instantly at no cost, includes the following:
Get Started Free
Arfan Sharif is a product marketing lead for the Observability portfolio at CrowdStrike. He has over 15 years experience driving Log Management, ITOps, Observability, Security and CX solutions for companies such as Splunk, Genesys and Quest Software. Arfan graduated in Computer Science at Bucks and Chilterns University and has a career spanning across Product Marketing and Sales Engineering.