Unlocking the Power of Enterprise Data with PostgreSQL: Be Big Data Ready

When it comes to choosing a database for enterprise data storage, there are a variety of options available. One of the most popular choices among enterprise businesses is Postgresql. This open source relational database management system has a lot of advantages that make it an excellent choice for enterprise data storage. To interact with the database I use DBeaver – it has been an indispensable tool which I use daily. Speaking about beavers – let’s talk about the elephant in the room (pun intended)!

An image of dbeaver’s table visualisation feature

Relational Databases:

A relational database is a type of database management system that is used to store and provide access to data points that are related to one another. This type of database is based on the relational model, which is a logical and intuitive way of representing data in tables.

The basic structure of a relational database consists of one or more tables, each of which is composed of rows and columns. Each row in the table is a record that represents a single instance of the data being stored, and each column represents a particular attribute or characteristic of the data.

In a relational database, each row in the table is uniquely identified by a key. This key is typically an integer value that is automatically generated by the database system, and it serves as a way to distinguish one record from another.

The columns in the table hold attributes of the data, and each record usually has a value for each attribute. This makes it easy to establish relationships among data points, as the values in one column can be used to link to values in another column.

For example, consider a relational database that stores information about customers and their orders. The customer table might include columns for the customer’s name, address, and phone number, as well as a unique ID for each customer. The order table might include columns for the order date, the order total, and a reference to the customer ID of the person who placed the order.

Using this structure, it is easy to establish a relationship between a customer and their orders. By linking the customer ID column in the customer table to the corresponding column in the order table, it is possible to retrieve all of the orders that were placed by a particular customer.

One of the key advantages of a relational database is its flexibility. Because the data is stored in tables with well-defined relationships, it is easy to add new data to the database or modify existing data as needed. This makes it an ideal choice for applications where the data is constantly changing, such as customer relationship management systems or inventory management systems.

Overall, a relational database is an intuitive and straightforward way of storing and accessing data that is related to one another. By using tables with well-defined relationships, it is possible to easily establish connections between data points and retrieve the information needed to make informed decisions.

Built in Functions

One of the key advantages of Postgresql is its built-in functions. Postgresql has a wide range of built-in functions that can be used to manipulate and analyze data. These functions make it easier to work with the data stored in the database and can save time and effort for developers and data analysts. The documentation available from postgresql directly is a great source for learning about more of the available functions.

Here’s an example of using PostgreSQL built-in functions to extract a substring from a description column, convert it to a number, and multiply it by 2:

SELECT CAST(SUBSTRING(description, '(\d+)') AS INTEGER) * 2 AS result

FROM your_table;

In this example, we are using the SUBSTRING function to extract a substring containing one or more digits from the description column. The regular expression ‘(\d+)’ matches one or more digits and captures them in a group. The CAST function is used to convert the matched substring to an integer. Finally, we multiply the resulting integer value by 2 and return it as result.

You can replace your_table with the name of your actual table, and modify the regular expression or other parts of the query as needed to match your data.

Data Types

Postgresql also offers a large range of data types, including date ranges. While this can be both a positive and a negative, depending on your needs, having a wide range of data types available can be incredibly useful. For example, data types such as time ranges can help with storing transient data and tracking data valid time periods, making it easier and more efficient to work with the data. Making use of these additional data types does commit the data to remain within postgresql to be useful. I did implement this into a database once, and the next time I wanted to use a similar capability I resorted to using 2 columns with a standard datetime datatype, just to simplify queries, and make them less postgresql centric. This made sense for the shorter duration of the requirement, and the simpler SQL query syntax. For an application based used case with many users, I wouldn’t hesitate to use the timerage data type, since it makes queries more efficient.

Custom Functions

In addition to its built-in functions and wide range of data types, Postgresql also offers flexible user-defined functions. This makes the database programmable and allows developers to create their own functions to meet their specific needs. This flexibility can be incredibly valuable in an enterprise setting where custom solutions are often necessary. One function which I use a lot is one that updates a timestamp when the record has been modified. This simplifies some of the functions that interact with the database inserting / ‘upserting’ records. Technically postgres does not have an upsert in-built function, but again, a custom function can be written to create a temporary table and perform and delete and insert on conflict of an insert with the primary key

Here is an example of a function that updates a field when a record is modified.

CREATE OR REPLACE FUNCTION update_last_updated() RETURNS TRIGGER AS $$
BEGIN
    NEW.last_updated = NOW();
    RETURN NEW;
END;
$$ LANGUAGE plpgsql;
CREATE TRIGGER users_update_trigger
BEFORE UPDATE ON users
FOR EACH ROW
EXECUTE FUNCTION update_last_updated();

ODBC Compatibility – Excel Integration

Postgresql is also ODBC compatible, with drivers that can be found on the ODBC website. This enables connectivity to tools such as Excel and Power BI, making it easier for users not familiar with database architectures to work with the data stored in Postgresql. This can make the transition to using Postgresql smoother for those who are used to working with more user-friendly interfaces. With a little vba, a user could write a query in excel, and the results could be available within the same excel, users may feel that this suits their workflow better, or at least – it may be more familiar.

Conclusions

Of course, like any database, there are some downsides to using Postgresql. One of the main drawbacks is that there is a slight learning curve when setting it up. It takes a little more effort to configure users and schemas than some other databases, such as MySQL. However, the use of schemas instead of databases can actually make it easier to query across different parts of the database, as schemas are accessible in a similar way as accessing different folders in the same system.

In conclusion, Postgresql is an excellent choice for enterprise data storage. Its built-in functions, wide range of data types, flexible user-defined functions, and ODBC compatibility make it a versatile and powerful database system. While there is a slight learning curve to setting it up, the benefits of using Postgresql for enterprise data storage far outweigh the initial investment of time and effort.

Have a look at my related posts here.