Data Blog

PostgreSQL application_name

PostgreSQL application_name

PostgreSQL application_name can be set in the connection string. The view pg_stat_activity will show the application_name to help to identify the sessions. The article shows how to set application_name and how to benefit from it. It is highly recommended to set the...

read more
PostgreSQL columnar extension cstore_fdw

PostgreSQL columnar extension cstore_fdw

PostgreSQL columnar extension cstore_fdw is a storage extension which is suited for OLAP-/DWH-style queries and data-intense applications. Columnar analytical databases have unique characteristics compared to row-oriented data access. Many commercial products exist:...

read more
PostgreSQL partitioning guide

PostgreSQL partitioning guide

PostgreSQL partitioning is a powerful feature when dealing with huge tables. Partitioning allows breaking a table into smaller chunks, aka partitions. Logically, there seems to be one table only if accessing the data, but physically there are several partitions....

read more
Anonymization techniques and data privacy

Anonymization techniques and data privacy

Anonymization techniques are essential for data analytics or in test/dev databases. Anonymization and pseudonymization are very different but often confused. GDPR does not apply to anonymized data anymore. GDPR is still applicable for pseudonymized data that can be...

read more
Log-based Change Data Capture - lessons learnt

Log-based Change Data Capture - lessons learnt

My article on medium summarizes experiences from various projects with log-based change data capture (CDC). There are many use cases for which CDC is beneficial. Some DBs even have CDC functionality integrated without requiring a separate tool. The article first...

read more
Calvin: distributed ACID transactions

Calvin: distributed ACID transactions

Most distributed databases do not offer ACID transactions. The support of linear scalability is the main reason that distributed NoSQL databases like MongoDB, Cassandra, AWS DynamoDB and many others have reduced transactional support. Abadi et al. propose in a paper...

read more
Study on Knowledge Sharing – Spotify Guilds / CoPs

Study on Knowledge Sharing – Spotify Guilds / CoPs

Communications of the ACM published a study on Spotify Guilds / CoPs (Communities of Practice). A CoP is a group of people with similar interests who share their knowledge, solve problems or establish standards. The study examines the challenge of knowledge sharing...

read more
The Zettabyte challenge

The Zettabyte challenge

IDC published a White Paper about the challenge of Big Data Volume in a data-driven world. IDC expects that the data volume will grow from 45 Zettabyte (ZB) in 2020 to 175 ZB in 2025. The data will be produced in various forms like transactional data, text, voices,...

read more
Columnar analytical databases for DWH and Data Analytics

Columnar analytical databases for DWH and Data Analytics

The German magazine BI Spektrum published my article on analytical databases for DWH and Data analytics. The article discusses the characteristics of columnar databases and some analytical database categories. This blog contains a very brief summary....

read more
Q&A on Data Integration and Big Data

Q&A on Data Integration and Big Data

Roberto Zicari did a Q&A with me about Data Integration and Big Data. Covered topics are Data integration, Big Data architecture, ETL, SQL, Hadoop, Data Lake, Data Catalog, Data Quality, education. The interview is available on odbms.org with the following...

read more
NoSQL, NewSQL, cloud-native databases

NoSQL, NewSQL, cloud-native databases

The first NoSQL databases were created in the 2000s. Companies like Google, Amazon, Twitter & Co have developed their own databases for their specific needs. Over time, many of these databases were made available as open source. This blog post gives an overview of...

read more

JSON and ISO SQL Standard

JSON was initially developed to exchange data via RESTful APIs (Representative State Transfer Application Programming Interface). The encoding is always Unicode, mostly UTF8. Programmable Web contains a variety of links to APIs like Twitter, LinkedIn, Strava, GitHub....

read more

Archives

Categories