Standalone MariaDB ColumnStore 1.2.3 installation on Ubuntu - Getting started

Introduction

A standalone instance MariaDB ColumnStore 1.2.3 is installed and configured on Ubuntu 18 in this article.

  • The installation is performed with a non root user (user mcs). MariaDB ColumnStore will run with the user mcs.
  • The storage is internal.
  • The MySQL/MariaDB front end database is installed in the directory /sqlpac/mcs/mysql.
  • One DBRoot ColumnStore with one disk is created (DBRoot1) and installed in the directory /sqlpac/mcs/data1.
architecture MariaDB Columnstore Sandbox

Pre-requisites

Preparing the user mcs

MariaDB ColumnStore will be installed and will run with the user mcs (id: 10010, $HOME : /opt/mcs).

For a non root installation :

  • The home directory ($HOME) must be the root installation directory of MariaDB ColumnStore.
  • The group name must be the same than the user name (mcs).
  • When planning multiple instances installation, the user id and group id must be the same on all hosts.
root@vps$ useradd -d /opt/mcs -m -u 10010 -s /bin/bash mcs

System requirements

Increase the number of opened files in the system configuration file /etc/security/limits.conf for the user mcs.

/etc/security/limits.conf
mcs hard nofile 65536
mcs soft nofile 65536

Packages dependencies installation

The following packages are mandatory :

  • libboost
  • libjemalloc (memory allocator)
  • libsnappy (compression library)

If not installed, as root user :

root@vps$ apt-get install libboost-all-dev
          
root@vps$ apt-get install libjemalloc1
          
root@vps$ apt-get install libsnappy1v5

MariaDB ColumnStore download and installation

Download MariaDB ColumnStore 1.2.3 from MariaDB website : mariadb-columnstore-1.2.3-1-bionic.amd64.bin.tar.gz.

As user mcs, unzip and extract the archive in the directory $HOME.

mcs@vps$ cd $HOME
          
mcs@vps$ gunzip mariadb-columnstore-1.2.3-1-bionic.amd64.bin.tar.gz
          
mcs@vps$ tar xvf mariadb-columnstore-1.2.3-1-bionic.amd64.bin.tar
Do not change the directory structure during the extraction, the installation directory must absolutely be $HOME/mariadb/columnstore, otherwise setup fails.

Configuration and setup

Modifying databases locations

Before running any post installation scripts and building the system, in this article the MySQL Front End database and the DBRoot ColumnStore are not installed in their default locations, respectively $HOME/mariadb/columnstore/mysql/db and $HOME/mariadb/columnstore/data1. In such case, soft links must be created :

mcs@vps$ ln -s /sqlpac/mcs/mysql /opt/mcs/mariadb/columnstore/mysql/db
mcs@vps$ ln -s /sqlpac/mcs/data1 /opt/mcs/mariadb/columnstore/data1

Check the user mcs is the owner of the new locations.

post-install

As user mcs, run the script post-install.

Be sure the script post-install is executed when the current directory is $HOME.
mcs@vps$ ./mariadb/columnstore/bin/post-install --installdir=$HOME/mariadb/columnstore

Various configurations are prepared by this script (links to libraries, directories…).

The output gives the next steps to be performed depending on the context.

NOTE: For non-root install, you will need to run the following commands as root user to
      setup the MariaDB ColumnStore System Logging

export COLUMNSTORE_INSTALL_DIR=/opt/mcs/mariadb/columnstore
export LD_LIBRARY_PATH=:/opt/mcs/mariadb/columnstore/lib:/opt/mcs/mariadb/columnstore/mysql/lib:/opt/mcs/mariadb/columnstore/lib:/opt/mcs/mariadb/columnstore/mysql/lib
/opt/mcs/mariadb/columnstore/bin/syslogSetup.sh --installdir=/opt/mcs/mariadb/columnstore --user=mcs install


The next steps are:

If installing on a pm1 node using non-distributed install

export COLUMNSTORE_INSTALL_DIR=/opt/mcs/mariadb/columnstore
export LD_LIBRARY_PATH=:/opt/mcs/mariadb/columnstore/lib:/opt/mcs/mariadb/columnstore/mysql/lib:/opt/mcs/mariadb/columnstore/lib:/opt/mcs/mariadb/columnstore/mysql/lib
/opt/mcs/mariadb/columnstore/bin/postConfigure -i /opt/mcs/mariadb/columnstore

If installing on a pm1 node using distributed install

export COLUMNSTORE_INSTALL_DIR=/opt/mcs/mariadb/columnstore
export LD_LIBRARY_PATH=:/opt/mcs/mariadb/columnstore/lib:/opt/mcs/mariadb/columnstore/mysql/lib:/opt/mcs/mariadb/columnstore/lib:/opt/mcs/mariadb/columnstore/mysql/lib
/opt/mcs/mariadb/columnstore/bin/postConfigure -i /opt/mcs/mariadb/columnstore -d

If installing on a non-pm1 using the non-distributed option:

export COLUMNSTORE_INSTALL_DIR=/opt/mcs/mariadb/columnstore
export LD_LIBRARY_PATH=:/opt/mcs/mariadb/columnstore/lib:/opt/mcs/mariadb/columnstore/mysql/lib:/opt/mcs/mariadb/columnstore/lib:/opt/mcs/mariadb/columnstore/mysql/lib
/opt/mcs/mariadb/columnstore/bin/columnstore start

syslogSetup.sh

Only one script must be executed as root user : syslogSetup.sh. This script configures logging functionalities in the directory /var/log/mariadb/columnstore. If not performed, log files won’t be available for diagnose purposes (crit.log, warn.log, info.log, debug.log, warn.log…).

root@vps$ export COLUMNSTORE_INSTALL_DIR=/opt/mcs/mariadb/columnstore
root@vps$ export LD_LIBRARY_PATH=:/opt/mcs/mariadb/columnstore/lib:/opt/mcs/mariadb/columnstore/mysql/lib

root@vps$ /opt/mcs/mariadb/columnstore/bin/syslogSetup.sh --installdir=/opt/mcs/mariadb/columnstore --user=mcs install
System logging being used: rsyslog

postConfigure

The columnstore system is then built with the script postConfigure as user mcs. A performance module pm1 using a non distributed install is generated.

mcs@vps$ export COLUMNSTORE_INSTALL_DIR=/opt/mcs/mariadb/columnstore
mcs@vps$ export LD_LIBRARY_PATH=:/opt/mcs/mariadb/columnstore/lib:/opt/mcs/mariadb/columnstore/mysql/lib

mcs@vps$ /opt/mcs/mariadb/columnstore/bin/postConfigure -i /opt/mcs/mariadb/columnstore

A wizard starts :

  • System server type : single
  • System name : mcs1
  • Type of data storage : internal
  • DBRoot IDs assigned to module pm1 : 1
This is the MariaDB ColumnStore System Configuration and Installation tool.
It will Configure the MariaDB ColumnStore System and will perform a Package
Installation of all of the Servers within the System that is being configured.

IMPORTANT: This tool requires to run on the Performance Module #1

Prompting instructions:

        Press 'enter' to accept a value in (), if available or
        Enter one of the options within [], if available, or
        Enter a new value


===== Setup System Server Type Configuration =====

There are 2 options when configuring the System Server Type: single and multi

  'single'  - Single-Server install is used when there will only be 1 server configured
              on the system. It can also be used for production systems, if the plan is
              to stay single-server.

  'multi'   - Multi-Server install is used when you want to configure multiple servers now or
              in the future. With Multi-Server install, you can still configure just 1 server
              now and add on addition servers/modules in the future.

Select the type of System Server install [1=single, 2=multi] (2) > 1

Performing the Single Server Install.

Enter System Name (columnstore-1) > mcs1

===== Setup Storage Configuration =====


----- Setup Performance Module DBRoot Data Storage Mount Configuration -----

There are 2 options when configuring the storage: internal or external

  'internal' -    This is specified when a local disk is used for the DBRoot storage.
                  High Availability Server Failover is not Supported in this mode

  'external' -    This is specified when the DBRoot directories are mounted.
                  High Availability Server Failover is Supported in this mode.

Select the type of Data Storage [1=internal, 2=external] (1) > 1

Enter the list (Nx,Ny,Nz) or range (Nx-Nz) of DBRoot IDs assigned to module 'pm1' (1) >1

When the installation is successfull :

===== Performing Configuration Setup and MariaDB ColumnStore Startup =====

NOTE: Setting 'NumBlocksPct' to 50%
      Setting 'TotalUmMemory' to 25% of total memory.

Running the MariaDB ColumnStore setup scripts

post-mysqld-install Successfully Completed
post-mysql-install Successfully Completed

Starting MariaDB Columnstore Database Platform

Starting MariaDB ColumnStore Database Platform Starting, please wait ..... DONE

System Catalog Successfull Created

MariaDB ColumnStore Install Successfully Completed, System is Active

Enter the following command to define MariaDB ColumnStore Alias Commands

. /opt/mcs/.profile

Enter 'mcsmysql' to access the MariaDB ColumnStore SQL console
Enter 'mcsadmin' to access the MariaDB ColumnStore Admin console

NOTE: The MariaDB ColumnStore Alias Commands are in /etc/profile.d/columnstoreAlias.sh

Post installation environment variables

After the installation, the file .profile for the user mcs is updated. Environment variables and useful aliases are defined.

$HOME/.profile
# MariaDB Columnstore Non-Root Environment Variables
export COLUMNSTORE_INSTALL_DIR=/opt/mcs/mariadb/columnstore
export LD_LIBRARY_PATH=:/opt/mcs/mariadb/columnstore/lib:/opt/mcs/mariadb/columnstore/mysql/lib:/opt/mcs/mariadb/columnstore/lib:/opt/mcs/mariadb/columnstore/mysql/lib

# MariaDB Columnstore Non-Root Alias Variables
# MariaDB Columnstore Alias Commands
#
alias mcsmysql='/opt/mcs/mariadb/columnstore/mysql/bin/mysql --defaults-extra-file=/opt/mcs/mariadb/columnstore/mysql/my.cnf -u root'
alias ma=/opt/mcs/mariadb/columnstore/bin/mcsadmin
alias mcsadmin=/opt/mcs/mariadb/columnstore/bin/mcsadmin
alias cpimport=/opt/mcs/mariadb/columnstore/bin/cpimport
alias mcshome='cd /opt/mcs/mariadb/columnstore'
alias log='cd /var/log/mariadb/columnstore/'
alias core='cd /var/log/mariadb/columnstore/corefiles'
alias tmsg='tail -f /var/log/messages'
alias tdebug='tail -f /var/log/mariadb/columnstore/debug.log'
alias tinfo='tail -f /var/log/mariadb/columnstore/info.log'
alias terror='tail -f /var/log/mariadb/columnstore/err.log'
alias twarning='tail -f /var/log/mariadb/columnstore/warning.log'
alias tcrit='tail -f /var/log/mariadb/columnstore/crit.log'
alias dbrm='cd /opt/mcs/mariadb/columnstore/data1/systemFiles/dbrm'
alias mcsmodule='cat /opt/mcs/mariadb/columnstore/local/module'

Mariadb ColumnStore processes architecture

11 processes run in a MariaDB ColumnStore performance module, ProcMon (ProcessMonitor) is the parent process of all the other processes except mysql :

mcs@vps$ ps -fauxwww  | grep 'mcs'

\_ /opt/mcs/mariadb/columnstore/bin/ProcMon
   \_ [ProcMgr]
   \_ /opt/mcs/mariadb/columnstore/bin/controllernode fg
   \_ /opt/mcs/mariadb/columnstore/bin/ServerMonitor
   \_ /opt/mcs/mariadb/columnstore/bin/workernode DBRM_Worker1 fg
   \_ [PrimProc]
   \_ [ExeMgr]
   \_ [WriteEngineServ]
   \_ [DDLProc]
   \_ [DMLProc]

\_ /opt/mcs/mariadb/columnstore/mysql//bin/mysqld --basedir=/opt/mcs/mariadb/columnstore/mysql/ \
            --datadir=/opt/mcs/mariadb/columnstore/mysql/db \
            --plugin-dir=/opt/mcs/mariadb/columnstore/mysql/lib/plugin \
            --log-error=/opt/mcs/mariadb/columnstore/mysql/db/vps.err \
            --pid-file=/opt/mcs/mariadb/columnstore/mysql/db/vps.pid \
            --socket=/opt/mcs/mariadb/columnstore/mysql/lib/mysql/mysql.sock \
            --port=3306

Processes listen on various ports, ports defined in the configuration file $HOME/mariadb/columnstore/etc/Columnstore.xml :

mcs@vps$ lsof -i -P -n | grep LISTEN | grep 'mcs'

mysqld    16684  mcs   28u  IPv6 182203      0t0  TCP *:3306 (LISTEN)
controlle 16789  mcs    5u  IPv4 183378      0t0  TCP *:8616 (LISTEN)
ServerMon 16810  mcs    3u  IPv4 183939      0t0  TCP *:8622 (LISTEN)
workernod 16836  mcs    9u  IPv4 183420      0t0  TCP *:8700 (LISTEN)
PrimProc  16895  mcs   10u  IPv4 183479      0t0  TCP *:8620 (LISTEN)
ExeMgr    16979  mcs   11u  IPv4 183577      0t0  TCP *:8601 (LISTEN)
WriteEngi 17039  mcs    9u  IPv4 183653      0t0  TCP *:8630 (LISTEN)
DDLProc   17079  mcs   10u  IPv4 183747      0t0  TCP *:8612 (LISTEN)
DMLProc   17138  mcs   10u  IPv4 183871      0t0  TCP *:8614 (LISTEN)
ProcMon   24558  mcs    4u  IPv4 150911      0t0  TCP *:8800 (LISTEN)
ProcMon   24558  mcs   11u  IPv4 150920      0t0  TCP *:8604 (LISTEN)
ProcMgr   24680  mcs    9u  IPv4 151557      0t0  TCP *:8603 (LISTEN)
ProcMgr   24680  mcs   10u  IPv4 151555      0t0  TCP *:8606 (LISTEN)

A schema is more explicit :

Processes architecture

To manage Mariadb ColumnStore, use the admin console mcsadmin.

List processes - getProcessStatus

mcs@vps$ mcsadmin getProcessStatus
getprocessstatus   Thu Dec 12 13:50:04 2019

MariaDB ColumnStore Process statuses

Process             Module    Status            Last Status Change        Process ID
------------------  ------    ---------------   ------------------------  ----------
ProcessMonitor      pm1       ACTIVE            Fri Nov 15 13:37:48 2019       22045
ProcessManager      pm1       ACTIVE            Fri Nov 15 13:37:54 2019       22171
DBRMControllerNode  pm1       ACTIVE            Fri Nov 15 13:38:00 2019       22581
ServerMonitor       pm1       ACTIVE            Fri Nov 15 13:38:02 2019       22604
DBRMWorkerNode      pm1       ACTIVE            Fri Nov 15 13:38:03 2019       22656
PrimProc            pm1       ACTIVE            Fri Nov 15 13:38:06 2019       22694
ExeMgr              pm1       ACTIVE            Fri Nov 15 13:38:10 2019       22771
WriteEngineServer   pm1       ACTIVE            Fri Nov 15 13:38:14 2019       22853
DDLProc             pm1       ACTIVE            Fri Nov 15 13:38:19 2019       22916
DMLProc             pm1       ACTIVE            Fri Nov 15 13:38:23 2019       22981
mysqld              pm1       ACTIVE            Fri Nov 15 13:38:00 2019       22470

Starting / Stopping the ColumnStore system

Use mcsadmin to start and stop the database columnstore system :

mcs@vps$ mcsadmin startSystem
startsystem   Thu Dec 12 14:32:13 2019

startSystem command, 'columnstore' service is down, sending command to
start the 'columnstore' service on all modules

   System being started, please wait...........
   Successful start of System
mcs@vps$ mcsadmin shutdownSystem y
shutdownsystem   Thu Dec 12 14:38:02 2019

This command stops the processing of applications on all Modules within the MariaDB ColumnStore System

   Checking for active transactions
   Stopping System...
   Successful stop of System

   Shutting Down System...
   Successful shutdown of System

The option shutdownSystem stops the whole system. When using the option stopSystem, all processes are stopped except ProcessMonitor and ProcessManager.

Configuration, resource usage

mcsadmin displays useful informations about configuration (DBRoot…), resources usage (CPU, Memory…) :

mcs@vps$ mcsadmin getStorageConfig
getstorageconfig   Thu Dec 12 15:17:49 2019

System Storage Configuration

Performance Module (DBRoot) Storage Type = internal
System Assigned DBRoot Count = 1
DBRoot IDs assigned to 'pm1' = 1
mcs@vps$ mcsadmin getResourceUsage
getsystemresourceusage   Thu Dec 12 15:18:23 2019

System Resource Usage per Module

Module 'pm1' Resource Usage

CPU: 3% Usage
Mem:  7973380k total, 657728k used, 584824k cache, 8% Usage
Swap: 0 k total, 0k used, 0% Usage
Top CPU Process Users: f2b/server 6%
Top Memory Process Users: mysqld 2%, systemd-journald 1%, mono 1%, DMLProc 1%, workernode 1%
Disk Usage: / 78%
mcs@vps$ mcsadmin getModuleMemory pm1
getmodulememory   Thu Dec 12 15:19:32 2019

Module Memory Usage (in K bytes)

Module  Mem Total  Mem Used cache    Mem Usage % Swap Total Swap Used Swap Usage %
------  ---------  -------  -------  ----------  ---------- --------- -----------
pm1     7973380    658000    584840       8        0           0               0

For the full list of the available options : use mcsadmin help.

The MySQL/MariaDB front end database

mcsmysql

Locally, use the alias mcsmysql to connect to the database. MariaDB 10.3.13 is embedded with MariaDB ColumnStore 1.2.3.

mcs@vps$ mcsmysql
Welcome to the MariaDB monitor.  Commands end with ; or \g.
Your MariaDB connection id is 11
Server version: 10.3.13-MariaDB-log Columnstore 1.2.3-1

Copyright (c) 2000, 2018, Oracle, MariaDB Corporation Ab and others.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

MariaDB [(none)]>

Storage engines, plugins, variables

The ColumnStore storage engine is installed and InnoDB is the default storage engine :

MariaDB [(none)]> show storage engines;
+--------------------+---------+----------------------------------------+--------------+------+------------+
| Engine             | Support | Comment                                | Transactions | XA   | Savepoints |
+--------------------+---------+----------------------------------------+--------------+------+------------+
| Columnstore        | YES     | Columnstore storage engine             | YES          | NO   | NO         |
| ...                |         |                                        |              |      |            |
| MyISAM             | YES     | Non-transactional engine with good     | NO           | NO   | NO         |
|                    |         | performance and small data footprint   |              |      |            |
| ...                |         |                                        |              |      |            |
| InnoDB             | DEFAULT | Supports transactions,                 | YES          | YES  | YES        |
|                    |         | row-level locking, foreign keys        |              |      |            |
|                    |         | and encryption for tables              |              |      |            |
| ...                |         |                                        |              |      |            |
| Aria               | YES     | Crash-safe tables with MyISAM          | NO           | NO   | NO         |
|                    |         | heritage                               |              |      |            |
| ...                |         |                                        |              |      |            |
+--------------------+---------+----------------------------------------+--------------+------+------------+

Many plugins related to the ColumnStore storage engine are enabled :

MariaDB [(none)]> show plugins;
+-------------------------------+----------+--------------------+---------------------------+---------+
| Name                          | Status   | Type               | Library                   | License |
+-------------------------------+----------+--------------------+---------------------------+---------+
| Columnstore                   | ACTIVE   | STORAGE ENGINE     | libcalmysql.so            | GPL     |
| InfiniDB                      | ACTIVE   | STORAGE ENGINE     | libcalmysql.so            | GPL     |
| COLUMNSTORE_TABLES            | ACTIVE   | INFORMATION SCHEMA | is_columnstore_tables.so  | GPL     |
| COLUMNSTORE_COLUMNS           | ACTIVE   | INFORMATION SCHEMA | is_columnstore_columns.so | GPL     |
| COLUMNSTORE_EXTENTS           | ACTIVE   | INFORMATION SCHEMA | is_columnstore_extents.so | GPL     |
| COLUMNSTORE_FILES             | ACTIVE   | INFORMATION SCHEMA | is_columnstore_files.so   | GPL     |

Historically, the ColumnStore storage engine was first developed by the Calpont company, company that does not exist anymore, and was called InfiniDB. MariaDB ColumnStore is a fork of Calpont/InfiniDB. That’s why libraires are called for example libcalmysql.so, cal for Calpont.

Variables related to the ColumnStore storage engine are named infinidb% :

MariaDB [(none)]> show variables like 'infinidb%';
+---------------------------------------------+-------+
| Variable_name                               | Value |
+---------------------------------------------+-------+
| infinidb_compression_type                   | 2     |
| infinidb_decimal_scale                      | 8     |
| infinidb_diskjoin_bucketsize                | 100   |
| infinidb_diskjoin_largesidelimit            | 0     |
| infinidb_diskjoin_smallsidelimit            | 0     |
| infinidb_double_for_decimal_math            | OFF   |
| infinidb_import_for_batchinsert_delimiter   | 7     |
| infinidb_import_for_batchinsert_enclosed_by | 17    |
| infinidb_local_query                        | 0     |
| infinidb_ordered_only                       | OFF   |
| infinidb_string_scan_threshold              | 10    |
| infinidb_stringtable_threshold              | 20    |
| infinidb_um_mem_limit                       | 0     |
| infinidb_use_decimal_scale                  | OFF   |
| infinidb_use_import_for_batchinsert         | ON    |
| infinidb_varbin_always_hex                  | OFF   |
| infinidb_vtable_mode                        | 1     |
+---------------------------------------------+-------+

By default, the compression is set to 2 (infinidb_compression_type : 2), compression is enabled.

Client configuration

ColumnStore uses a dedicated database called infinidb_vtable for creation of all temporary tables used for ColumnStore tables query processing. The permission "CREATE TEMPORARY TABLES" must be given for all user accounts in the database infinidb_vtable.

MariaDB [(none)]> create user flightsdbo@'%' identified by '************';

MariaDB [(none)]> grant create temporary tables on infinidb_vtable.* to flightsdbo;

MariaDB [(none)]> flush privileges;

When this permission is not given, an error occurs when processing queries :

ERROR 1044 (42000): Access denied for user 'flightsdbo'@'%' to database 'infinidb_vtable'

Creating a ColumnStore table

To create a ColumnStore table, only use the option engine=columnstore in the CREATE TABLE statement.

MariaDB [(none)]> use flights;

MariaDB [(flights)]>

create table airlines (
    iata_code char(2),
    airline varchar(30)
) engine=columnstore default character set=utf8;

Loading data with cpimport

cpimport is the utility to bulk load data from flat files.

Example : loading airports in the table airports (database flights) from the file airports.csv.

mcs@vps$ cpimport  -m 2 -s ',' -E '"' flights airports -l $SCHEMA_DIR/airports.csv

2019-11-18 12:36:14 (25992) INFO : Running distributed import (mode 2) on all PMs...
2019-11-18 12:36:16 (25992) INFO : For table flights.airports: 342 rows processed and 342 rows inserted.
2019-11-18 12:36:16 (25992) INFO : Bulk load completed, total run time : 2.11828 seconds

The demo database flights

A useful demo database is available : Mariadb ColumnStore Flights Demo Database.

In this demo database, three tables : airlines, airports and flights. All american flights during the year 2018 are loaded.

/opt/mcs/flights/data/2018-03.csv
2019-11-18 12:01:07 (2547) INFO : Running distributed import (mode 2) on all PMs...
2019-11-18 12:01:10 (2547) INFO : For table flights.flights: 665810 rows processed and 665810 rows inserted.
2019-11-18 12:01:10 (2547) INFO : Bulk load completed, total run time : 3.24067 seconds
/opt/mcs/flights/data/2018-04.csv
2019-11-18 12:01:11 (2674) INFO : Running distributed import (mode 2) on all PMs...
2019-11-18 12:01:15 (2674) INFO : For table flights.flights: 649023 rows processed and 649023 rows inserted.
2019-11-18 12:01:15 (2674) INFO : Bulk load completed, total run time : 4.22641 seconds

The variable infinidb_vtable_mode

An important variable : infinidb_vtable_mode. This parameter can be defined at the server level and at the session level.

MariaDB [(none)]> show variables like 'infinidb_vtable_mode%';
+----------------------+-------+
| Variable_name        | Value |
+----------------------+-------+
| infinidb_vtable_mode | 1     |
+----------------------+-------+
ValueBehaviour
0 Some WHERE clause components can be processed by ColumnStore, but joins are processed entirely by mysqld using a nested-loop join mechanism
1 Default : query syntax is evaluated by ColumnStore for compatibility with distributed execution and incompatible queries are rejected.
2 Auto-switch mode : ColumnStore will attempt to process the query internally, if it cannot, it will automatically switch the query to run in row-by-row mode.

The needed value will depend on the target application querying system.

ColumnStore metadata informations

Space usage

In the database columnstore_info, the stored procedure total_usage returns the total space used in the ColumnStore system :

MariaDB [(none)]> call columnstore_info.total_usage();

+-----------------+------------------+
| TOTAL_DATA_SIZE | TOTAL_DISK_USAGE |
+-----------------+------------------+
| 701.76 MB       | 764.35 MB        |
+-----------------+------------------+

To get space used for the ColumnStore tables, use the stored procedure columnstore_info.table_usage().

MariaDB [(none)]> call columnstore_info.table_usage(null,null);
+--------------+------------+-----------------+-----------------+-------------+
| TABLE_SCHEMA | TABLE_NAME | DATA_DISK_USAGE | DICT_DATA_USAGE | TOTAL_USAGE |
+--------------+------------+-----------------+-----------------+-------------+
| flights      | airlines   | 2.52 MB         | 2.01 MB         | 4.52 MB     |
| flights      | airports   | 9.55 MB         | 6.02 MB         | 15.58 MB    |
| flights      | flights    | 744.25 MB       | 0.00 Bytes      | 744.25 MB   |
+--------------+------------+-----------------+-----------------+-------------+

MariaDB [(none)]> call columnstore_info.table_usage('flights','flights');
+--------------+------------+-----------------+-----------------+-------------+
| TABLE_SCHEMA | TABLE_NAME | DATA_DISK_USAGE | DICT_DATA_USAGE | TOTAL_USAGE |
+--------------+------------+-----------------+-----------------+-------------+
| flights      | flights    | 744.25 MB       | 0.00 Bytes      | 744.25 MB   |
+--------------+------------+-----------------+-----------------+-------------+

Compression ratio

The average compression ratio is calculated with the stored procedure columnstore_info.compression_ratio.

MariaDB [(none)]> call columnstore_info.compression_ratio();
+-------------------+
| COMPRESSION_RATIO |
+-------------------+
| 2.7716:1          |
+-------------------+

INFORMATION_SCHEMA

MariaDB ColumnStore implements four Information Schema tables that expose informations about the tables and columns storage :

  • COLUMNSTORE_TABLES
  • COLUMNSTORE_COLUMNS
  • COLUMNSTORE_EXTENTS
  • COLUMNSTORE_FILES

The stored procedures columnstore_info.total_usage(), columnstore_info.table_usage(), columnstore_info.compression_ratio() relies on these information tables.

MariaDB [information_schema]> select * from columnstore_tables;
+--------------+------------+-----------+---------------+--------------+---------------+
| TABLE_SCHEMA | TABLE_NAME | OBJECT_ID | CREATION_DATE | COLUMN_COUNT | AUTOINCREMENT |
+--------------+------------+-----------+---------------+--------------+---------------+
| flights      | airlines   |      3000 | 2019-11-18    |            2 |          NULL |
| flights      | airports   |      3004 | 2019-11-18    |            7 |          NULL |
| flights      | flights    |      3015 | 2019-11-18    |           32 |          NULL |
+--------------+------------+-----------+---------------+--------------+---------------+

To find where is stored a column, its properties and its size, for example the column year in the table flights :

MariaDB [information_schema]>
SELECT 
       b.table_name as "Table",
       b.column_name as "Column",
       a.filename,
       a.file_size,
       a.compressed_data_size,
       c.min_value as "Min",
       c.max_value as "Max"
FROM COLUMNSTORE_FILES a, 
     COLUMNSTORE_COLUMNS b,
     COLUMNSTORE_EXTENTS c
WHERE a.object_id = b.object_id
  AND b.object_id = c.object_id
  AND b.COLUMN_NAME='year'
  AND b.table_name='flights'

+---------+--------+----------------------------------------------------------------------------------------+-----------+----------------------+
| Table   | Column | filename                                                                               | file_size | compressed_data_size |
+---------+--------+----------------------------------------------------------------------------------------+-----------+----------------------+
| flights | year   | /opt/mcs/mariadb/columnstore/data1/000.dir/000.dir/012.dir/136.dir/000.dir/FILE001.cdf |  16785408 |               860160 | 
+---------+--------+----------------------------------------------------------------------------------------+-----------+----------------------+
| Min     | Max    |
+---------+--------+
|  2015   |  2017  |

The min and max values are important informations when the ColumnStore engine processes query, these values govern partitions eliminations.

Conclusion

Installing a standalone MariaDB ColumnStore is quite easy and adequate for getting started before planning architectures with multiple performance modules.

First tests performed with analytical queries : the performances are very good. But there is this big inconvenience : unable to create unicity constraints to enforce data quality. If duplicate rows are inserted due to an error, duplicate tuples must be removed manually.

Another inconvenience (for the moment), feeding is efficient only when using cpimport. Like many other column store systems, atomic feeding (INSERT, UPDATE, DELETE) is not optimized in MariaDB ColumnStore : performance and storage degradations. That’s probably the reason why replication is not (yet) implemented and supported.