ClickHouse


目录:

What Is ClickHouse?

ClickHouse® is a high-performance, column-oriented SQL database management system (DBMS) for online analytical processing (OLAP). It is available as both an open-source software and a cloud offering.

ClickHouse是俄罗斯的 Yandex 于2016年开源的列式存储数据库(DBMS),主要用于在线分析处理查询(OLAP),能够使用SQL 查询实时生成分析数据报告。

原文链接:https://blog.csdn.net/weixin_32265569/article/details/111822811

What is OLAP?

OLAP scenarios require real-time responses on top of large datasets for complex analytical queries with the following characteristics:

  • Datasets can be massive - billions or trillions of rows
  • Data is organized in tables that contain many columns
  • Only a few columns are selected to answer any particular query
  • Results must be returned in milliseconds or seconds

什么是列式存储

以下面的表为例:

Id Name Age
1 张三 18
2 李四 22
3 王五 34

采用行式存储时,数据在磁盘上的组织结构为:

image-20230406093948285

好处是想查某个人所有的属性时,可以通过一次磁盘查找加顺序读取就可以。但是当想查所有人的年龄时,需要不停的查找,或者全表扫描才行,遍历的很多数据都是不需要的。

而采用列式存储时,数据在磁盘上的组织结构为:

image-20230406094018560

这时想查所有人的年龄只需把年龄那一列拿出来就可以了。

快速部署

https://clickhouse.com/docs/en/install#quick-install

Quick Install

TIP
For production installs of a specific release version see the installation options down below.

On Linux and macOS:

  1. If you are just getting started and want to see what ClickHouse can do, the simplest way to download ClickHouse locally is to run the following command. It downloads a single binary for your operating system that can be used to run the ClickHouse server, clickhouse-client, clickhouse-local, ClickHouse Keeper, and other tools:

bash curl https://clickhouse.com/ | sh

  1. Run the following command to start the ClickHouse server:

bash ./clickhouse server

The first time you run this script, the necessary files and folders are created in the current directory, then the server starts.

  1. Open a new terminal and use the clickhouse-client to connect to your service:

bash ./clickhouse client

```response ./clickhouse client ClickHouse client version 23.2.1.1501 (official build). Connecting to localhost:9000 as user default. Connected to ClickHouse server version 23.2.1 revision 54461.

local-host :) ```

You are ready to start sending DDL and SQL commands to ClickHouse!

TIP
The Quick Start walks through the steps for creating tables and inserting data.

Production Deployments

Production Deployments

For production deployments of ClickHouse, choose from one of the following install options.

From RPM Packages

It is recommended to use official pre-compiled rpm packages for CentOS, RedHat, and all other rpm-based Linux distributions.

Setup the RPM repository

First, you need to add the official repository:

sudo yum install -y yum-utils
sudo yum-config-manager --add-repo https://packages.clickhouse.com/rpm/clickhouse.repo

Install ClickHouse server and client

sudo yum install -y clickhouse-server clickhouse-client

Start ClickHouse server

sudo systemctl enable clickhouse-server
sudo systemctl start clickhouse-server
sudo systemctl status clickhouse-server
clickhouse-client # or "clickhouse-client --password" if you set up a password.

Install standalone ClickHouse Keeper

TIP

If you are going to run ClickHouse Keeper on the same server as ClickHouse server you do not need to install ClickHouse Keeper as it is included with ClickHouse server. This command is only needed on standalone ClickHouse Keeper servers.

sudo yum install -y clickhouse-keeper

Enable and start ClickHouse Keeper

sudo systemctl enable clickhouse-keeper
sudo systemctl start clickhouse-keeper
sudo systemctl status clickhouse-keeper
sudo yum install yum-utils
sudo rpm --import https://repo.clickhouse.com/CLICKHOUSE-KEY.GPG
sudo yum-config-manager --add-repo https://repo.clickhouse.com/rpm/clickhouse.repo
sudo yum install clickhouse-server clickhouse-client

sudo /etc/init.d/clickhouse-server start
clickhouse-client # or "clickhouse-client --password" if you set up a password.

You can replace stable with lts to use different release kinds based on your needs.

Then run these commands to install packages:

sudo yum install clickhouse-server clickhouse-client

You can also download and install packages manually from here.

From Tgz Archives

It is recommended to use official pre-compiled tgz archives for all Linux distributions, where installation of deb or rpm packages is not possible.

The required version can be downloaded with curl or wget from repository https://packages.clickhouse.com/tgz/. After that downloaded archives should be unpacked and installed with installation scripts. Example for the latest stable version:

LATEST_VERSION=$(curl -s https://packages.clickhouse.com/tgz/stable/ | \
    grep -Eo '[0-9]+\.[0-9]+\.[0-9]+\.[0-9]+' | sort -V -r | head -n 1)
export LATEST_VERSION

case $(uname -m) in
  x86_64) ARCH=amd64 ;;
  aarch64) ARCH=arm64 ;;
  *) echo "Unknown architecture $(uname -m)"; exit 1 ;;
esac

for PKG in clickhouse-common-static clickhouse-common-static-dbg clickhouse-server clickhouse-client clickhouse-keeper
do
  curl -fO "https://packages.clickhouse.com/tgz/stable/$PKG-$LATEST_VERSION-${ARCH}.tgz" \
    || curl -fO "https://packages.clickhouse.com/tgz/stable/$PKG-$LATEST_VERSION.tgz"
done

tar -xzvf "clickhouse-common-static-$LATEST_VERSION-${ARCH}.tgz" \
  || tar -xzvf "clickhouse-common-static-$LATEST_VERSION.tgz"
sudo "clickhouse-common-static-$LATEST_VERSION/install/doinst.sh"

tar -xzvf "clickhouse-common-static-dbg-$LATEST_VERSION-${ARCH}.tgz" \
  || tar -xzvf "clickhouse-common-static-dbg-$LATEST_VERSION.tgz"
sudo "clickhouse-common-static-dbg-$LATEST_VERSION/install/doinst.sh"

tar -xzvf "clickhouse-server-$LATEST_VERSION-${ARCH}.tgz" \
  || tar -xzvf "clickhouse-server-$LATEST_VERSION.tgz"
sudo "clickhouse-server-$LATEST_VERSION/install/doinst.sh" configure
sudo /etc/init.d/clickhouse-server start

tar -xzvf "clickhouse-client-$LATEST_VERSION-${ARCH}.tgz" \
  || tar -xzvf "clickhouse-client-$LATEST_VERSION.tgz"
sudo "clickhouse-client-$LATEST_VERSION/install/doinst.sh"
#Deprecated Method for installing tgz archives
export LATEST_VERSION=$(curl -s https://repo.clickhouse.com/tgz/stable/ | \
    grep -Eo '[0-9]+\.[0-9]+\.[0-9]+\.[0-9]+' | sort -V -r | head -n 1)
curl -O https://repo.clickhouse.com/tgz/stable/clickhouse-common-static-$LATEST_VERSION.tgz
curl -O https://repo.clickhouse.com/tgz/stable/clickhouse-common-static-dbg-$LATEST_VERSION.tgz
curl -O https://repo.clickhouse.com/tgz/stable/clickhouse-server-$LATEST_VERSION.tgz
curl -O https://repo.clickhouse.com/tgz/stable/clickhouse-client-$LATEST_VERSION.tgz

tar -xzvf clickhouse-common-static-$LATEST_VERSION.tgz
sudo clickhouse-common-static-$LATEST_VERSION/install/doinst.sh

tar -xzvf clickhouse-common-static-dbg-$LATEST_VERSION.tgz
sudo clickhouse-common-static-dbg-$LATEST_VERSION/install/doinst.sh

tar -xzvf clickhouse-server-$LATEST_VERSION.tgz
sudo clickhouse-server-$LATEST_VERSION/install/doinst.sh
sudo /etc/init.d/clickhouse-server start

tar -xzvf clickhouse-client-$LATEST_VERSION.tgz
sudo clickhouse-client-$LATEST_VERSION/install/doinst.sh

For production environments, it’s recommended to use the latest stable-version. You can find its number on GitHub page https://github.com/ClickHouse/ClickHouse/tags with postfix -stable.

From Docker Image

To run ClickHouse inside Docker follow the guide on Docker Hub. Those images use official deb packages inside.

docker run -itd  \
    --name clickhouse \
    --ulimit nofile=262144:262144  \
    --restart always \
    --net host \
    -u root \
    --volume clickdata:/bitnami/ \
    --privileged=true \
    --env ALLOW_EMPTY_PASSWORD=yes \
    bitnami/clickhouse:latest
# --volume clickdata:/bitnami/  必须用数据卷挂载,否则无法启动
# docker volume ls
DRIVER    VOLUME NAME
local     clickdata

# ll /var/lib/docker/volumes/clickdata/_data/clickhouse/
total 0
drwxrwxr-x 14 polkitd root 267 Apr  6 12:19 data
drwxrwxr-x  2 polkitd root   6 Apr  4 19:12 etc