What Is ClickHouse?
ClickHouse® is a high-performance, column-oriented SQL database management system (DBMS) for online analytical processing (OLAP). It is available as both an open-source software and a cloud offering.
ClickHouse
是俄罗斯的 Yandex 于2016年开源的列式存储数据库(DBMS),主要用于在线分析处理查询(OLAP),能够使用SQL 查询实时生成分析数据报告。
原文链接:https://blog.csdn.net/weixin_32265569/article/details/111822811
What is OLAP?
OLAP scenarios require real-time responses on top of large datasets for complex analytical queries with the following characteristics:
- Datasets can be massive - billions or trillions of rows
- Data is organized in tables that contain many columns
- Only a few columns are selected to answer any particular query
- Results must be returned in milliseconds or seconds
什么是列式存储
以下面的表为例:
Id | Name | Age |
---|---|---|
1 | 张三 | 18 |
2 | 李四 | 22 |
3 | 王五 | 34 |
采用行式存储时,数据在磁盘上的组织结构为:
好处是想查某个人所有的属性时,可以通过一次磁盘查找加顺序读取就可以。但是当想查所有人的年龄时,需要不停的查找,或者全表扫描才行,遍历的很多数据都是不需要的。
而采用列式存储时,数据在磁盘上的组织结构为:
这时想查所有人的年龄只需把年龄那一列拿出来就可以了。
快速部署
https://clickhouse.com/docs/en/install#quick-install
Quick Install
TIP
For production installs of a specific release version see the installation options down below.
On Linux and macOS:
- If you are just getting started and want to see what ClickHouse can do, the simplest way to download ClickHouse locally is to run the following command. It downloads a single binary for your operating system that can be used to run the ClickHouse server, clickhouse-client, clickhouse-local, ClickHouse Keeper, and other tools:
bash
curl https://clickhouse.com/ | sh
- Run the following command to start the ClickHouse server:
bash
./clickhouse server
The first time you run this script, the necessary files and folders are created in the current directory, then the server starts.
- Open a new terminal and use the clickhouse-client to connect to your service:
bash
./clickhouse client
```response ./clickhouse client ClickHouse client version 23.2.1.1501 (official build). Connecting to localhost:9000 as user default. Connected to ClickHouse server version 23.2.1 revision 54461.
local-host :) ```
You are ready to start sending DDL and SQL commands to ClickHouse!
TIP
The Quick Start walks through the steps for creating tables and inserting data.
Production Deployments
Production Deployments
For production deployments of ClickHouse, choose from one of the following install options.
From RPM Packages
It is recommended to use official pre-compiled rpm
packages for CentOS, RedHat, and all other rpm-based Linux distributions.
Setup the RPM repository
First, you need to add the official repository:
sudo yum install -y yum-utils
sudo yum-config-manager --add-repo https://packages.clickhouse.com/rpm/clickhouse.repo
Install ClickHouse server and client
sudo yum install -y clickhouse-server clickhouse-client
Start ClickHouse server
sudo systemctl enable clickhouse-server
sudo systemctl start clickhouse-server
sudo systemctl status clickhouse-server
clickhouse-client # or "clickhouse-client --password" if you set up a password.
Install standalone ClickHouse Keeper
TIP
If you are going to run ClickHouse Keeper on the same server as ClickHouse server you do not need to install ClickHouse Keeper as it is included with ClickHouse server. This command is only needed on standalone ClickHouse Keeper servers.
sudo yum install -y clickhouse-keeper
Enable and start ClickHouse Keeper
sudo systemctl enable clickhouse-keeper
sudo systemctl start clickhouse-keeper
sudo systemctl status clickhouse-keeper
sudo yum install yum-utils
sudo rpm --import https://repo.clickhouse.com/CLICKHOUSE-KEY.GPG
sudo yum-config-manager --add-repo https://repo.clickhouse.com/rpm/clickhouse.repo
sudo yum install clickhouse-server clickhouse-client
sudo /etc/init.d/clickhouse-server start
clickhouse-client # or "clickhouse-client --password" if you set up a password.
You can replace stable
with lts
to use different release kinds based on your needs.
Then run these commands to install packages:
sudo yum install clickhouse-server clickhouse-client
You can also download and install packages manually from here.
From Tgz Archives
It is recommended to use official pre-compiled tgz
archives for all Linux distributions, where installation of deb
or rpm
packages is not possible.
The required version can be downloaded with curl
or wget
from repository https://packages.clickhouse.com/tgz/. After that downloaded archives should be unpacked and installed with installation scripts. Example for the latest stable version:
LATEST_VERSION=$(curl -s https://packages.clickhouse.com/tgz/stable/ | \
grep -Eo '[0-9]+\.[0-9]+\.[0-9]+\.[0-9]+' | sort -V -r | head -n 1)
export LATEST_VERSION
case $(uname -m) in
x86_64) ARCH=amd64 ;;
aarch64) ARCH=arm64 ;;
*) echo "Unknown architecture $(uname -m)"; exit 1 ;;
esac
for PKG in clickhouse-common-static clickhouse-common-static-dbg clickhouse-server clickhouse-client clickhouse-keeper
do
curl -fO "https://packages.clickhouse.com/tgz/stable/$PKG-$LATEST_VERSION-${ARCH}.tgz" \
|| curl -fO "https://packages.clickhouse.com/tgz/stable/$PKG-$LATEST_VERSION.tgz"
done
tar -xzvf "clickhouse-common-static-$LATEST_VERSION-${ARCH}.tgz" \
|| tar -xzvf "clickhouse-common-static-$LATEST_VERSION.tgz"
sudo "clickhouse-common-static-$LATEST_VERSION/install/doinst.sh"
tar -xzvf "clickhouse-common-static-dbg-$LATEST_VERSION-${ARCH}.tgz" \
|| tar -xzvf "clickhouse-common-static-dbg-$LATEST_VERSION.tgz"
sudo "clickhouse-common-static-dbg-$LATEST_VERSION/install/doinst.sh"
tar -xzvf "clickhouse-server-$LATEST_VERSION-${ARCH}.tgz" \
|| tar -xzvf "clickhouse-server-$LATEST_VERSION.tgz"
sudo "clickhouse-server-$LATEST_VERSION/install/doinst.sh" configure
sudo /etc/init.d/clickhouse-server start
tar -xzvf "clickhouse-client-$LATEST_VERSION-${ARCH}.tgz" \
|| tar -xzvf "clickhouse-client-$LATEST_VERSION.tgz"
sudo "clickhouse-client-$LATEST_VERSION/install/doinst.sh"
#Deprecated Method for installing tgz archives
export LATEST_VERSION=$(curl -s https://repo.clickhouse.com/tgz/stable/ | \
grep -Eo '[0-9]+\.[0-9]+\.[0-9]+\.[0-9]+' | sort -V -r | head -n 1)
curl -O https://repo.clickhouse.com/tgz/stable/clickhouse-common-static-$LATEST_VERSION.tgz
curl -O https://repo.clickhouse.com/tgz/stable/clickhouse-common-static-dbg-$LATEST_VERSION.tgz
curl -O https://repo.clickhouse.com/tgz/stable/clickhouse-server-$LATEST_VERSION.tgz
curl -O https://repo.clickhouse.com/tgz/stable/clickhouse-client-$LATEST_VERSION.tgz
tar -xzvf clickhouse-common-static-$LATEST_VERSION.tgz
sudo clickhouse-common-static-$LATEST_VERSION/install/doinst.sh
tar -xzvf clickhouse-common-static-dbg-$LATEST_VERSION.tgz
sudo clickhouse-common-static-dbg-$LATEST_VERSION/install/doinst.sh
tar -xzvf clickhouse-server-$LATEST_VERSION.tgz
sudo clickhouse-server-$LATEST_VERSION/install/doinst.sh
sudo /etc/init.d/clickhouse-server start
tar -xzvf clickhouse-client-$LATEST_VERSION.tgz
sudo clickhouse-client-$LATEST_VERSION/install/doinst.sh
For production environments, it’s recommended to use the latest stable
-version. You can find its number on GitHub page https://github.com/ClickHouse/ClickHouse/tags with postfix -stable
.
From Docker Image
To run ClickHouse inside Docker follow the guide on Docker Hub. Those images use official deb
packages inside.
docker run -itd \
--name clickhouse \
--ulimit nofile=262144:262144 \
--restart always \
--net host \
-u root \
--volume clickdata:/bitnami/ \
--privileged=true \
--env ALLOW_EMPTY_PASSWORD=yes \
bitnami/clickhouse:latest
# --volume clickdata:/bitnami/ 必须用数据卷挂载,否则无法启动
# docker volume ls
DRIVER VOLUME NAME
local clickdata
# ll /var/lib/docker/volumes/clickdata/_data/clickhouse/
total 0
drwxrwxr-x 14 polkitd root 267 Apr 6 12:19 data
drwxrwxr-x 2 polkitd root 6 Apr 4 19:12 etc