2022-05-06

Setting up sslh as transparent proxy for a remote container

 I have an NGINX server that is publicly accessible. It has been deployed in the following manner:

  • Machine A
    • Port forwarding with socat: localhost:4443 ==>  0.0.0.0:443
  • Machine B
    • Running NGINX in a Docker container
    • Port forwarding by Docker: <container_ip>:443 ==> localhost:4443
    • Port forwarding by SSH to Machine A: localhost(B):4443 ==> localhost(A):4443
This in general works. Machine A is published to my domain, and the traffic to 443 is forwarded to NGINX in a few hops.

However there is a problem: the NGINX server never sees the real IP of the client, so it is impossible to depoly fail2ban or other IP address based tools. So I wanted to fix it.


Step 1: VPN

The first step is to connect machine A and B with a VPN. I feel that it would also work without it, but the iptables rules could be more tricky. 

WireGuard is my choice. I made a simple setup:
  • Machine A has IP: 10.0.0.2/24
  • Machine B has IP: 10.0.0.1/24
  • On both machines, the interface is called wg0, AllowedIPs of the other peer is <other_peer_ip>/32 
  • wg-quick and systemd are used manage the interface.

Step 2: Machine A

Configure sslh:

sslh --user sslh --transparent --listen 0.0.0.0:443 --tls 10.0.0.1:4443

This way sslh will create a transparent socket that talks to Machine B. When the reply packets come back, we need to redirect them to the transparent socket:

iptables -t mangle -N MY-SERVER
iptables -t mangle -I PREROUTING -p tcp -m socket --transparent -j MY-SERVER
iptables -t mangle -A MY-SERVER -j MARK --set-mark 0x1
iptables -t mangle -A MY-SERVER -j ACCEPT
ip rule add fwmark 0x1 lookup 100
ip route add local 0.0.0.0/0 dev lo table 100

Here I'm forwarding all transparent sockets, which is OK because sslh is the only one that creates such traffic.

Step 3: Machine B

Now machine A will start routing packets, the source address will be of the real HTTP client, not Machine A. However WireGuard will block them because of AllowedIPs. 

To unblock:

wg set wg0 peer MACHINE_A_PUB_KEY allowed-ips 10.0.0.2/32,0.0.0.0/0

Note that I cannot simply add 0.0.0.0/0 to AllowedIPs in the conf file, because wg-quick will automatically set ip routing.

My Linux distro and Docker already set up some good default values for forwarding traffic towards containers:
  • IP forwarding is enabled
  • -j DNAT is set to translate the destination IP address and port.
Now NGINX can see the real IP addresses of clients. It will also send response traffic back to that real IP. I need make sure that the traffic is sent back to machine A.

Note that if NGINX proactively initiates traffic to the Internet, I still want it to go through the default routing on machine B. But I suppose it is also OK to route all traffic to machine A if preferred/needed.

iptables -N MY-SERVER
# Tag incoming traffic towards NGINX
iptables -I FORWARD -i wg0 -o docker0 -m conntrack --ctorigdst 10.0.0.1 --ctorigdstport 4443 -j MY-SERVER
iptables -A MY-SERVER -j CONNMARK --set-xmark 0x01/0x0f
iptables -A MY-SERVER -j ACCEPT
# Tag response traffic from NGINX
iptables -t mangle -I PREROUTING -i docker0 -m connmark --mark 0x01/0x0f -j CONNMARK --restore-mark --mask 0x0f

# Route all tagged traffic via wg0
ip rule add fwmark 0x1 lookup 100
ip route add 0.0.0.0/0 dev wg0 via 10.0.0.2 table 100

Now everything should work.

Notes

I mainly referred to the official guide of sslh. I also referred to a few other sources like Arch Wiki. 

In practice, some instructions did not apply to my case:

  • I did not need to grant CAP_NET_RAW or CAP_NET_ADMIN to sslh. Althougth it is mentioned in an sslh doc and a manpage. Maybe the sslh package already handled it automatically.
  • On machine A I did not need to enable IP forwading. Actually this could make sense, because routing is happening on machine B.
  • I did not need to enable route_localnet on machine A

2022-04-02

Home Server Tinkering

Weeks ago I  purchased a secondhand machine. Since then I have been tinkering this little box.

The Perfect Media Server site is a good place to start with. Arch Linux Wiki is my go-to learnning resource, even though I use Ubuntu.

Filesystem

I'd be super paranoid and careful, as this is my first time manually configuring a disk array. Basically my optoins include:

  • ZFS
  • btrfs
  • Snapraid (or even combnied ZFS/btrfs)
  • Unraid
My considerations include:
  • Data integrety, which is the most important.
  • Maintenance. I want everything easy to set up and maintain.
  • Popularity. There will be more doc/tutorial/discussions if the technology is more popular.
Eventually I decided to use ZFS with raidz2 on 4 disks. 

I also took this chance to learn configuring disk encryption. I decided to use LUKS beneath ZFS. I could have just used ZFS's built-in encryption, but I thought LUKS is fun to learn. It really was. The commands are way more user-friendly that I had expected.

Hardening SSH

Most popular best practices include:
  • Use a non-guessable port.
  • Use public key authenticatoin and disable password authentication.
  • Optionally use an OTP (e.g. Google Authenticator) authentication.
  • Set up chroot and command restrictions if applicable. E.g. for backup users.

Various Routines

  • Set up remote disk decryption via SSH, with dropbear.
  • Set up mail/postfix, so I will receive all kinds of system errors/warnings. E.g. from cron.
  • Set up ZED. Schedule scrubbing with sanoid.
  • Set up samba and other services.
  • Set up backup routines.

Containers

I also took the chance to learn about Docker, and tried a couple of images. Not all of them are useful, but I found a few very useful:
  • Grafana + Prometheus. Monitoring system, UPS, air quality etc.
  • Photoprism. Managing personal photos
  • Pi-hole. Well I do have it running on my Pi, but I guess it's nice to have another option.
  • Hosting GUI software with web access. E.g. firefox. 
However there may be security concerns. See below.

Security Considerations

While I'd like to run userful software and services, I'd also want to keep my data safe. 

I want to protect my data from two scenarios:
  • Malicious/Untrusted code. I have heard so many news about malicious NPM packages in the last few years.
  • Human/Script errors. It happened with a popular package, where a whitepsace was unintended added into the install script, such that the command became "rm -rf / usr/lib/...". Horrible. For similar reason, I don't trust scripts that I wrote myself either.
At this moment I am not worried about DoS attacks.

User and File Permissions

The easiest option is to use different users for diffrent tasks. Avoid using root when possible. Also limit the resources that each user can access. 

This is a natural choice when I want other devices to back up data to my server. It is also useful when I need to run code in a "sandbox like environment". This is explained well in Gentoo Wiki.

There are two issues with this approach:
  1. It is not really a sandbox. It is straightfoward to prevent a user reading/writing some files, but it'd be trickier to limit other resource, like network, memory etc.
  2. It is tricky to maintain permissions for multiple users, especially when they need to access the same files with different scopes. ACL are better than the classic Linux permission bits, yet it can still become too complicated. I believe that complicated rules equal to security holes.

I created and applied different users for each docker image, but it was not enough. Discussed later below.


Mandatory Access Control (MAC)

Examples include AppArmor and SELinux.

Funny enough, years ago I thought AppArmor was quite annoying, because it kept showing popup messages. And now I proactively write AppArmor profiles.

I decided to write AppArmor profiles for docker images and all my scripts in crontab. I feel more assured knowing that my backup scripts cannot silently delete all my data.

Sandboxing

I had thought that chroot was a nice security tool, until I learned that it isn't. I found a couple of sandboxing options on Arch Wiki

However I don't see them fit well in my case. It should work for my scripts, but dedicated user + MAC sounds simpler to me. I also want to protect against malicious install scripts (e.g. NPM packages), and I feel that firejail/bubblewrap won't help too much here.

Sandboxing seems to be useful for beast software, like web browsers. However I'll also need to access it remotely, so I'd just go for containers or VMs.

I suppose I may find useful scenarios later.

Docker / Container / VM

I use Docker when
  • User + MAC is not enough.
  • I do not trust the code.
  • It is difficult to deploy to the host.
Security best practices include:
  • Do not run as root
  • Drop all unnecessary capabilities. (Most images don't need anything at all)
  • Set no-new-priviledges to true.
  • Apply AppArmor profiles.
[UPDATE: Obviously I did not see the whole story. Added a new section below]
I was quite surprised when I learned that root@container == root@host, unless I'm running rootless docker. What's worse, almost all docker images that I found use root by default.

While I managed to run most containers without root, many GUI-related container really want to use root. I really hate it and I started looking for rootless options.

Instead of mutiple GUI containers, I decided to run an entire OS. This will be my playground which has no access to my data. Docker is not designed for this task, although probably it can still do the job if configured correctly.

VMs (e.g. VirtualBox) are my last resorts. They are quite laggy on my box. It is also tricky to dynamically balance the load, e.g. I have to specify the max CPU/RAM beforehand.

I learned that Kata Container is good for this tast. It is fast and considered very secure. However I didn't find an easy way of depolying it. (Somehow I don't like Snap and disabled it on my machine. Well now snap is almost required by Kata Container, LXD and Firefox, maybe I should give it a go some time?)

Eventually I turned to LXC. It was quite easy to deploy a Ubuntu box. I am very happy with the toolchain and the design choices. For example the root filesystem of the container is exposed as a plain directory tree on the host, instead of (virtual disk) images.

[UPDATE] Containers without root@host

I really dislike it, that processes in containers can be run as root@host. Therefore I was looking for "rootless options". There are in fact two options:

  1. Container daemon run as root. Contaniers run as non-root. 
  2. Both container daemon and containers run as non-root.

#1 means to turn on user namespace mapping for containers in Docker or LXC. #2 means to further configure the daemon of Docker or LXC.

#2 seems more secure, but it requires another kernal feature CONFIG_USER_NS_UNPRIVILEGED, which might have security concerns. So funny enough, it is both "more secure" and "less secure" than #1.

While I don't know anything deeper, I'm slighly leaning towards #1. I will keep an eye on #2 and maybe turn to it when the secury concerns are resolved.

What's Next

Probably I will try to improve the box to reload services/container on failure/reboot. Maybe systemd is enough, or maybe I need something like Kubernetes or Ansible. Or maybe I can live well without them.


2022-03-29

Fix broken sudoers files

Lesson learned today: An invalid sudoers file can break the sudo command, which in turn prevent the sudoers file from being edited via sudo. 

The good practice is to always use visudo to modify sudoers file. In my case I needed to modify a file inside /etc/sudoers.d, where I should have used `visudo -f`.


To recover from invalid sudoers files, it is possible to run `pkexec bash` to gain root access. However I got an error "polkit-agent-helper-1: error response to PolicyKit daemon: GDBus.Error:org.freedesktop.PolicyKit1.Error.Failed: No session for cookie"


Solution to this error:

Source: https://github.com/NixOS/nixpkgs/issues/18012#issuecomment-335350903

- Open two terminals. (tmux also works)

- In terminal #1, get PID by running `echo $$`

- In terminal #2, run `pkttyagent --process <PID>`

- In terminal #1, run `pkexec bash`



2022-01-16

On Data Backup

Around 2013, every year I'd burn all my important data into a single DVD, that is 4.7GB. Nowadays I have ~5TB data, and I don't even bother optimizing 5GB data.

Background

I realized that it is the time to consider backup. I guess I have seen enough signs.
  • The NAS shows that the disks are quite full.
  • I just happened to see article and videos about data backup.
  • I found corrupted data in my old DVDs.
  • I realized that most of my important data are not properly backed up.
  • I have a few scripts that manage different files, which might contains bugs.
The goal is to have good coverage under acceptable cost.

The Plan

All my data are categorized into 4 classes.

Class 1: Most Important + Frequently Accessed

Rougly ~50GB in total. Average file size is ~5MB. 
Examples include official documents, my artworks and source code.

The plan: sync into multpile locations to maximize robustness. Sometimes I chooes a smaller subset when I don't have enough free space.
In case some copies are down/corrupted, I can still access the data quickly.

Class 2: Important + Frequently Modified

Roughly ~500GB in total. Average file size is ~500KB.
Typically there are groups of small files, which must be used together.
Examples include source code, git repo and backup repo.
Note that it overlaps with Class 1.

The plan: hot backup with versioning/snapshots,  yearly cold archives.

Class 3: Important + Frozen

Roughly ~1TB in total. Average flie size is ~50MB.
Frozen means they are never (or at least rarely) changed once created.
Most data of this class are labeled, for example /Media/Video/2020/2020-01-03.mp4
Examples include raw GoPro footages.
Note that it overlaps with Class 1.

The plan: hot backup with versioning/snapshots, labeled data are directly sync'ed to cold storage, unlabled data go to yearly cold archives.

Class 4: Unimportant

The rest of the data are not important, I wouldn't worry too much if they are lost, but I'm happy to keep them with minimum cost.
Examples include downloaded Steam games.

The plan: upload some to hot backup storage, shoud I have enough quota. 
No cold archive is planned.

Thoughts

I have put lots of thought when designing the plan, and I'm happy with the result. 
On the other hand, I had too many headaches throughout the process. 
To name a few:


Hot Backup or Cold Archive
I had a hard time choosing between hot backups and cold archvies. Hot backups are more up-to-date but cold archives are safer.

Originally I had planned to use only one per data class (and the classes were defined slightly differently). But I just couldn't decide. 

The decision is to try both then revisit later.


Format of Cold Archives
There are two possibilities:
  1. Directly uploading the files, with the same local file structure
  2. Create an archive and upload it. This also includes chunk-based backup methods.
Note that cold storage are special:
  • Files on cold storage cannot be modified/moved/renamed, or more preciely it is more expensive to do so. 
  • There is typically cost per API/object. So we get a penalty for too many small objects.
With option 1 I can easily access individual files on the storage, but if I rename or move some files locally, it will be a diaster in the next backup cycle.
With oiption 2 there is no problem with too many files, but I have to download the whole archive in order to access a single file inside.  Also I will need to make sure that the archives do not overlap (too much), or they will just waste space.

My solution is to organize and label the data, mostly by year. Good news is that most frozen data can be labeled this way, and they are often large files. This way it is mostly safe to upload them directly, since files may be added or removed, they are unlikely to be renamed or modified.

For unlabeled data, I'd just create archives every year, the size is small comparing with labeled data, so I wouldn't worry about it.


Format of Hot Backups
There are also two possibilities:
  1. File-based. Every time a file is modified or removed, the old version is saved somewhere else.
  2. Chunk-based. All files are broken and store in chunks. Like git repos.
There are lots of things to consider, e.g. size, speed, safety/robustness, easiness to access.

The decision is to go for #1 for all relevant data classes. My thoughts are:
  • In the worse case, the whole chunk-based repo may be affected by a few rotten bits. This is not the caes for file-based solutions.
  • I want to be able to access individual files without special tools.
  • Benefits of chunk-base approaches include deduplication and smaller sizes (mostly for changed files). But it does not really apply for my data. Most big files are large video files. They are rarely changed and cannot be compressed much.
On the other hand, I do plan to try out some chunk-based software in the future.


Backup for Repos
In my NAS I have a few git repos and (chunk-based) backup repos. So how should I back them up?

On one hand, the repos already saved versions of source files, so simply syncing them into cloud storage should work well enough.
On the other hand, should there be local data corruption, the cloud version will also be damaged after one data sync.

The decision is to keep versions in the repo backups as well. Fortunately they are not very big.
I plan to revisit this later. And hopefully I never need to recover a repo like this. 



Backup Storage

I don't have enough extra HDDS to back up all my data. Anyways I prefer cloud storage for this task.
Both hot and cold ones need to be discussed individually.


Hot Backup

Most cloud storage services would work well as a hot backup repo. In general the files are always available for reading or writing, which make them suitable for simple rsync-alike backups, or chunk-based backups.

It is not too difficult to fit Class 1 data into free quota, although I do need to subset it.

It is tricky to choose one service for other classes, as I'd like to keep all backup data together. 
I have check a number of services. I found the followings especially interesting.
  • Backblaze B2
  • Amazon S3
  • Google Cloud Storage
  • Google Storage
I'd just pick one while balancing cost, speed and reputation etc. I wouldn't worry too much about software support, since all of them are popular.


Cold Archive

I only learned recently about cold archives from Jeff Geerling's backup plan. After some readings I find the concept really interesting.

Mostly I'd narrow down to the following:
  • Amazon S3 Glacier Deep Archive
  • Google Archival Cloud Storage
  • Azure Archive
I remember also seeing similar storage class from Huawei and Tencent, but the software support is not as good among open source tools that I have found.


Software

I'd like to manage all backup tasks on my Raspberry Pi. 

rclone, so-called the swiss army knife for cloud storages, is an easy winner. I just couldn't find another one the matches with it. The more I learn about the tool, the more I like it. To name a few observations:
  • Great coverage on cloud storage providers.
  • Comprehensive and well-written documents.
  • Active community.
  • Lots of useful features and safety checks
  • Output both human-readable and machine-readable information.
So I ended up writing my own scripts that calls rclone. It is always easy to execute a task like "copy all files from A to B, and in case some files in B need to be modified, save a copy in C". So I just needed to focus on defining my tasks, scoping the data and set up the routines. Well it is not trivial though, more on that later.

I also spent quite some time researching chunk-based backup tools. For example
I also checked a few others, but not as extensive as these three. Here are a few useful lists:
I pulled myself out of the rabbit hole, as soon as I realized that I don't need them at the moment. While I still cannot decide which one to use, should I need a chunk-based backup today. Here's a summary of my 2-page notes on these tools:
  • BorgBackup is mature (forked from Attic in 2010), but have limited support on backends.
  • restic is relatively new (first GitHub commit in 2014). It used to have performance issues with pruning, which seems to have been fixed. The backup format is not fixed (yet).
  • Duplicacy is even younger (first GitHub commit in 2016). The license is not standard, which concerns many people. Due to the lock-free design it is measured faster than others, especially when there are multiple clients connecting to the same repo. However it might waste some space in order to achieve that.
Maybe thing will change in a few years. I will keey an eye on them.


Technical Issues

I had quite a few issues when using OneDrive + WebDAV.
  • Limit on max path length
  • Limit on max file length
  • No checksums
  • No quota metrics.
Fortunatley most of them are not big problems.

Another issue, about 7zip, is that I cannot add empty directories in the archive without adding files in the directories. This is particularly important for my cold yearly archives.

Eventually I used Python and tarfile to achieve it. I probably can do the same with a 7z python library. But I used tarfile anyway because it is natively available in Python, plus I realized that most of the archives cannot be effectively compressed.


Next Steps

I probably will add a few more scripts to monitor and to verify the backups. For example, download and verify ~10GB data that is randomly selected from the backup repo.

I will also keep an eye on chunk-based solutions.

2021-11-11

Windows的KVM软件

 KVM即是共享鼠标键盘。

之前Windows和Ubuntu我都是用Synergy,不过最近一查发现变成收费软件了。

Github上有个fork叫Barrier,下载以后配了好久也不好用。有可能跟我的屏幕配置有关系,一个电脑连了多个显示器,而且有的显示器还是禁用。最后放弃了。

后来搜到了微软的Mouse without Borders,简单配一下就能用了,还是不错。

网址是http://aka.ms/mm

2021-11-01

Euclidea 14.5 解法与证明

Euclidea 14.5

如图,给定三个两两相切的圆,O,O_1, O_2。三个圆心共线。

任务:尺规作出一个圆,与给定的三个圆都相切。



解法:


  1. 设圆O与圆O_1的切点为A
  2. 连接OA
  3. 过O做OA垂线交圆O于D
  4. 以D为圆心,DA为半径作圆,交圆O_1和圆O_2于E和F。
  5. 作直线EO_1和FO_2交于G
  6. 以G为圆心,GE为半径作圆
  7. 圆G即为所求圆



简要证明:

  1. 如图,设三个圆切点为A,B,C。明显A,B,C,O,O_1,O_2五点共线。
  2. 以A为中心作反演变换,取反演圆与圆O_2正交。于是圆O_2经反演变换后不变。B,C互为反演点。
  3. 分别过B和C做AB的垂线L_B和L_C。易知圆O1和圆O经反演变换后分别为L_B和L_C。
  4. 如图作圆G'同时与圆O_2,L_B和L_C相切于E'和F'。
  5. 过E'F'作直线L,易证C在L上,并且∠E'CB为45°。
  6. 设L经过反演变换得到的圆为圆D',考虑圆D'的性质:
    1. 直线AD'与直线L垂直
    2. 圆D'经过A(因为L不经过A)
    3. 圆D'经过B(因为L经过C,而B,C互为反演点)
  7. 于是可知圆D就是圆D',因此E和E', F和F'分别互为反演点。
  8. 设圆G'经过反演变换为圆G''。因为圆G'与L_B和圆O_2相切于E'和F',所以圆G''与圆O_1和圆O_2相切于E和F。
  9. 所以G''即是EO_1和FO_2的交点,且圆G''半径是G''E。于是圆G即是圆G''。
  10. 因为圆G'与L_B,圆O_2,L_C都相切,所以圆G与三个给定圆都相切。证明完毕。

参考资料:Pappus chain


2021-04-28

再度音乐寻宝

平时脑子里会无缘无故,不由自主地蹦出一些旋律。这些旋律很熟悉,也肯定不是自己现编的,但是就是想不起来具体的名字,歌手或者哪里听到的。

最简单的情况是记得歌词,或者可以哼唱检索。最难的大概是影视作品的配乐,我觉得成功的配乐会让人记得当时的“感觉”而不是曲子本身的细节。

继上次音乐寻宝之后,又一个旋律出现了,令我吃不香睡不着。

经过几天的查找,最终还是找到了,结果是井上昌己的《Up Side Down 永遠の環》,出自圣少女的ED。过程还是挺有趣的。

1. 我大概记得开头intro的旋律,以及歌词的整体节奏,类似词牌。然而试了若干哼唱检索,都没有结果。
2. 我大概记得一些歌词片段,于是去各种网站搜索。然而结果证明我记忆的片段是错误的。
3. 我“感觉”这是一个日本动画的ED, 于是去搜索了80 90年代引进的日本动画, 以及00-08年流行的日本动画的OP ED,然而没有找到。这个比较巧妙,因为圣少女国内引进过,但是节选了OP和ED。我其实也翻了日版动画,大概看了前几集和后几集的OP ED,万没想到结果是中间的。
4. 我“感觉”动画是跟魔法少女有关,所以翻了翻魔卡少女樱和Marybell的乐集,一无所获。当时其实也过了一遍圣少女的乐集,不知为何漏掉了。
5. 后来放宽了条件,搜了一些类型相似的动画乐集,虽然没找到当前这个,但是意外找到了我的女神里的《優しい心》。这个也是以前突然蹦出来的旋律。同时翻到了同作品里的《願い》,感觉跟要找的相似,然而其实没啥相似,除了都是欢快的曲子。
6. 于是放弃了,能试的都试了,只能随缘了。
7. 后来调整了下思维,想到有没有可能是其他类属性。于是翻了一下邓丽君的日文歌集,甚至《梅兰梅兰我爱你》,当然还是无果。
8. 最后真的是随了缘,重翻了一下圣少女的乐集,竟然找到了。

这个过程里我觉得最有趣的是两点:
一是我记忆里的“感觉”,日本动画, ED, 魔法少女。这样的记忆比曲子本身深刻一些。这也让找曲子更困难了一些。
二是我记忆里的不确定因素,包括旋律歌词。我试想过多种可能的乐器组合,以及歌词韵脚,感觉都有可能。最后也证明我这部分记忆都是不准确的。