Unix: pass and gpg blog
This commit is contained in:
parent
26f0543d62
commit
9373a15b90
2 changed files with 386 additions and 26 deletions
|
@ -8,22 +8,22 @@ heroText: 'Base64 LLama'
|
|||
# The Secret Learnings of Llamas
|
||||
|
||||
Tool use by llamas is an active area of research. Recent implementations like
|
||||
Devin promise great productivity increases through tool use. I was investigating
|
||||
tool use by some modern llamas, when I made an unfortunate discovery.
|
||||
[Devin](https://www.cognition-labs.com/introducing-devin) promise great
|
||||
productivity increases, just by allowing llamas to interact with more tools. I
|
||||
was investigating this in some modern llamas, when I made an unfortunate
|
||||
discovery.
|
||||
|
||||
It appears most large llamas have learned a new language, in addition to the
|
||||
ones that were intended: base64.
|
||||
|
||||
### Base64 Background
|
||||
## Base64 Background
|
||||
|
||||
Base64 is a simple encoding scheme. This is different from encryption and
|
||||
hashing, as those provide security, while base64 just transforms data into a
|
||||
portable form.
|
||||
Base64 is a simple encoding scheme. It takes in a stream of bytes and converts
|
||||
them into a plain-text representation.
|
||||
|
||||
Each byte is 8 bits. This means there are 2^8 (256) possible bytes, since each
|
||||
bit contributes 2 states. Base64 encodes such that each bytes only stores 2^6
|
||||
(64) possible states, but this makes the vocabulary much smaller. With just 64
|
||||
letters and numbers, it can hold 64 states per character.
|
||||
bit contributes 2 states. Base64 only uses plain-text encoding, so it only
|
||||
stores 2^6 (64) possible states per character.
|
||||
|
||||
Let's visualize how base64 works. Say we have the following word:
|
||||
|
||||
|
@ -31,10 +31,11 @@ Let's visualize how base64 works. Say we have the following word:
|
|||
Hello
|
||||
```
|
||||
|
||||
This has a utf-8 encoding below. I used the `ord` function in python to get the
|
||||
numbers in the `Base 10` row. I then converted the base 10 representations to
|
||||
octal (base 8) and binary (base 2). The bottom two rows are the same, but the
|
||||
spacing makes it easier to see the direct mapping from octal to binary:
|
||||
We can convert each letter to a number using [utf-8 encoding
|
||||
tables](https://en.wikipedia.org/wiki/UTF-8#Codepage_layout) or the `ord()`
|
||||
function in python. I then converted the base 10 representations to octal (base
|
||||
8) and binary (base 2). The bottom two rows are the same, but the spacing makes
|
||||
it easier to see the direct mapping from octal to binary:
|
||||
|
||||
```
|
||||
Letters: H e l l o
|
||||
|
@ -55,8 +56,7 @@ Now we can map in reverse:
|
|||
|
||||
```
|
||||
Base 2 (spaced): 000 001 000 000 001 100 101 001 101 100 001 101 100 001 101 111
|
||||
Base 8 (spaced): 0 1 0 0 1 4 5 1 5 4 1 5 4 1 5 7
|
||||
|
||||
Just chaging the spacing...
|
||||
Base 2 (spaced): 000001 000000 001100 101001 101100 001101 100001 101111
|
||||
Base 8 (spaced): 01 00 14 51 54 15 41 57
|
||||
Base 64 (spaced): B A M p s N h v
|
||||
|
@ -65,7 +65,7 @@ Base 64 (spaced): B A M p s N h v
|
|||
So we can encode the word `Hello` as `BAMpsNhv` in base64! Base64 is often used
|
||||
to encode images and other binary data to store in JSON. It is not space
|
||||
efficient, taking up more space than it should, but it's entirely made of
|
||||
printable characters.
|
||||
printable characters!
|
||||
|
||||
## Base64 Llamas
|
||||
|
||||
|
@ -83,7 +83,7 @@ echo 'how are you today?' | base64
|
|||
```
|
||||
|
||||
Then ask a llama about `aG93IGFyZSB5b3UgdG9kYXk/Cg==` or whatever other string
|
||||
you want. You'll notice that they break down after a about 10-20 characters,
|
||||
you want. You'll notice that they break down after about 10-20 characters,
|
||||
depending on how good the llama is.
|
||||
|
||||
|
||||
|
@ -107,13 +107,10 @@ Encode "emiliko@mami2.moe" into base64.
|
|||
This discovery was shocking to me. I thought they were achieving this through
|
||||
tool use, but I can cross-verify on localllamas which most certainly don't have
|
||||
access to tools. This means our 100-billion scale llamas are learning to be a
|
||||
base64 decoder?
|
||||
base64 decoder? Of course this is a completely pointless feature, as no llama
|
||||
will ever be more energy efficient than a trivially coded base64 tool.
|
||||
|
||||
Of course this is a completely pointless feature, as no llama will ever be more
|
||||
energy efficient than a trivially coded base64 tool. The Llamas likely picked it
|
||||
up while learning on sample code, but the degree to which they picked it up is
|
||||
incredible!
|
||||
|
||||
This has lead me to wonder, what other completely pointless things are our
|
||||
llamas learning? This one was an unindented side effect of learning to code, but
|
||||
what other side effects is our data having?
|
||||
The Llamas likely picked it up while learning on sample code, but the degree to
|
||||
which they picked it up is incredible! This has lead me to wonder, what other
|
||||
completely pointless things are our llamas learning? This one was an unintended
|
||||
side effect of learning to code, but what other side effects is our data having?
|
||||
|
|
363
src/content/unix/unix-password-management.md
Normal file
363
src/content/unix/unix-password-management.md
Normal file
|
@ -0,0 +1,363 @@
|
|||
---
|
||||
title: 'Unix Password Management'
|
||||
description: 'Using GPG and Pass for optimal security and ease'
|
||||
updateDate: 'March 28 2024'
|
||||
---
|
||||
|
||||
# Password Management
|
||||
|
||||
Passwords are often the main method of digital identification. This means
|
||||
anything you don't want others to access but do want yourself to access is
|
||||
behind some sort of password. This means we need to optimize on two fronts:
|
||||
|
||||
- Easy of access: Passwords must be quick and easy to access and use
|
||||
- High security: Passwords must be strong to resist attacks
|
||||
|
||||
Optimizing for both is more tricky than it seems. Here I will discuss problems
|
||||
with existing solutions and present an **offline**, **multi-factor**,
|
||||
**easy-to-use**, and **extremely strong** solution to password management. Along
|
||||
the way we'll learn a lot about password security in general!
|
||||
|
||||
## Optimizing for high-security
|
||||
|
||||
A password is pretty pointless if it's not strong enough to be cracked. Let's
|
||||
look over some core security concepts!
|
||||
|
||||
### Measuring Bits of Entropy
|
||||
|
||||
In the security field, "strength" of a password is measured by the *entropy* of
|
||||
the password. You'll often hear that passwords should be "60 bits of entropy" or
|
||||
some other number. The higher your [bits] of entropy, the stronger your
|
||||
password. In fact a password with 41 bits of entropy is twice as strong as one
|
||||
with 40 bits of entropy. Going from 20 bits of entropy to 30 makes the password
|
||||
over 1000x stronger!
|
||||
|
||||
To understand how to compute entropy, let's consider an example. Say I make a
|
||||
password with the following constraints:
|
||||
|
||||
- It's made entirely of the characters `A`, `B`, and `C`
|
||||
- It's 12 characters long
|
||||
- Each of the 12 characters is chosen completely randomly
|
||||
|
||||
An example of such a password is `AAABCCBCAACB`. To calculate the entropy, we
|
||||
consider the number of possible passwords we can generate with the above
|
||||
constraints. For each character we have 3 possibilities, and there are 12
|
||||
characters, so the entropy is:
|
||||
|
||||
```
|
||||
3^12 = 531,441
|
||||
```
|
||||
|
||||
To calculate "bits of entropy", we just need to take the base-2-logarithm of the
|
||||
entropy we just computed, giving us 19 bits of entropy for the above case:
|
||||
|
||||
```
|
||||
log2(3^12) = log2(531,441) = 19
|
||||
```
|
||||
|
||||
The reason this is "bits of entropy" is since `2^19 ~= 531,441`.
|
||||
|
||||
For the mathematically inclined, this is different from the information-theory
|
||||
concept of entropy, as computers are deterministic systems. Therefore, digital
|
||||
security usually assumes pseudo-random numbers are good enough, which is true in
|
||||
practice.
|
||||
|
||||
A more general formula to remember is:
|
||||
|
||||
```
|
||||
strength = bits of entropy = log2( #possible_characters ^ password_length )
|
||||
```
|
||||
|
||||
### Kerckhoffs's Principle
|
||||
|
||||
The common idea in digital security is that the attacker knows exactly how
|
||||
you're defending your system. Obscurity is not considered to add to security in
|
||||
any way. This is a pretty important principle to understand why security seems a
|
||||
bit overkill sometimes, but it's a very realistic concept.
|
||||
|
||||
By writing this blog, anyone on the internet now knows how I protect my
|
||||
passwords. However, since my approach is in line with Kerckhoffs's Principle,
|
||||
this isn't a security concern in any respect.
|
||||
|
||||
If you'd like to read more about Kerckhoffs's Principle, check out [this
|
||||
article](https://nordvpn.com/cybersecurity/glossary/kerckhoffs-principle/) by a
|
||||
questionable VPN provider.
|
||||
|
||||
### Making Passwords Stronger
|
||||
|
||||
Taking into consideration the above, we can now determine what makes a good
|
||||
password. Let's take a look at that entropy formula again:
|
||||
|
||||
```
|
||||
strength = bits of entropy = log2( #possible_characters ^ password_length )
|
||||
```
|
||||
|
||||
One interesting observation here is how increasing the password length will
|
||||
increase the exponent, often making a larger impact on the password strength, as
|
||||
compared to using more characters. Consider the following base password:
|
||||
|
||||
- Characters: `a-z`, `A-Z`, and `0-9`
|
||||
- Length: 16
|
||||
|
||||
This password has `log2(62^16) = 95` bits of entropy. If we make this password
|
||||
17 characters instead we get `log2(62^17) = 101` bits of entropy. However, if we
|
||||
now add the `$` character to the possible characters in the password, it still
|
||||
has `log2(63^16) = 95` bits of entropy!
|
||||
|
||||
In general, you always want to increase the number that's smaller. Since most
|
||||
passwords require at least one of each `a-z`, `A-Z`, `0-9`, the character set
|
||||
number starts out around 64. However, most passwords are only about 11
|
||||
characters long! This means it's almost always beneficial to make a longer
|
||||
password, instead of varying up the characters.
|
||||
|
||||
Let's take another example of two passwords:
|
||||
|
||||
1. `balhajisundoubtedlythebesttoyatikea`: length 32, character set 26
|
||||
2. `S0mE1EEtc*dedP@ssword`: length 21, character set ~72
|
||||
|
||||
The first password has `log2(26^2) = 150` bits of entropy, while the second one
|
||||
has `log2(72^21) = 129` bits of entropy. The first password is **2 billion times
|
||||
stronger while being much easier to remember!**
|
||||
|
||||
### Making Passwords Easy to Remember
|
||||
|
||||
The famous [XKCD comic](https://xkcd.com/936/) comments on how it's actually
|
||||
pretty easy to make very strong passwords. All you need to do is think up a
|
||||
sentence using real words! As we saw above, length tends to count more for
|
||||
password strength, so a long sentence with simple characters will easily
|
||||
outclass any sort complex password.
|
||||
|
||||
You could use a password generator that chooses N random words from a list of M
|
||||
possible words. This provides `log2(M^N)` bits of entropy. Generally it's quite
|
||||
easy to find a list of `10-30k` English words online. Then a password using
|
||||
**just 4 words** will have around `54` bits of entropy!
|
||||
|
||||
I highly recommend using a password like this, especially if you're going to
|
||||
have to type it in on a mobile device often. While they're certainly weaker for
|
||||
their length compared to completely randomly generated passwords, the
|
||||
convenience is worth the trade-off.
|
||||
|
||||
### How Strong Should My Password Be?
|
||||
|
||||
There is a very wide spectrum of opinions on this matter. I will provide mine.
|
||||
|
||||
Let's start by recognizing that most passwords are stored in one of 128, 256, or
|
||||
512 bit hashes. Usually 256, but older systems are often 128. This means you
|
||||
often *cannot have a password more secure than 256 bits of entropy*. This is a
|
||||
result of the output space being lower dimensional than any password with higher
|
||||
entropy, so any "stronger" password would be projected down to only 256 bits of
|
||||
entropy.
|
||||
|
||||
We can also look at how fast computers can brute-force passwords.
|
||||
[Bcrypt](https://en.wikipedia.org/wiki/Bcrypt) is one of the most popular
|
||||
hashing choices for passwords. Assuming a company is decently secure, they use
|
||||
enough rounds of hashing such that a modern processor takes about 100ms to hash
|
||||
a possible password. Rounding up, that means an attacker can try about 1 million
|
||||
passwords per day per core. Assuming they have a monstrous 1000 core system,
|
||||
they can crack through 29 bits of entropy in one day.
|
||||
|
||||
Based on that number, your password should be around 60 bits of entropy for a
|
||||
safe space of true security. More security-concious users often target around
|
||||
100 bits of entropy instead, to make sure advancements in processor speeds never
|
||||
catch up to their passwords.
|
||||
|
||||
I would personally aim for making your password 15-20 simple lower-case
|
||||
characters long. This provides 70-94 bits of entropy alone, and often shouldn't
|
||||
be very difficult to remember!
|
||||
|
||||
### Duplicate Passwords
|
||||
|
||||
Pretty much everything we've discussed up until this point falls apart if you
|
||||
use duplicate passwords. For many many different reasons, a password may become
|
||||
compromised. Your network may be hijacked, your computer may have a keylogger
|
||||
installed, even just someone recording a video on a phone can easily get your
|
||||
password.
|
||||
|
||||
Do NOT use duplicate passwords. Do NOT vary passwords by 1-2 characters, create
|
||||
completely new distinct passwords every time.
|
||||
|
||||
## Password Managers
|
||||
|
||||
The current most wide-spread solution to strong but easy-to-use passwords are
|
||||
password managers. Some of them, like [KeePassXC](https://keepassxc.org/), are
|
||||
actually a good secure solution!
|
||||
|
||||
Unfortunately, people usually use a big corporate solution instead. These
|
||||
usually store your passwords in the cloud, which is a complete disaster. A
|
||||
single leak means all your accounts an immediately compromised; too many eggs in
|
||||
one basket.
|
||||
|
||||
This isn't even unusual! Lastpass had a [major security
|
||||
breach](https://blog.lastpass.com/posts/2022/12/notice-of-recent-security-incident)
|
||||
in 2022. As of writing, Lastpass, Dashlane, and 1Password are [all
|
||||
compromised](https://www.forbes.com/sites/daveywinder/2023/12/11/android-warning-1password-dashlane-lastpass-and-others-can-leak-passwords/?sh=1c019c497dbf)
|
||||
on Android. To me, this is completely unacceptable for something that holds keys
|
||||
to your all your accounts.
|
||||
|
||||
That said, if you make a reasonable choice of password manager, they can be a
|
||||
rather no-frills solution to most people. For those looking for top-notch
|
||||
security though, it may be worth considering the pass-gpg approach instead.
|
||||
|
||||
### Browser-Saved Passwords
|
||||
|
||||
A browser is not a password manager. It is a complete joke how easy it is to rip
|
||||
out passwords from a browser. This [short python
|
||||
script](https://github.com/priyankchheda/chrome_password_grabber) can do it! So
|
||||
can [this one](https://github.com/henry-richard7/Browser-password-stealer) and
|
||||
[this one](https://github.com/JustYuuto/Yuuto-Stealer)... It's so easy to build
|
||||
one, you can do it yourself under an hour!
|
||||
|
||||
To be fair, for most people this is probably fine. Unless malware gets access to
|
||||
your computer, it's unlikely to be stolen... but I still wouldn't put my
|
||||
recovery email passwords nor my banking information in these.
|
||||
|
||||
## 2-Factor Authentication
|
||||
|
||||
Almost all services now offer 2-Factor Authentication (2FA). In fact, it's
|
||||
increasingly a requirement to sign up for services. Although it seems like a
|
||||
hardy security method, it's not a replacement for strong passwords.
|
||||
|
||||
2FA can be quite beneficial for people who don't make very strong passwords. At
|
||||
least with this method, their password is effectively multiplied by 100000
|
||||
possibilities. However, that's only about 16 bits of entropy, which isn't a very
|
||||
big increase.
|
||||
|
||||
I also personally feel 2FA can be incredibly inconvenient. It's not a given I
|
||||
have my phone nearby every time I use my computer. One can only imagine the
|
||||
situation where your phone dies and you just can't access your accounts anymore.
|
||||
That said, if you're fine with the inconvenience, there's no harm in adding 16
|
||||
bits of entropy to your passwords.
|
||||
|
||||
### SMS 2FA
|
||||
|
||||
This is a completely different game. Unlike 2FA apps which require typing in a
|
||||
code and verifying any new device using an existing device, SMS just needs
|
||||
access to your phone number. A phone number is controlled by your cellular
|
||||
provider, not you. This means their customer service agents can easily reassign
|
||||
your phone number to another phone!
|
||||
|
||||
When I got my phone number, I actually ended up with SMS verification for the
|
||||
previous owner's AirBnB account! This isn't the worst possible account to
|
||||
compromise, but this happened by complete accident. Further, identity theft is a
|
||||
much more real threat than a hacker with a 1000-core compute cluster. All they
|
||||
have to do is deceive a customer service agent over the phone, and your SMS
|
||||
number is now theirs.
|
||||
|
||||
In short, avoid SMS-based 2FA in favor of apps like [Authy](https://authy.com/)
|
||||
that don't create such a huge security gap.
|
||||
|
||||
## GPG and Pass
|
||||
|
||||
If you truly want strong security, while also having ease of use, we'll use the
|
||||
OG method of digital identity verification: GPG keys.
|
||||
|
||||
### GPG Keys
|
||||
|
||||
A GPG key is a file associated with an email, expiry date, and password. It can
|
||||
be used to encrypt and decrypt any binary encoding. Originally, it was meant as
|
||||
a way to send emails that only the receiver will be able to decrypt.
|
||||
|
||||
I highly recommend this [amazing GPG
|
||||
cheatsheet](https://gist.github.com/johnfedoruk/7f156d844af54cc91324dff4f54b11ce),
|
||||
though I'll cover the necessary bare minimum here. It's a bit unfortunate how
|
||||
GPG has one of the most confusing interfaces of any command line tool.
|
||||
|
||||
To start, create a new key for yourself. Roughly follow the below, replacing
|
||||
pieces with your information:
|
||||
|
||||
```
|
||||
gpg --full-gen-key --expert
|
||||
> (9) ECC and ECC
|
||||
> (1) Curve 25519
|
||||
> 1y # Optional, 1 year is recommended
|
||||
> Your Name # This will be visible in the public key
|
||||
> emiliko@mami2.moe # This is also visible in the public key
|
||||
> No comment
|
||||
> Put a password on the primary key
|
||||
```
|
||||
|
||||
It's important you put a *very* strong password on your GPG key, one that **you
|
||||
can memorize by heart**. This will be the one and only password you will ever
|
||||
need to remember from this point forward, but it must be strong!
|
||||
|
||||
GPG keys can be used to sign git commits, encrypt and decrypt messages, but we
|
||||
won't cover that here. All we really need is to set it up. You can also view
|
||||
your keys at any time using `gpg --list-keys`.
|
||||
|
||||
### Pass
|
||||
|
||||
[Pass](https://www.passwordstore.org/) is my favourite password manager. I like
|
||||
it since it's simple, transparent, and secure. It's so simple in fact, it'd be
|
||||
pretty easy to write your own implementation in an hour or two.
|
||||
|
||||
All your passwords will be stored in the `~/.password-store` directory for your
|
||||
user. You can organize them by directories, just as you organize your file
|
||||
system ordinarily. The trick is that `pass` will ensure all these files are
|
||||
encrypted with your default GPG key.
|
||||
|
||||
Using pass is super simple. To add a password just type it in:
|
||||
|
||||
```
|
||||
pass insert github/password
|
||||
```
|
||||
|
||||
You can also watch it being typed in with the echo `-e` flag, which I find quite
|
||||
helpful:
|
||||
|
||||
```
|
||||
pass insert -e github/email
|
||||
```
|
||||
|
||||
You can list the names of the password files you have, without decrypting
|
||||
anything:
|
||||
|
||||
```
|
||||
pass show # All of them
|
||||
pass show github # All the ones in the github directory
|
||||
```
|
||||
|
||||
Finally, you can decrypt passwords. Use the `-c` option to copy the password to
|
||||
your clipboard. The clipboard will clear after a few seconds:
|
||||
|
||||
```
|
||||
pass show github/password # Actually prints it to the terminal
|
||||
pass show -c github/password # Copies it to the clipboard for a few seconds
|
||||
```
|
||||
|
||||
Tab competition is well supported. You can also use `pass mv` to rename files
|
||||
and `pass rm` to delete them. Really, it's just super simple.
|
||||
|
||||
### Why Use Pass and GPG?
|
||||
|
||||
There are many reason to do so:
|
||||
|
||||
- **Locally stored**: These are much harder to obtain than passwords stored on the
|
||||
cloud.
|
||||
- **Easy migration**: You literally `scp -r` your `~/.password-store` directory to
|
||||
migrate them to a new computer.
|
||||
- **3-factor authentication**: An attacker needs your GPG key, your GPG key's
|
||||
password, and your `~/.password-store` directory to perform a successful
|
||||
attack. Missing any of these 3 will prevent any attack from succeeding
|
||||
- **0-type approach**: You can use very very strong passwords and never need to
|
||||
type them in (they're on your clipboard!)
|
||||
- **Secure scripting**: It's very easy to put passwords securely into shell
|
||||
scripts with `"$(pass show cloudflare/token)"`
|
||||
|
||||
The 3-factor method is even more helpful when considering common attacks. For
|
||||
example, if a phone camera records you typing on your keyboard to decrypt the
|
||||
GPG key, the attacker can't do *anything* with that password alone. They still
|
||||
need physical access to your system to grab the files themselves.
|
||||
|
||||
An odd benefit of 3-factor authentication is distributing backups. If you
|
||||
provide people who know you, but mutually don't know one another, you can safely
|
||||
entrust your passwords with third parties. This is since they need all 3 pieces
|
||||
to mount an attack, so giving a trusted third party only 1 piece doesn't
|
||||
compromise your security.
|
||||
|
||||
Malware *could* be both a key logger and grab the files from
|
||||
`~/.password-store`, but that is some very sophisticated and targeted
|
||||
malware... most of the keylogger ones will just assume the password you typed is
|
||||
the actual password to your accounts, which it isn't!
|
||||
|
||||
I hope this article was useful in building an understanding of digital security.
|
||||
Maybe you'll even consider using GPG and Pass!
|
Loading…
Reference in a new issue