Unix: pass and gpg blog

2024-03-29 00:11:34 -06:00 · 2024-03-29 00:11:34 -06:00 · 9373a15b90
commit 9373a15b90
parent 26f0543d62
2 changed files with 386 additions and 26 deletions
--- a/src/content/llama/the-secret-learnings-of-llamas.md
+++ b/src/content/llama/the-secret-learnings-of-llamas.md
@ -8,22 +8,22 @@ heroText: 'Base64 LLama'
 # The Secret Learnings of Llamas
 Tool use by llamas is an active area of research. Recent implementations like
-Devin promise great productivity increases through tool use. I was investigating
+[Devin](https://www.cognition-labs.com/introducing-devin) promise great
-tool use by some modern llamas, when I made an unfortunate discovery.
+productivity increases, just by allowing llamas to interact with more tools. I
 was investigating this in some modern llamas, when I made an unfortunate
 discovery.
 It appears most large llamas have learned a new language, in addition to the
 ones that were intended: base64.
-### Base64 Background
+## Base64 Background
-Base64 is a simple encoding scheme. This is different from encryption and
+Base64 is a simple encoding scheme. It takes in a stream of bytes and converts
-hashing, as those provide security, while base64 just transforms data into a
+them into a plain-text representation.
 portable form.
 Each byte is 8 bits. This means there are 2^8 (256) possible bytes, since each
-bit contributes 2 states. Base64 encodes such that each bytes only stores 2^6
+bit contributes 2 states. Base64 only uses plain-text encoding, so it only
-(64) possible states, but this makes the vocabulary much smaller. With just 64
+stores 2^6 (64) possible states per character.
 letters and numbers, it can hold 64 states per character.
 Let's visualize how base64 works. Say we have the following word:
@ -31,10 +31,11 @@ Let's visualize how base64 works. Say we have the following word:
 Hello
 ```
-This has a utf-8 encoding below. I used the `ord` function in python to get the
+We can convert each letter to a number using [utf-8 encoding
-numbers in the `Base 10` row. I then converted the base 10 representations to
+tables](https://en.wikipedia.org/wiki/UTF-8#Codepage_layout) or the `ord()`
-octal (base 8) and binary (base 2). The bottom two rows are the same, but the
+function in python. I then converted the base 10 representations to octal (base
-spacing makes it easier to see the direct mapping from octal to binary:
+8) and binary (base 2). The bottom two rows are the same, but the spacing makes
 it easier to see the direct mapping from octal to binary:
 ```
 Letters:         H         e         l         l         o
@ -55,8 +56,7 @@ Now we can map in reverse:
 ```
 Base 2 (spaced):  000 001 000 000 001 100 101 001 101 100 001 101 100 001 101 111
-Base 8 (spaced):    0   1   0   0   1   4   5   1   5   4   1   5   4   1   5   7
+Just chaging the spacing...
 Base 2 (spaced):  000001 000000 001100 101001 101100 001101 100001 101111
 Base 8 (spaced):      01     00     14     51     54     15     41     57
 Base 64 (spaced):      B      A      M      p      s      N      h      v
@ -65,7 +65,7 @@ Base 64 (spaced):      B      A      M      p      s      N      h      v
 So we can encode the word `Hello` as `BAMpsNhv` in base64! Base64 is often used
 to encode images and other binary data to store in JSON. It is not space
 efficient, taking up more space than it should, but it's entirely made of
-printable characters.
+printable characters!
 ## Base64 Llamas
@ -83,7 +83,7 @@ echo 'how are you today?' | base64
 ```
 Then ask a llama about `aG93IGFyZSB5b3UgdG9kYXk/Cg==` or whatever other string
-you want. You'll notice that they break down after a about 10-20 characters,
+you want. You'll notice that they break down after about 10-20 characters,
 depending on how good the llama is.
@ -107,13 +107,10 @@ Encode "emiliko@mami2.moe" into base64.
 This discovery was shocking to me. I thought they were achieving this through
 tool use, but I can cross-verify on localllamas which most certainly don't have
 access to tools. This means our 100-billion scale llamas are learning to be a
-base64 decoder?
+base64 decoder? Of course this is a completely pointless feature, as no llama
 will ever be more energy efficient than a trivially coded base64 tool.
-Of course this is a completely pointless feature, as no llama will ever be more
+The Llamas likely picked it up while learning on sample code, but the degree to
-energy efficient than a trivially coded base64 tool. The Llamas likely picked it
+which they picked it up is incredible! This has lead me to wonder, what other
-up while learning on sample code, but the degree to which they picked it up is
+completely pointless things are our llamas learning? This one was an unintended
-incredible!
+side effect of learning to code, but what other side effects is our data having?
 This has lead me to wonder, what other completely pointless things are our
 llamas learning? This one was an unindented side effect of learning to code, but
 what other side effects is our data having?
--- a/src/content/unix/unix-password-management.md
+++ b/src/content/unix/unix-password-management.md
@ -0,0 +1,363 @@
 ---
 title: 'Unix Password Management'
 description: 'Using GPG and Pass for optimal security and ease'
 updateDate: 'March 28 2024'
 ---
 # Password Management
 Passwords are often the main method of digital identification. This means
 anything you don't want others to access but do want yourself to access is
 behind some sort of password. This means we need to optimize on two fronts:
 - Easy of access: Passwords must be quick and easy to access and use
 - High security: Passwords must be strong to resist attacks
 Optimizing for both is more tricky than it seems. Here I will discuss problems
 with existing solutions and present an **offline**, **multi-factor**,
 **easy-to-use**, and **extremely strong** solution to password management. Along
 the way we'll learn a lot about password security in general!
 ## Optimizing for high-security
 A password is pretty pointless if it's not strong enough to be cracked. Let's
 look over some core security concepts!
 ### Measuring Bits of Entropy
 In the security field, "strength" of a password is measured by the *entropy* of
 the password. You'll often hear that passwords should be "60 bits of entropy" or
 some other number. The higher your [bits] of entropy, the stronger your
 password. In fact a password with 41 bits of entropy is twice as strong as one
 with 40 bits of entropy. Going from 20 bits of entropy to 30 makes the password
 over 1000x stronger!
 To understand how to compute entropy, let's consider an example. Say I make a
 password with the following constraints:
 - It's made entirely of the characters `A`, `B`, and `C`
 - It's 12 characters long
 - Each of the 12 characters is chosen completely randomly
 An example of such a password is `AAABCCBCAACB`. To calculate the entropy, we
 consider the number of possible passwords we can generate with the above
 constraints. For each character we have 3 possibilities, and there are 12
 characters, so the entropy is:
 ```
 3^12 = 531,441
 ```
 To calculate "bits of entropy", we just need to take the base-2-logarithm of the
 entropy we just computed, giving us 19 bits of entropy for the above case:
 ```
 log2(3^12) = log2(531,441) = 19
 ```
 The reason this is "bits of entropy" is since `2^19 ~= 531,441`.
 For the mathematically inclined, this is different from the information-theory
 concept of entropy, as computers are deterministic systems. Therefore, digital
 security usually assumes pseudo-random numbers are good enough, which is true in
 practice.
 A more general formula to remember is:
 ```
 strength = bits of entropy = log2( #possible_characters ^ password_length )
 ```
 ### Kerckhoffs's Principle
 The common idea in digital security is that the attacker knows exactly how
 you're defending your system. Obscurity is not considered to add to security in
 any way. This is a pretty important principle to understand why security seems a
 bit overkill sometimes, but it's a very realistic concept.
 By writing this blog, anyone on the internet now knows how I protect my
 passwords. However, since my approach is in line with Kerckhoffs's Principle,
 this isn't a security concern in any respect.
 If you'd like to read more about Kerckhoffs's Principle, check out [this
 article](https://nordvpn.com/cybersecurity/glossary/kerckhoffs-principle/) by a
 questionable VPN provider.
 ### Making Passwords Stronger
 Taking into consideration the above, we can now determine what makes a good
 password. Let's take a look at that entropy formula again:
 ```
 strength = bits of entropy = log2( #possible_characters ^ password_length )
 ```
 One interesting observation here is how increasing the password length will
 increase the exponent, often making a larger impact on the password strength, as
 compared to using more characters. Consider the following base password:
 - Characters: `a-z`, `A-Z`, and `0-9`
 - Length: 16
 This password has `log2(62^16) = 95` bits of entropy. If we make this password
 17 characters instead we get `log2(62^17) = 101` bits of entropy. However, if we
 now add the `$` character to the possible characters in the password, it still
 has `log2(63^16) = 95` bits of entropy!
 In general, you always want to increase the number that's smaller. Since most
 passwords require at least one of each `a-z`, `A-Z`, `0-9`, the character set
 number starts out around 64. However, most passwords are only about 11
 characters long! This means it's almost always beneficial to make a longer
 password, instead of varying up the characters.
 Let's take another example of two passwords:
 1. `balhajisundoubtedlythebesttoyatikea`: length 32, character set 26
 2. `S0mE1EEtc*dedP@ssword`: length 21, character set ~72
 The first password has `log2(26^2) = 150` bits of entropy, while the second one
 has `log2(72^21) = 129` bits of entropy. The first password is **2 billion times
 stronger while being much easier to remember!**
 ### Making Passwords Easy to Remember
 The famous [XKCD comic](https://xkcd.com/936/) comments on how it's actually
 pretty easy to make very strong passwords. All you need to do is think up a
 sentence using real words! As we saw above, length tends to count more for
 password strength, so a long sentence with simple characters will easily
 outclass any sort complex password.
 You could use a password generator that chooses N random words from a list of M
 possible words. This provides `log2(M^N)` bits of entropy. Generally it's quite
 easy to find a list of `10-30k` English words online. Then a password using
 **just 4 words** will have around `54` bits of entropy!
 I highly recommend using a password like this, especially if you're going to
 have to type it in on a mobile device often. While they're certainly weaker for
 their length compared to completely randomly generated passwords, the
 convenience is worth the trade-off.
 ### How Strong Should My Password Be?
 There is a very wide spectrum of opinions on this matter. I will provide mine.
 Let's start by recognizing that most passwords are stored in one of 128, 256, or
 512 bit hashes. Usually 256, but older systems are often 128. This means you
 often *cannot have a password more secure than 256 bits of entropy*. This is a
 result of the output space being lower dimensional than any password with higher
 entropy, so any "stronger" password would be projected down to only 256 bits of
 entropy.
 We can also look at how fast computers can brute-force passwords.
 [Bcrypt](https://en.wikipedia.org/wiki/Bcrypt) is one of the most popular
 hashing choices for passwords. Assuming a company is decently secure, they use
 enough rounds of hashing such that a modern processor takes about 100ms to hash
 a possible password. Rounding up, that means an attacker can try about 1 million
 passwords per day per core. Assuming they have a monstrous 1000 core system,
 they can crack through 29 bits of entropy in one day.
 Based on that number, your password should be around 60 bits of entropy for a
 safe space of true security. More security-concious users often target around
 100 bits of entropy instead, to make sure advancements in processor speeds never
 catch up to their passwords.
 I would personally aim for making your password 15-20 simple lower-case
 characters long. This provides 70-94 bits of entropy alone, and often shouldn't
 be very difficult to remember!
 ### Duplicate Passwords
 Pretty much everything we've discussed up until this point falls apart if you
 use duplicate passwords. For many many different reasons, a password may become
 compromised. Your network may be hijacked, your computer may have a keylogger
 installed, even just someone recording a video on a phone can easily get your
 password.
 Do NOT use duplicate passwords. Do NOT vary passwords by 1-2 characters, create
 completely new distinct passwords every time.
 ## Password Managers
 The current most wide-spread solution to strong but easy-to-use passwords are
 password managers. Some of them, like [KeePassXC](https://keepassxc.org/), are
 actually a good secure solution!
 Unfortunately, people usually use a big corporate solution instead. These
 usually store your passwords in the cloud, which is a complete disaster. A
 single leak means all your accounts an immediately compromised; too many eggs in
 one basket.
 This isn't even unusual! Lastpass had a [major security
 breach](https://blog.lastpass.com/posts/2022/12/notice-of-recent-security-incident)
 in 2022. As of writing, Lastpass, Dashlane, and 1Password are [all
 compromised](https://www.forbes.com/sites/daveywinder/2023/12/11/android-warning-1password-dashlane-lastpass-and-others-can-leak-passwords/?sh=1c019c497dbf)
 on Android. To me, this is completely unacceptable for something that holds keys
 to your all your accounts.
 That said, if you make a reasonable choice of password manager, they can be a
 rather no-frills solution to most people. For those looking for top-notch
 security though, it may be worth considering the pass-gpg approach instead.
 ### Browser-Saved Passwords
 A browser is not a password manager. It is a complete joke how easy it is to rip
 out passwords from a browser. This [short python
 script](https://github.com/priyankchheda/chrome_password_grabber) can do it! So
 can [this one](https://github.com/henry-richard7/Browser-password-stealer) and
 [this one](https://github.com/JustYuuto/Yuuto-Stealer)... It's so easy to build
 one, you can do it yourself under an hour!
 To be fair, for most people this is probably fine. Unless malware gets access to
 your computer, it's unlikely to be stolen... but I still wouldn't put my
 recovery email passwords nor my banking information in these.
 ## 2-Factor Authentication
 Almost all services now offer 2-Factor Authentication (2FA). In fact, it's
 increasingly a requirement to sign up for services. Although it seems like a
 hardy security method, it's not a replacement for strong passwords.
 2FA can be quite beneficial for people who don't make very strong passwords. At
 least with this method, their password is effectively multiplied by 100000
 possibilities. However, that's only about 16 bits of entropy, which isn't a very
 big increase.
 I also personally feel 2FA can be incredibly inconvenient. It's not a given I
 have my phone nearby every time I use my computer. One can only imagine the
 situation where your phone dies and you just can't access your accounts anymore.
 That said, if you're fine with the inconvenience, there's no harm in adding 16
 bits of entropy to your passwords.
 ### SMS 2FA
 This is a completely different game. Unlike 2FA apps which require typing in a
 code and verifying any new device using an existing device, SMS just needs
 access to your phone number. A phone number is controlled by your cellular
 provider, not you. This means their customer service agents can easily reassign
 your phone number to another phone!
 When I got my phone number, I actually ended up with SMS verification for the
 previous owner's AirBnB account! This isn't the worst possible account to
 compromise, but this happened by complete accident. Further, identity theft is a
 much more real threat than a hacker with a 1000-core compute cluster. All they
 have to do is deceive a customer service agent over the phone, and your SMS
 number is now theirs.
 In short, avoid SMS-based 2FA in favor of apps like [Authy](https://authy.com/)
 that don't create such a huge security gap.
 ## GPG and Pass
 If you truly want strong security, while also having ease of use, we'll use the
 OG method of digital identity verification: GPG keys.
 ### GPG Keys
 A GPG key is a file associated with an email, expiry date, and password. It can
 be used to encrypt and decrypt any binary encoding. Originally, it was meant as
 a way to send emails that only the receiver will be able to decrypt.
 I highly recommend this [amazing GPG
 cheatsheet](https://gist.github.com/johnfedoruk/7f156d844af54cc91324dff4f54b11ce),
 though I'll cover the necessary bare minimum here. It's a bit unfortunate how
 GPG has one of the most confusing interfaces of any command line tool.
 To start, create a new key for yourself. Roughly follow the below, replacing
 pieces with your information:
 ```
 gpg --full-gen-key --expert
 > (9) ECC and ECC
 > (1) Curve 25519
 > 1y  # Optional, 1 year is recommended
 > Your Name          # This will be visible in the public key
 > emiliko@mami2.moe  # This is also visible in the public key
 > No comment
 > Put a password on the primary key
 ```
 It's important you put a *very* strong password on your GPG key, one that **you
 can memorize by heart**. This will be the one and only password you will ever
 need to remember from this point forward, but it must be strong!
 GPG keys can be used to sign git commits, encrypt and decrypt messages, but we
 won't cover that here. All we really need is to set it up. You can also view
 your keys at any time using `gpg --list-keys`.
 ### Pass
 [Pass](https://www.passwordstore.org/) is my favourite password manager. I like
 it since it's simple, transparent, and secure. It's so simple in fact, it'd be
 pretty easy to write your own implementation in an hour or two.
 All your passwords will be stored in the `~/.password-store` directory for your
 user. You can organize them by directories, just as you organize your file
 system ordinarily. The trick is that `pass` will ensure all these files are
 encrypted with your default GPG key.
 Using pass is super simple. To add a password just type it in:
 ```
 pass insert github/password
 ```
 You can also watch it being typed in with the echo `-e` flag, which I find quite
 helpful:
 ```
 pass insert -e github/email
 ```
 You can list the names of the password files you have, without decrypting
 anything:
 ```
 pass show  # All of them
 pass show github  # All the ones in the github directory
 ```
 Finally, you can decrypt passwords. Use the `-c` option to copy the password to
 your clipboard. The clipboard will clear after a few seconds:
 ```
 pass show github/password  # Actually prints it to the terminal
 pass show -c github/password  # Copies it to the clipboard for a few seconds
 ```
 Tab competition is well supported. You can also use `pass mv` to rename files
 and `pass rm` to delete them. Really, it's just super simple.
 ### Why Use Pass and GPG?
 There are many reason to do so:
 - **Locally stored**: These are much harder to obtain than passwords stored on the
   cloud.
 - **Easy migration**: You literally `scp -r` your `~/.password-store` directory to
   migrate them to a new computer.
 - **3-factor authentication**: An attacker needs your GPG key, your GPG key's
   password, and your `~/.password-store` directory to perform a successful
   attack. Missing any of these 3 will prevent any attack from succeeding
 - **0-type approach**: You can use very very strong passwords and never need to
   type them in (they're on your clipboard!)
 - **Secure scripting**: It's very easy to put passwords securely into shell
   scripts with `"$(pass show cloudflare/token)"`
 The 3-factor method is even more helpful when considering common attacks. For
 example, if a phone camera records you typing on your keyboard to decrypt the
 GPG key, the attacker can't do *anything* with that password alone. They still
 need physical access to your system to grab the files themselves.
 An odd benefit of 3-factor authentication is distributing backups. If you
 provide people who know you, but mutually don't know one another, you can safely
 entrust your passwords with third parties. This is since they need all 3 pieces
 to mount an attack, so giving a trusted third party only 1 piece doesn't
 compromise your security.
 Malware *could* be both a key logger and grab the files from
 `~/.password-store`, but that is some very sophisticated and targeted
 malware... most of the keylogger ones will just assume the password you typed is
 the actual password to your accounts, which it isn't!
 I hope this article was useful in building an understanding of digital security.
 Maybe you'll even consider using GPG and Pass!