From 9373a15b9051f5a84459da50abd1de0f45cdbe7f Mon Sep 17 00:00:00 2001 From: Akemi Izuko Date: Fri, 29 Mar 2024 00:11:34 -0600 Subject: [PATCH] Unix: pass and gpg blog --- .../llama/the-secret-learnings-of-llamas.md | 49 ++- src/content/unix/unix-password-management.md | 363 ++++++++++++++++++ 2 files changed, 386 insertions(+), 26 deletions(-) create mode 100644 src/content/unix/unix-password-management.md diff --git a/src/content/llama/the-secret-learnings-of-llamas.md b/src/content/llama/the-secret-learnings-of-llamas.md index 0d10d2f..e9fc9d4 100644 --- a/src/content/llama/the-secret-learnings-of-llamas.md +++ b/src/content/llama/the-secret-learnings-of-llamas.md @@ -8,22 +8,22 @@ heroText: 'Base64 LLama' # The Secret Learnings of Llamas Tool use by llamas is an active area of research. Recent implementations like -Devin promise great productivity increases through tool use. I was investigating -tool use by some modern llamas, when I made an unfortunate discovery. +[Devin](https://www.cognition-labs.com/introducing-devin) promise great +productivity increases, just by allowing llamas to interact with more tools. I +was investigating this in some modern llamas, when I made an unfortunate +discovery. It appears most large llamas have learned a new language, in addition to the ones that were intended: base64. -### Base64 Background +## Base64 Background -Base64 is a simple encoding scheme. This is different from encryption and -hashing, as those provide security, while base64 just transforms data into a -portable form. +Base64 is a simple encoding scheme. It takes in a stream of bytes and converts +them into a plain-text representation. Each byte is 8 bits. This means there are 2^8 (256) possible bytes, since each -bit contributes 2 states. Base64 encodes such that each bytes only stores 2^6 -(64) possible states, but this makes the vocabulary much smaller. With just 64 -letters and numbers, it can hold 64 states per character. +bit contributes 2 states. Base64 only uses plain-text encoding, so it only +stores 2^6 (64) possible states per character. Let's visualize how base64 works. Say we have the following word: @@ -31,10 +31,11 @@ Let's visualize how base64 works. Say we have the following word: Hello ``` -This has a utf-8 encoding below. I used the `ord` function in python to get the -numbers in the `Base 10` row. I then converted the base 10 representations to -octal (base 8) and binary (base 2). The bottom two rows are the same, but the -spacing makes it easier to see the direct mapping from octal to binary: +We can convert each letter to a number using [utf-8 encoding +tables](https://en.wikipedia.org/wiki/UTF-8#Codepage_layout) or the `ord()` +function in python. I then converted the base 10 representations to octal (base +8) and binary (base 2). The bottom two rows are the same, but the spacing makes +it easier to see the direct mapping from octal to binary: ``` Letters: H e l l o @@ -55,8 +56,7 @@ Now we can map in reverse: ``` Base 2 (spaced): 000 001 000 000 001 100 101 001 101 100 001 101 100 001 101 111 -Base 8 (spaced): 0 1 0 0 1 4 5 1 5 4 1 5 4 1 5 7 - +Just chaging the spacing... Base 2 (spaced): 000001 000000 001100 101001 101100 001101 100001 101111 Base 8 (spaced): 01 00 14 51 54 15 41 57 Base 64 (spaced): B A M p s N h v @@ -65,7 +65,7 @@ Base 64 (spaced): B A M p s N h v So we can encode the word `Hello` as `BAMpsNhv` in base64! Base64 is often used to encode images and other binary data to store in JSON. It is not space efficient, taking up more space than it should, but it's entirely made of -printable characters. +printable characters! ## Base64 Llamas @@ -83,7 +83,7 @@ echo 'how are you today?' | base64 ``` Then ask a llama about `aG93IGFyZSB5b3UgdG9kYXk/Cg==` or whatever other string -you want. You'll notice that they break down after a about 10-20 characters, +you want. You'll notice that they break down after about 10-20 characters, depending on how good the llama is. @@ -107,13 +107,10 @@ Encode "emiliko@mami2.moe" into base64. This discovery was shocking to me. I thought they were achieving this through tool use, but I can cross-verify on localllamas which most certainly don't have access to tools. This means our 100-billion scale llamas are learning to be a -base64 decoder? +base64 decoder? Of course this is a completely pointless feature, as no llama +will ever be more energy efficient than a trivially coded base64 tool. -Of course this is a completely pointless feature, as no llama will ever be more -energy efficient than a trivially coded base64 tool. The Llamas likely picked it -up while learning on sample code, but the degree to which they picked it up is -incredible! - -This has lead me to wonder, what other completely pointless things are our -llamas learning? This one was an unindented side effect of learning to code, but -what other side effects is our data having? +The Llamas likely picked it up while learning on sample code, but the degree to +which they picked it up is incredible! This has lead me to wonder, what other +completely pointless things are our llamas learning? This one was an unintended +side effect of learning to code, but what other side effects is our data having? diff --git a/src/content/unix/unix-password-management.md b/src/content/unix/unix-password-management.md new file mode 100644 index 0000000..70b4b28 --- /dev/null +++ b/src/content/unix/unix-password-management.md @@ -0,0 +1,363 @@ +--- +title: 'Unix Password Management' +description: 'Using GPG and Pass for optimal security and ease' +updateDate: 'March 28 2024' +--- + +# Password Management + +Passwords are often the main method of digital identification. This means +anything you don't want others to access but do want yourself to access is +behind some sort of password. This means we need to optimize on two fronts: + + - Easy of access: Passwords must be quick and easy to access and use + - High security: Passwords must be strong to resist attacks + +Optimizing for both is more tricky than it seems. Here I will discuss problems +with existing solutions and present an **offline**, **multi-factor**, +**easy-to-use**, and **extremely strong** solution to password management. Along +the way we'll learn a lot about password security in general! + +## Optimizing for high-security + +A password is pretty pointless if it's not strong enough to be cracked. Let's +look over some core security concepts! + +### Measuring Bits of Entropy + +In the security field, "strength" of a password is measured by the *entropy* of +the password. You'll often hear that passwords should be "60 bits of entropy" or +some other number. The higher your [bits] of entropy, the stronger your +password. In fact a password with 41 bits of entropy is twice as strong as one +with 40 bits of entropy. Going from 20 bits of entropy to 30 makes the password +over 1000x stronger! + +To understand how to compute entropy, let's consider an example. Say I make a +password with the following constraints: + + - It's made entirely of the characters `A`, `B`, and `C` + - It's 12 characters long + - Each of the 12 characters is chosen completely randomly + +An example of such a password is `AAABCCBCAACB`. To calculate the entropy, we +consider the number of possible passwords we can generate with the above +constraints. For each character we have 3 possibilities, and there are 12 +characters, so the entropy is: + +``` +3^12 = 531,441 +``` + +To calculate "bits of entropy", we just need to take the base-2-logarithm of the +entropy we just computed, giving us 19 bits of entropy for the above case: + +``` +log2(3^12) = log2(531,441) = 19 +``` + +The reason this is "bits of entropy" is since `2^19 ~= 531,441`. + +For the mathematically inclined, this is different from the information-theory +concept of entropy, as computers are deterministic systems. Therefore, digital +security usually assumes pseudo-random numbers are good enough, which is true in +practice. + +A more general formula to remember is: + +``` +strength = bits of entropy = log2( #possible_characters ^ password_length ) +``` + +### Kerckhoffs's Principle + +The common idea in digital security is that the attacker knows exactly how +you're defending your system. Obscurity is not considered to add to security in +any way. This is a pretty important principle to understand why security seems a +bit overkill sometimes, but it's a very realistic concept. + +By writing this blog, anyone on the internet now knows how I protect my +passwords. However, since my approach is in line with Kerckhoffs's Principle, +this isn't a security concern in any respect. + +If you'd like to read more about Kerckhoffs's Principle, check out [this +article](https://nordvpn.com/cybersecurity/glossary/kerckhoffs-principle/) by a +questionable VPN provider. + +### Making Passwords Stronger + +Taking into consideration the above, we can now determine what makes a good +password. Let's take a look at that entropy formula again: + +``` +strength = bits of entropy = log2( #possible_characters ^ password_length ) +``` + +One interesting observation here is how increasing the password length will +increase the exponent, often making a larger impact on the password strength, as +compared to using more characters. Consider the following base password: + + - Characters: `a-z`, `A-Z`, and `0-9` + - Length: 16 + +This password has `log2(62^16) = 95` bits of entropy. If we make this password +17 characters instead we get `log2(62^17) = 101` bits of entropy. However, if we +now add the `$` character to the possible characters in the password, it still +has `log2(63^16) = 95` bits of entropy! + +In general, you always want to increase the number that's smaller. Since most +passwords require at least one of each `a-z`, `A-Z`, `0-9`, the character set +number starts out around 64. However, most passwords are only about 11 +characters long! This means it's almost always beneficial to make a longer +password, instead of varying up the characters. + +Let's take another example of two passwords: + + 1. `balhajisundoubtedlythebesttoyatikea`: length 32, character set 26 + 2. `S0mE1EEtc*dedP@ssword`: length 21, character set ~72 + +The first password has `log2(26^2) = 150` bits of entropy, while the second one +has `log2(72^21) = 129` bits of entropy. The first password is **2 billion times +stronger while being much easier to remember!** + +### Making Passwords Easy to Remember + +The famous [XKCD comic](https://xkcd.com/936/) comments on how it's actually +pretty easy to make very strong passwords. All you need to do is think up a +sentence using real words! As we saw above, length tends to count more for +password strength, so a long sentence with simple characters will easily +outclass any sort complex password. + +You could use a password generator that chooses N random words from a list of M +possible words. This provides `log2(M^N)` bits of entropy. Generally it's quite +easy to find a list of `10-30k` English words online. Then a password using +**just 4 words** will have around `54` bits of entropy! + +I highly recommend using a password like this, especially if you're going to +have to type it in on a mobile device often. While they're certainly weaker for +their length compared to completely randomly generated passwords, the +convenience is worth the trade-off. + +### How Strong Should My Password Be? + +There is a very wide spectrum of opinions on this matter. I will provide mine. + +Let's start by recognizing that most passwords are stored in one of 128, 256, or +512 bit hashes. Usually 256, but older systems are often 128. This means you +often *cannot have a password more secure than 256 bits of entropy*. This is a +result of the output space being lower dimensional than any password with higher +entropy, so any "stronger" password would be projected down to only 256 bits of +entropy. + +We can also look at how fast computers can brute-force passwords. +[Bcrypt](https://en.wikipedia.org/wiki/Bcrypt) is one of the most popular +hashing choices for passwords. Assuming a company is decently secure, they use +enough rounds of hashing such that a modern processor takes about 100ms to hash +a possible password. Rounding up, that means an attacker can try about 1 million +passwords per day per core. Assuming they have a monstrous 1000 core system, +they can crack through 29 bits of entropy in one day. + +Based on that number, your password should be around 60 bits of entropy for a +safe space of true security. More security-concious users often target around +100 bits of entropy instead, to make sure advancements in processor speeds never +catch up to their passwords. + +I would personally aim for making your password 15-20 simple lower-case +characters long. This provides 70-94 bits of entropy alone, and often shouldn't +be very difficult to remember! + +### Duplicate Passwords + +Pretty much everything we've discussed up until this point falls apart if you +use duplicate passwords. For many many different reasons, a password may become +compromised. Your network may be hijacked, your computer may have a keylogger +installed, even just someone recording a video on a phone can easily get your +password. + +Do NOT use duplicate passwords. Do NOT vary passwords by 1-2 characters, create +completely new distinct passwords every time. + +## Password Managers + +The current most wide-spread solution to strong but easy-to-use passwords are +password managers. Some of them, like [KeePassXC](https://keepassxc.org/), are +actually a good secure solution! + +Unfortunately, people usually use a big corporate solution instead. These +usually store your passwords in the cloud, which is a complete disaster. A +single leak means all your accounts an immediately compromised; too many eggs in +one basket. + +This isn't even unusual! Lastpass had a [major security +breach](https://blog.lastpass.com/posts/2022/12/notice-of-recent-security-incident) +in 2022. As of writing, Lastpass, Dashlane, and 1Password are [all +compromised](https://www.forbes.com/sites/daveywinder/2023/12/11/android-warning-1password-dashlane-lastpass-and-others-can-leak-passwords/?sh=1c019c497dbf) +on Android. To me, this is completely unacceptable for something that holds keys +to your all your accounts. + +That said, if you make a reasonable choice of password manager, they can be a +rather no-frills solution to most people. For those looking for top-notch +security though, it may be worth considering the pass-gpg approach instead. + +### Browser-Saved Passwords + +A browser is not a password manager. It is a complete joke how easy it is to rip +out passwords from a browser. This [short python +script](https://github.com/priyankchheda/chrome_password_grabber) can do it! So +can [this one](https://github.com/henry-richard7/Browser-password-stealer) and +[this one](https://github.com/JustYuuto/Yuuto-Stealer)... It's so easy to build +one, you can do it yourself under an hour! + +To be fair, for most people this is probably fine. Unless malware gets access to +your computer, it's unlikely to be stolen... but I still wouldn't put my +recovery email passwords nor my banking information in these. + +## 2-Factor Authentication + +Almost all services now offer 2-Factor Authentication (2FA). In fact, it's +increasingly a requirement to sign up for services. Although it seems like a +hardy security method, it's not a replacement for strong passwords. + +2FA can be quite beneficial for people who don't make very strong passwords. At +least with this method, their password is effectively multiplied by 100000 +possibilities. However, that's only about 16 bits of entropy, which isn't a very +big increase. + +I also personally feel 2FA can be incredibly inconvenient. It's not a given I +have my phone nearby every time I use my computer. One can only imagine the +situation where your phone dies and you just can't access your accounts anymore. +That said, if you're fine with the inconvenience, there's no harm in adding 16 +bits of entropy to your passwords. + +### SMS 2FA + +This is a completely different game. Unlike 2FA apps which require typing in a +code and verifying any new device using an existing device, SMS just needs +access to your phone number. A phone number is controlled by your cellular +provider, not you. This means their customer service agents can easily reassign +your phone number to another phone! + +When I got my phone number, I actually ended up with SMS verification for the +previous owner's AirBnB account! This isn't the worst possible account to +compromise, but this happened by complete accident. Further, identity theft is a +much more real threat than a hacker with a 1000-core compute cluster. All they +have to do is deceive a customer service agent over the phone, and your SMS +number is now theirs. + +In short, avoid SMS-based 2FA in favor of apps like [Authy](https://authy.com/) +that don't create such a huge security gap. + +## GPG and Pass + +If you truly want strong security, while also having ease of use, we'll use the +OG method of digital identity verification: GPG keys. + +### GPG Keys + +A GPG key is a file associated with an email, expiry date, and password. It can +be used to encrypt and decrypt any binary encoding. Originally, it was meant as +a way to send emails that only the receiver will be able to decrypt. + +I highly recommend this [amazing GPG +cheatsheet](https://gist.github.com/johnfedoruk/7f156d844af54cc91324dff4f54b11ce), +though I'll cover the necessary bare minimum here. It's a bit unfortunate how +GPG has one of the most confusing interfaces of any command line tool. + +To start, create a new key for yourself. Roughly follow the below, replacing +pieces with your information: + +``` +gpg --full-gen-key --expert + > (9) ECC and ECC + > (1) Curve 25519 + > 1y # Optional, 1 year is recommended + > Your Name # This will be visible in the public key + > emiliko@mami2.moe # This is also visible in the public key + > No comment + > Put a password on the primary key +``` + +It's important you put a *very* strong password on your GPG key, one that **you +can memorize by heart**. This will be the one and only password you will ever +need to remember from this point forward, but it must be strong! + +GPG keys can be used to sign git commits, encrypt and decrypt messages, but we +won't cover that here. All we really need is to set it up. You can also view +your keys at any time using `gpg --list-keys`. + +### Pass + +[Pass](https://www.passwordstore.org/) is my favourite password manager. I like +it since it's simple, transparent, and secure. It's so simple in fact, it'd be +pretty easy to write your own implementation in an hour or two. + +All your passwords will be stored in the `~/.password-store` directory for your +user. You can organize them by directories, just as you organize your file +system ordinarily. The trick is that `pass` will ensure all these files are +encrypted with your default GPG key. + +Using pass is super simple. To add a password just type it in: + +``` +pass insert github/password +``` + +You can also watch it being typed in with the echo `-e` flag, which I find quite +helpful: + +``` +pass insert -e github/email +``` + +You can list the names of the password files you have, without decrypting +anything: + +``` +pass show # All of them +pass show github # All the ones in the github directory +``` + +Finally, you can decrypt passwords. Use the `-c` option to copy the password to +your clipboard. The clipboard will clear after a few seconds: + +``` +pass show github/password # Actually prints it to the terminal +pass show -c github/password # Copies it to the clipboard for a few seconds +``` + +Tab competition is well supported. You can also use `pass mv` to rename files +and `pass rm` to delete them. Really, it's just super simple. + +### Why Use Pass and GPG? + +There are many reason to do so: + + - **Locally stored**: These are much harder to obtain than passwords stored on the + cloud. + - **Easy migration**: You literally `scp -r` your `~/.password-store` directory to + migrate them to a new computer. + - **3-factor authentication**: An attacker needs your GPG key, your GPG key's + password, and your `~/.password-store` directory to perform a successful + attack. Missing any of these 3 will prevent any attack from succeeding + - **0-type approach**: You can use very very strong passwords and never need to + type them in (they're on your clipboard!) + - **Secure scripting**: It's very easy to put passwords securely into shell + scripts with `"$(pass show cloudflare/token)"` + +The 3-factor method is even more helpful when considering common attacks. For +example, if a phone camera records you typing on your keyboard to decrypt the +GPG key, the attacker can't do *anything* with that password alone. They still +need physical access to your system to grab the files themselves. + +An odd benefit of 3-factor authentication is distributing backups. If you +provide people who know you, but mutually don't know one another, you can safely +entrust your passwords with third parties. This is since they need all 3 pieces +to mount an attack, so giving a trusted third party only 1 piece doesn't +compromise your security. + +Malware *could* be both a key logger and grab the files from +`~/.password-store`, but that is some very sophisticated and targeted +malware... most of the keylogger ones will just assume the password you typed is +the actual password to your accounts, which it isn't! + +I hope this article was useful in building an understanding of digital security. +Maybe you'll even consider using GPG and Pass!