Post

Password login implementation and https fundamentals

Password login implementation and https fundamentals

This article describes in detail the encrypted transmission, encrypted storage, and setting of cookies used in the implementation of password login.

1 Encrypted transmission

1.1 http (hypertext transfer protocol) clear text transfer

Disadvantages of http:

  • Use of plaintext for communications is subject to eavesdropping
  • Does not authenticate the identity of the communicating party
  • No proof of message integrity, may have been tampered with (man-in-the-middle attack, MITM)

http explicit transmission

Man-in-the-Middle Attack (https://baike.baidu.com/item/%E4%B8%AD%E9%97%B4%E4%BA%BA%E6%94%BB%E5%87%BB) (MITM attack) is an “indirect” intrusion attack, in which a computer controlled by an intruder is placed virtually between two communicating computers in a network connection by various technical means. “) is an “indirect” intrusion attack, this attack mode is through a variety of technical means will be controlled by the intruder of a computer virtually placed in the network connection between the two communicating computers, this computer is called “man-in-the-middle”. The computer is called the “middleman”. Attack methods: DNS spoofing (by hacking DNS servers, controlling routers, etc.), unreliable proxy servers, etc.

1.2 Symmetric encryption

Since plaintext transmission is not secure, we use symmetric encryption for encrypted transmission. Symmetric encryption means using the same key S to encrypt and decrypt. symmetric encryption

However, if the key is fixed, the client is able to decrypt with the key once it is leaked. Therefore a random key needs to be generated. But how can we securely tell each other the key? If the server side directly tells the client which key to use, the key is easily leaked and cannot be secured symmetric encryption

1.3 Asymmetric encryption + symmetric encryption

How to encrypt the negotiation process? In the field of cryptography, there is an encryption algorithm called “asymmetric encryption”.

It is characterized by the fact that the ciphertext encrypted by the private key can be decrypted by the public key, but the ciphertext encrypted by the public key can only be decrypted by the private key. Only one person has the private key, while the public key can be sent to all people.

The client encrypts the key S with the public key and sends it to the server. The server receives it and decrypts it with the private key to get the key S. In this way, the client can successfully inform the server about the key. Thus, the negotiation process of symmetric encryption algorithm using asymmetric encryption algorithm is secured.

symmetric encryption

Why can’t we use asymmetric encryption directly, but use asymmetric encryption algorithm for symmetric encryption algorithm negotiation process? Because the public key is known to everyone, the server encrypts with the private key, and the middleman can decrypt with the public key to get the information. That is, it can only guarantee that the client can send information to the server securely, but not guarantee that the server can send information to the client securely.

But the question arises again, how can the client get the public key? If the server sends the public key directly to the client, the public key may be switched by an intermediary.

symmetric encryption

But it’s also impractical to have every browser on every client save the public keys of all websites by default.

1.4 Digital Certificates and Digital Signatures

1.4.1 Digital certificates

Therefore, we use a third-party organization (CA, Certificate Authority) to issue digital certificates.

symmetric encryption

The CA encrypts our public key using its private key and then passes it to the client. The client then decrypts it using the public key of the third party organization. In this way, the browser only needs to save 1 CA public key by default (and probably other trusted organizations’ public keys as well).

symmetric encryption

However, CA can’t just give you a company to make a certificate, it may also issue a certificate to a company with a bad intention like a middleman. In this case, the middleman will have the opportunity to switch your certificate, the client in this case is unable to distinguish whether it is receiving your certificate, or the middleman. Because no matter the middleman, or your certificate, can use the CA’s public key for decryption.

symmetric encryption

So what does a client have to do to properly identify the other side?

1.4.2 Digital signatures

The client can verify the digital signature of the certificate to identify the source of the certificate. First use the hash function to generate a digest of the certificate content (digest), using the private key, the digest encrypted to generate a “digital signature” (signature). The following are some of the features that can be used to generate a digital signature:

symmetric encryption

After the client gets the certificate, it decrypts it with the CA public key and generates a summary by itself according to the method on the certificate, and if the generated summary is the same as the summary obtained by decrypting the digital signature on the certificate, then it means that the certificate is real and has not been tampered with. Then, check the website information on the certificate, if it is the same as the website you are currently browsing, then it means that the certificate is from the correct source, so as to correctly identify the other party’s identity.

symmetric encryption

1.5 https (hypertext transfer security protocol)

https = http + TLS/SSL HTTP is the application layer protocol, TCP is the transport layer protocol, and between the application layer and the transport layer, a secure sockets layer, TLS, is added. The above talks about allowing the client and server side to securely negotiate a symmetric encryption algorithm. This is what the TLS protocol in HTTPS mainly does. The diagram below:

symmetric encryption

Seems flawless? If the intermediary uses a CA certificate, if the URL recorded in the digital certificate, does not match the URL you are browsing, it means that this certificate may have been fraudulently used, and the browser will issue a warning. However, if the user clicks to continue browsing the site, the attack will still be successful.

symmetric encryption

And if the middleman uses his own forged certificate, the same warning will be issued. If the user clicks to trust this certificate still the attack will be successful. So the hacker just has to trick the user into installing their own forged certificate, for example using various phishing indescribable websites.

symmetric encryption

1.6 https + one-way encrypted passwords

As you can see, https is quite secure, but not secure enough for transmitting passwords. So we use the https protocol along with a one-way encryption algorithm to encrypt the password before transmitting it.

2 One-way encryption

2.1 Why encrypted storage?

Never store passwords in plaintext! If you store passwords in plaintext (whether in database or logs), once the data is leaked, all users’ passwords will be exposed to hackers without any reservation, and then it will be meaningless for us to take half a day’s effort to encrypt the transmission. We can learn from the previous news “GitHub inadvertently recorded some plaintext passwords in internal logs”.

2.2 Hash algorithm encrypted passwords

Algorithms that are often used by everyone for encryption are MD5 and SHA series (e.g. SHA1, SHA256, SHA384, SHA512, etc.)

1
md5('truepassword')

But this is easy to crack. Hackers will pre-calculate a large number of passwords corresponding to the hash value of various hash algorithms, and passwords and the corresponding hash value into a table (this table is often referred to as a rainbow table), in the cracking of passwords just need to go to the pre-prepared rainbow table to match.

2.3 Adding “salt” to improve security

Salt, i.e. a randomized string, salting a plaintext password means splicing the plaintext password with a randomized string. We can first add salt to the plaintext password, and then encrypt the salted password with a hash algorithm.

1
2
const salt = Math.round(Math.random() * 10000).toString()
const encrypted = md5(`truepassword@${salt}`) // 用@分割

Although the salted algorithm is effective against the rainbow table cracking method, it does not have a high level of security because calculating the hash is extremely time consuming, and a hacker can still crack it using the exhaustive method, just with some added time consuming.

2.4 Increase cracking difficulty with BCrypt or PBKDF2

The biggest feature of these two algorithms is that we can set the number of repeated calculations through the parameter, the more the number of repeated calculations, the longer the time consuming. If it takes 1 second or more to calculate a hash value, then hackers will no longer be able to crack passwords using the profiteering method. It takes 11.5 days to crack a 6-digit plain numeric password, let alone a high-security one. What security trades for is a loss of performance, as its complexity results in each computation taking far more time than a normal salting algorithm. Here is an example using bcryptjs:

1
2
3
const bcrypt = require('bcryptjs')
const salt = bcrypt.genSaltSync(10) // rounds决定了加密复杂度
const hash = bcrypt.hashSync('truepassword', salt)

bcrypt encrypted string is shaped like: $2a$10$asdjflkaydgigadfahgl.asdfaoygoqhgasldhf, where: $ is the separator, meaningless; 2a is the bcrypt encryption version number; 10 is the value of cost; after the first 22 bits is the value of salt; and then the string is the password ciphertext! .;

3 Setting Passwords and Verifying Passwords

These are the encrypted transmission and encrypted storage passwords that will be used when setting a password. To summarize, the process is as follows:

  1. using https protocol
  2. use bcryptjs to encrypt and transmit data in one direction
  3. store the encrypted passwords in the database

When verifying a password, the password is verified by comparing the encrypted password with the ciphertext from the database. When the verification passes, a login state needs to be generated and written to the cookie. Next, this part will be said in detail.

4 Setting Cookies

4.1 Why Set Cookies

Since the http protocol itself is stateless, the protocol itself does not support the concept of “login state”, which must be implemented by the project itself.

So, how do you recognize a user’s login state?

session, refers to a session made up of multiple related http requests. After logging in, the server sets the sessionid (usually the userid) into a cookie. Each request in this session will carry this cookie, and the server can recognize the user through the cookie.

Setting a user id to a cookie and then getting user information from the user id is safer than setting all user information directly to a cookie.

4.2 Generating a sessionid

Why not just set the user id to a cookie? Because cookies are carried in the HTTP header and can be accessed by intermediaries, and sensitive information should not be transmitted via cookies. Explicit user ids are easy for hackers to guess the logic of user id generation (usually sequential numbers) and thus impersonate the user.

Therefore, use an algorithm that guarantees uniqueness and randomness (e.g., uuid, etc.) to generate the sessionid, store the mapping relationship between the sessionid and the user id in a database such as redis, and set the expiration time. When the client initiates the request again, it will get the userid through the sessionid, and then query redis or memchache through the userid to get the user information. The same userid can have more than one sessionid, which allows the user to log in on more than one device at the same time.

Or use symmetric encryption algorithm, the key is placed on the server side, add a timestamp to the sessionid, decrypt it to get the userid and timestamp and check the timestamp. But this scheme once a sessionid is stolen, it is not possible to delete the sessionid before the expiration date.

  • Time of expiration (Expires) or period of validity (Max-Age)

Persistent cookies can specify a specific expiration time (Expires) or validity period (Max-Age).

  • Secure' and HttpOnly’.

Cookies marked with the Secure flag should only be sent to the server through requests that are encrypted by the HTTPS protocol. However, even if the Secure flag is set, sensitive information should not be transmitted via cookies because cookies are inherently insecure and the Secure flag does not provide any real security. Starting with Chrome 52 and Firefox 52, insecure sites (http:) cannot use the Secure flag for cookies. To avoid cross-domain scripting (XSS) attacks, cookies with the HttpOnly token cannot be accessed through JavaScript’s Document.cookie API and should only be sent to the server.

  • Domain' and Path’.

The Domain and Path identifiers define the scope of the cookie: i.e., which URLs the cookie should be sent to. The Domain identifier specifies which hosts can accept cookies; if not specified, it defaults to the host of the current document (without subdomains). If Domain is specified, subdomains are generally included. The Path identifier specifies which paths under the host can accept cookies (the URL path must be present in the request URL). With the character %x2F (“/”) as the path separator, subpaths are also matched.

  • `SameSite’

The SameSite Cookie allows a server to request that a certain cookie not be sent in the case of a cross-site request, thus preventing cross-site request forgery attacks (CSRF). However, the SameSite cookie is currently in an experimental phase and is not supported by all browsers.

See HTTP Cookies for details.

5 Reference

This post is licensed under CC BY 4.0 by the author.