What is a JWT for?
To understand the point of JWTs, you have to first understand the problem they are trying to solve. Imagine you are a server, patiently waiting in a rack, in a data center. And you've received a message that says "show me the home page". Or, to speak HTTP, a "GET /index" request.
You generate the home page and send it back. So far so good.
Now imagine that on the home page, you've been instructed by your programmer overlords to display the first name of the person who requested the page, if they are logged in.
Or imagine that the person requests a page such as /secret. And you know this page is for employees only or for reserved for those who have subscribed. How do you handle that, from the server's point of view? To state things differently: how, when the server receives a message, does it know who sent it?
Sure, at some point, the user logged in. He or she entered an email and a password, so we (the server) knew then who it was. When the server receives a message on /login with an email and a password, it checks the database that the email and the password match. And returns an error if it doesn't.
But let's imagine that this message to /login was sent, say,10 minutes ago. Let's imagine that there are hundreds of people, each with different rights, who are also logging in. How does the server know that *this *message (or request) is from that user?
This is where request headers come in. We can put information in them. We could add the user's email to identify them. But that would be trivial to spoof. We could add the password, but that would imply two things:
- First, the database would have to be checked each time a request is received, which is costly
- Secondly, the password would need to be sent in each and every message, which is risky.
Clearly, this is not ideal.
A first solution
The solution that has been used for a long time is to create a kind of temporary password (a secret) that the users receive as a response when they log in. The server remembers the link between this secret and that user.
And when the browser talks to the server, it sends this secret (via the request headers). Then the server, when it receives the message, checks in memory if the secret corresponds to a user. This secret is what is called the "session" identifier. All this works very well when you have a single server.
But imagine that you're managing an application with hundreds of servers. And in front of these servers, there is a load balancer, which sends messages to the servers that are less used. So when a message arrives, there is little chance that it will be on the same server as the one used for login.
So we can't rely on the server keeping the secrets in memory.
We can create a central database, possibly in memory. But it would be much better if we could make our servers "stateless". That is to say, able to process requests without having to keep a state in memory between requests. If we could do that, it would allow much easier scalability. Because there would be no need to communicate a shared state between servers.
And this is where the JWT comes in.
Structure of a JWT
A JWT looks like this gibberish hodgepodge of letters and numbers: eyJhbGciOiJIUzI1NiJ9.eyJuYW1lIjoiSm9lIENvZGVyIn0.5dlp7GmziL2QS06sZgK4mtaqv0_xX4oFUuTDh1zHK4U
As you can see if you look carefully, there are three blocks of text separated by dots. These blocks are the three parts of the JWT, namely: the header, the body, and the signature.
Each of the parts is encoded in Base64. Not for security purposes, but to allow the token to transit peacefully via an HTTP message. Let's have a look at these three parts:
1/ The header specifies the algorithm used for the signature, with the "alg" field as "algorithm."
Here, in this example, HS256 indicates that the token is supposed to be signed with a key in SHA 256.
2/ The second part, the body, contains the claims. Here too it is a JSON document. This is where the token states (for example) which user the token belongs to, its id, and its rights. We can (and, if we want to do this right, we must) specify the creation and expiration dates of the token.
The JWT standard specifies the name of those fields. For example, the "exp" field specifies the timestamp of expiration. (A timestamp is a date specified as the number of seconds since EPOCH, that is to say since January 1st, 1970). So the (non-base64 version) might look something like this:
"id": 42, "name": "Kodaps", "role": "admin", "exp": 1647807974 }
This is the part that stores the state and eventually allows the server to work in a stateless way.
3/ Finally comes the** cryptographic signature**. This is the part that allows us to validate the token. The idea is for the token to be signed with a secret that only the server knows.
This way, if the signature matches this secret, I (the server) know that I can trust this token and its claims. Without needing to consult a database.
JWT's first advantage, then is that it allows the servers to be "stateless", i.e. without needing to store any state in their own memory.
The second advantage is that JWTs can be used in the context of mobile applications, where the cookies or sessions make less sense. Or in the context of communication with an API, where in the same way, sessions don't really exist.
A third advantage is that using an asymmetric key pair, a token signed with a private key can be validated by anyone who has the public key.
So this means that a central body — one which we know to be trustworthy — can issue a verifiable certificate, which states (for example): "This user has these characteristics. And you can verify with the public key that I published that his JWT token is valid."
And now it's not only the emitting server that can validate the certificate but anyone who has access to the public key.
A JWT is an object with a cryptographic signature, which encapsulates a state expressed as "claims". Depending on the algorithm used, the token can be validated by the server itself. This is useful in the case of an application, API calls, or stateless servers.
And using a private/public key pair gives us the means to set up trusted third-party mechanisms.
JWTs are not always preferable to sessions (they do have some drawbacks such as adding to the message size) but in the right situation, they can be very useful.