HTML URL encode
What is URL Encoding?
URL encoding, also known as percent encoding, is a method to encode information in a Uniform Resource Identifier (URI) under certain circumstances. This is mainly used in the query string or the fragment of a URL.
For instance, in the URL http://example.com/query?q=hello world, the space character in the query value is not safe to include directly in the URL, so it must be encoded.
Characters can be unsafe for a variety of reasons:
- Reserved characters: Certain characters have special meanings in a URL. For instance, the '?' character is used to indicate the start of a query string, and '&' separates query parameters.
- Non-alphanumeric characters: Except for a small set of specific characters, most non-alphanumeric characters must be encoded.
- Non-ASCII characters: URLs are transmitted over the internet using ASCII character-set. So, non-ASCII characters must be converted into a valid ASCII format.
- Unsafe characters: Some characters may be unsafe to use because they could be used in malicious ways, or they might be misinterpreted by some older systems.
So, in URL encoding, unsafe characters are replaced with a '%' followed by two hexadecimal digits that represent the ASCII code of the character.
Let's dive into some examples to get a clearer understanding.
URL Encoding in HTML
HTML forms use URL encoding when submitting data in HTTP GET method. The encoding process replaces unsafe ASCII characters with a "%" followed by two hexadecimal digits. Spaces are replaced by "+" or "%20".
Let's take a look at how to encode a URL in HTML.
Suppose we have a simple HTML form like this:
<!DOCTYPE html>
<html>
<body>
<h2>HTML Form</h2>
<form action="/action_page.php" method="get">
First name:<br>
<input type="text" name="firstname" value="John&Doe">
<br>
Last name:<br>
<input type="text" name="lastname" value="Doe">
<br><br>
<input type="submit" value="Submit">
</form>
</body>
</html>
In the above HTML form, when you click on submit button, the browser takes the form data and appends it to the URL in the form of query parameters. The URL might look like this:
http://www.example.com/action_page.php?firstname=John&Doe&lastname=Doe
The above URL is incorrect because the '&' character has a special meaning in a URL. Hence it needs to be encoded. The correctly encoded URL would look like:
http://www.example.com/action_page.php?firstname=John%26Doe&lastname=Doe
Here, %26 is the URL encoded form of the '&' character.
In the next part of this article, we will look at how you can programmatically URL encode and decode using JavaScript. We will also explore more examples to better understand this concept.
JavaScript provides built-in functions for URL encoding and decoding. This part of the article will guide you through the usage of these functions.
JavaScript's encodeURIComponent function
encodeURIComponent
is a JavaScript function that encodes a URI by replacing each instance of certain characters by one, two, three, or four escape sequences representing the UTF-8 encoding of the character.
Here is a basic usage example:
let uri = "my test.asp?name=ståle&car=saab";
let encoded = encodeURIComponent(uri);
console.log(encoded);
// Outputs: my%20test.asp%3Fname%3Dst%C3%A5le%26car%3Dsaab
In this example, the encodeURIComponent
function is used to encode the URI which includes some unsafe characters.
JavaScript's decodeURIComponent function
decodeURIComponent
is a JavaScript function that decodes a URI component that was previously created by encodeURIComponent
or by a similar routine.
Here is how you can use it:
let uri = "my%20test.asp%3Fname%3Dst%C3%A5le%26car%3Dsaab";
let decoded = decodeURIComponent(uri);
console.log(decoded);
// Outputs: my test.asp?name=ståle&car=saab
In this example, the decodeURIComponent
function is used to decode the URI back to its original form.
Encoding Query Strings in JavaScript
While encodeURIComponent
encodes the full URL, you often only need to encode the query string parameters of a URL. Here is a simple function that encodes an object of data into a query string with properly URL encoded name-value pairs:
function encodeQueryData(data) {
const ret = [];
for (let d in data)
ret.push(encodeURIComponent(d) + '=' + encodeURIComponent(data[d]));
return ret.join('&');
}
let data = {name: "John&Doe", age: 30};
let queryString = encodeQueryData(data);
console.log(queryString); // Outputs: name=John%26Doe&age=30
In the example above, we are only encoding the query parameters and not the entire URL.
URL encoding ensures the integrity and reliability of the data being sent in the URL. It allows for the safe transmission of data over the web by encoding any potentially unsafe characters in the URL. This becomes especially important when dealing with data input by users, as it helps prevent any potential security risks.
Summary
URL encoding, also known as percent encoding, is a technique that allows us to include certain characters in a URL that would otherwise be not allowed. It works by replacing unsafe characters with a '%' sign and a two-digit hexadecimal number that represents the ASCII code of the character.
We have covered how URL encoding works in the context of HTML and JavaScript. In HTML, the browser automatically encodes URL for us when we submit a form. In JavaScript, we can use the encodeURIComponent
and decodeURIComponent
functions to manually encode and decode URLs.