Jump to content
Ketarin forum
Adam Piggott

Ketarin not decoding data in HTML

Recommended Posts

Hi - I'm not sure if this is a bug in Ketarin, .NET, a fault of the web dev of the site, or a bit of none/all of the above.

On the download page of the encryption software VeraCrypt, the link to the download has an encoded "+" character in the HTML, showing up as "+". Firefox decodes this automagically and the link works, but Ketarin requests the URL as-is.

The offending line in the page is as follows:

<a href="https://launchpad.net/veracrypt/trunk/1.21/&#43;download/VeraCrypt%20Setup%201.21.exe">

Requesting this via Ketarin gets a 404, the same with cURL. I assume the web server is seeing the "#" and assuming it's part of a document anchor. Or something else weird. Either way I'm not sure where the fault lies but I'm erring towards the site being "less incorrect" :-)

For the moment I've just split the URL variable into two parts but figured I'd report this in case it is a bug in Ketarin.

Share this post


Link to post
Share on other sites

it's bad form within the URI specification, but it's an edge case so browsers will usually allow it anyway. this is "HTML-encoded" (uses an ampersand escape) not "URL-encoded" (uses a percentage escape). URI's are supposed to be encoded with URL-encoding.

in situations like these I would recommend you pre-parse the URL by performing a replacement operation on it. 

{url:replace:&#43;:+}

Alternatively, you could pass it to multireplace to swap out a series of broken encodings like this (or any other string selections you wanted to replace).

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

×

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.