Jump to content
Ketarin forum

[SOLVED] Get binaries from HTML-Redirects like Sourceforge easily!


Gozi
 Share

Recommended Posts

Hi all,

 

finally I found an easy way to get files from redirecting sites like sourceforge.

 

Requirement

get curl for win32

http://curl.haxx.se/download.html

 

There are two versions, one with SSL and one without.

 

The manpage

http://curl.haxx.se/docs/manpage.html

 

Demonstration

- Go to SF, project Unetbootin for example http://sourceforge.net/projects/unetbootin/

- now click download now!

- you´ll see the website for choosing a mirror and the direct link. You know and love this site!

- just copy the url of the site http://sourceforge.net/projects/unetbootin/files/UNetbootin/494/unetbootin-windows-494.exe/download

 

Open cmd.exe (dos box) and just test this:

curl -L -o myfilename.exe "http://sourceforge.net/projects/unetbootin/files/UNetbootin/494/unetbootin-windows-494.exe/download"

You´ll see curl getting the HTML and waiting for the binary. After few seconds the download will start like known from IE or Firefox and you´ve successfully downloaded the binary.

 

Thats all!!!!

 

Just need to define the filename and get the version and you can use Ketarin´s COMMANDs for getting your files.

 

Take a look on the man page. curl supports FTP, user-agent, referer, resume, AND COOKIES!

 

Isn´t that great?

 

A Quick Example:

<?xml version='1.0' encoding='utf-8'?>
<Jobs>
 <ApplicationJob xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema" Guid="5b2368fd-4ed1-4a73-855b-c271bed46a45">
   <SourceTemplate><![CDATA[]]></SourceTemplate>
   <WebsiteUrl />
   <UserAgent>Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 6.0)</UserAgent>
   <UserNotes />
   <LastFileSize>152930</LastFileSize>
   <LastFileDate>2010-10-13T22:35:01+02:00</LastFileDate>
   <IgnoreFileInformation>false</IgnoreFileInformation>
   <DownloadBeta>Default</DownloadBeta>
   <DownloadDate xsi:nil="true" />
   <CheckForUpdatesOnly>false</CheckForUpdatesOnly>
   <VariableChangeIndicator>version</VariableChangeIndicator>
   <CanBeShared>true</CanBeShared>
   <ShareApplication>false</ShareApplication>
   <ExclusiveDownload>false</ExclusiveDownload>
   <HttpReferer />
   <SetupInstructions />
   <Variables>
     <item>
       <key>
         <string>SFlatest</string>
       </key>
       <value>
         <UrlVariable>
           <RegexRightToLeft>false</RegexRightToLeft>
           <VariableType>RegularExpression</VariableType>
           <Regex />
           <Url>http://sourceforge.net/projects/unetbootin/files/</Url>
           <Name>SFlatest</Name>
         </UrlVariable>
       </value>
     </item>
     <item>
       <key>
         <string>Match1</string>
       </key>
       <value>
         <UrlVariable>
           <RegexRightToLeft>false</RegexRightToLeft>
           <VariableType>RegularExpression</VariableType>
           <Regex>dload filename { url: '(.*?).exe</Regex>
           <Url>http://sourceforge.net/projects/unetbootin/files/</Url>
           <Name>Match1</Name>
         </UrlVariable>
       </value>
     </item>
     <item>
       <key>
         <string>version</string>
       </key>
       <value>
         <UrlVariable>
           <RegexRightToLeft>false</RegexRightToLeft>
           <VariableType>Textual</VariableType>
           <Regex />
           <Url>http://sourceforge.net/projects/unetbootin/files/</Url>
           <TextualContent>{Match1:split:-:-1}</TextualContent>
           <Name>version</Name>
         </UrlVariable>
       </value>
     </item>
     <item>
       <key>
         <string>url</string>
       </key>
       <value>
         <UrlVariable>
           <RegexRightToLeft>false</RegexRightToLeft>
           <VariableType>Textual</VariableType>
           <Regex />
           <TextualContent>{Match1}.exe</TextualContent>
           <Name>url</Name>
         </UrlVariable>
       </value>
     </item>
     <item>
       <key>
         <string>ForTesting</string>
       </key>
       <value>
         <UrlVariable>
           <RegexRightToLeft>false</RegexRightToLeft>
           <VariableType>Textual</VariableType>
           <Regex />
           <TextualContent>df</TextualContent>
           <Name>ForTesting</Name>
         </UrlVariable>
       </value>
     </item>
   </Variables>
   <ExecuteCommand />
   <ExecutePreCommand>curl -L -o "{2DIR}\UNetbootin-{version}.exe" "{url}"
</ExecutePreCommand>
   <ExecuteCommandType>Batch</ExecuteCommandType>
   <ExecutePreCommandType>Batch</ExecutePreCommandType>
   <Category> Tools</Category>
   <SourceType>FixedUrl</SourceType>
   <PreviousLocation>q:\software\UNetbootin_494.jpg</PreviousLocation>
   <DeletePreviousFile>true</DeletePreviousFile>
   <Enabled>true</Enabled>
   <FileHippoId />
   <LastUpdated>2010-10-13T22:57:29.7176+02:00</LastUpdated>
   <TargetPath>{2DIR}\{appname}_{version}.{url:ext}</TargetPath>
   <FixedDownloadUrl>http://www.wetterzentrale.de/pics/D2u.jpg</FixedDownloadUrl>
   <Name>UNetbootin</Name>
 </ApplicationJob>
</Jobs>

 

Open the log to see whats happening.

 

Remember:

Ketarin allways needs a link to initialize the download, so I took a virtual link from an alterating JPG.

 

@Flo:

This could be the next Request: The pre Update command should be be always executed if an update is available (checking {version}). Even if URL is empty.

OR

Ketarin should have a "on update" command ;-)

Edited by Gozi
Link to comment
Share on other sites

Did you download Windows compiled version with dependencies? Anyway i didn't know cURL, but after a quick search i see it supports a lot more protocols than Wget

HTTP, HTTPS, FTP, FTPS, SCP, SFTP, TFTP, LDAP, LDAPS, DICT, TELNET, FILE · IMAP, POP3, SMTP and RTSP

who knows maybe someday i start tracking downloads via POP3 or IMAP :) Also found a comparison done by cURL developer, it does seem interesting. I will be testing this soon thanks for sharing.

 

http://daniel.haxx.se/docs/curl-vs-wget.html

Link to comment
Share on other sites

What you're actually seeing is a special behavior applied to SF for the curl and wget engines. Because SF is used as a distribution network for Linux libraries and other packages, allowing curl & wget a direct "pass" to the real files, effectively bypassing the whole framed and redirection distribution network.

 

I really should have realized that before when I was creating the template, 'cause it could save a lot of time just by adding a wget or curl UA header to the SF apps. Sigh.

Link to comment
Share on other sites

Glad to see the feedback :) I did just download the compiled win32 version without SSL.

 

cUrl is really impressive and there are other projects based on cUrl like CurlFtpFS (using FTP as filesystem).

Edited by Gozi
Link to comment
Share on other sites

  • 2 months later...
What you're actually seeing is a special behavior applied to SF for the curl and wget engines. Because SF is used as a distribution network for Linux libraries and other packages, allowing curl & wget a direct "pass" to the real files, effectively bypassing the whole framed and redirection distribution network.

 

I really should have realized that before when I was creating the template, 'cause it could save a lot of time just by adding a wget or curl UA header to the SF apps. Sigh.

 

Great information! I was revisiting my SourceForge entries in Ketarin occasionally, but more or less gave up on them. Now I just spoofed user agent to "curl" in settings and just like that - those entries work again without any trouble.

Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
 Share

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.