Jump to content
Ketarin forum

Still looking for a way to download Oracle/Sun JRE


appyface
 Share

Recommended Posts

Thought I'd give this URL a try:

hxxp://www.java.com/en/download/manual.jsp

 

Ketarin is not able to load this page in the variable pane so I can scrape it.

 

I use WatchThatPage for my page-watching service, and it loads this page fine.

 

Is there a way to get the server to render the page contents for Ketarin? TIA,

 

--appyface

 

P.S. I know there are other web sources for downloading Java JRE and JDK. I'm only interested in methods to get the original sites' downloads via Ketarin as these are the problematic ones. E.g. java.com, oracle.com/technetwork/java/javase/downloads/index.html, etc. Please don't post me 3rd party sites, thanks :-)

Edited by appyface
Link to comment
Share on other sites

Still not quite right... the page that loads into Ketarin does not contain the same links that I get when looking at the page with IE8.

 

For example, in IE8 look at the second Windows download entry (it's for 64-bit). Hover over the orange download arrow to the left of it and note the bundle-id is 41293.

 

However, I can't find that bundle-id anywhere in the text that Ketarin loads?

Edited by appyface
Link to comment
Share on other sites

Appy, consider using this instead:

 

<?xml version='1.0' encoding='utf-8'?>
<Jobs>
 <ApplicationJob xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema" Guid="56063164-e18f-4c70-806d-04ed851f2222">
   <WebsiteUrl>http://www.java.com/en/</WebsiteUrl>
   <UserNotes />
   <LastFileSize>16299808</LastFileSize>
   <LastFileDate>2010-09-12T02:29:12.4995617</LastFileDate>
   <IgnoreFileInformation>false</IgnoreFileInformation>
   <DownloadBeta>Default</DownloadBeta>
   <DownloadDate>2009-07-09T13:28:52</DownloadDate>
   <CheckForUpdatesOnly>false</CheckForUpdatesOnly>
   <VariableChangeIndicator />
   <CanBeShared>false</CanBeShared>
   <ShareApplication>false</ShareApplication>
   <ExclusiveDownload>false</ExclusiveDownload>
   <HttpReferer />
   <Variables>
     <item>
       <key>
         <string>aversion</string>
       </key>
       <value>
         <UrlVariable>
           <RegexRightToLeft>false</RegexRightToLeft>
           <VariableType>RegularExpression</VariableType>
           <Regex>Recommended Version ([^<>]+?)\s*</strong</Regex>
           <Url>http://java.com/en/download/manual.jsp</Url>
           <Name>aversion</Name>
         </UrlVariable>
       </value>
     </item>
     <item>
       <key>
         <string>version</string>
       </key>
       <value>
         <UrlVariable>
           <RegexRightToLeft>false</RegexRightToLeft>
           <VariableType>Textual</VariableType>
           <Regex />
           <Url>http://java.com/en/download/manual.jsp</Url>
           <StartText>Recommended Version </StartText>
           <EndText> </strong</EndText>
           <TextualContent>{aversion:regexreplace:[^\d]+:u}</TextualContent>
           <Name>version</Name>
         </UrlVariable>
       </value>
     </item>
     <item>
       <key>
         <string>URL</string>
       </key>
       <value>
         <UrlVariable>
           <RegexRightToLeft>false</RegexRightToLeft>
           <VariableType>StartEnd</VariableType>
           <Regex />
           <Url>http://java.com/en/download/manual.jsp#win</Url>
           <StartText>Offline" href="http://javadl.sun.com/webapps/download/AutoDL?BundleId=</StartText>
           <EndText>" onclick</EndText>
           <Name>URL</Name>
         </UrlVariable>
       </value>
     </item>
     <item>
       <key>
         <string>swebsite</string>
       </key>
       <value>
         <UrlVariable>
           <RegexRightToLeft>false</RegexRightToLeft>
           <VariableType>Textual</VariableType>
           <Regex />
           <TextualContent>http://www.java.com/winoffline_installer/</TextualContent>
           <Name>swebsite</Name>
         </UrlVariable>
       </value>
     </item>
     <item>
       <key>
         <string>schangelog</string>
       </key>
       <value>
         <UrlVariable>
           <RegexRightToLeft>false</RegexRightToLeft>
           <VariableType>Textual</VariableType>
           <Regex />
           <TextualContent>http://java.sun.com/javase/{version:split:u:0}/webnotes/ReleaseNotes.html</TextualContent>
           <Name>schangelog</Name>
         </UrlVariable>
       </value>
     </item>
     <item>
       <key>
         <string>snotes</string>
       </key>
       <value>
         <UrlVariable>
           <RegexRightToLeft>false</RegexRightToLeft>
           <VariableType>Textual</VariableType>
           <Regex />
           <TextualContent />
           <Name>snotes</Name>
         </UrlVariable>
       </value>
     </item>
   </Variables>
   <ExecuteCommand />
   <ExecutePreCommand />
   <Category>Plugins</Category>
   <SourceType>FixedUrl</SourceType>
   <DeletePreviousFile>true</DeletePreviousFile>
   <Enabled>true</Enabled>
   <FileHippoId />
   <TargetPath>.\{category}\{appname:regexreplace:([\s\t\r\n\-\\&\/]+):_}-{version}.{url:ext}</TargetPath>
   <FixedDownloadUrl>http://javadl.sun.com/webapps/download/AutoDL?BundleId={URL}</FixedDownloadUrl>
   <Name>Java x86</Name>
 </ApplicationJob>
</Jobs>

 

Or FileHippo.

Link to comment
Share on other sites

  • 8 months later...
  • 1 month later...

For whatever reason (mistake on Oracle's part to the web page? My dumb luck?) I've been able to scrape the JRE offline installers from http://www.java.com/en/download/manual.jsp for awhile now -- until this last update 6u26.

 

Now I'm back to the problem of not seeing the same content loaded in Ketarin, as IE8 and Firefox can see on the webpage (user agent issue?).

 

So... back to my thread... any ideas on how to scrape the offline installers from this page again?

 

Kind regards,

--appyface

Edited by appyface
Link to comment
Share on other sites

It uses either a cookie (that does high-byte math) or a UA header to ensure that it's offering 64-bit to 64-bit computers. Use this user-agent header and it'll work fine:

Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 6.1; Win64; x64; Trident/4.0; .NET CLR 2.0.50727; SLCC2; .NET CLR 3.5.30729; .NET CLR 3.0.30729; Media Center PC 6.0; Media Center PC 5.0; SLCC1; Tablet PC 2.0; .NET4.0C)

 

Here's a revised 64-bit app profile:

<?xml version='1.0' encoding='utf-8'?>
<Jobs>
 <ApplicationJob xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema" Guid="019d2cdb-5370-45b8-9d8d-14c012033dd0">
   <WebsiteUrl>http://www.java.com/en/</WebsiteUrl>
   <UserAgent>Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 6.1; Win64; x64; Trident/4.0; .NET CLR 2.0.50727; SLCC2; .NET CLR 3.5.30729; .NET CLR 3.0.30729; Media Center PC 6.0; Media Center PC 5.0; SLCC1; Tablet PC 2.0; .NET4.0C)</UserAgent>
   <UserNotes />
   <LastFileSize>16852768</LastFileSize>
   <LastFileDate>2011-06-07T14:21:40.8008915</LastFileDate>
   <IgnoreFileInformation>false</IgnoreFileInformation>
   <DownloadBeta>Default</DownloadBeta>
   <DownloadDate>2009-07-09T13:28:52</DownloadDate>
   <CheckForUpdatesOnly>false</CheckForUpdatesOnly>
   <VariableChangeIndicator />
   <CanBeShared>true</CanBeShared>
   <ShareApplication>false</ShareApplication>
   <ExclusiveDownload>false</ExclusiveDownload>
   <HttpReferer />
   <SetupInstructions />
   <Variables>
     <item>
       <key>
         <string>aversion</string>
       </key>
       <value>
         <UrlVariable>
           <RegexRightToLeft>false</RegexRightToLeft>
           <VariableType>RegularExpression</VariableType>
           <Regex>Recommended Version ([^<>]+?)\s*</strong</Regex>
           <Url>http://java.com/en/download/manual.jsp</Url>
           <Name>aversion</Name>
         </UrlVariable>
       </value>
     </item>
     <item>
       <key>
         <string>version</string>
       </key>
       <value>
         <UrlVariable>
           <RegexRightToLeft>false</RegexRightToLeft>
           <VariableType>Textual</VariableType>
           <Regex />
           <Url>http://java.com/en/download/manual.jsp</Url>
           <StartText>Recommended Version </StartText>
           <EndText> </strong</EndText>
           <TextualContent>{aversion:regexreplace:[^\d]+:u}</TextualContent>
           <Name>version</Name>
         </UrlVariable>
       </value>
     </item>
     <item>
       <key>
         <string>URL</string>
       </key>
       <value>
         <UrlVariable>
           <RegexRightToLeft>false</RegexRightToLeft>
           <VariableType>RegularExpression</VariableType>
           <Regex>Windows\s*\(64\-bit\)" href="http\://[^'"]+BundleId=(\d+)"</Regex>
           <Url>http://java.com/en/download/manual.jsp#win</Url>
           <StartText>Offline" href="http://javadl.sun.com/webapps/download/AutoDL?BundleId=</StartText>
           <EndText>" onclick</EndText>
           <Name>URL</Name>
         </UrlVariable>
       </value>
     </item>
     <item>
       <key>
         <string>swebsite</string>
       </key>
       <value>
         <UrlVariable>
           <RegexRightToLeft>false</RegexRightToLeft>
           <VariableType>Textual</VariableType>
           <Regex />
           <TextualContent>http://www.java.com/winoffline_installer/</TextualContent>
           <Name>swebsite</Name>
         </UrlVariable>
       </value>
     </item>
     <item>
       <key>
         <string>schangelog</string>
       </key>
       <value>
         <UrlVariable>
           <RegexRightToLeft>false</RegexRightToLeft>
           <VariableType>Textual</VariableType>
           <Regex />
           <TextualContent>http://java.sun.com/javase/{version:split:u:0}/webnotes/ReleaseNotes.html</TextualContent>
           <Name>schangelog</Name>
         </UrlVariable>
       </value>
     </item>
     <item>
       <key>
         <string>snotes</string>
       </key>
       <value>
         <UrlVariable>
           <RegexRightToLeft>false</RegexRightToLeft>
           <VariableType>Textual</VariableType>
           <Regex />
           <TextualContent />
           <Name>snotes</Name>
         </UrlVariable>
       </value>
     </item>
   </Variables>
   <ExecuteCommand />
   <ExecutePreCommand />
   <ExecuteCommandType>Batch</ExecuteCommandType>
   <ExecutePreCommandType>Batch</ExecutePreCommandType>
   <Category>Plugins</Category>
   <SourceType>FixedUrl</SourceType>
   <DeletePreviousFile>true</DeletePreviousFile>
   <Enabled>true</Enabled>
   <FileHippoId />
   <LastUpdated>2011-06-07T14:21:40.8008915</LastUpdated>
   <TargetPath>.\{category}\{appname:regexreplace:([\s\t\r\n\-\\&\/]+):_}-{version}.{url:ext}</TargetPath>
   <FixedDownloadUrl>http://javadl.sun.com/webapps/download/AutoDL?BundleId={URL}</FixedDownloadUrl>
   <Name>Java x64</Name>
 </ApplicationJob>
</Jobs>

 

Of course, you could always just use filehippo to get 'em, too, as this avoid the issue of overwhelming maintenance completely. :)

Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
 Share

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.