Jump to content
Ketarin forum

Help with site inspection for grabbing json data


Yuri-Tech
 Share

Recommended Posts

Can use some help please, as I don't have a lot of experience in .Net regex and javascript/HTML5.

 

I'm tring to get SUMo version from the changlelog url (and not the download url):

{ChangelogURL}

    http://www.kcsoftwares.com/bugs/changelog_page.php?project_id=11

{VersionFromChangeLogURL} regex I use with success:

(?<=changelog_page\.php\?project_id\=11\"\>.*?SUMo.*?changelog_page\.php\?version_id\=\d+\"\>).*?([\d\.]+).*?(?=<\/)

 

I wonder if there is a safer way and performance wise to scrape the version number.

 

I tried using chromium "inspect url" F12 / Ctrl+Shift+I,

I can see there is a reference to this site:

http://www.kcsoftwares.com/bugs/changelog_page.php?version_id=954

which has less input for the regex to grab from but I can't get how to scrape the `id` div data which currently equals to `954`.

Maybe there's also a way to grab the version number which post data or json data?

 

Hope someone can help me figuring out this example so I'll be able to do it next time.

 

Thank you

Link to comment
Share on other sites

Thank you MAPJe71,

Thats a one clever regular expression.

Would you mind explaining what is your procedure how you approach the problem to solve it, (if you have any:) 

 

 

Actually, I thought there should be a better way to access the version number,

If anyone has any idea using site inspection would be great and mind enriching.

If there is no easy method should be also great to know.

 

Link to comment
Share on other sites

  • 3 weeks later...

MAPJe71 thanks for your answer and the regex.

Actually I already used the download page.

I asked if there's a better way scraping the version number from the changelog for 2 reasons:

1) to learn and use it for other sites with cleaner approach, so it won't have future glitches.

2) Whereveris possible, I use 2 variables for the version number and compare them with a script without coding so it'll be easier to identify website changes that may break something.

 

I'm looking for a solution to make it cleaner, for example reggexing this url for version:

https://justgetflux.com/update/v4/windows-download.json

When I grabbed it the version wouldn't appear in the main download site (https://justgetflux.com/)

but if you go to site inspection -> Sources -> Page -> update/v4

there's the json file with the version

had to really dig it up and looking for explanation or a simple way for scraping it.

 

Any suggestion following this way?

Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
 Share

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.