For more advanced usage (like clicking, or submiting a search request) it would need to have some kind of scenario like:
"Click on this" -> "wait till this loads" -> "type something here" -> "scroll to this" -> load data.
Which is possible with headless chrome, so the trick is to make it general and easy to use (something like recording what user does through chrome plugin). Maybe in future versions :)
fill_in('Password', with: 'Seekrit')
choose('A Radio Button')
select('Option', from: 'Select Box')
* shameless plug * Our little startup, Feedity - https://feedity.com, helps create custom RSS feeds for any webpage, utilizing Chrome for full-rendering and many other tweaks & techniques under the hood for seamless & scalable indexing.
"selector":".bloc-blanc > p:nth-child(1)"
"text":" 0 école(s) correspondent à votre recherche "
Edit: the page changed and it's not working anymore. Sorry for the false alarm, my bad.
It would be great if the page analyzer could supply a list of all the assets loaded with the web page; for example, any asset with a media type of image/* is listed in an images array, and so forth.
Also, looking at pinterest, it's server rendered through ReactJS, so there is #initial-state script tag with first few images preloaded as urls, so if you cared only about the images on top without scrolling then this is the safest bet.
Btw the fact that it's running for 5 minutes is a bug, that I will look at, since there is a timeout of 2 minutes and there are no hanging runs or runs that ended with timeout.
Btw when it comes to ToS and scraping, this is not much different from accessing their website through normal browser only instead of rendered content we should you analyzed data. The page is only loaded once same as in browser.
I just sent an application for the Junior Web Developer position.
Looking forward to hearing back!