Scraping Cloudflare Sites [closed] - java

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 3 months ago.
Improve this question
so I am trying to read data from a bunch of sites, now most of these sites use cloudflare // cloudflare antibot, what this does is block my scrape attempts...
I don't know much about how the cloudflare anti bot works but I was wondering if you could explain how I could do this without using a external library.
Thanks.

It works very well and how it works is not fully disclosed since it is more a paid service then free one.
Cloudflare default challenge is 30 minutes and they stopped to serve captchas to identify real users by replacing with a more UX oriented method (browser evaluate an hash some seconds). Solved the 1st captcha Your browser will get a token (cf-something) which will make you able to query the underlying resource for half hour (if the site owner didn't lower it) then your scrape process must be faster than 30m.
The real question is: why some APi are not available for the services You want to scrape?

Related

Is there a simple way to allow access to my website only through my Android application? [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 1 year ago.
Improve this question
Is there like any method to only allow direct access to my website only from the Android application.
Any other requests coming from any where else being blocked or redirected to a restricted access page or similar.
Please do provide any ideas or simple solutions and any step by step procedure to do so would be greatly appreciated.
a little idea here:
Make all your request to index HTML and then redirect according to user agent
On index validate the request headers seeking for user agent and if it's not android then redirect to another site.
Info about user-agent at mozilla site: here
Edit: As MrWhite says, make sure that Android app sends a unique user-agent.

Data from a website [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 7 years ago.
Improve this question
I was just sitting and thinking what I should do in programming.
And I thought of a csgo item value calculator.
Then I thought how would I get the prices of the items.
So my question is how do I get information from ex. //
http://steamcommunity.com/market/listings/730/Chroma%202%20Case And retrieve the price of the item, which is in this: market_commodity_orders_header_promote
What I would recommend is getting some kind of API tool such as PostMan or Fiddler and use them to sniff the website. See if its calling any kind of API, and if it is, see if you can take advantage of that API to do what you need to do.
Website Scraping is also a valid option. Either way you should have a look and see which method would work best for you given your problem and experience level.
The most important part is research research research. Google everything before you ask. We all wish you the best of luck!

Please suggest approach for Java-based web app [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 8 years ago.
Improve this question
I'm a Java developer that hasn't coded in about 5 years and wants to polish up my skills. I am going to create a small app that uses an OAuth 2.0 authentication flow and then makes a few REST calls and displays the results. I've got my credentials setup with the OAuth provider.
I used Eclipse back in the day, is that still a solid IDE for this type of project? If I want to share the app with others to show my work, where could I host the code?
Thanks for these and any other pointers.
first off - yes, Eclipse if still a good choice.
if you can, make you app a web-application, and then you can host it in PaaS such as Google AppEngine. then the app itself will be always accessible from any machine that is connected to the web. this way, you will be able to show it to anyone you want.
if you only want to show the code, then GitHub or Google Code are a good choices.
HTH

What is preferable : detecting device using its size or user agent string? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 9 years ago.
Improve this question
I need to differentiate between mobile/tablet and desktop browsers and i was wondering is it better to parse user agent string or look at width and height of the device to determine same? Which method is preferable and why?
TIA
You typically need to do both. The User Agent isn't enough to distinguish between android phones and tablets.
See this link on how Google Web Toolkit does it:
https://code.google.com/p/google-web-toolkit/source/browse/trunk/samples/mobilewebapp/src/com/google/gwt/sample/mobilewebapp/FormFactor.gwt.xml?r=10041
Basically, you check the useragent for "iphone" or "ipad", else check for "android" (if so, use the size to determine mobile/tablet), otherwise it's a "desktop".
Is this information directly used by your application, or is it used by business/marketing to figure who is looking at your site? If it is a marketing need, then you should use Google Analytics, you simply embed a piece of code into your HTML and your business folks will get ALL the info they'd ever want about visitor's devices, browsing patterns, drop off pages, flows, etc.

Access website without browser [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 4 years ago.
Improve this question
I am just learning java. I just want to make a simple application to access a web-site.
there is a website onto which i want to log-in through java:
and then interact with it through my interface, basically after log in, i would be writing in some text boxes and sending it.
I tried many places to do it, studied HTTP protocol but still cant make it.
can someone help me out?
Accessing a web site, logging in and interacting with forms on it is somewhat complex work, so it might not be the best choice for a first java project.
But if you want to do it, you should probably use Apache HttpComponents/HttpClient.
There are useful examples at the above link as well, which may help you get started.

Categories