I'm using Flying Saucer to create a pdf from xhtml, hosted on a tomcat server. Most of the images included in the pdf are publicly available (logos and so on), but some of them are protected behind a login (that is, they are streamed through a servlet if the user is logged in).
When I paste the url in the browser, the image is of course displayed fine, because the browser sends the session with the request. But when Flying Saucer renders the pdf, it doesn't include the protected image because it doesn't know anything about the session.
So, my question is; is there any way to include the byte streams for Flying Saucer to resolve, just as it is possible to add resolvable fonts? I have tried something like this, but there is no easy way to set the UAC on the ITextRenderer, and it complained every time i tried.
You can set the UserAgentCallback this way, and Flying Saucer will use it to resolve the urls (tested, works with Release 8):
ITextRenderer renderer = new ITextRenderer();
renderer.getSharedContext().setUserAgentCallback(new MyUAC());
MyUAC should extend the NaiveUserAgent, and override the resolveAndOpenStream method as the other page suggests.
I overrode ITextUserAgent as well - from the source, looks like that's what ITextRenderer uses. You have to provide the output device in the constructor, which you can get from the renderer object. One other gotcha was you have to set the "shared context" explicitly using the setter method - otherwise you will get an NPE during rendering. Here is the code to set up the object:
ITextRenderer renderer = new ITextRenderer();
MyUserAgentCallback uac = new MyUserAgentCallback(renderer.getOutputDevice());
uac.setSharedContext(renderer.getSharedContext());
renderer.getSharedContext().setUserAgentCallback(uac);
Also, here is the basic idea of MyUserAgentCallback, using basic authentication:
private static class MyUserAgentCallback extends ITextUserAgent
{
public MyUserAgentCallback(ITextOutputDevice outputDevice)
{
super(outputDevice);
}
#Override
protected InputStream resolveAndOpenStream(String uri)
{
if (_isProtectedResource(uri))
{
java.io.InputStream is = null;
uri = resolveURI(uri);
try {
URL url = new URL(uri);
String encoding = new BASE64Encoder().encode ("username:password".getBytes());
URLConnection uc = url.openConnection();
uc.setRequestProperty ("Authorization", "Basic " + encoding);
is = uc.getInputStream();
Log.debug("got input stream");
}
catch (java.net.MalformedURLException e) {
Log.error("bad URL given: " + uri, e);
}
catch (java.io.FileNotFoundException e) {
Log.error("item at URI " + uri + " not found");
}
catch (java.io.IOException e) {
Log.error("IO problem for " + uri, e);
}
return is;
}
else
{
return super.resolveAndOpenStream(uri);
}
}
private boolean _isProtectedResource(String uri)
{
// does this require authentication?
}
}
Related
I need to figure out how to validate my XML files with schema's offline. After looking around for a couple of days, what I was able to find was basically that I needed to have an internal reference to the schema. I needed to find them, download them, and change the reference to a local system path. What I was unable to find was exactly how to do that. Where and how can I change the reference to point internally instead of externally? What is the best way to download the schemas?
There are three ways you could do this. What they all have in common is that you need a local copy of the schema document(s). I'm assuming that the instance documents currently use xsi:schemaLocation and/or xsi:noNamespaceSchemaLocation to point to a location holding the schema document(s) on the web.
(a) Modify your instance documents to refer to the local copy of the schema documents. This is usually inconvenient.
(b) Redirect the references so that a request for a remote file is redirected to a local file. The way to set this up depends on which schema validator you are using and how you are invoking it.
(c) Tell the schema processor to ignore the values of xsi:schemaLocation and xsi:noNamespaceSchemaLocation, and to validate instead against a schema that you supply using your schema processor's invocation API. Again the details depend on which schema processor you are using.
My preferred approach is (c): if only because when you are validating a source document, then by definition you don't fully trust it - so why should you trust it to contain a correct xsi:schemaLocation attribute?
XmlValidate is a simple but powerful command-line tool that can perform offline validation of single or multiple XML files against target schemas. It can scan local xml files by file name, directory, or URL.
XmlValidate automatically adds the schemaLocation based on the schema namespace and a config file that mapping to a local file. The tool will validate against whatever XML Schema is referenced in the config file.
Here are example mappings of namespace to target Schema in config file:
http://www.opengis.net/kml/2.2=${XV_HOME}/schemas/kml22.xsd
http://appengine.google.com/ns/1.0=C:/xml/appengine-web.xsd
urn:oasis:names:tc:ciq:xsdschema:xAL:2.0=C:/xml/xAL.xsd
Note that ${XV_HOME} token above is simply an alias for the top-level directory that XmlValidate is running from. The location can likewise be a full file path.
XmlValidate is an open-source project (source code available) that runs with the Java Runtime Environment (JRE). The bundled application (Java jars, examples, etc.) can be downloaded here.
If XmlValidate is run in batch mode against multiple XML files, it will provide a summary of validation results.
Errors: 17 Warnings: 0 Files: 11 Time: 1506 ms
Valid files 8/11 (73%)
You can set your own Implementation of ResourceResolver and LSInput to the SchemaFactory so that the call of
of LSInput.getCharacterStream() will provide a schema from a local path.
I have written an extra class to do offline validation. You can call it like
new XmlSchemaValidator().validate(xmlStream, schemaStream, "https://schema.datacite.org/meta/kernel-4.1/",
"schemas/datacite/kernel-4.1/");
Two InputStream are beeing passed. One for the xml, one for the schema. A baseUrl and a localPath (relative on classpath) are passed as third and fourth parameter. The last two parameters are used by the validator to lookup additional schemas locally at localPath or relative to the provided baseUrl.
I have tested with a set of schemas and examples from https://schema.datacite.org/meta/kernel-4.1/ .
Complete Example:
#Test
public void validate4() throws Exception {
InputStream xmlStream = Thread.currentThread().getContextClassLoader().getResourceAsStream(
"schemas/datacite/kernel-4.1/example/datacite-example-complicated-v4.1.xml");
InputStream schemaStream = Thread.currentThread().getContextClassLoader()
.getResourceAsStream("schemas/datacite/kernel-4.1/metadata.xsd");
new XmlSchemaValidator().validate(xmlStream, schemaStream, "https://schema.datacite.org/meta/kernel-4.1/",
"schemas/datacite/kernel-4.1/");
}
The XmlSchemaValidator will validate the xml against the schema and will search locally for included Schemas. It uses a ResourceResolver to override the standard behaviour and to search locally.
public class XmlSchemaValidator {
/**
* #param xmlStream
* xml data as a stream
* #param schemaStream
* schema as a stream
* #param baseUri
* to search for relative pathes on the web
* #param localPath
* to search for schemas on a local directory
* #throws SAXException
* if validation fails
* #throws IOException
* not further specified
*/
public void validate(InputStream xmlStream, InputStream schemaStream, String baseUri, String localPath)
throws SAXException, IOException {
Source xmlFile = new StreamSource(xmlStream);
SchemaFactory factory = SchemaFactory.newInstance(XMLConstants.W3C_XML_SCHEMA_NS_URI);
factory.setResourceResolver((type, namespaceURI, publicId, systemId, baseURI) -> {
LSInput input = new DOMInputImpl();
input.setPublicId(publicId);
input.setSystemId(systemId);
input.setBaseURI(baseUri);
input.setCharacterStream(new InputStreamReader(
getSchemaAsStream(input.getSystemId(), input.getBaseURI(), localPath)));
return input;
});
Schema schema = factory.newSchema(new StreamSource(schemaStream));
javax.xml.validation.Validator validator = schema.newValidator();
validator.validate(xmlFile);
}
private InputStream getSchemaAsStream(String systemId, String baseUri, String localPath) {
InputStream in = getSchemaFromClasspath(systemId, localPath);
// You could just return in; , if you are sure that everything is on
// your machine. Here I call getSchemaFromWeb as last resort.
return in == null ? getSchemaFromWeb(baseUri, systemId) : in;
}
private InputStream getSchemaFromClasspath(String systemId, String localPath) {
System.out.println("Try to get stuff from localdir: " + localPath + systemId);
return Thread.currentThread().getContextClassLoader().getResourceAsStream(localPath + systemId);
}
/*
* You can leave out the webstuff if you are sure that everything is
* available on your machine
*/
private InputStream getSchemaFromWeb(String baseUri, String systemId) {
try {
URI uri = new URI(systemId);
if (uri.isAbsolute()) {
System.out.println("Get stuff from web: " + systemId);
return urlToInputStream(uri.toURL(), "text/xml");
}
System.out.println("Get stuff from web: Host: " + baseUri + " Path: " + systemId);
return getSchemaRelativeToBaseUri(baseUri, systemId);
} catch (Exception e) {
// maybe the systemId is not a valid URI or
// the web has nothing to offer under this address
}
return null;
}
private InputStream urlToInputStream(URL url, String accept) {
HttpURLConnection con = null;
InputStream inputStream = null;
try {
con = (HttpURLConnection) url.openConnection();
con.setConnectTimeout(15000);
con.setRequestProperty("User-Agent", "Name of my application.");
con.setReadTimeout(15000);
con.setRequestProperty("Accept", accept);
con.connect();
int responseCode = con.getResponseCode();
if (responseCode == HttpURLConnection.HTTP_MOVED_PERM
|| responseCode == HttpURLConnection.HTTP_MOVED_TEMP || responseCode == 307
|| responseCode == 303) {
String redirectUrl = con.getHeaderField("Location");
try {
URL newUrl = new URL(redirectUrl);
return urlToInputStream(newUrl, accept);
} catch (MalformedURLException e) {
URL newUrl = new URL(url.getProtocol() + "://" + url.getHost() + redirectUrl);
return urlToInputStream(newUrl, accept);
}
}
inputStream = con.getInputStream();
return inputStream;
} catch (SocketTimeoutException e) {
throw new RuntimeException(e);
} catch (IOException e) {
throw new RuntimeException(e);
}
}
private InputStream getSchemaRelativeToBaseUri(String baseUri, String systemId) {
try {
URL url = new URL(baseUri + systemId);
return urlToInputStream(url, "text/xml");
} catch (Exception e) {
e.printStackTrace();
throw new RuntimeException(e);
}
}
}
prints
Try to get stuff from localdir: schemas/datacite/kernel-4.1/http://www.w3.org/2009/01/xml.xsd
Get stuff from web: http://www.w3.org/2009/01/xml.xsd
Try to get stuff from localdir: schemas/datacite/kernel-4.1/include/datacite-titleType-v4.xsd
Try to get stuff from localdir: schemas/datacite/kernel-4.1/include/datacite-contributorType-v4.xsd
Try to get stuff from localdir: schemas/datacite/kernel-4.1/include/datacite-dateType-v4.1.xsd
Try to get stuff from localdir: schemas/datacite/kernel-4.1/include/datacite-resourceType-v4.1.xsd
Try to get stuff from localdir: schemas/datacite/kernel-4.1/include/datacite-relationType-v4.1.xsd
Try to get stuff from localdir: schemas/datacite/kernel-4.1/include/datacite-relatedIdentifierType-v4.xsd
Try to get stuff from localdir: schemas/datacite/kernel-4.1/include/datacite-funderIdentifierType-v4.xsd
Try to get stuff from localdir: schemas/datacite/kernel-4.1/include/datacite-descriptionType-v4.xsd
Try to get stuff from localdir: schemas/datacite/kernel-4.1/include/datacite-nameType-v4.1.xsd
The print shows that the validator was able to validate against a set of local schemas. Only http://www.w3.org/2009/01/xml.xsd was not available locally and therefore fetched from the internet.
I'm trying to write unit tests for my program and use mock data. I'm a little confused on how to intercept an HTTP Get request to a URL.
My program calls a URL to our API and it is returned a simple XML file. I would like the test to instead of getting the XML file from the API online to receive a predetermined XML file from me so that I can compare the output to the expected output and determine if everything is working correctly.
I was pointed to Mockito and have been seeing many different examples such as this SO post, How to use mockito for testing a REST service? but it's not becoming clear to me how to set it all up and how to mock the data (i.e., return my own xml file whenever the call to the URL is made).
The only thing I can think of is having another program made that's running locally on Tomcat and in my test pass a special URL that calls the locally running program on Tomcat and then return the xml file that I want to test with. But that just seems like overkill and I don't think that would be acceptable. Could someone please point me in the right direction.
private static InputStream getContent(String uri) {
HttpURLConnection connection = null;
try {
URL url = new URL(uri);
connection = (HttpURLConnection) url.openConnection();
connection.setRequestMethod("GET");
connection.setRequestProperty("Accept", "application/xml");
return connection.getInputStream();
} catch (MalformedURLException e) {
LOGGER.error("internal error", e);
} catch (IOException e) {
LOGGER.error("internal error", e);
} finally {
if (connection != null) {
connection.disconnect();
}
}
return null;
}
I am using Spring Boot and other parts of the Spring Framework if that helps.
Part of the problem is that you're not breaking things down into interfaces. You need to wrap getContent into an interface and provide a concrete class implementing the interface. This concrete class will then
need to be passed into any class that uses the original getContent. (This is essentially dependency inversion.) Your code will end up looking something like this.
public interface IUrlStreamSource {
InputStream getContent(String uri)
}
public class SimpleUrlStreamSource implements IUrlStreamSource {
protected final Logger LOGGER;
public SimpleUrlStreamSource(Logger LOGGER) {
this.LOGGER = LOGGER;
}
// pulled out to allow test classes to provide
// a version that returns mock objects
protected URL stringToUrl(String uri) throws MalformedURLException {
return new URL(uri);
}
public InputStream getContent(String uri) {
HttpURLConnection connection = null;
try {
Url url = stringToUrl(uri);
connection = (HttpURLConnection) url.openConnection();
connection.setRequestMethod("GET");
connection.setRequestProperty("Accept", "application/xml");
return connection.getInputStream();
} catch (MalformedURLException e) {
LOGGER.error("internal error", e);
} catch (IOException e) {
LOGGER.error("internal error", e);
} finally {
if (connection != null) {
connection.disconnect();
}
}
return null;
}
}
Now code that was using the static getContent should go through a IUrlStreamSource instances getContent(). You then provide to the object that you want to test a mocked IUrlStreamSource rather than a SimpleUrlStreamSource.
If you want to test SimpleUrlStreamSource (but there's not much to test), then you can create a derived class that provides an implementation of stringToUrl that returns a mock (or throws an exception).
The other answers in here advise you to refactor your code to using a sort of provider which you can replace during your tests - which is the better approach.
If that isn't a possibility for whatever reason you can install a custom URLStreamHandlerFactory that intercepts the URLs you want to "mock" and falls back to the standard implementation for URLs that shouldn't be intercepted.
Note that this is irreversible, so you can't remove the InterceptingUrlStreamHandlerFactory once it's installed - the only way to get rid of it is to restart the JVM. You could implement a flag in it to disable it and return null for all lookups - which would produce the same results.
URLInterceptionDemo.java:
public class URLInterceptionDemo {
private static final String INTERCEPT_HOST = "dummy-host.com";
public static void main(String[] args) throws IOException {
// Install our own stream handler factory
URL.setURLStreamHandlerFactory(new InterceptingUrlStreamHandlerFactory());
// Fetch an intercepted URL
printUrlContents(new URL("http://dummy-host.com/message.txt"));
// Fetch another URL that shouldn't be intercepted
printUrlContents(new URL("http://httpbin.org/user-agent"));
}
private static void printUrlContents(URL url) throws IOException {
try(InputStream stream = url.openStream();
BufferedReader reader = new BufferedReader(new InputStreamReader(stream))) {
String line;
while((line = reader.readLine()) != null) {
System.out.println(line);
}
}
}
private static class InterceptingUrlStreamHandlerFactory implements URLStreamHandlerFactory {
#Override
public URLStreamHandler createURLStreamHandler(final String protocol) {
if("http".equalsIgnoreCase(protocol)) {
// Intercept HTTP requests
return new InterceptingHttpUrlStreamHandler();
}
return null;
}
}
private static class InterceptingHttpUrlStreamHandler extends URLStreamHandler {
#Override
protected URLConnection openConnection(final URL u) throws IOException {
if(INTERCEPT_HOST.equals(u.getHost())) {
// This URL should be intercepted, return the file from the classpath
return URLInterceptionDemo.class.getResource(u.getHost() + "/" + u.getPath()).openConnection();
}
// Fall back to the default handler, by passing the default handler here we won't end up
// in the factory again - which would trigger infinite recursion
return new URL(null, u.toString(), new sun.net.www.protocol.http.Handler()).openConnection();
}
}
}
dummy-host.com/message.txt:
Hello World!
When run, this app will output:
Hello World!
{
"user-agent": "Java/1.8.0_45"
}
It's pretty easy to change the criteria of how you decide which URLs to intercept and what you return instead.
The answer depends on what you are testing.
If you need to test the processing of the InputStream
If getContent() is called by some code that processes the data returned by the InputStream, and you want to test how the processing code handles specific sets of input, then you need to create a seam to enable testing. I would simply move getContent() into a new class, and inject that class into the class that does the processing:
public interface ContentSource {
InputStream getContent(String uri);
}
You could create a HttpContentSource that uses URL.openConnection() (or, better yet, the Apache HttpClientcode).
Then you would inject the ContentSource into the processor:
public class Processor {
private final ContentSource contentSource;
#Inject
public Processor(ContentSource contentSource) {
this.contentSource = contentSource;
}
...
}
The code in Processor could be tested with a mock ContentSource.
If you need to test the fetching of the content
If you want to make sure that getContent() works, you could create a test that starts a lightweight in-memory HTTP server that serves the expected content, and have getContent() talk to that server. That does seem overkill.
If you need to test a large subset of the system with fake data
If you want to make sure things work end to end, write an end to end system test. Since you indicated you use Spring, you can use Spring to wire together parts of the system (or to wire the entire system, but with different properties). You have two choices
Have the system test start a local HTTP server, and when you have your test create your system, configure it to talk to that server. See the answers to this question for ways to start the HTTP server.
Configure spring to use a fake implementation of ContentSource. This gets you slightly less confidence that everything works end-to-end, but it will be faster and less flaky.
I want to login to a https website with username and password, go to one url in that website and download the page at the url (and maybe parse contents of that page). I want to do this using only core Java apis and not htmlunit, jsoup etc. I got the below code to learn how to do this, but it does not show me how to login to a website. Please tell me how I can login, maintain a session and then finally close the connection.
Source - http://www.mkyong.com/java/java-https-client-httpsurlconnection-example/
import java.net.MalformedURLException;
import java.net.URL;
import java.security.cert.Certificate;
import java.io.*;
import javax.net.ssl.HttpsURLConnection;
import javax.net.ssl.SSLPeerUnverifiedException;
public class HttpsClient{
public static void main(String[] args)
{
new HttpsClient().testIt();
}
private void testIt(){
String https_url = "https://www.google.com/";
URL url;
try {
url = new URL(https_url);
HttpsURLConnection con = (HttpsURLConnection)url.openConnection();
//dumpl all cert info
print_https_cert(con);
//dump all the content
print_content(con);
} catch (MalformedURLException e) {
e.printStackTrace();
} catch (IOException e) {
e.printStackTrace();
}
}
private void print_https_cert(HttpsURLConnection con){
if(con!=null){
try {
System.out.println("Response Code : " + con.getResponseCode());
System.out.println("Cipher Suite : " + con.getCipherSuite());
System.out.println("\n");
Certificate[] certs = con.getServerCertificates();
for(Certificate cert : certs){
System.out.println("Cert Type : " + cert.getType());
System.out.println("Cert Hash Code : " + cert.hashCode());
System.out.println("Cert Public Key Algorithm : "
+ cert.getPublicKey().getAlgorithm());
System.out.println("Cert Public Key Format : "
+ cert.getPublicKey().getFormat());
System.out.println("\n");
}
} catch (SSLPeerUnverifiedException e) {
e.printStackTrace();
} catch (IOException e){
e.printStackTrace();
}
}
}
private void print_content(HttpsURLConnection con){
if(con!=null){
try {
System.out.println("****** Content of the URL ********");
BufferedReader br =
new BufferedReader(
new InputStreamReader(con.getInputStream()));
String input;
while ((input = br.readLine()) != null){
System.out.println(input);
}
br.close();
} catch (IOException e) {
e.printStackTrace();
}
}
}
}
Every website manages logins differently. You will need to scout the website, find out how the session is maintained, and mimic the functions in such a way that the server can't tell that it is not a browser.
In general, a web server stores a secret hash in the cookie. Here is the process
Post a login and password to said url using HttpsURLConnection to send the form.
The server responds with a hash in a header that it wants stored in the cookie. Usually has session in the name.
Send requests back with the hash in the header in the correct value
All of the above can be done only using URL and HttpsURLConnection, but you will need to mimic a browser exactly to trick the server.
For scouting, I would recommend using a tool like fiddler. It captures all communication from the webserver and back, so that you can see exactly what is going on at the http level to mimic in your java code.
Here is an overview of fiddler. I have never looked at the logs. Fiddler has a sweet interface. The video is really boring, but it gives an overview of the interface. You want to look at the raw text view, and mimic that.
For your other question, owasp is a great resource for best practices. The reality is that there is a lot of insecure and bad code out there that does stuff that you would never expect. I have seen a server put the boolean value inside of a script tag to be stored as a javascript variable. You just have to carefully watch how the server changes the responses after you log in. For a popular website following best practices, they will use the above method.
I've wrote a bean that takes photos from a webcam. I'd like to display these image in a JSF 2.0 page and update them every n second.
If I give the path name the file in eclipse like this it work:
public String getNewPhoto() {
try {
File dir = new File("C:/Users/User/MyApp/images/");
FileUtils.cleanDirectory(dir);
}catch(Exception ex) {
ex.printStackTrace();
}
String timeStamp = new SimpleDateFormat("yyyyMMdd_HHmmss").format(Calendar.getInstance().getTime());
try {
webcam = Webcam.getDefault();
webcam.open();
ImageIO.write(webcam.getImage(), "PNG", new File("C:/Users/User/MyApp/images/"+ timeStamp + ".png"));
webcam.close();
}catch(Exception ex) {
ex.printStackTrace();
}
return "C:/Users/User/MyApp/images/" + timeStamp + ".png";
}
With the following XHTML:
<p:graphicImage value="#{myBean.newPhoto}" id="photo" cache="false" />
<p:poll interval="1" listener="#{myBean.increment}" update="photo" />
As expect all of the above works fine from my dev environment on eclipse. I'd like to deploy this to my server (Linux). When I change the paths from what you see above to
/var/lib/tomcat7/webapps/MyApp/images
Then the images get saved, but I can't display them in h:graphicImage. I also tried passing:
http://hostname:8080/MyApp/images/....
to h:graphicImage and still no dice, I'm sure I'm missing something real simple. Any help is appreciated!
You need to change it in the image saving line as well . . .
ImageIO.write(webcam.getImage(), "PNG", new File("C:/Users/User/MyApp/images/"+ timeStamp + ".png"));
How about the following:
View
<p:graphicImage value="#{photoBean.newPhoto}" id="photo" cache="false" />
<p:poll listener="#{photoBean.updatePhoto}" interval="1"
update="photo" />
Managed Bean
#ManagedBean
#RequestScoped
public class PhotoBean {
private String realPath;
private String realtivePath;
#PostConstruct
public void init() {
realtivePath = "/images/webcam.png";
ExternalContext externalContext = FacesContext.getCurrentInstance()
.getExternalContext();
realPath = externalContext.getRealPath(realtivePath);
}
public void updatePhoto() {
try {
File file = new File(realPath);
file.delete(); // cleanup
// Use file object to write image...
} catch (IOException e) {
// Implement some exception handling here
e.printStackTrace();
}
}
public String getNewPhoto() {
return realtivePath;
}
}
Some more notes (getters)
It's not a good idea to process the webcam image in the getter (getNewPhoto) as getters may be called multiple times by JSF.
See: Why JSF calls getters multiple times
Timestamp
I've removed the timestamp from the filename. I think you've added it to prevent browser caching. This is not required as cache=false already creates unique image URIs.
I have created a web browser using webview in android. My aim is to control the content of the webview before it is loaded. Whenever the content of the webview makes a request to any domain server, it has to pass through shoulInterceptRequest(). If the url is pointing to any video uploading sites(youtube.com, vimeo.com), I can change it to some Access Denied url so that video will not be loaded.
#Override
public WebResourceResponse shouldInterceptRequest(final WebView view, String url) {
try {
if (access.permission(url)) {
return super.shouldInterceptRequest(view, url);
}
} catch (Exception e) {
e.printStackTrace();
}
return getResponseData();
}
private WebResourceResponse getResponseData() {
try {
String str = "Access Denied";
InputStream data = new ByteArrayInputStream(str.getBytes("UTF-8"));
return new WebResourceResponse("text/css", "UTF-8", data);
} catch (IOException e) {
return null;
}
}
But shoulInterceptRequest() is availabe from API 11. I NEED it to work from API 8.
Is there any alternative way to implement it ? I need to block the url if it is potinting to any video uploading sites BEFORE LOADING ANY DATA.
How about using the http://developer.android.com/reference/android/webkit/WebViewClient.html#shouldOverrideUrlLoading(android.webkit.WebView, java.lang.String) event?
You can block the url and then call http://developer.android.com/reference/android/webkit/WebView.html#loadUrl(java.lang.String) to show anything you want (and even run arbitrary javascript with the "javascript:do_something()" notation)