what options do I have to profile a page request in a spring mvc app?
I want to get a breakdown of how long the page request takes, along with the various stages like how long it takes to render the freemarker template, hibernate db calls, etc.
We just accomplished something similar with an interceptor and a custom tag. This solution is "light" enough to be used in production, presents its data as HTML comments at the bottom of the response, and allows you to opt into the more verbose logging with a request parameter. You apply the interceptor below to all request paths you want to profile, and you add the custom tag to the bottom of the desired pages. The placement of the custom tag is important; it should be invoked as close to the end of request processing as possible, as it's only aware of time spent (and objects loaded) prior to its invocation.
package com.foo.web.interceptor;
import javax.servlet.http.HttpServletRequest;
import javax.servlet.http.HttpServletResponse;
import org.springframework.web.servlet.ModelAndView;
import org.springframework.web.servlet.handler.HandlerInterceptorAdapter;
public class PageGenerationTimeInterceptor extends HandlerInterceptorAdapter {
public static final String PAGE_START_TIME = "page_start_time";
public static final String PAGE_GENERATION_TIME = "page_generation_time";
public boolean preHandle(HttpServletRequest request, HttpServletResponse response,
Object handler) throws Exception {
request.setAttribute(PAGE_START_TIME, new Long(System.currentTimeMillis()));
return true;
}
public void postHandle(HttpServletRequest request, HttpServletResponse response,
Object handler, ModelAndView modelAndView) throws Exception {
Long startTime = (Long) request.getAttribute(PAGE_START_TIME);
if (startTime != null) {
request.setAttribute(PAGE_GENERATION_TIME, new Long(System.currentTimeMillis() - startTime.longValue()));
}
}
}
The custom tag looks for the request attributes, and uses them to compute the handler time, the view time, and the total time. It can also query the current Hibernate session for first-level cache statistics, which can shed some light on how many objects were loaded by the handler and view. If you don't need the Hibernate information, you can delete the big if block.
package com.foo.web.taglib;
import java.io.IOException;
import java.util.HashMap;
import java.util.Map;
import java.util.SortedSet;
import java.util.TreeSet;
import javax.servlet.ServletContext;
import javax.servlet.jsp.JspException;
import javax.servlet.jsp.JspWriter;
import javax.servlet.jsp.tagext.Tag;
import javax.servlet.jsp.tagext.TryCatchFinally;
import org.apache.log4j.LogManager;
import org.apache.log4j.Logger;
import org.hibernate.Session;
import org.hibernate.SessionFactory;
import org.hibernate.engine.CollectionKey;
import org.hibernate.engine.EntityKey;
import org.hibernate.stat.SessionStatistics;
import org.springframework.beans.factory.BeanFactoryUtils;
import org.springframework.context.ApplicationContext;
import org.springframework.web.bind.ServletRequestUtils;
import org.springframework.web.context.support.WebApplicationContextUtils;
import org.springframework.web.servlet.tags.RequestContextAwareTag;
import com.foo.web.interceptor.PageGenerationTimeInterceptor;
public class PageInfoTag extends RequestContextAwareTag implements TryCatchFinally {
private static final long serialVersionUID = -8448960221093136401L;
private static final Logger LOGGER = LogManager.getLogger(PageInfoTag.class);
public static final String SESSION_STATS_PARAM_NAME = "PageInfoTag.SessionStats";
#Override
public int doStartTagInternal() throws JspException {
try {
JspWriter out = pageContext.getOut();
Long startTime = (Long)pageContext.getRequest().getAttribute(PageGenerationTimeInterceptor.PAGE_START_TIME);
Long handlerTime = (Long)pageContext.getRequest().getAttribute(PageGenerationTimeInterceptor.PAGE_GENERATION_TIME);
if (startTime != null && handlerTime != null) {
long responseTime = System.currentTimeMillis() - startTime.longValue();
long viewTime = responseTime - handlerTime;
out.append(String.format("<!-- total: %dms, handler: %dms, view: %dms -->", responseTime, handlerTime, viewTime));
}
if (ServletRequestUtils.getBooleanParameter(pageContext.getRequest(), SESSION_STATS_PARAM_NAME, false)) {
//write another long HTML comment with information about contents of Hibernate first-level cache
ServletContext servletContext = pageContext.getServletContext();
ApplicationContext context = WebApplicationContextUtils.getRequiredWebApplicationContext(servletContext);
String[] beans = BeanFactoryUtils.beanNamesForTypeIncludingAncestors(context,
SessionFactory.class, false, false);
if (beans.length > 0) {
SessionFactory sessionFactory = (SessionFactory) context.getBean(beans[0]);
Session session = sessionFactory.getCurrentSession();
SessionStatistics stats = session.getStatistics();
Map<String, NamedCount> entityHistogram = new HashMap<String, NamedCount>();
out.append("\n<!-- session statistics:\n");
out.append("\tObject keys (").append(String.valueOf(stats.getEntityCount())).append("):\n");
for (Object obj: stats.getEntityKeys()) {
EntityKey key = (EntityKey)obj;
out.append("\t\t").append(key.getEntityName()).append("#").append(key.getIdentifier().toString()).append("\n");
increment(entityHistogram, key.getEntityName());
}
out.append("\tObject key histogram:\n");
SortedSet<NamedCount> orderedEntityHistogram = new TreeSet<NamedCount>(entityHistogram.values());
for (NamedCount entry: orderedEntityHistogram) {
out.append("\t\t").append(entry.name).append(": ").append(String.valueOf(entry.count)).append("\n");
}
Map<String, NamedCount> collectionHistogram = new HashMap<String, NamedCount>();
out.append("\tCollection keys (").append(String.valueOf(stats.getCollectionCount())).append("):\n");
for (Object obj: stats.getCollectionKeys()) {
CollectionKey key = (CollectionKey)obj;
out.append("\t\t").append(key.getRole()).append("#").append(key.getKey().toString()).append("\n");
increment(collectionHistogram, key.getRole());
}
out.append("\tCollection key histogram:\n");
SortedSet<NamedCount> orderedCollectionHistogram = new TreeSet<NamedCount>(collectionHistogram.values());
for (NamedCount entry: orderedCollectionHistogram) {
out.append("\t\t").append(entry.name).append(": ").append(String.valueOf(entry.count)).append("\n");
}
out.append("-->");
}
}
} catch (IOException e) {
LOGGER.error("Unable to write page info tag");
throw new RuntimeException(e);
}
return Tag.EVAL_BODY_INCLUDE;
}
protected void increment(Map<String, NamedCount> histogram, String key) {
NamedCount count = histogram.get(key);
if (count == null) {
count = new NamedCount(key);
histogram.put(key, count);
}
count.count++;
}
class NamedCount implements Comparable<NamedCount> {
public String name;
public int count;
public NamedCount(String name) {
this.name = name;
count = 0;
}
#Override
public int compareTo(NamedCount other) {
//descending count, ascending name
int compared = other.count - this.count;
if (compared == 0) {
compared = this.name.compareTo(other.name);
}
return compared;
}
}
}
Take a look here:
Profiling with Eclipse and remote profile agents on Linux
Tutorial: Profiling with TPTP and Tomcat
An introduction to profiling Java applications using TPTP
TPTP = Eclipse Test and Performance Tools Platform
More links to the stack:
Open Source Profilers in Java
Related
I would like to know if there is a better way (without reflection) to get the java.security.Permissions for a specific URL and Role.
for example:
boolean canAccess = SecurityController.isAllowedToAccessUrl("/pages/confirmOrders.action", Collections.singletonList(new UserPrincipal("Dave")));
would work with the following constraint (web.xml):
<security-constraint>
<web-resource-collection>
<web-resource-name></web-resource-name>
<url-pattern>/pages/confirmOrders.action</url-pattern>
</web-resource-collection>
<auth-constraint>
<role-name>Dave</role-name>
</auth-constraint>
The code, I wrote bellow works well. What I don't like is that I have to use reflection to invoke getContextPolicy from DelegatingPolicy.getInstance() and invoke getPermissionsForRole from ContextPolicy.
import org.jboss.security.jacc.ContextPolicy;
import org.jboss.security.jacc.DelegatingPolicy;
import javax.security.jacc.PolicyConfigurationFactory;
import javax.security.jacc.PolicyContext;
import javax.security.jacc.PolicyContextException;
import javax.security.jacc.WebResourcePermission;
import java.lang.reflect.InvocationTargetException;
import java.lang.reflect.Method;
import java.security.Permissions;
import java.security.Principal;
import java.util.List;
import java.util.logging.Level;
import java.util.logging.Logger;
public class SecurityController {
private static final Logger LOG = Logger.getLogger(SecurityController.class.getName());
static boolean isAllowedToAccessUrl(final String url, final List<Principal> principalRoles) {
initializeConfigurationInService();
boolean result = false;
for (Principal principalRole : principalRoles) {
try{
final ContextPolicy contextPolicy = getContextPolicy();
final Permissions permissions = getPermissionsFromContextPolicy(contextPolicy, principalRole.getName());
result |= permissions.implies(new WebResourcePermission(url, new String[] {"GET","POST"}));
}catch (Exception e){
LOG.log(Level.SEVERE, "checkAllowed failed checking if : ", e);
}
}
return result;
}
private static void initializeConfigurationInService() {
try {
final PolicyConfigurationFactory policyConfigurationFactory = PolicyConfigurationFactory.getPolicyConfigurationFactory();
policyConfigurationFactory.getPolicyConfiguration(PolicyContext.getContextID(), false);
} catch (PolicyContextException | ClassNotFoundException e) {
LOG.log(Level.INFO, "initializeConfigurationInService", e);
}
}
private static Permissions getPermissionsFromContextPolicy(ContextPolicy contextPolicy, String loginName) throws NoSuchMethodException, IllegalAccessException, InvocationTargetException {
final Method getPermissionsForRole = contextPolicy.getClass().getDeclaredMethod("getPermissionsForRole", String.class);
getPermissionsForRole.setAccessible(true);
return (Permissions) getPermissionsForRole.invoke(contextPolicy, loginName);
}
private static ContextPolicy getContextPolicy() throws NoSuchMethodException, IllegalAccessException, InvocationTargetException {
final DelegatingPolicy delegatingPolicy = DelegatingPolicy.getInstance();
final Method getContextPolicy = delegatingPolicy.getClass().getDeclaredMethod("getContextPolicy", String.class);
getContextPolicy.setAccessible(true);
return (ContextPolicy) getContextPolicy.invoke(delegatingPolicy, PolicyContext.getContextID());
}
}
I read programmatically retrieve security constraints from web.xml but found it not very useful.
Any comments, ideas are really welcome. Thanks!
A similar standard method to do the 'isAllowedToAccessUrl` function is available in Java EE 8.
boolean hasAccessToWebResource(String resource, String... methods)
Checks whether the caller has access to the provided "web resource"
using the given methods, as specified by section 13.8 of the Servlet
specification. A caller has access if the web resource is either not
protected (constrained), or when it is protected by a role and the
caller is in that role.
See: SecurityContext#hasAccessToWebResource
Thanks to the comment of Uux I was able to shorten my code and get rid of using reflection. I am now able to verify if a specific role is allowed to access a specific URL in my code.
workable code below:
import javax.security.jacc.WebResourcePermission;
import java.security.CodeSource;
import java.security.Policy;
import java.security.Principal;
import java.security.ProtectionDomain;
import java.security.cert.Certificate;
import java.util.List;
import java.util.logging.Level;
import java.util.logging.Logger;
public class SecurityController {
private static final Logger LOG = Logger.getLogger(SecurityController.class.getName());
static boolean isAllowedToAccessUrl(final String url, final List<Principal> principalRoles) {
try {
final CodeSource codesource = new CodeSource(null, (Certificate[]) null);
final Principal[] principals = principalRoles.toArray(new Principal[0]);
final ProtectionDomain domain = new ProtectionDomain(codesource, null, null, principals);
return Policy.getPolicy().implies(domain, (new WebResourcePermission(url, new String[] {"GET", "POST"})));
} catch (Exception e) {
LOG.log(Level.SEVERE, "checkAllowed failed checking if : ", e);
}
return false;
}
}
I searched on net, found similar issues. As I'm newbie to LDAP, had to reach out for help.
Right now code brings all the groups for a user. When user1 logins, it brings Group A.
New Requirement is:
If Group A is member of Group B, we need to retrieve Group B as well along with Group A.
I'm trying to achieve this by tweaking query. I read about some matching rules OID 1.2.840.113556.1.4.1941 & LDAP_MATCHING_RULE_IN_CHAIN. But couldn't figure out how to implement in my code.
import javax.naming.Context;
import javax.naming.NamingEnumeration;
import javax.naming.NamingException;
import javax.naming.directory.SearchControls;
import javax.naming.directory.SearchResult;
import javax.naming.ldap.InitialLdapContext;
import javax.naming.ldap.LdapContext;
import javax.servlet.*;
import javax.servlet.http.HttpServletRequest;
import javax.servlet.http.HttpSession;
import java.io.IOException;
import java.net.URL;
import java.util.ArrayList;
import java.util.Collections;
import java.util.Hashtable;
import java.util.List;
public abstract class SAPSecurityFilter implements Filter {
protected abstract SAPPrincipal buildGroups(SAPPrincipal principal, NamingEnumeration<SearchResult> results) throws NamingException;
private static final String SECURE_ENTERPRISE_DIRECTORY = "ldaps://ldap.abc.com:636/o=abc.com";
private static final String PRINCIPAL_NAME = "SAPPrincipal";
private static final String ENTERPRISE_DIRECTORY = "ldap://ldap.abc.com:389/o=abc.com";
private static final String USER_KEY = "HTTP_SM_USER";
private static final String BASE = "ou=Groups";
private static final String GROUP_QUERY = "(member=uid=%s,ou=People,o=abc.com)";
private final CacheManager cacheManager;
private List<String> excludeUrlPatterns = new ArrayList<String>();
public SAPSecurityFilter() {
// Setup Cache for principals
// cache Manager
URL url = getClass().getResource("/data-cache.xml");
cacheManager = new CacheManager(url);
}
public void destroy() {
// TODO Auto-generated method stub
}
/**
* doFilter
* <p/>
* Read the request headers for the HTTP_SM_USER value
* This value is the users email address.
* Using the email address lookup the users values in Enterprise directory
* Populate the principal and place it in request scope.
*/
public void doFilter(ServletRequest request, ServletResponse response,
FilterChain chain) throws IOException, ServletException {
//SAPt the request into HttpServletRequest
String path = ((HttpServletRequest) request).getPathInfo();
if (patternExcluded(path) || "OPTIONS".equalsIgnoreSAPe(((HttpServletRequest) request).getMethod())) {
chain.doFilter(request, response);
} else {
String smUser = ((HttpServletRequest) request).getRemoteUser();
HttpSession session = ((HttpServletRequest) request).getSession();
if (smUser == null) throw new ServletException("USER TOKEN MISSING");
// use the smUser to get the data needed to build a principal
LdapContext ctx = null;
// build SAP principal //
SAPPrincipal principal = new SAPPrincipal();
principal.setName(smUser);
//Cache cache = cacheManager.getCache("principalCache");
//Element element = cache.get(smUser);
// Cache miss for user
if (session.getAttribute(PRINCIPAL_NAME) == null) {
try {
ctx = getLdapContext(smUser);
SearchControls constraints = new SearchControls();
constraints.setSearchScope(SearchControls.SUBTREE_SCOPE);
String[] attrs = {"cn"};
constraints.setReturningAttributes(attrs);
String filter = String.format(GROUP_QUERY, smUser);
NamingEnumeration<SearchResult> results = ctx.search(BASE, filter, constraints);
principal = buildGroups(principal, results);
//cache.put(new Element(smUser, principal));
session.setAttribute(PRINCIPAL_NAME, principal);
} catch (NamingException ne) {
throw new ServletException(ne);
} finally {
try {
if (ctx != null) ctx.close();
} catch (NamingException ne) {
// swallow on purpose
}
}
// Cache Hit for user
} else {
principal = (SAPPrincipal) session.getAttribute(PRINCIPAL_NAME);
}
// add principal to securityContext and SAPContext//
SAPContext.setPrincipal(principal);
chain.doFilter(new SecurityRequestWrapper(principal, (HttpServletRequest) request), response);
}
}
Your filter needs to be something like:
(member:1.2.840.113556.1.4.1941:=(CN=UserName,CN=Users,DC=YOURDOMAIN,DC=NET))
form:http://ldapwiki.willeke.com/wiki/Active%20Directory%20User%20Related%20Searches
-jim
Ok, the case is simple. I need to be able to enable/disable logging for a JDK class (HttpURLConnection) programmatically.
public class HttpLoggingTest {
/**
Just a dummy to get some action from HttpURLConnection
*/
private static void getSomething(String urlStr) throws MalformedURLException, IOException {
System.out.println("----- " + urlStr);
HttpURLConnection conn = (HttpURLConnection) new URL("http://www.google.com").openConnection();
for (Entry<String, List<String>> header : conn.getHeaderFields().entrySet()) {
System.out.println(header.getKey() + "=" + header.getValue());
}
conn.disconnect();
}
public static void main(String[] args) throws MalformedURLException, IOException {
// HERE : Enable JDK logging for class
// sun.net.www.protocol.http.HttpURLConnection
getSomething("http://www.goodle.com");
// HERE: Disable JDK logging for class
// sun.net.www.protocol.http.HttpURLConnection
getSomething("http://www.microsoft.com");
}
}
In other words: before the first URL call the logging must be enabled and then disabled before the next call.
That is the challenge !
I'm unable to figure out how to do it.
Must work with Java 7.
Note:
I can do it by using configuration file, logging.properties :
sun.net.www.protocol.http.HttpURLConnection.level = ALL
but I want to have a programmatic solution.
UPDATE
Here's code that works in Java 6 but not in Java 7:
import java.io.IOException;
import java.net.HttpURLConnection;
import java.net.MalformedURLException;
import java.net.URL;
import java.util.List;
import java.util.Map.Entry;
import java.util.logging.ConsoleHandler;
import java.util.logging.Handler;
import java.util.logging.Level;
import java.util.logging.Logger;
public class HttpLoggingTest {
/**
Just a dummy to get some action from HttpURLConnection
*/
private static void getSomething(String urlStr) throws MalformedURLException, IOException {
System.out.println("----- " + urlStr);
HttpURLConnection conn = (HttpURLConnection) new URL("http://www.google.com").openConnection();
for (Entry<String, List<String>> header : conn.getHeaderFields().entrySet()) {
System.out.println(header.getKey() + "=" + header.getValue());
}
conn.disconnect();
}
private static void enableConsoleHandler() {
//get the top Logger
Logger topLogger = java.util.logging.Logger.getLogger("");
// Handler for console (reuse it if it already exists)
Handler consoleHandler = null;
//see if there is already a console handler
for (Handler handler : topLogger.getHandlers()) {
if (handler instanceof ConsoleHandler) {
//found the console handler
consoleHandler = handler;
break;
}
}
if (consoleHandler == null) {
//there was no console handler found, create a new one
consoleHandler = new ConsoleHandler();
topLogger.addHandler(consoleHandler);
}
consoleHandler.setLevel(Level.ALL);
}
public static void main(String[] args) throws MalformedURLException, IOException {
enableConsoleHandler();
final Logger httpLogger = Logger.getLogger("sun.net.www.protocol.http.HttpURLConnection");
// Enable JDK logging for class
//sun.net.www.protocol.http.HttpURLConnection
httpLogger.setLevel(java.util.logging.Level.FINE);
getSomething("http://www.goodle.com");
// Disable JDK logging for class
// sun.net.www.protocol.http.HttpURLConnection
httpLogger.setLevel(java.util.logging.Level.INFO);
getSomething("http://www.microsoft.com");
}
}
UPDATE2
In order to make sure that a solution only enables output from our target class (and not all sorts of other JDK internal classes) I've created this minimal JAXB example. Here JAXB is simply an example of 'something else', it could have been any other part of the JDK that also use PlatformLogger.
import javax.xml.bind.JAXBContext;
import javax.xml.bind.JAXBException;
import javax.xml.bind.Marshaller;
import javax.xml.bind.annotation.XmlElement;
import javax.xml.bind.annotation.XmlRootElement;
/**
* Minimal dummy JAXB example. Only purpose is to provoke
* some JAXB action. Non-prod quality!
*/
#XmlRootElement(name = "book")
public class Celebrity {
#XmlElement
public String getFirstName() {
return "Marilyn";
}
#XmlElement
public String getLastName() {
return "Monroe";
}
public void printXML() {
JAXBContext context;
try {
context = JAXBContext.newInstance(Celebrity.class);
Marshaller m = context.createMarshaller();
m.setProperty(Marshaller.JAXB_FORMATTED_OUTPUT, Boolean.TRUE);
m.marshal(this, System.out);
} catch (JAXBException ex) {
}
}
}
Instantiate an instance of the Celebrity class and call printXML(). Put that into getSomething() method. This must not generate JAXB internal logging output ... or else you've enabled logging for more than you thought.
Stumbled over PlatformLoggingMXBean the other day. I'll need to try something like:
PlatformLoggingMXBean platformLoggingMXBean =
ManagementFactory.getPlatformMXBean(PlatformLoggingMXBean.class);
platformLoggingMXBean.setLoggerLevel(
"sun.net.www.protocol.http.HttpURLConnection", "FINE");
and see it it works.
Try:
java.util.logging.Logger logger =
java.util.logging.Logger.getLogger(
"sun.net.www.protocol.http.HttpURLConnection");
logger.setLevel(java.util.logging.Level.FINE);
Is there a good way to get the logged in user count in a Java web application that is running in a cluster?
I wrote a simple HttpSessionListener with a static field, but I suppose this doesn't work in cluster. I can see there is a Spring Security solution, but I read in some forums that this is still not ok in cluster.
The product in which I have to implement this user count is trying to be application server independent, currently we support Tomcat, Weblogic and JBoss. At the moment I need a solution for Weblogic 10.3 clusters.
You can maintain the counter in database which will work in cluster env.
A simple tutorial to demonstrate how to determine active users / sessions in a Java Web Application.
package com.hubberspot.javaee.listener;
import javax.servlet.annotation.WebListener;
import javax.servlet.http.HttpSessionEvent;
import javax.servlet.http.HttpSessionListener;
#WebListener
public class OnlineUsersCounter implements HttpSessionListener {
private static int numberOfUsersOnline;
public OnlineUsersCounter() {
numberOfUsersOnline = 0;
}
public static int getNumberOfUsersOnline() {
return numberOfUsersOnline;
}
public void sessionCreated(HttpSessionEvent event) {
System.out.println("Session created by Id : " + event.getSession().getId());
synchronized (this) {
numberOfUsersOnline++;
}
}
public void sessionDestroyed(HttpSessionEvent event) {
System.out.println("Session destroyed by Id : " + event.getSession().getId());
synchronized (this) {
numberOfUsersOnline--;
}
}
}
Running the below servlet on three different browsers will provide output as : (see fig below)
package com.hubberspot.javaee;
import java.io.IOException;
import java.io.PrintWriter;
import javax.servlet.ServletException;
import javax.servlet.annotation.WebInitParam;
import javax.servlet.annotation.WebServlet;
import javax.servlet.http.HttpServlet;
import javax.servlet.http.HttpServletRequest;
import javax.servlet.http.HttpServletResponse;
import javax.servlet.http.HttpSession;
import com.hubberspot.javaee.listener.OnlineUsersCounter;
// #WebServlet annotation has a initParams field which takes
// in initialization parameters for a servlet.
// #WebInitParam annotation takes in a name and value for the
// initialization parameters for the current Servlet.
#WebServlet(name = "HelloWorldServlet" , urlPatterns = { "/HelloWorldServlet" }
, initParams = { #WebInitParam(name = "user" , value = "Jonty") })
public class HelloWorldServlet extends HttpServlet {
protected void doGet(
HttpServletRequest request,
HttpServletResponse response
) throws ServletException, IOException {
response.setContentType("text/html");
PrintWriter out = response.getWriter();
// sessionCreated method gets executed
HttpSession session = request.getSession();
session.setMaxInactiveInterval(60);
try {
out.println("<html>");
out.println("<body>");
out.println("<h2>Number of Users Online : "
+ OnlineUsersCounter.getNumberOfUsersOnline()
+ "</h2>");
out.println("</body>");
out.println("</html>");
} finally {
out.close();
}
}
}
Output of the program :
Eclipse Browser ->
Firefox Browser ->
Internet Explorer Browser ->
Console Output ->
For more: http://www.hubberspot.com/2013/09/how-to-determine-active-users-sessions.html
hello:
I'm writing code in java for nutch(open source search engine) to remove the movments from arabic words in the indexer.
I don't know what is the error in it.
Tthis is the code:
package com.mycompany.nutch.indexing;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.io.Text;
import org.apache.log4j.Logger;
import org.apache.nutch.crawl.CrawlDatum;
import org.apache.nutch.crawl.Inlinks;
import org.apache.nutch.indexer.IndexingException;
import org.apache.nutch.indexer.IndexingFilter;
import org.apache.nutch.indexer.NutchDocument;
import org.apache.nutch.parse.getData().parse.getData();
public class InvalidUrlIndexFilter implements IndexingFilter {
private static final Logger LOGGER =
Logger.getLogger(InvalidUrlIndexFilter.class);
private Configuration conf;
public void addIndexBackendOptions(Configuration conf) {
// NOOP
return;
}
public NutchDocument filter(NutchDocument doc, Parse parse, Text url,
CrawlDatum datum, Inlinks inlinks) throws IndexingException {
if (url == null) {
return null;
}
char[] parse.getData() = input.trim().toCharArray();
for(int p=0;p<parse.getData().length;p++)
if(!(parse.getData()[p]=='َ'||parse.getData()[p]=='ً'||parse.getData()[p]=='ُ'||parse.getData()[p]=='ِ'||parse.getData()[p]=='ٍ'||parse.getData()[p]=='ٌ' ||parse.getData()[p]=='ّ'||parse.getData()[p]=='ْ' ||parse.getData()[p]=='"' ))
new String.append(parse.getData()[p]);
return doc;
}
public Configuration getConf() {
return conf;
}
public void setConf(Configuration conf) {
this.conf = conf;
}
}
I think that the error is in using parse.getdata() but I don't know what I should use instead of it?
The line
char[] parse.getData() = input.trim().toCharArray();
will give you a compile error because the left hand side is not a variable. Please replace parse.getData() by a unique variable name (e.g. parsedData) in this line and the following lines.
Second the import of
import org.apache.nutch.parse.getData().parse.getData();
will also fail. Looks a lot like a text replace issue.